Skip to content

Introduction to HTTP

HTTP stands for Hypertext Transfer Protocol. It is a Request/Response communication protocol invented by Sir Tim Berners-Lee, a British computer scientist, in 1989. HTTP is the foundation of any data exchange on the Web, and it is a protocol used by the World Wide Web.

Client / Server

The original HTTP 1/1.1 is a relatively simple text-based protocol where a client makes a request to the server. The server then processes it and returns a response. This interaction is also known as an HTTP transaction. Communication typically occurs over port 80, layered on top of TCP/IP.

Structure

At its core, a request/response consists of:

  • Header
  • Metadata about the request or response.
  • Body
  • Contains the actual data, like FormData, HTML, CSS, JavaScript, JSON, XML, etc.

Commands

There are several different commands, but in standard web development, you should be familiar with GET and POST. The GET command is used to retrieve data from the server through a URL, while POST is used to send data to the server, either through a form or JavaScript.

Here’s an example of a simple HTTP 1.1 GET request:

Request

GET http://fdemo2.cronberg.dk/SimplePage HTTP/1.1
Host: fdemo2.cronberg.dk
Connection: keep-alive
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8
Referer: http://fdemo2.cronberg.dk/
Accept-Encoding: gzip, deflate
Accept-Language: en,da;q=0.9,en-US;q=0.8

Response

HTTP/1.1 200 OK
Cache-Control: private
Content-Length: 1498
Content-Type: text/html; charset=utf-8
Server: Microsoft-IIS/8.5
X-AspNet-Version: 4.0.30319
X-AspNetWebPages-Version: 3.0
X-Powered-By: ASP.NET
Date: Tue, 02 Oct 2018 22:09:33 GMT

<!DOCTYPE html>
...

And here’s an example of a simple HTTP 1.1 POST request:

Request

POST http://fdemo2.cronberg.dk/SimplePost_submit.cshtml HTTP/1.1
Host: fdemo2.cronberg.dk
Connection: keep-alive
Content-Length: 174
Cache-Control: max-age=0
Origin: http://fdemo2.cronberg.dk
Upgrade-Insecure-Requests: 1
Content-Type: application/x-www-form-urlencoded
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8
Referer: http://fdemo2.cronberg.dk/SimplePost
Accept-Encoding: gzip, deflate
Accept-Language: en,da;q=0.9,en-US;q=0.8

txtName_name=1234&txtSecretName_name=5678&lstCountry_name=DK&lstSpeak_name=SE&chkFeelYoung_name=on&optSex_name=Male&file1_name=&txtNotes_name=test&btnSubmit1_name=Submit+%231

Response

HTTP/1.1 200 OK
Cache-Control: private
Content-Length: 1684
Content-Type: text/html; charset=utf-8
Server: Microsoft-IIS/8.5
X-AspNet-Version: 4.0.30319
X-AspNetWebPages-Version: 3.0
X-Powered-By: ASP.NET
Date: Tue, 02 Oct 2018 22:11:34 GMT

<!DOCTYPE html>
...

Status Codes

HTTP status codes provide information about the success or failure of a request. They are grouped into:

  • Informational (1XX)
  • Successful (2XX)
  • 200 OK: The request was successful.
  • Redirection (3XX)
  • 301 Moved Permanently: The URL has changed permanently.
  • 302 Found: The URL has changed temporarily.
  • Client Error (4XX)
  • Examples include 404 Not Found and 403 Forbidden.
  • Server Error (5XX)
  • 500 Internal Server Error: The server encountered an unexpected condition.

Identifying Resources on the Web

In the early days of the internet, as the World Wide Web was taking shape, there was a need for a standardized way to access and locate resources. Tim Berners-Lee, the inventor of the World Wide Web, introduced the concept of the Uniform Resource Identifier (URI) to serve this purpose. Over time, two main subsets of URIs emerged: URLs (Uniform Resource Locators) and URNs (Uniform Resource Names).

What’s the Difference?

  • URI (Uniform Resource Identifier): A generic term for any string of characters used to identify a name or a resource on the internet. Both URLs and URNs are types of URIs.

  • URL (Uniform Resource Locator): Specifies where an identified resource is available and the mechanism for retrieving it. It’s what we commonly think of as a “web address.” A URL not only identifies a resource but also explains how to access it, typically using the HTTP or HTTPS protocol.

  • URN (Uniform Resource Name): Names a resource with a particular namespace, but doesn’t necessarily tell you how to locate or access it. URNs are used less frequently than URLs in everyday web browsing.

Anatomy of a URI/URL

A typical URL consists of several parts, each serving a specific purpose:

  1. Protocol: This indicates the method used to access the resource. Common protocols include http, https, ftp, and file.

  2. Subdomain: This is an optional part that comes before the main domain name. Common subdomains include www and api.

  3. Domain: The main address of the website, like google.com or cronberg.dk.

  4. Resource Path: This specifies a particular resource on the web server, like a page or an image.

  5. Query String: Begins with a ? and provides additional parameters that the server can use. It’s split into key-value pairs separated by &.

Using the structure:

Protocol://subdomain(s).domain/resource?Querystring

Here are a few illustrative examples:

http://www.google.com
https://kursusportal.cronberg.dk/forside
https://kursusportal.cronberg.dk/modul/1
https://kursusportal.cronberg.dk/søg?q=web
https://kursusportal.cronberg.dk/viskursus?id=1&bruger=michell

Why is this Important?

Understanding the structure and purpose of URLs is crucial for several reasons:

  • Navigation: URLs are the primary means by which users navigate the web.

  • SEO (Search Engine Optimization): Well-structured URLs can help improve a website’s search engine ranking.

  • Security: Recognizing the structure of URLs can help users identify potentially malicious sites or phishing attempts.

A cookie is essentially a small piece of text data stored on the user’s computer by the web browser. Cookies are designed to be a reliable mechanism for websites to remember stateful information or to record the user’s browsing activity.

Here’s an example of a server sending a cookie:

HTTP/1.1 200 OK
Cache-Control: private
Content-Length: 1765
Content-Type: text/html; charset=utf-8
Server: Microsoft-IIS/8.5
X-AspNet-Version: 4.0.30319
X-AspNetWebPages-Version: 3.0
Set-Cookie: MySessionCookie=My value; path=/
Set-Cookie: MyExpireCookie=My value; expires=Tue, 16-Oct-2018 22:13:16 GMT; path=/
X-Powered-By: ASP.NET
Date: Tue, 02 Oct 2018 22:13:16 GMT

<!DOCTYPE html>
...

Certainly! Here’s an expanded section on Encryption:


Encryption

In the early days of the internet, data transferred between a user’s browser and web servers was sent in plain text, making it vulnerable to eavesdropping and tampering. As the internet grew and began to be used for more sensitive tasks, such as online banking and shopping, the need for a secure method of communication became evident. This led to the development of HTTPS and SSL/TLS encryption.

Basics of HTTP and its Limitations

HTTP (Hypertext Transfer Protocol) is the foundation of any data exchange on the Web. However, being a plain text-based protocol, it has inherent security vulnerabilities. Data sent via HTTP can be intercepted, read, and even modified by malicious actors, especially when using public networks like Wi-Fi hotspots.

Introduction of SSL/TLS

SSL (Secure Sockets Layer) and its successor, TLS (Transport Layer Security), are cryptographic protocols designed to provide secure communication over a computer network. They work by encrypting the data packets that are sent between the user’s browser and the web server. This encryption ensures that even if the data is intercepted, it remains unreadable to unauthorized parties.

Evolution to HTTPS

HTTPS (Hypertext Transfer Protocol Secure) is essentially HTTP combined with SSL/TLS. It provides the same functionalities as HTTP but with an added layer of security. When you visit a website that uses HTTPS, the data exchanged between your browser and the website is encrypted. This is especially crucial for websites where sensitive data, such as passwords, credit card numbers, or personal information, is transmitted.

Most modern websites now default to HTTPS, and browsers often flag non-HTTPS websites as “not secure.” This shift was driven by both the increasing threats to online security and the efforts of organizations and browser vendors to promote a more secure web.

Usage and Importance

  1. Online Transactions: HTTPS is vital for online banking, e-commerce, and any other type of transactional website where users input sensitive information.

  2. Data Integrity: HTTPS ensures that the data being sent and received is not tampered with, guaranteeing the integrity of the information.

  3. Authentication: It verifies that users are communicating with the intended website, preventing redirection to malicious sites.

  4. Trust: The padlock icon and the ‘HTTPS’ in the address bar give users confidence in the website’s security, making them more likely to engage in activities like shopping or signing up for services.

  5. SEO Benefits: Search engines, like Google, give preference to HTTPS websites, making it an essential factor for search engine optimization.

Digital Certificates

A digital certificate, often simply referred to as a “certificate,” is an electronic document that uses a digital signature to bind together a public key with an identity. This identity can be an individual’s name, an organization’s name, or a server’s name. Certificates serve as a proof of identity in the digital realm, much like a driver’s license or passport does in the physical world.

How Do Certificates Work?

  1. Issuance: Certificates are issued by trusted entities called Certificate Authorities (CAs). When a website owner wants a certificate, they generate a public and private key pair. They then send the public key, along with some identity information, to a CA. The CA verifies the identity and, if everything checks out, issues a certificate that binds the website owner’s identity to the public key.

  2. Validation: When you connect to a secure website, it presents its certificate to your browser. Your browser checks if the certificate was issued by a trusted CA and if it’s still valid. If both checks pass, the browser uses the public key in the certificate to set up an encrypted connection.

  3. Revocation: Sometimes, certificates need to be invalidated before they expire, such as when a private key is compromised. CAs maintain lists of revoked certificates, and browsers can check these lists.

Why Are Certificates Important?

  1. Authentication: Certificates ensure that you’re communicating with the intended website and not a malicious imposter. This is crucial for preventing “man-in-the-middle” attacks, where attackers intercept and potentially alter communications between two parties.

  2. Encryption: Once a browser validates a certificate, it can set up an encrypted connection with the website. This ensures that any data exchanged, like passwords or credit card numbers, remains confidential.

  3. Trust: Seeing a valid certificate (often represented by a padlock icon in the address bar) gives users confidence in a website’s legitimacy and security.

Types of Certificates

While the basic principle remains the same, there are different types of certificates tailored for various needs:

  • Domain Validated (DV) Certificates: These are the most basic type, where the CA only verifies that the applicant has control over the domain name. No company identity information is vetted and no information is displayed other than encryption information within the Secure Socket Layer (SSL).

  • Organization Validated (OV) Certificates: The CA verifies the company’s details and the domain name. This type provides a moderate level of security and is used by businesses and public-facing websites.

  • Extended Validation (EV) Certificates: This is the highest-ranking certificate. The CA conducts a thorough examination of the company before issuing the certificate. Browsers may display the company’s name in green in the address bar, providing a higher level of user trust.

HTTP/2

A few years ago, HTTP 1.1 was updated to HTTP 2. All modern browsers and servers today can communicate over HTTP2 if desired. HTTP2 is a significant update from its predecessor, introducing several new features:

  • Multiplexing: Multiple transactions can be sent in parallel.
  • Server push: The server can send resources to the client without the client explicitly asking for them.
  • Binary: Makes HTTP2 more efficient to parse.
  • Prioritization: Resources can be sent in order of importance.
  • Speed: Generally faster than HTTP 1.1 due to the above improvements.

For more on HTTP2, you can check out HTTP2 test.

Tools

There are several tools related to testing and using HTTP.

F12

As a developer, you can typically rely on the network tab in the browser’s F12 tool - it provides all the desired information.

Fiddler

However, if you want a deeper understanding of HTTP and also want to create scripts that test HTTP communication, Fiddler is recommended.

Curl

Lastly, there’s Curl, a widely-used command-line tool for all forms of HTTP communication.