HTTP

Internet Engineering
Spring 2024
@1995parham

Introduction

  • HTTP is the transfer protocol for web applications
    • H T T P: Hyper Text Transfer Protocol
    • HTTP 1.0 (RFC 1945), HTTP 1.1 (RFC 2068), HTTP 2 (RFC 7540)
    • In fact, it can be used to transfer everything (not only hyper text)
      • Text Documents e.g. HTML, XML, JSON, etc.
      • Multimedia e.g. JGP, GIF, MP4, MKV, etc.
      • Application e.g. PDF, ZIP, etc.

Introduction (Cont.)

  • HTTP uses the client/server paradigm
    • HTTP server provides resource
    • HTTP client (usually web browser) gets resource
  • But not pure client/server communication
    • Proxies
    • Caches
    • ...

Introduction (Cont.)

  • HTTP is an application layer protocol
  • HTTP assumes reliable communication
    • over TCP
    • default (server) port: 80
    • client port is chosen randomly per connection
  • HTTP is Stateless
    • Server does not keep history/state of client
    • High performance & Low Complexity
    • Problematic in some applications (sessions)
      • Cookies
      • JSON Web Tokens

Data Resources to Transfer

  • HTTP is the protocol to transfer data between server and client (usually from server to client)
  • Which data?
    • It can be anything
    • In web, usually, it is a resource/object on server
  • Each resource must be identified/located uniquely
    • URI (Uniform Resource Identifier)
    • URL (Uniform Resource Locator)
  • URIs identify and URLs locate; however, locators are also identifiers, so
  • every URL is also a URI, but there are URIs which are not URLs.

URI in Action

  • A Uniform Resource Name (URN) is a URI that identifies a resource by name in a particular namespace.
  • International Standard Book Number (ISBN) system

URL

<protocol(scheme)> :// <user> : <pass> @ <host> : < port> / <path> ? <query> # <frag>

  • https://sbu.ac.ir/SitePages/Home.aspx
  • https://www.bing.com/search?q=Hello+World&form=QBLH&sp=-1&pq=&sc=0-0&qs=n&sk=&cvid=7F2B496642F94D5F95989756B4FF60EE
  • file:///home/parham/Documents/Git/parham/dotfiles
  • http://libgen.is
  • ftp://speedtest.tele2.net

URL (Cont.)

  • Scheme: The application layer protocol
  • HTTP: The web protocol
  • HTTPS: Secure HTTP
  • FTP: File Transfer Protocol
  • File: Access to a local file
  • javascript: Run javascript code
  • mailto: Send mail to given address
  • etc.

URL (Cont.)

  • Path: The path of the object on host filesystem
  • E.g. web server root directory is /var/www/
    • http://www.example.com/1.html/var/www/1.html
    • http://www.example.com/1/2/3.jpg/var/www/1/2/3.jpg
    • http://www.example.com/1/2/../3.jpg/var/www/1/3.jpg 🤨

URL (Cont.)

  • Query: A mechanism to pass information from client to active pages or forms
    • Fill information in a university registration form
    • Ask Bing! to search a phrase
  • Starts with "?"
  • name=value format
  • "&" is the border between multiple parameters
  • Try this out, by clicking me

HTTPBin

A simple HTTP Request & Response Service.

URL (Cont.)

URL (Cont.)

  • Domain names are case insensitive according to RFC 4343.
  • The rest of URL is sent to the server via the GET method and etc. This may be case sensitive or not.

URL (Cont.)

  • URL is encoded by client before transmission
  • How: Each byte is divided into two 4-bit group, hexadecimal of the 4-bits are prefixed by %
    • ~126%7E
  • What & Why?
    • Non-ASCII (e.g., Persian Characters, Emoji)
    • Reserved character when are not used for special role
    • Unsafe character, e.g. space, %, ...

    https://ganj.irandoc.ac.ir/api/v1/search/main?keywords=hellow%20world

URL in Action

  • User asks the browser to retrieve a resource
  • Browser finds the ip address of <host> (DNS lookup)
  • Browser creates a TCP connection to the IP address and the <port>
  • Browser sends http requests through the connection
  • Browser gets the response and processes it

URL in Action (Cont.)


# ce.aut.ac.ir resolves to 185.211.88.129
# connectify0 is the network interface

sudo tcpdump --interface connectify0 --number -n -v 'port 80 and dst host 185.211.88.129'
  

1  07:47:41.159469 IP (tos 0x0, ttl 64, id 59255, offset 0, flags [DF], proto TCP (6), length 60)
10.202.0.2.55340 > 185.211.88.129.80: Flags [S], cksum 0x5a35 (correct), seq 440330126, win 59040, options [mss 14760,sackOK,TS val 954835682 ecr 0,nop,wscale 7], length 0

2  07:47:41.159904 IP (tos 0x0, ttl 64, id 59256, offset 0, flags [DF], proto TCP (6), length 52)
10.202.0.2.55340 > 185.211.88.129.80: Flags [.], cksum 0xf7e0 (correct), ack 35014, win 462, options [nop,nop,TS val 954835682 ecr 15081515], length 0

3  07:47:41.160032 IP (tos 0x0, ttl 64, id 59257, offset 0, flags [DF], proto TCP (6), length 137)
10.202.0.2.55340 > 185.211.88.129.80: Flags [P.], cksum 0x282f (correct), seq 0:85, ack 1, win 462, options [nop,nop,TS val 954835682 ecr 15081515], length 85: HTTP, length: 85
    GET /~bakhshis/ HTTP/1.1
    Host: ce.aut.ac.ir
    User-Agent: curl/8.3.0
    Accept: */*

4  07:47:41.218732 IP (tos 0x0, ttl 64, id 59258, offset 0, flags [DF], proto TCP (6), length 52)
10.202.0.2.55340 > 185.211.88.129.80: Flags [.], cksum 0xe649 (correct), ack 4334, win 429, options [nop,nop,TS val 954835741 ecr 15081574], length 0

5  07:47:41.219358 IP (tos 0x0, ttl 64, id 59259, offset 0, flags [DF], proto TCP (6), length 52)
10.202.0.2.55340 > 185.211.88.129.80: Flags [.], cksum 0xd5c5 (correct), ack 8592, win 397, options [nop,nop,TS val 954835742 ecr 15081575], length 0

6  07:47:41.224808 IP (tos 0x0, ttl 64, id 59260, offset 0, flags [DF], proto TCP (6), length 52)
10.202.0.2.55340 > 185.211.88.129.80: Flags [.], cksum 0xcb33 (correct), ack 11288, win 397, options [nop,nop,TS val 954835747 ecr 15081580], length 0

7  07:47:41.226004 IP (tos 0x0, ttl 64, id 59261, offset 0, flags [DF], proto TCP (6), length 52)
10.202.0.2.55340 > 185.211.88.129.80: Flags [.], cksum 0xc0a9 (correct), ack 13984, win 397, options [nop,nop,TS val 954835748 ecr 15081581], length 0

8  07:47:41.226548 IP (tos 0x0, ttl 64, id 59262, offset 0, flags [DF], proto TCP (6), length 52)
10.202.0.2.55340 > 185.211.88.129.80: Flags [.], cksum 0xb61f (correct), ack 16680, win 397, options [nop,nop,TS val 954835749 ecr 15081582], length 0

9  07:47:41.227124 IP (tos 0x0, ttl 64, id 59263, offset 0, flags [DF], proto TCP (6), length 52)
10.202.0.2.55340 > 185.211.88.129.80: Flags [.], cksum 0xb332 (correct), ack 17427, win 397, options [nop,nop,TS val 954835750 ecr 15081583], length 0

10  07:47:41.227371 IP (tos 0x0, ttl 64, id 59264, offset 0, flags [DF], proto TCP (6), length 52)
10.202.0.2.55340 > 185.211.88.129.80: Flags [F.], cksum 0xb331 (correct), seq 85, ack 17427, win 397, options [nop,nop,TS val 954835750 ecr 15081583], length 0

11  07:47:41.227485 IP (tos 0x0, ttl 64, id 59265, offset 0, flags [DF], proto TCP (6), length 52)
10.202.0.2.55340 > 185.211.88.129.80: Flags [.], cksum 0xb330 (correct), ack 17428, win 397, options [nop,nop,TS val 954835750 ecr 15081583], length 0
  

How does HTTP work? Transactions

  • HTTP data transfer is a collection of transactions
  • Each transaction is composed of 2 HTTP messages
  • Requests are identified by methods
    • Method: The action that client asks from server
  • Responses are identified by status codes
    • Status: The result of the requested action

HTTP Transactions

http-transactions

HTTP Transactions in Web

  • (Typically) each web page contains multiple resources
    • The main skeleton HTML page
    • Some linked materials: figures, videos, JS, CSS, etc.
  • Displaying a web page by a browser
    • Get the HTML page (first transaction)
    • Try to display the page (rendering)
    • Other resources are linked to the page
    • Get the resources (subsequent transactions)

HTTP Transactions in Web (Cont.)

  • HTTP Transactions & TCP Connections
    1. Non-persistent
      • A new TCP connection per object
      • Network overhead + Connection establish delay + Resource intensive
      • Parallel connections speed up browsing
    2. Persistent
      • Get multiple objects using a single TCP connection
      • No extra processing & networking overhead
      • Poor performance if implemented in serial manner
      • Pipeline requests speed up browsing (HTTP/1.1)

HTTP/2 for a Faster Web

HTTP Pipelining

  • HTTP Pipelining is not difficult to deploy, it is impossible.
  • it still allowed a single large or slow response to block all others that followed.

HTTP/2 Multiplexing

  • Multiplexing allows multiple request-response messages to be in flight over a single HTTP/2 connection, at the same time.

HTTP/2 for a Faster Web (Cont.)

http2-transactions

HTTP Transactions in Web: Hands on

  • Get a HTML page from a server
  • Capture the packets
  • Investigate the transactions

HTTP Messages

  • HTTP is text-based protocol
    • Human readable headers
    • The header is composed of some lines
message-structure

HTTP Messages (Cont.)

  • E.g. HTTP request message

GET /index.html HTTP/1.1
Host: www.aut.ac.ir
User-Agent: Mozilla/36.0
Accept-Language: en-us
Connection: keep-alive
    

Method<sp>Path<sp>Version<CRLF>
Header-Field:Header-Value<CRLF>
...
Header-Field:Header-Value<CRLF>
<CRLF>
Entity-Body
    
  • E.g. HTTP response message

HTTP/1.1 200 OK
Date: Sun, 02 Oct 2018 20:30:40
Server: Apache/2.2.2
Last-Modified: Mon, 03 May 2017 10:20:22
Connection: keep-alive
Content-Length: 3000

data data data ...
    

Version<sp>Code<sp>Reason<CRLF>
Header-Field:Header-Value<CRLF>
...
Header-Field:Header-Value<CRLF>
<CRLF>
Entity-Body
    

HTTP Methods

  • Methods are actions that client asks from server to do on the specified resource (given by the path parameter)
  • Which actions?
    • Basic data communication operations
      • Safe operations
        • Get a resource from server
        • Send data to server
      • Unsafe operations
        • Delete a resource on server
        • Create/Replace a resource on server
    • Debugging and troubleshooting
      • Get information about a resource
      • Check what the server has got from a client
      • Get the list of operations which can be applied on a resource

HTTP Methods (Cont.)

  • GET: Retrieve resource from server
  • HEAD: Similar to GET but the resource itself is not retrieved, just the HTTP response header
    • Useful for debugging or some other applications
  • POST: Submit data to be processed by the specified resource
    • Data itself is enveloped in message body

HTTP Methods (Cont.)

  • DELETE: Remove the resource
    • Not popular in web, can be used in other applications
  • PUT: Add message body as the specified resource to server
  • PATCH: A PATCH request is considered a set of instructions on how to modify a resource. Contrast this with PUT; which is a complete representation of a resource.
  • TRACE: Server echoes back the received message
    • For troubleshooting & debugging
  • OPTIONS: Request the list of supported methods by server on the resource

HTTP Responses

  • The message for the result/response of the requested action
  • Which responses?
    • Basic responses
      • Success
      • Failure
        • Bad client request
        • Server problem
        • ...
    • Others
      • E.g., Redirection to other resources

HTTP Responses (Cont.)

  • 2xx (Successful responses)
    • 200: OK
    • 201: Created
    • 204: No Content
  • 4xx (Client errors)
    • 400: Bad Request
    • 401: Unauthorized (Authorization required)
    • 403: Forbidden
    • 404: Not Found
    • 405: Method Not Allowed

HTTP Responses (Cont.)

  • 5xx (Server errors)
    • 500: Internal Server Error
    • 501: Not Implemented
    • 503: Service Unavailable
  • 3xx (Redirects)
    • 301: Moved Permanently
    • 302: Found
      • redirect status response code indicates that the resource requested has been temporarily moved to the URL given by the Location header
    • 307: Moved Temporarily
      • Resource has been moved, Redirection
      • Location header contains the new location of resource
    • 304: Not Modified
    • 308: Permanent Redirect
    • Both 302 and 307 used for temporary redirects

HTTP Responses (Cont.)

  • 3xx (Redirects)
  • 307 came about because user agents adopted as a de facto behaviour to take POST requests that receive a 302 response and send a GET request to the Location response header.

    That is the incorrect behaviour — only a 303 should cause a POST to turn into a GET.

    3xx

HTTP Responses (Cont.)

  • 1xx (Informational responses)
    • 101: Switching Protocol
      • This code is sent in response to an Upgrade request header from the client, and indicates the protocol the server is switching to.

HTTP Messages Hands on

  • Connect to a web server
    • Telnet can create TCP socket
  • Play with the server by sending HTTP methods and checking the responses

HTTP Headers

  • Headers are additional information that is sent by client to server and vice versa
    • Most (almost all) are optional
  • Which headers?
    • Information about client
    • Information about server
    • Information about the requested resource
    • Information about the response
    • Security/Authentication
    • ...

HTTP Headers

  • General headers
    • Appear both on request & response messages
  • Request headers
    • Information about request
  • Response headers
    • Information about response
  • Entity headers
    • Information about body (size, ...)
  • Extension headers
    • New headers (not standard)

General Headers

  • Date: Date & Time that message is created
  • Connection: Close or Keep-Alive
    • Close: Non-persistent connection
    • Keep-Alive: Persistent connection
  • Via: Information about the intermediate nodes between two sides
    • Proxy servers

Request Headers

  • Host: The name of the server (required, why?)
  • Referer: URL that contains requested URL
  • Information about the client
    • User-Agent: The client program
    • Accept: The acceptable media types
    • Accept-Encoding: The acceptable encoding
    • Accept-Language: The acceptable language

Request Headers (Cont.)

  • Range: Specific range (in byte) of resource
  • Authorization: Response to the authenticate
    • Will be discussed later
  • Cookie: To return back the cookies
    • Will be discussed later
  • If-Modified-Since: Request is processed if the objected is modified since the specified time.
    • Used in Web Caching
    • Will be discussed later

Response Header

  • Server: Information about server
  • WWW-Authenticate: Used to specify authentication parameters by server
  • Proxy-Authenticate: Used to specify authentication parameters by proxy
  • Set-Cookie: To send a cookie to client
  • Location: The location of entity to redirect client

Response Header (Cont.)

  • Last-Modified: The date and time of last modification of entity
  • Content-Range: Range of this entity in the entire resource
  • Expires: The date and time at which the entity will expire

Entity Headers

  • Content-Length: The length of body (in byte)
  • Content-Type: The type of entity
    • MIME types: text/xml, image/gif
  • Allow: The allowed request methods can be performed on the entity
    • This is in response of OPTIONS method

from https://avatars1.githubusercontent.com/u/8181240?v=4

Extension Headers

  • Custom proprietary headers have historically been used with an X- prefix, but this convention was deprecated in June 2012
  • implementation-specific and private-use parameters could at least incorporate the organization's name
    • ExampleInc-foo
    • VND.ExampleInc.foo (vnd stands for vendor)
  • or primary domain name
    • com.example.foo
    • http://example.com/foo

HTTP Tools

Server-sent events (SSE)

  • Traditionally, a web page has to send a request to the server to receive new data; that is, the page requests data from the server
  • With server-sent events, it's possible for a server to send new data to a web page at any time, by pushing messages to the web page.

Server-Sent Events (SSE) is a server push technology enabling a client to receive automatic updates from a server via an HTTP connection, and describes how servers can initiate data transmission towards clients once an initial client connection has been established.

They are commonly used to send message updates or continuous data streams to a browser client and designed to enhance native, cross-browser streaming through a JavaScript API called EventSource, through which a client requests a particular URL in order to receive an event stream.

The EventSource API is standardized as part of HTML5 by the WHATWG. The media type for SSE is text/event-stream.

HTTP/2 Server Push

  • HTTP/2 Server Push is an optional feature of the HTTP/2 and HTTP/3 network protocols which allows servers to send resources to a client before the client requests them.
  • Server Push is a performance technique aimed at reducing latency by loading resources preemptively, even before the client knows they will be needed.

Stateless Problem

  • HTTP is a stateless protocol
    • Server does not remember it's client
  • How to personalize pages (personal portal)?
    • Use http header: X-Forwarded-For
      • Is not usually sent by browsers
    • Find client IP address from TCP connection
      • The problem is NAT
    • Clients move but IP address does not

Solution of Stateless Problem: Cookie [RFC 6265]

  • Cookie: are information (e.g. unique identifiers) sent by server to user (browser) which are returned back to server
  • How it works
    • Server asks client to remember the information
      • Set-Cookie header in response message
      • Set-Cookie: <cookie-name>= <cookie-value>
    • Client gives back the information to server in every request
      • Cookie header in request messages
      • Cookie: <cookie-name>= <cookie-value>; <cookie-name>= <cookie-value>
    • Server customizes responses according to the cookie
  • Types
    • Session cookies: To identify a session
    • Persistent cookies: To identify a client (browser)

Cookies (Cont.)


GET /cookies/set?name=parham&family=alvani HTTP/1.1
Host: httpbin.org

    

HTTP/1.1 302 FOUND
Date: Mon, 07 Sep 2020 05:19:50 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 223
Connection: keep-alive
Server: gunicorn/19.9.0
Location: /cookies
Set-Cookie: name=parham; Path=/
Set-Cookie: family=alvani; Path=/
    

GET /cookies HTTP/1.1
Host: httpbin.org
Cookie: name=parham; family=alvani
    

HTTP/1.1 200 OK
Date: Mon, 07 Sep 2020 05:23:53 GMT
Content-Type: application/json
Content-Length: 58
Connection: keep-alive
Server: gunicorn/19.9.0
Access-Control-Allow-Origin: *
Access-Control-Allow-Credentials: true

{
  "cookies": {
    "family": "alvani",
    "name": "parham"
  }
}
    

Cookies (Cont.)

  • Limitation
    • Cannot be used to store large data
    • At least 300 cookies
    • At least 4096 bytes per cookie
    • At least 20 cookies per unique host or domain name
  • Cookies are text files
    • No virus spread
  • There is not any request from server to read cookies
    • By default cookies are sent by browser
    • Browser checks URL and finds appropriate cookies

Cookies (Cont.)

  • Client can control cookies
    • Disable cookies: no cookie is saved & used
    • View & Delete cookies
  • Server can control cookies by its attributes
    • Expiration time
    • Domain
    • Path
    • Security
    • ...

Cookies Attributes

  • Expire & Max-Age: The life time of the cookie
    • Expire: An absolute time to delete cookie
    • Max-Age: The maximum life time (sec) of cookie
    • If exists, shows a permanent cookie
    • If does not exist, shows a session cookie and will be expired when session ends (more)
    • Send a past time (or negative) to delete a cookie
  • Secure: Cookie is sent only if channel is secure
    • Specially useful for login sessions cookies
    • A cookie with the Secure attribute is sent to the server only with an encrypted request over the HTTPS protocol, never with unsecured HTTP (except on localhost), and therefore can't easily be accessed by a man-in-the-middle attacker.
  • HttpOnly: Cookie is sent only if HTTP is used
    • JavaScript cannot access to the cookies
  • SameSite allows you to declare if your cookie should be restricted to a first-party or same-site context
    • Lax: Cookies are not sent on normal cross-site subrequests (for example to load images or frames into a third party site), but are sent when a user is navigating to the origin site (i.e. when following a link).
    • Strict: Cookies will only be sent in a first-party context and not be sent along with requests initiated by third party websites.
    • None: Cookies will be sent in all contexts, i.e in responses to both first-party and cross-origin requests. If SameSite=None is set, the cookie Secure attribute must also be set (or the cookie will be blocked).

CSRF

  • Cross-site request forgery (CSRF) attacks rely on the fact that cookies are attached to any request to a given origin, no matter who initiates the request.

Warning!

Neither Strict nor Lax are a complete solution for your site's security. Cookies are sent as part of the user's request and you should treat them the same as any other user input. That means sanitizing and validating the input. Never use a cookie to store data you consider a server-side secret.

Cookies Attributes: Domain & Path

  • Domain & Path determine the scope of the cookie
    • For which path and domain, the cookie is saved & returned back by browser
  • If Domain is omitted, defaults to the host of the current document URL.
    • Browser returns back the cookie for the domain and not sub-domains
  • If Path is omitted, defaults to the path of the current document URL.
    • Browser returns back the cookie for the path and also for all sub-paths
  • If present then browser checks validity
    • If they are valid then Browser returns back the cookie for that domain & that path and also for all sub-domains and sub-paths

Cookies Attributes: Domain & Path

  • Validity check by major browsers
    • Domain names must start with dot
      • Some browsers accept names without dot as domain
      • Contrary to earlier specifications, leading dots in domain names (.example.com) are ignored.
    • Don’t accept for other domains than the base domain
    • Don’t accept cookies for sub-domains
    • Accept cookies for higher domains
      • Except the top level domains, e.g., .com, .ac.ir
    • Accept cookies for other (sub or higher) paths
      • A path that must exist in the requested URL, or the browser won't send the Cookie header.

Cookies Attributes: Domain & Path: Hands on

    Single Sign-on (SSO)

    Proxy

    • Proxies sit between client and server
    • Act as server for client
    • Act as client for server
    proxy

    Forward Proxies

    • A forward proxy, or gateway, or just "proxy" provides proxy services to a client or a group of clients.

    Reverse Proxies

    • As the name implies, a reverse proxy does the opposite of what a forward proxy does:
    • A forward proxy acts in behalf of clients (or requesting hosts), a reverse proxy acts in behalf of servers.
    • Forward proxies can hide the identities of clients whereas reverse proxies can hide the identities of servers.
    • Reverse proxies have several use cases, a few are:
      • Load balancing: distribute the load to several web servers,
      • Cache static content: offload the web servers by caching static content like pictures,
      • Compression: compress and optimize content to speed up load time.
    reverse-proxy

    HTTP Proxy Applications

    • Authentication
      • Client side: Authenticate clients before they access web
      • Server side: Authenticate clients before access the server
    • Accounting: Log client activities
    • Security: Analyze request before sending it to server
      • Integrated in modern firewalls
    • Filtering: Limit access to specified contents
    • Anonymizer: Anonymous web browsing
    • Caching (more details in the following slide)

    Caching

    • Caching: save a copy of a resource and use it instead of requesting server
    • Browser has its own local caches
    • Cache server is special proxy for caching
    • Benefits
      • Reduce redundant data transfer
      • Reduce network bottleneck
      • Reduce load on server
      • Reduce delay

    Caching Algorithm

    • If the object is not cached, it is got from server, saved in cache, and sent to client
    • Else, if object is in cache
      • Cache server must return only fresh objects
      • Freshness check
    • Objects life-time specified by server
    • The Cache-Control HTTP/1.1 general-header field is used to specify directives for caching mechanisms in both requests and responses.

    Expiration

    the maximum amount of time a resource will be considered fresh.

    
    Cache-Control: max-age=<seconds>
        

    No caching

    The cache should not store anything about the client request or server response.

    
    Cache-Control: no-store
        

    Cache but revalidate

    A cache will send the request to the origin server for validation before releasing a cached copy.

    
    Cache-Control: no-cache
        
    • If requested object is not expired
      • Cache server gives it to client

    Caching Algorithm (Cont.)

    • If requested object is expired
      • Its freshness must be checked
    • Freshness is checked by conditional request
      • If-Modified-Since: current last-modified time
      • If-None-Match: the server will send back the requested resource, with a 200 status, only if it doesn't have an ETag matching the given ones.
        • The ETag HTTP response header is an identifier for a specific version of a resource.
    • Server responses
      • 304 Not modified response + new expire time
        • Cached copy is valid until the specified time
      • 200 OK
        • Server provides a new version of the object
        • Cache server updates cached copy

    Authentication vs Authorization

    • Authentication is the verification of the credentials of the connection attempt.
    • This process consists of sending the credentials from the remote access client to the remote access server in an either plaintext or encrypted form by using an authentication protocol.
    • Authentication is the verification that the connection attempt is allowed.
    • Authorization occurs after successful authentication.

    HTTP Authentication

    • All resources are not public in web; e.g.,
      • Financial documents, Customer information, ...
    • HTTP has two (similar) authentications
      • Basic: Base64 encoded user:pass [The username itself cannot contain a colon]
      • Digest: Plain username + Digest of pass
    • Steps are the same
        1. Client-side app (browser) request resource from server
        2. Server refuses with 401 Unauthorized
        3. Client-side app ask Username & Password from user
        4. Client send Username & Password to server
        5. Server authenticates and allows.
      • Authentication information are sent by every request until end of current session

    Base64

    • Base64 is a group of binary-to-text encoding schemes that represent binary data in an ASCII string format by translating it into a radix-64 representation.

    An additional pad character is allocated which may be used to force the encoded output into an integer multiple of 4 characters (or equivalently when the unencoded binary text is not a multiple of 3 bytes) ; these padding characters must then be discarded when decoding but still allow the calculation of the effective length of the unencoded text, when its input binary length would not be not a multiple of 3 bytes (the last non-pad character is normally encoded so that the last 6-bit block it represents will be zero-padded on its least significant bits, at most two pad characters may occur at the end of the encoded stream).

    If unpadded strings are concatenated, it's impossible to recover the original data because information about the number of odd bytes at the end of each individual sequence is lost

    HTTP Authentication (Cont.)

    basic-auth-in-action

    HTTP Authentication (Cont.)

    
    GET /basic-auth/admin/admin HTTP/1.1
    Host: httpbin.org
    Authorization: Basic YWRtaW46YWRtaW4=
    
      
    
    HTTP/1.1 200 OK
    Date: Mon, 07 Sep 2020 14:14:25 GMT
    Content-Type: application/json
    Content-Length: 48
    Connection: keep-alive
    Server: gunicorn/19.9.0
    Access-Control-Allow-Origin: *
    Access-Control-Allow-Credentials: true
    
    {
      "authenticated": true,
      "user": "admin"
    }
      

    Digest Authentication

    • Basic authentication is insecure
      • Password is sent in base64 encoding
      • Attacker can easily find it
    • Digest authentication: Don’t send password
      • Send its digest (hash)
    • Digest/hash function
      • One way function, irreversible
    • Attacker cannot find password 😌
    • But! Reply attack 😞
      • Attacker resends the same digest then he will be authenticated
      • Use Nonce
    • Digest authentication uses nonce and digest together

    Digest Authentication (Cont.)

    • Client requests a private resource
    • Server creates a nonce (the server only issues a new nonce
      for each 401 response)
    
    WWW-Authenticate: Digest realm="testrealm@host.com",
                            qop="auth,auth-int",
                            nonce="dcd98b7102dd2f0e8b11d0f600bfb0c093",
                            opaque="5ccc069c403ebaf9f0171e9517f40e41"
        
    • Client computes digest of password & nonce
    • 
      HA1 = MD5(username:realm:password)
      HA2 = MD5(method:digestURI)
      response = MD5(HA1:nonce:HA2)
            
      
      Authorization: Digest username="Mufasa",
                           realm="testrealm@host.com",
                           nonce="dcd98b7102dd2f0e8b11d0f600bfb0c093",
                           uri="/dir/index.html",
                           qop=auth,
                           nc=00000001,
                           cnonce="0a4f113b",
                           response="6629fae49393a05397450978507c4ef1",
                           opaque="5ccc069c403ebaf9f0171e9517f40e41"
            
    • Server looks up the password of username and computes
      hash(pass, nonce)
    • More details in RFC7616

    In the Wild

    • There is no need for basic/digest authentication in our new world we can send plaintext passwords over secure HTTP.
    • In both ways, we need a solution for stateful connection to not repeat the authentication procedure for every request.
    courses.aut.ac.ir
    moodle-cookies moodle-request moodle-post

    Bearer Authentication

    • Bearer authentication (also called token authentication) is an HTTP authentication scheme that involves security tokens called bearer tokens
    • The name Bearer authentication can be understood as give access to the bearer of this token.

    JWT

    • JSON Web Tokens are an open, industry standard RFC 7519 method for representing claims securely between two parties.

    Sample Token

    
    eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.
    eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IlBhcmhhbSBBbHZhbmkiLCJpYXQiOjE1MTYyMzkwMjIsInByb2plY3QiOiJhbiBhd2Vzb21lIHByb2plY3QifQ.
    gWWHu5Ps_F6lbqJRBXkNjEk_-0QdLhN9l2MNjWOcj90
        
    
    {
      "alg": "HS256",
      "typ": "JWT"
    }
    
    {
      "sub": "1234567890",
      "name": "Parham Alvani",
      "iat": 1516239022,
      "project": "an awesome project"
    }
        

    Snapp! Token

    
    {
      "alg": "RS512",
      "kid": "z8a4l4oOFEqgehRYDBZP+fprPnLDLmabkslOxVVpLNE",
      "typ": "JWT"
    }
    {
      "aud": [
        "passenger"
      ],
      "email": "parhamalvani@gmail.com",
      "exp": 1646469738,
      "iat": 1645260138,
      "iss": 1,
      "jti": "2NFKm5FfEey65wIArBQAz289hDgf/E0gjnyXrNCM0v4",
      "sid": "25JzmlUBAwtMfQvT7qmOalw5M7p",
      "sub": "KpQxO5glyv04Ad1"
    }
        

    Security

    • Digest authentication protect password only
    • Data is completely insecure
    • No mechanism in HTTP to protect data
    • HTTP over SSL/TLS is the popular solution
      • An encrypted tunnel between client & server
      • Send HTTP traffic through the tunnel

    References 📚

    Fork me on GitHub