The application layer

Principles of Network Applications

Communication paradigm

Client-server paradigm: server:

always-on host
permanent IP address
often in data centers, for scaling clients:
contact, communicate with server
may be intermittently connected
may have dynamic IP addresses
do not communicate directly with each other examples: HTTP, IMAP, FTP

Peer-peer architecture:

no always-on server
arbitrary end systems directly communicate
peers request service from other peers, provide service in return to other peers
peers are intermittently connected and change IP addresses example: P2P file sharing [BitTorrent]

Sockets

Process: program running within a host(resources management and allocation)

Addressing: IP address + port
Communication:
within same host: IPC(inter-process communication)
different host: exchanging messages

Sockets

Protocols

What a protocol defines

Transport protocol

Transport requirement of different apps

Sockets

Securing TCP:

Vanilla TCP & UDP sockets:

no encryption
cleartext passwords sent into socket traverse Internet in cleartext (!)

Transport Layer Security (TLS):

provides encrypted TCP connections
data integrity
end-point authentication

Web and HTTP

HTTP

HTTP: hypertext transfer protocol

Web's application-layer protocol
client/server model
stateless: server maintains no infomation about past client requests(so we need cookie)
HTTP uses TCP in the transport-layer

There are two types of HTTP connections:

Non-persistent HTTP: at most one object sent over TCP connection. It is used by HTTP/1.0 by default.
Persistent HTTP: multiple objects can be sent over single TCP connection between client, and that server. Used by HTTP/1.1 by default.

Non-persistent HTTP example:

Non-persistent HTTP example

But it is bothering to initiate TCP every time we want to sent a request because it takes time and gets OS overhead:

Response time for Non-persistent HTTP

RTT

RTT is short for roundtrip time

Therefore, we use persistent HTTP(HTTP1.1) more. Property of persistent HTTP (HTTP1.1):

server leaves connection open after sending response
subsequent HTTP messages between same client/server sent over open connection
client sends requests as soon as it encounters a referenced object
as little as one RTT for all the referenced objects (cutting response time in half)

HTTP message

Request message HTTP request message general format:

HTTP request message general format

Other HTTP request messages:

GET method: include user data in URL field of HTTP GET request message (following a ‘?’)
POST method: for web page including form input
PUT method: uploads new file (object) to server, and completely replaces file that exists at specified URL with content in entity body of POST HTTP request message
HEAD method: request the headers of a resource, but without the actual content (body) of the resource, similar to the GET method but without getting response body

Response message:

Format:
Status line: HTTP version Status code Status message
Headers: Content-type etc
Body(optional): html, jpeg, etc

Cookies

Why we need cookies:

HTTP GET/response interaction is stateless:

no need for client/server to track “state” of multi-step exchange
all HTTP requests are independent of each other
no need for client/server to “recover” from a partially-completed-but-never-completely-completed transaction

Considering stateful one protocol:

A stateful protocol

What is Cookie

Cookie is a small piece of data sent from a server to a client's web browser when client requests server for the first time and it is stored on the client side. Format: Set-Cookie: <cookie-name>=<cookie-value>; <attributes>

What it can do:

track user behavior on a given website (first party cookies)
track user behavior across multiple websites (third party cookies) without user ever choosing to visit tracker site (!)

How third party cookies(from websites you didn't choose to visit) track users' behavior?

The third-party: like ad.com. When you visit example.com, it returns you cookies and you store it locally. This is the first party cookie. But at the same time you will request ad.com in some way and they send cookies to you and store them.

When you visit example2.com, there is still ad there so you send request to ad.com and they store the cookies recording you have visited example2.com.

In this way, the ad.com know you have visited example.com, example2.com and so on. It knows more about you and could give you personalized ads on sites with ad.com resourses embedded in.

First and third party cookies

Cookie tracking user's behavior

Tip

In chrome the cookie of a specific site we are visiting can be seen by using application in developer tools.

GDPR

When cookies can identify an individual, cookies are considered personal data, subject to GDPR(general data protection regulation) personal data regulations.

Web caches

caching's goal

To satisfy client requests without involving origin server. Then this can reduce response time for client request and reduce traffic on an institution's access link.

How it works:

user configures browser to point to a (local) Web cache and browser sends all HTTP requests to cache
if object in cache: cache returns object to client
else cache requests object from origin server, caches received object, then returns object to client

Web caches

Cache-Control

There is Cache-Control section in both HTTP request and response headers.

In request header, used to indicate how the client (usually a browser) wants to interact with caches when requesting a resource.
In response header: used by the server to specify caching instructions for the client (browser) and intermediate caches (like CDNs or proxy servers). Cache-Control: private: the response is specific to a single user and should not be cached by shared caches (e.g., proxy servers or CDNs). However, it can be cached in the browser's local cache. Cache-Control: public: the response can be cached by both private and shared caches. This is typically used for resources that are meant to be publicly available, such as images or static files.

HTTP/2

Goal: decrease delay in multi-object HTTP requests

Property: increased flexibility at server in sending objects to client:

methods, status codes, most header fields unchanged from HTTP 1.1
transmission order of requested objects based on client-specified object priority (not necessarily FCFS)
push unrequested objects to client
divide objects into frames, schedule frames to mitigate HOL(head-of-line) blocking(first large packets block the latter small ones)

E-mail

3 main components: user agent(mail reader), mail severs, SMTP(simple mail transfer protocol)

DNS(domain name system)

Services

services:

translate hostname to IP address
host/mailserver aliasing

Structure

DNS structure

root name servers: official, contact-of-last-resort by name servers that can not resolve name
Top-Level domain(TLD) severs:
responsible for .com, .org, .net, .edu, .aero, .jobs, .museums, and all top-level country domains, e.g.: .cn, .uk, .fr, .ca, .jp
Network Solutions: authoritative registry for .com, .net TLD
Educause: .edu TLD
Authoritative DNS servers: organization’s own DNS server(s), providing authoritative hostname to IP mappings for organization’s named hosts.

Iterative DNS name resolution

Recursive DNS name resotution

Caching

Once (any) name server learns mapping, it caches mapping, and immediately returns a cached mapping in response to a query.

caching improves response time
cache entries timeout (disappear) after some time (TTL)
TLD servers typically cached in local name servers

DNS records

DNS: distributed database storing resource records(RR)

RR format:(name, value, type, ttl)

Common record types:

A record(address record)
name: domain name
value: IPv4 address
AAAA record(IPv6 address record)
name: domain name
value: IPv6
NS record(name server)
name: domain
value: hostname of authoritative name server for this domain
CNAME
name: alias name for some "canonical"(the real) name(another domain name who has its own records)
value: canonical name
MX record(mail exchange record)
name: domain name
value: name of SMTP mail server associated with the domain name

DNS protocol messages:

DNS message

An DNS response message in wireshark(results with display filter: dns)

DNS message in wireshark

Useful commands

commands

There are some useful commands about DNS: nslookup –option1 –option2 host-to-find dns-server nslookup is used to resolved the IP address of a specific domain. With options like: nslookup –type=NS zju.edu.cn. This gives the DNS servers and theire IP of domain zju.edu.cn. ipconfig \display dns will display the cached DNS records on your computer. ipconfig \flushdns will flush these DNS records.

P2P applications

P2P architecture

Basic idea

no always-on server
arbitrary end systems directly communicate
peers request service from other peers, provide service in return to other peers
self scalability – new peers bring new service capacity, and new service demands
peers are intermittently connected and change IP addresses
complex management
examples: P2P file sharing (BitTorrent), streaming (KanKan), VoIP (Skype)

File distribution time

For the following discussion: F is the file size and u and v are the upload and download speed respectively.

Client-Server model

P2P model

BitTorrent

A peer joining torrent:
has no chunks, but will accumulate them over time from other peers
registers with tracker to get list of peers, connects to subset of peers (“neighbors”)
while downloading, peer uploads chunks to other peers
peer may change peers with whom it exchanges chunks
churn: peers may come and go
once peer has entire file, it may (selfishly) leave or (altruistically) remain in torrent

Requesting chunks: At any given time, different peers have different subsets of file chunks. Periodically, Alice asks each peer for list of chunks that they have. Alice requests missing chunks from peers, rarest first

Sending chunks: tit-for-tat

Alice sends chunks to those four peers currently sending her chunks at highest rate. Other peers are choked by Alice (do not receive chunks from her)
Re-evaluate top 4 every 10 secs
Every 30 secs: randomly select another peer, starts sending chunks, namely“optimistically unchoke” this peer and this newly chosen peer may join top 4
For higher upload rate: find better trading partners

Video streaming and CNDs

Challenges

server-to-client bandwidth will vary over time
packet loss, delay due to congestion will delay playout

Streaming at constant rate

continuous playout constraint: during client video playout, playout timig must match original timing

With client-side buffering:

Streaming at variable rate

DASH

Refer to Dynamic, Adaptive streaming over HTTP. It is an approach that allows a client to adapt the encoding rate of retrieved video to network congestion conditions.

server:
divides video file into multiple chunks
each chunk encoded at multiple different rates
different rate encodings stored in different files
files replicated in various CDN nodes
manifest file: provides URLs for different chunks
client:
periodically estimates server-to-client bandwidth
consulting manifest, requests one chunk at a time
- chooses maximum coding rate sustainable given current bandwidth
- can choose different coding rates at different points in time (depending on available bandwidth at time), and from different servers

General streaming

Streaming video = encoding + DASH + playout buffering

CND(content delivery network)

challenge: how to stream content (selected from millions of videos) to hundreds of thousands of simultaneous users?

store/serve multiple copies of videos at multiple geographically distributed sites (CDN)

enter deep: push CDN servers deep into many access networks
bring home: smaller number of larger clusters in POPs near access nets

An example of netflix

The application layer

Principles of Network Applications

Communication paradigm

Sockets

Protocols

Web and HTTP

HTTP

HTTP message

Cookies

Web caches

HTTP/2

E-mail

DNS(domain name system)

Services

Structure

Caching

DNS records

Useful commands

P2P applications

P2P architecture

File distribution time

BitTorrent

Video streaming and CNDs

Challenges

DASH

CND(content delivery network)

Socket Programming with UDP and TCP

Comments