Application Layer
-2.1 Principles of network applications
Some network apps:
▪ e-mail
web
text messaging
remote login
P2P file sharing
multi-user network games
streaming stored video (YouTube, Hulu, Netflix)
voice over IP (e.g., Skype)
real-time video conferencing
social networking
search
At the core of network application development is writing programs that run on different end systems and communicate
with each other over the network. For example, in the Web application there are two distinct programs that communicate
with each other: the browser program running in the user’s host (desktop, laptop, tablet, smartphone, and so on); and the
Web server program running in the Web server host.
Importantly, you do not need to write software that runs on network- core devices, such as routers or link-layer switches.
Even if you wanted to write application software for these network-core devices, you wouldn’t be able to do . Network-
core devices do not function at the application layer but instead function at lower layers— specifically at the network layer
and below. This basic design—namely, confining application software to the end systems , has facilitated the rapid
development and deployment of a vast array of network applications.
Possible structure of applications:
client-server
peer-to-peer (P2P)
Client-server architecture
Server:
In a client-server architecture, there is an always-on host, called the server, which services requests from many other
hosts, called clients . Another characteristic of the client-server architecture is that the server has a fixed, well-known
address, called an IP address .
Some of the better-known applications with a client-server architecture include the Web, FTP, Telnet, and e-mail.
A data center, housing a large number of hosts, is often used to create a powerful virtual server. The most popular
Internet services—such as search engines (e.g., Google and Bing), Internet commerce (e.g., Amazon and e-Bay), Web-
based email (e.g., Gmail and Yahoo Mail), social networking (e.g., Facebook and Twitter)— employ one or more data
centers.
Client:
communicate with server
may be intermittently connected
may have dynamic IP addresses
do not communicate directly with each other
P2P architecture:
In a P2P architecture, there is minimal (or no) reliance on dedicated servers in data centers. Instead the application
exploits direct communication between pairs of intermittently connected hosts, called peers. The peers are not owned by
the service provider, but are instead desktops and laptops controlled by users, with most of the peers residing in homes,
universities, and offices. Because the peers communicate without passing through a dedicated server, the architecture is
called peer-to-peer. Many of today’s most popular and traffic-intensive applications are based on P2P architectures. These
applications include file sharing (e.g., BitTorrent), peer-assisted download acceleration (e.g., Xunlei), Internet Telephony
(e.g., Skype), and IPTV
For many instant messaging applications, servers are used to track the IP addresses of users, but user-to-user messages are
sent directly between user hosts (without passing through intermediate servers).
One of the most compelling features of P2P architectures is their self-scalability. For example, in a P2P file-sharing
application, although each peer generates workload by requesting files, each peer also adds service capacity to the system
by distributing files to other peers. P2P architectures are also cost effective, since they normally don’t require significant
server infrastructure and server bandwidth
Processes communicating
A process can be thought of as a program that is running within an end system. When processes are running on the same
end system, they can communicate with each other with interprocess communication.
Processes on two different end systems communicate with each other by exchanging messages across the computer
network
In the context of a communication session between a pair of processes, the process that initiates the communication is
labeled as the client. The process that waits to be contacted to begin the session is the server.
Applications with P2P architectures have client processes & server processes
Sockets
Any message sent from one process to another must go through the underlying network. A process sends messages into,
and receives messages from, the network through a software interface called a socket .
A process is analogous to a house and its socket is analogous to its door. When a process wants to send a message to
another process on another host, it shoves the message out its door (socket). This sending process assumes that there is a
transportation infrastructure on the other side of its door that will transport the message to the door of the destination
process. Once the message arrives at the des- tination host, the message passes through the receiving process’s door
(socket), and the receiving process then acts on the message
Addressing Processes
In order for a process running on one host to send packets to a process running on another host, the receiving process
needs to have an address. To identify the receiving process, two pieces of information need to be specified: (1) the address
of the host and (2) an identifier that specifies the receiving process in the destination host.
In the Internet, the host is identified by its IP address. An IP address is a 32-bit quantity that we can think of as uniquely
identifying the host. In addition to knowing the address of the host to which a message is destined, the sending process
must also identify the receiving process running in the host. This information is needed because in gen- eral a host could
be running many network applications. A destination port number serves this purpose.
Application-Layer Protocols
An application-layer protocol defines how an application’s processes, running on dif- ferent end systems, pass messages
to each other. In particular, an application-layer protocol defines:
The types of messages exchanged, for example, request messages and response messages
The syntax of the various message types, such as the fields in the message and how the fields are delineated
The semantics of the fields, that is, the meaning of the information in the fields
Rules for determining when and how a process sends messages and responds to
messages
Some application-layer protocols are specified in RFCs and are therefore in the public domain. For example, the Web’s
application-layer protocol, HTTP, is available as an RFC. If a browser developer follows the rules of the HTTP RFC, the
browser will be able to retrieve Web pages from any Web server that has also followed the rules of the HTTP RFC. Many
other application-layer protocols are proprietary and intentionally not available in the public domain. For example, Skype,
Zoom uses proprietary application-layer protocols.
What transport service does an app need?
Data integrity
some apps (e.g., file transfer, web transactions) require 100% reliable data transfer
other apps (e.g., audio) can tolerate some loss timing
Throughput
some apps (e.g., multimedia) require minimum amount of throughput to be “effective”
other apps (“elastic apps”) make use of whatever throughput they get
Security
encryption, data integrity
Timing
Some apps(e.g, Internet telephony, interactive games require low delay to be “effective”
The Internet (and, more gen. erally, TCP/IP networks) makes two transport protocols available to applications, UDP and
TCP.
TCP service:
reliable transport between sending and receiving process
flow control: sender won’ t overwhelm receiver
congestion control: throttle sender when network overloaded
does not provide: timing, minimum throughput guarantee, security
connection-oriented: setup required between client and server processe
UDP service:
▪ unreliable data transfer between sending and receiving process
▪ does not provide: reliability, flow control, congestion control, timing, throughput guarantee, security, or connection setup,
SECURING TCP
Neither TCP nor UDP provide any encryption—the data that the sending process pass- es into its socket is the same data that
travels over the network to the destination process. So, for example, if the sending process sends a password in cleartext (i.e.,
unencrypted) into its socket, the cleartext password will travel over all the links between sender and receiver, potentially getting
sniffed and discovered at any of the intervening links.
Because privacy and other security issues have become critical for many applica- tions, the Internet community has developed an
enhancement for TCP, called Secure Sockets Layer (SSL). TCP-enhanced-with-SSL not only does everything that traditional
TCP does but also provides critical process-to-process security services, including encryption, data integrity, and end-point
authentication. We emphasise that SSL is not a third Internet transport protocol, on the same level as TCP and UDP, but instead
is an enhancement of TCP
When an application uses SSL, cleartext passwords sent into socket traverse Internet encrypted.
2.2 Web and HTTP
Overview of HTTP
The HyperText Transfer Protocol (HTTP), the Web’s application-layer protocol, is at the heart of the Web. It is defined
in [RFC 1945] and [RFC 2616]. HTTP is implemented in two programs: a client program and a server program. The client
program and server program, executing on different end systems, talk to each other by exchanging HTTP messages. HTTP
defines the structure of these messages and how the client and server exchange the messages.
A Web page (also called a document) consists of objects. An object is simply a file—such as an HTML file, a JPEG
image, a Java applet, or a video clip—that is addressable by a single URL. Most Web pages consist of a base HTML file
and several referenced objects. For example, if a Web page contains HTML text and five JPEG images, then the Web page
has six objects: the base HTML file plus the five images. The base HTML file references the other objects in the page with
the objects’ URLs. Each URL has two components: the hostname of the server that houses the object and the object’s path
name.
It’s a client/server model
client: browser that requests, receives, (using HTTP protocol) and “displays” Web objects
server: Web server sends (using HTTP protocol) objects in response to requests
HTTP uses TCP as its underlying transport protocol (rather than running on top of UDP).
The HTTP client first initiates a TCP connection with the server. Once the con- nection is established, the browser and the
server processes access TCP through their socket interfaces
Because an HTTP server maintains no informa- tion about the clients, HTTP is said to be a stateless protocol.
HTTP connections
In the former approach, the application is said to use non-persistent connections; and in the latter approach, persistent
connections.
non-persistent HTTP
at most one object sent over TCP connection
connection then closed
downloading multiple objects required multiple connections
persistent HTTP
▪ multiple objects can be sent over single TCP connection between client, server
HTTP with Non-Persistent Connections
Let’s suppose the page consists of a base HTML file and 10 JPEG images, and that all 11 of these objects reside on the
same server. Further suppose the URL for the base HTML file is
http://www.someSchool.edu/someDepartment/home.index
Here is what happens:
1. The HTTP client process initiates a TCP connection to the server www.someSchool.edu on port number 80,
which is the default port num- ber for HTTP. Associated with the TCP connection, there will be a socket at the
client and a socket at the server.
2. TheHTTPclientsendsanHTTPrequestmessagetotheserverviaitssocket.The request message includes the path name
/someDepartment/home.index. (We will discuss HTTP messages in some detail below.)
3. The HTTP server process receives the request message via its socket, retrieves the object
/someDepartment/home.index from its storage (RAM or disk), encapsulates the object in an HTTP response
message, and sends the response message to the client via its socket.
4. The HTTP server process tells TCP to close the TCP connection. (But TCP doesn’t actually terminate the
connection until it knows for sure that the client has received the response message intact.)
5. The HTTP client receives the response message. The TCP connection termi- nates. The message indicates that
the encapsulated object is an HTML file. The client extracts the file from the response message, examines the
HTML file, and finds references to the 10 JPEG objects.
6. The first four steps are then repeated for each of the referenced JPEG objects.
Round-trip time (RTT), which is the time it takes for a small packet to travel from client to server and then back to the
client. The RTT includes packet-propagation delays, packet- queuing delays in intermediate routers and switches, and
packet-processing delays.
“three-way handshake”—the client sends a small TCP segment to the server, the server acknowledges and responds with a
small TCP segment, and, finally, the client acknowledges back to the server. The first two parts of the three- way
handshake take one RTT. After completing the first two parts of the hand- shake, the client sends the HTTP request
message combined with the third part of the three-way handshake (the acknowledgment) into the TCP connection. Once
the request message arrives at the server, the server sends the HTML file into the TCP connection. This HTTP
request/response eats up another RTT. Thus, roughly, the total response time is two RTTs plus the transmission time at the
server of the HTML file.
Persistent HTTP (HTTP 1.1)
non-persistent HTTP issues:
requires 2 RTTs per object
OS overhead for each TCP connection
browsers often open parallel TCP connections to fetch referenced objects
persistent HTTP (HTTP 1.1):
With persistent connections, the server leaves the TCP connection open after sending a response. Subsequent requests and
responses between the same client and server can be sent over the same connection.
client sends requests as soon as it encounters a referenced object
as little as one RTT for all the referenced objects
There are two types of HTTP messages, request messages and response messages
HTTP Request Message
Below we provide a typical HTTP request message:
GET /somedir/page.html HTTP/1.1 Host: www.someschool.edu
POST method:
▪ web page often includes form input
user input sent from client to server in entity body of HTTP POST request message GET method (for sending data to
server):
o include user data in URL field of HTTP GET request message (following a ‘?’):
HEAD method:
requests headers (only) that would be returned if specified URL were requested with an HTTP GET method.
PUT method:
uploads new file (object) to server
completely replaces file that exists at specified UR
Method types
HTTP/1.0:
GET
POST
HEAD
• asks server to leave requested object out of response
HTTP/1.1:
▪ GET, POST, HEAD
PUT
uploads file in entity body to path specified in URL field
DELETE
• deletes file specified in the URL field
HTTP response status codes
200 OK — request succeeded, requested object later in this msg
301 Moved Permanently — requested object moved, new location specified later in this msg (Location:)
400 Bad Request — request msg not understood by server
404 Not Found — requested document not found on this server 505 HTTP Version Not Supported
HTTP/2
HTTP/2: [RFC 7540, 2015] increased flexibility at server in sending objects to client:
methods, status codes, most header fields unchanged from HTTP 1.1
transmission order of requested objects based on client-specified object priority (not necessarily FCFS)
push unrequested objects to client
divide objects into frames, schedule frames to mitigate HOL blocking
HTTP/2 to HTTP/3
HTTP/2 over single TCP connection means:
recovery from packet loss still stalls all object transmissions
• as in HTTP 1.1, browsers have incentive to open multiple parallel TCP connections to reduce stalling, increase overall
throughput
▪ no security over vanilla TCP connection
HTTP/3: adds security, per object error- and congestioncontrol (more pipelining) over UDP
• more on HTTP/3 in transport layer
Cookies
Cookie technology has four components:
(1) a cookie header line in the HTTP response message;
(2) a cookie header line in the HTTP request message;
(3) a cookie file kept on the user’s end system and managed by the user’s browser; and
(4) a back-end database at the Web site.
what cookies can be used for:
▪ authorization
shopping carts
recommendations
user session state (Web e-mail)
Web Caching
A Web cache—also called a proxy server—is a network entity that satisfies HTTP requests on the behalf of an origin
Web server. The Web cache has its own disk storage and keeps copies of recently requested objects in this storage.
A cache is both a server and a client at the same time. When it receives requests from and sends responses to a browser, it
is a server. When it sends requests to and receives responses from an origin server, it is a client.
Typically a Web cache is purchased and installed by an ISP
Web caching has seen deployment in the Internet for these reasons:
reduce response time for client request
reduce traffic on an institution’s access link
Internet dense with caches: enables “ poor ” content providers to effectively deliver content (so too does P2P file
sharing)
The Conditional GET
Although caching can reduce user-perceived response times, it introduces a new prob- lem—the copy of an object residing
in the cache may be stale. In other words, the object housed in the Web server may have been modified since the copy was
cached at the client. Fortunately, HTTP has a mechanism that allows a cache to verify that its objects are up to date. This
mechanism is called the conditional GET.
An HTTP request message is a so-called conditional GET message if
the request message uses the GET method and the request message includes an
If-Modified- Since: header line.
2.3 Electronic mail
Electronic mail has been around since the beginning of the Internet. It was the most popular application when the Internet
was in its infancy and has become more and more elaborate and powerful over the years. It remains one of the Internet’s
most important and utilized applications. It has three major components: user agents, mail servers, and the Simple Mail
Transfer Protocol (SMTP).
Example: Alice, sending an e-mail message to a recipient, Bob
User agents allow users to read, reply to, forward, save, and compose mes- sages. Microsoft Outlook and Apple Mail are
examples of user agents for e-mail.
—outgoing, incoming messages stored on server
Mail servers form the core of the e-mail infrastructure. Each recipient, such as Bob, has a mailbox located in one of the
mail servers. Bob’s mailbox manages and maintains the messages that have been sent to him. A typical message starts its
journey in the sender’s user agent, travels to the sender’s mail server, and travels to the recipient’s mail server, where it is
deposited in the recipient’s mailbox. If Alice’s server can- not deliver mail to Bob’s server, Alice’s server holds the
message in a message queue and attempts to transfer the message later.
SMTP
Is the principal application-layer protocol for Internet electronic mail. It uses the reliable data transfer service of TCP to
transfer mail from the sender’s mail server to the recipient’s mail server. As with most application-layer protocols, SMTP
has two sides: a client side, which executes on the sender’s mail server, and a server side, which executes on the
recipient’s mail server. Both the client and server sides of SMTP run on every mail server. When a mail server sends mail
to other mail servers, it acts as an SMTP client. When a mail server receives mail from other mail servers, it acts as an
SMTP server.
direct transfer: sending server to receiving server
three phases of transfer
• handshaking (greeting)
• transfer of messages
• closure
command/response interaction (like HTTP)
• commands: ASCII text
• response: status code and phrase
SMTP uses persistent connections and SMTP server uses CRLF.CRLF to determine end of message.
Comparing with HTTP: HTTP: pull SMTP: push
both have ASCII command/response interaction, status codes
HTTP: each object encapsulated in its own response message
SMTP: multiple objects sent in multipart message
2.4 DNS—The Internet’s Directory Service
Just as humans can be identified in many ways, so too can Internet hosts. One identi- fier for a host is its hostname.
Because hostnames can consist of variable- length alphanumeric characters, they would be difficult to process by routers.
For these reasons, hosts are also identified by so-called IP addresses.
The DNS is
7. a distributed database implemented in a hierarchy of DNS servers, and
8. an application-layer protocol that allows hosts to query the distributed database.
DNS is commonly employed by other application-layer protocols—including HTTP, SMTP, and FTP—to translate user-
supplied hostnames to IP addresses.
Host aliasing. A host with a complicated hostname can have one or more alias names. For example, a hostname such as
relay1.west-coast.enter- prise.com could have, say, two aliases such as enterprise.com and www.enterprise.com. In this
case, the hostname relay1.west- coast.enterprise.com is said to be a canonical hostname. Alias host- names, when present,
are typically more mnemonic than canonical hostnames.
Mail server aliasing. For obvious reasons, it is highly desirable that e-mail addresses be mnemonic. For example, if Bob
has an account with Hotmail, Bob’s e-mail address might be as simple as bob@hotmail.com. However, the host- name of
the Hotmail mail server is more complicated and much less mnemonic than simply hotmail.com (for example, the
canonical hostname might be something like relay1.west-coast.hotmail.com).
Load distribution. DNS is also used to perform load distribution among repli- cated servers, such as replicated Web
servers: many IP addresses correspond to one name
why not centralize DNS?
Single point of failure.If the DNS server crashes, so does the entire Internet!
Traffic volume. A single DNS server would have to handle all DNS queries (for all the HTTP requests and e-
mail messages generated from hundreds of millions of hosts).
Distant centralized database . A single DNS server cannot be “close to” all the querying clients.
Maintenance.The single DNS server would have to keep records for all Internet hosts.
DNS: a distributed, hierarchical database
In order to deal with the issue of scale, the DNS uses a large number of servers, organized in a hierarchical fashion and
distributed around the world.o a first approximation, there are three classes of DNS servers—root DNS servers, top-level
domain (TLD) DNS servers, and authoritative DNS servers—organized in a hierarchy as shown in Fig- ure 2.19. To
understand how these three classes of servers interact, suppose a DNS client wants to determine the IP address for the
hostname www.amazon.com. To a first approximation, the following events will take place. The client first contacts one
of the root servers, which returns IP addresses for TLD servers for the top-level domain com. The client then contacts one
of these TLD servers, which returns the IP address of an authoritative server for amazon.com. Finally, the client contacts
one of the authoritative servers for amazon.com, which returns the IP address to the hostname www.amazon.com
Root DNS servers. In the Internet there are 13 root DNS servers .
contacted by local name server that can not resolve name
root name server:
—Very important in the Internet
— Provides IP address for TDL
Top-level domain (TLD) servers. These servers are responsible for top-level domains such as com, org, net, edu, and
gov, and all of the country top-level domains such as uk, fr, ca, and jp. The company Verisign Global Registry Services
maintains the TLD servers for the com top-level domain, and the company Educause maintains the TLD servers for the
edu top-level domain. See [IANA TLD 2012] for a list of all top-level domains.
Authoritative DNS servers. Every organization with publicly accessible hosts (such as Web servers and mail servers) on
the Internet must provide publicly acces- sible DNS records that map the names of those hosts to IP addresses. An
organiza- tion’s authoritative DNS server houses these DNS records. An organization can choose to implement its own
authoritative DNS server to hold these records; alter- natively, the organization can pay to have these records stored in an
authoritative DNS server of some service provider.
There is another important type of DNS server called the local DNS server. A local DNS server does not strictly belong to
the hierarchy of servers but is nevertheless central to the DNS architecture. Each ISP—such as a university, an academic
department, an employee’s company, or a residential ISP—has a local DNS server (also called a default name server)
.When a host makes a DNS query, the query is sent to the local DNS server, which acts a proxy, forwarding the query into
the DNS server hierarchy