CN 5
CN 5
Why?
▪ In data networks, devices are assigned IP addresses so that they can participate in sending
and receiving messages over the network.
▪ Most people have a hard time remembering this numeric address.
▪ Domain names were created to convert the numeric address into a simple, recognizable
name.
What?
▪ The Domain name system is the central point of the entire internet, and it is directly
responsible for the way web addresses are used.
▪ The Domain Name System (DNS) is a supporting program that is used by other programs
such as e-mail.
Figure.15 shows an example of how a DNS client/server program can support an e-mail program
to find the IP address of an e-mail recipient. A user of an e-mail program may know the e-mail
address of the recipient; however, the IP protocol needs the IP address. The DNS client program
sends a request to a DNS server to map the e-mail address to the corresponding IP address.
Figure.15 Example of using the DNS services
Concept
2) This address not in DNS server1 cache will try another DNS server.
Label
Each node in the tree has a label, which is a string with a maximum of 63 characters. The root
label is a null string (empty string). DNS requires that children of a node (nodes that branch from
the same node) have different labels, which guarantees the uniqueness of the domain names.
Figure.17 Domain Name Space
Domain Name
Each node in the tree has a domain name. A full domain name is a sequence of labels separated by
dots (.). The domain names are always read from the node up to the root. The last label is the label
of the root (null). This means that a full domain name always ends in a null label, which means
the last character is a dot because the null string is nothing.
Domain
A domain is a subtree of the domain name space. The name of the domain is the domain name of
the node at the top of the subtree. Figure.19 shows some domains. Note that a domain may itself
be divided into domains (or subdomains as they are sometimes called).
Figure.19 Domain
Zone
Since the complete domain name hierarchy cannot be stored on a single server, it is divided among
many servers. What a server is responsible for or has authority over is called a zone. We can define
a zone as a contiguous part of the entire tree. If a server accepts responsibility for a domain and
does not divide the domain into smaller domains, the domain and the zone refer to the same thing.
The server makes a database called a zone file and keeps all the information for every node under
that domain. However, if a server divides its domain into subdomains and delegates part of its
authority to other servers, domain and zone refer to different things.
Figure.21 Zones and Domains
Root Server
A root server is a server whose zone consists of the whole tree. A root server usually does not store
any information about domains but delegates its authority to other servers, keeping references to
those servers. There are several root servers, each covering the whole domain name space. The
servers are distributed all around the world.
A primary server loads all information from the disk file; the secondary server loads all information
from the primary server. When the secondary downloads information from the primary, it is called
zone transfer.
DNS in the Internet
DNS is a protocol that can be used in different platforms. In the Internet, the domain name space
(tree) is divided into three different sections: generic domains, country domains, and the inverse
domain.
Generic Domains
The generic domains define registered hosts according to their generic behaviour. Each node in
the tree defines a domain, which is an index to the domain name space database.
Looking at the tree, we see that the first level in the generic domains section allows 14 possible
labels. These labels describe the organization types as listed in Table.3
Table.3
Country Domains
The country domains section uses two-character country abbreviations (e.g., us for United States).
Second labels can be organizational, or they can be more specific, national designations. The
United States, for example, uses state abbreviations as a subdivision of us (e.g., ca.us.).
Iterative Resolution
If the client does not ask for a recursive answer, the mapping can be done iteratively. If the server
is an authority for the name, it sends the answer. If it is not, it returns (to the client) the IP address
of the server that it thinks can resolve the query. The client is responsible for repeating the query
to this second server. If the newly addressed server can resolve the problem, it answers the query
with the IP address; otherwise, it returns the IP address of a new server to the client. Now the client
must repeat the query to the third server. This process is called iterative resolution because the
client repeats the same query to multiple servers.
Caching
Each time a server receives a query for a name that is not in its domain, it needs to search its
database for a server IP address. Reduction of this search time would increase efficiency. DNS
handles this with a mechanism called caching. When a server asks for a mapping from another
server and receives the response, it stores this information in its cache memory before sending it
to the client. If the same or another client asks for the same mapping, it can check its cache memory
and solve the problem. However, to inform the client that the response is coming from the cache
memory and not from an authoritative source, the server marks the response as unauthoritative.
Cybersquatting
These are common disputes Cybersquatting involves the registrant having registered a name, or
names in most cases, in bad faith to gain some commercial advantage. This can involve trying to
sell it back to a party it knows would be interested in having registration of the domain name for
an inflated price or more commonly using it to direct traffic to their website or the website of a
trade competitor of the trade mark holder in return for payment of a commission.
Gripe sites
Sites such as www.natwestsucks.com or www.stopecg.com have been problematic. Arbitrators
and the courts have been inclined to order the transfer of the offending domain name particularly
if there is some bad faith or a lack of legitimate use. Reasoning for this has been that a non-native
English speaker may not disassociate the "suck" from the trade mark holder's mark.
Some Other Domain name Disputes
Because of the increasing popularity of the Internet, companies have realized that having a domain
name that is the same as their company name or the name of one of their products can be an
extremely valuable part of establishing an Internet presence. A company wishing to acquire a
domain name must file an application with the appropriate agency. Before doing so, a search is
done to see if their desired domain name is already taken. A good site for doing such a search is
provided by Network Solutions. When a company finds that the domain name corresponding to
their corporate name or product trademark is owned by someone else, the company can either
choose a different name or fight to get the domain name back from its current owners.
Some well publicized examples of these types of domain names disputes are:
o candyland.com: Both Hasbro and an adult entertainment provider desired the
candyland.com domain name. Hasbro was too late to register the name itself, but it is never
too late to sue (well, almost never). The domain name is now safely in the hands of Hasbro.
o mcdonalds.com: This domain name was taken by an author from Wired magazine who was
writing a story on the value of domain names. In his article, the author requested that people
contact him at ronald@mcdonalds.com with suggestions of what to do with the domain
name. In exchange for returning the domain name to McDonalds, the author convinced the
company to make a charitable contribution.
o Micros0ft.com: The company Zero Micro Software obtained a registration for
micros0ft.com (with a zero in place of the second 'o'), but the registration was suspended
after Microsoft filed a protest. When the domain name went abandoned for non-payment
of fees, the domain name was picked up by someone else: Vision Enterprises of Roanoke,
TX.
o mtv.com: The MTV domain name was originally taken by MTV video jockey Adam Curry.
Although MTV originally showed little interest in the domain name or the Internet, when
Adam Curry left MTV the company wanted to control the domain name. After a federal
court action was brought, the dispute settled out of court.
o peta.org: An organization entitled "People Eating Tasty Animals" obtained the peta.org
domain name, much to the disgust of the better know People for the Ethical Treatment of
Animals. This domain name was suspended, but as of May 2000 the domain name was still
registered in the name of People Eating Tasty Animals.
o roadrunner.com: When NSI threatened to suspend the roadrunner.com domain name after
a protest by Warner Brothers, the New Mexico Internet access provider who was using the
domain name filed suit to prevent the suspension. Although the access provider was able
to prevent the suspension, a joint venture company involving Time Warner, MediaOne,
Microsoft, Compaq, and Advance/Newhouse eventually obtained the domain name.
o taiwan.com: The mainland China news organization Xinhua was allowed to register the
domain name taiwan.com, much to the disgust of the government of Taiwan.
The more general portion of the statute protects companies against persons who, in bad faith,
register a domain name that is the same or confusingly similar to an existing trademark. The statute
the lists the following factors as elements that a court can consider to determine whether the
domain name was registered in bad faith.
✓ Does the domain name holder have trademark rights in the domain name?
✓ Is the domain name the legal name of the domain name holder, or some other name that is
otherwise commonly used to identify that person?
✓ Has the domain name holder made use (prior to the dispute) of the domain name in
connection with a bona fide sale of goods or services?
✓ Is the domain name holder using the mark in a bona fide noncommercial or fair use way at
a web site accessible at the domain name?
✓ Is the domain name holder attempting to divert consumers from the trademark owner's web
site in a confusing way, either for commercial gain or in an attempt to tarnish or disparage
the trademark mark?
✓ Has the domain name holder offered to sell the domain name to the trademark owner (or
anyone else) for financial gain without having any intent to use the mark with the sale of
goods or services?
✓ Has the domain name holder behaved in a pattern of registering and selling domain names
without intending to use them in connection with the sale of goods or services?
✓ Did the domain name holder provide false information when applying for the registration
of the domain name (or do so in connection with other domain names)?
✓ Has the domain name holder registered domain names of other parties trademarks?
✓ How distinctive and famous is the trademark owner's trademark?
• File Transfer Protocol (FTP)is a standard network protocol used to transfer files from
one host to another host over a TCP-based network, such as the Internet and used in
Application layer of TCP/IP suite.
1. ASCII mode
2. Image mode (commonly called Binary mode)
3. EBCDIC mode
4. Local mode
2. History:
▪ The original specification for the File Transfer Protocol was written by AbhayBhushan
and published as RFC 114 on 16 April 1971 and later replaced by RFC 765 (June 1980)
and RFC 959 (October 1985), the current specification. Several proposed standards amend
RFC 959, for example RFC 2228 (June 1997) proposes security extensions and RFC 2428
(September 1998) adds support for IPv6 and defines a new type of passive mode.
▪ A Request for Comments (RFC) is a publication of the Internet Engineering Task
Force (IETF) and the Internet Society, the principal technical development and standards-
setting bodies for the Internet.
• Two connections are used: the first is the control connection and the second is the data
connection that is managing the data transfer.
• On both sides of the link the FTP application is built with a protocol interpreter (PI) and a
data transfer process (DTP). On the client side of the link there exists also a user interface.
• The user interface communicates with the protocol interpreter, which is in charge of the control
connection.
• The protocol interpreter, besides its function of responding to the control protocol, has also to
manage the data connection. During the file transfer, the data management is performed by the
DTPs.
4. Protocol Overview:
The FTP protocol uses a control connection (the primary connection) and a data connection (the secondary
connection).
• The control connection is the communication path between the USER-PI and SERVER-PI
for the exchange of commands and replies. This connection follows the Telnet Protocol.
• When an FTP client wants to exchange files with an FTP server, the FTP client
must first set up the control connection. The client makes a TCP connection from a random
unprivileged port N (N > 1023) to the FTP server's well known command port 21 (the
IANA assigned port number).
• The protocol requests the control connection to remain open while the data transfer is in
progress.
• A data connection cannot exist without an open control connection.
• The data connection doesn't need to exist all of the time and there can be many data
connections during the lifetime of a control connection.
• It is the responsibility of the user to request the closing of the control connection when
finished using the FTP service. However, it is the server who takes the action to close the
control connection.
▪ The data connection is the communication path between the USER-DTP and SERVER-
DTP for the exchange of the real data, being directory lists and files. Depending on the
chosen FTP mode, the data connection is initiated from the server (active mode) or the
client (passive mode).
5. Overview: FTP Basics Operations:
Goal:
Steps:
1. H1 requests for a control connection with S1.
2. S1 requests for a data connection with H1.
3. S1 transfers data to H1.
4. When data transfer is done, S1 requests to close data connection and control connection.
At H1, user types: ftp 1.1.2.1. It triggers H1 sending a Control Connection Request packet to S1.
When S1 receives this request, it sends an Ack back to H1.Upon receiving Ack, H1 prints "20
FTP Server ready" to indicate that control connection is up.
Note: Here FTP runs in active mode. It is server that initiates data connection. But server needs
to know client's port number first. This is why H1 sends an unsolicited PORT command to S1.
- Upon receiving PORT, S1 sends data_Conn to H1 (source port 20, destination port 54705)
- H1 responds with an Ack_data_Conn. Now data connection is up.
- S1 receives the Ack and sends a message to H1 (not shown in animation)
- H1 receives the message and prints “150 Opening BINARY....” to indicate that data transfer are
starting.
S1 transfers foo to H1
- With data connection established, S1 starts to transmit foo data one packet (ftp_Data) at a time.
- When H1 receives a data packet, it responds an Ack_Data.
- When S1 receives Ack, it sends the next data packet.
- User has no other FTP tasks to do and types "quit." It triggers a message to S1
- When S1 receives the quit message, it sends a goodbye message to H1
- H1 receives this message and prints “221 Goodbye" to tell user that FTP is exited.
- S1 sends Close_Ctrl to close control connection with H1.
- H1 receives the request and sends Ack_Close to confirm. Now FTP control connection is
closed.
Connection Closed
▪ In active mode, the client creates a TCP control connection. While data connection is
initiated by FTP server.
▪ In active mode, the client sends a PORT command to the server. Basically this command
tells the server to which host (IP address) and port number (unprivileged port > 1023)
▪ The server must connect back for the data connection. After accepting the Port command,
the server will then establish the data connection from its local data port 20 (the IANA
assigned port number) to the IP address and port number learned from the PORT
command.
• In Passive mode, the clients are responsible for initiating both the connection control
connection as well as data connection.
• In passive mode, the client sends a PASV command to the server. Basically this command
asks the server to "listen" on a data port (which is not its default data port 20) and to wait
for a connection rather than to initiate one.
• If the server supports the passive mode, it will send a reply to this command including the
host (IP address) and port number (unprivileged port > 1023) this server is listening on.
• The client will then establish the data connection from a local random unprivileged port (>
1023) to the IP address and port number learned from the PASV reply.
6.3 Login
FTP login utilizes a normal username and password scheme for granting access. The
username is sent to the server using the USER command, and the password is sent using
the PASS command. If the information provided by the client is accepted by the server, the
server will send a greeting to the client and the session will commence. If the server
supports it, users may log in without providing login credentials, but the same server may
authorize only limited access for such sessions.
These are the FTP commands that may be sent to an FTP server, these commands are
standardized in RFC 959 by the IETF.
Note that most command-line FTP clients present their own set of commands to users. For
example, GET is the common user command to download a file instead of the raw command
RETR.
8. Advantages of FTP
▪ FTP is the fast and efficient way of transferring bulks of data across the internet.
▪ Allows transferring multiple files as well as directories.
▪ Many FTP clients have the ability to schedule transfers.
▪ No size limitation on single transfers (browsers only allow up to 2 GB)
▪ Many clients have scripting capabilities through command line
▪ Most clients have a synchronizing utility
▪ Faster transfers then HTTP
▪ It has an automatic backup .Whenever you edit your files in your local system you can
update the same by copying it to the host system in your site. So in cases where your site
has crashed and all the data is lost you have a copy of it in your own local system. It also
works the other way round.
▪ FTP gives you control over transfer. That is, you can choose the mode in which the data is
transferred over the network. The data can be transferred either in the ASCII mode (for text
files) or in the Binary mode (for executable or compressed files).
▪ You can work with the directories on the remote systems, delete or rename the remote files
while transferring data between 2 hosts.
9. Disadvantages of FTP
The World Wide Web (WWW) is a store of data connected together from focuses everywhere
throughout the world. The WWW has a remarkable blend of adaptability, convenience, also easy
to understand characteristics that recognize it from different administrations gave by the Internet.
The WWW undertaking was launched by CERN (European Laboratory for Particle Physics) to
make a framework to handle circulated assets vital for logical exploration.
The World Wide Web (WWW, W3) is a data arrangement of interlinked hypertext archives that
are gotten to by means of the Internet. It has likewise generally ended up referred to just as the
Web. Individual record pages on the World Wide Web are called site pages and are gotten to with
a product application running on the client's PC, ordinarily called a web program. Website pages
may contain content, pictures, features, and other mixed media parts, and in addition web route
gimmicks comprising of hyperlinks.
History
Tim Berners-Lee, a British PC researcher and previous CERN representative, is viewed as the
innovator of the Web. On 12 March 1989, Berners-Lee composed a proposition for what would in
the long run turn into the World Wide Web.
Architecture
The WWW today is a dispersed client server administration, in which a customer utilizing a
program
can get to an administration utilizing a server. Notwithstanding, the administration gave is
conveyed over numerous areas called sites, as indicated in Figure.
The client needs to see some information that it knows belongs to site A. It sends a request through
its browser, a program that is designed to fetch Web documents. The request, among other
information, includes the address of the site and the Web page, called the URL. The server at site
A finds the document and sends it to the client. When the user views the document, she finds some
references to other documents ,including a Web page at site B. The reference has the URL for the
new site. The user is also interested in seeing this document. The client sends another request to
the new site, and the new page is retrieved.
Client (Browser)
Each browser usually consists of three parts: a controller, client protocol ,and interpreters. The
controller receives input from the keyboard or the mouse and uses the client programs to access
the document. After the document has been accessed , the controller uses one of the interpreters to
display the document on the screen.The client protocol can be one of the protocol like FTP The
interpreter can be HTML, Java,or JavaScript, depending on the type of document.
Server
The Webpage is put away at the server. Each time a customer solicitation arrives , the relating
record is sent to the client. To enhance efficiency, servers regularly store asked records in a store
in memory; memory is speedier to get to than disk. A server can likewise become more productive
through multithreading or multiprocessing. In this case, a server can answer more than one appeal
at once.
Uniform Resource Locator
An Uniform Resource Locator (abridged URL; otherwise called a web address, especially when
utilized with HTTP) is a particular character string that constitutes a reference to an asset. Most
web programs show the URL of a website page over the page in an address bar.
The URL defines four things : protocol , host computer , port and path as shown in above figure.
Protocol
The protocol is the client/server program used to retrieve the document. Many different protocols
can retrieve a document; among them are FTP or HTTP. The most common today is HTTP.
Host
The host is the computer on which the information is located, although the name of the computer
can be an alias. Web pages are usually stored in computers, and computers are given alias names
that usually begin with the characters "www". This is not mandatory, however, as the host can be
any name given to the computer that hosts the Web page.
Port
The URL can optionally contain the port number of the server. If the port is included, it is inserted
between the host and the path, and it is separated from the host by a colon.
Path
Path is the pathname of the file where the information is located.
Function of WWW
• The WWW works by establishing hypertext/hypermedia links between documents
anywhere on the network.
• A document might include many links to other documents held on many different servers.
• Selecting any one of those links will take you to the related document wherever it is.
e.g. the references at the end of a paper might have hypertext links to the actual documents
held elsewhere.
WWW Hyperlinks
Hyperlinks can link a part of a hypermedia document to
▪ another part of the same document file.
▪ another document file on the same server computer.
▪ another document file on a server computer located elsewhere in the world.
HTTP
The Hypertext Transfer Protocol (HTTP) is a protocol used mainly to access data on the World
Wide Web. HTTP functions as a combination of FTP and SMTP. It is similar to FTP because it
transfers files and uses the services of TCP. However, it is much simpler than FTP because it uses
only one TCP connection. There is no separate control connection; only data are transferred
between the client and the server.
HTTP Trasanction
HTTP itself is a stateless protocol. The client initializes the transaction by sending a request
message. The server replies by sending a response.
Messages
A request message consists of a request line, a header, and sometimes a body. A response
message consists of a status line, a header, and sometimes a body.
Status Phase
This field is used in the response message. It explains the status code in text form.
HEADER
The header exchanges additional information between the client and the server. The header can
consist of one or more header lines. Each header line has a header name, a colon, a space, and a
header value.
A header line belongs to one of four categories: general header, request header, response header,
and entity header. A request message can contain only general, request, and entity headers. A
response message, on the other hand, can contain only general, response, and entity headers.
General Header
The general header gives general information about the message and can be present in both a
request and a response.
Request Header
The request header can be present only in a request message. It specifies the client's configuration
and the client's preferred document format.
Response Header
The response header can be present only in a response message.It specifies the server's
configuration and special information about the request.
Entity Header
The entity header gives information about the body of the document. Although it is mostly present
in response messages, some request messages, such as POST or PUT methods, that contain a body
also use this type of header.
Body
The body can be present in a request or response message. Usually, it contains
the document to be sent or received.
Search Engines
A web search engine is a software system that is designed to search for information on the World
Wide Web. The search results are generally presented in a line of results often referred to as search
engine results pages (SERPs).
The information may be a mix of web pages, images, and other types of files. Some search engines
also mine data available in databases or open directories.
Unlike web directories, which are maintained only by human editors, search engines also maintain
real-time information by running an algorithm on a web crawler.
A Web crawler is an Internet bot that systematically browses the World Wide Web, typically for
the purpose of Web indexing. A Web crawler may also be called a Web spider, an ant, an
automatic indexer, or (in the FOAF software context) a Web scutter.
Web search engines and some other sites use Web crawling or spidering software to update
their web content or indexes of others sites' web content. Web crawlers can copy all the pages
they visit for later processing by a search engine that indexes the downloaded pages so that users
can search them much more quickly