14-07-2020
Bit Torrent
Introduction
• Scalability issues for client/server systems
Server’s workload grows linearly with
number of clients
Flash crowd problem
14-07-2020
Introduction
• An Ideal Solution is IP Multicast
Same stream is shared by all clients receiving same data
Requires infrastructure-level changes
Security issues
No widely accepted transport protocol on IP multicast layer
Peer to Peer System
• Let clients, now called peers, share the server workload
• Peers forward all the data they receive to other peers
14-07-2020
Peer to Peer System
Advantages
• P2P solutions are
• Scalable:
Downloading bandwidth grows with number of peers
• Easy to deploy:
No additional hardware
No change to network infrastructure
• Cheap
14-07-2020
Issues
• Organizing data transfers:
Figuring which peers have which chunks of data
Deciding where to send these chunks
• Dealing with churning:
Peers come and go
• Enforcing fairness:
Some peers do not upload as many data as they download
BitTorrent
• It was developed by Bram Cohen in 2001
• Description
Allows users to join a "swarm" of hosts to download and upload
from each other simultaneously
Shares contents(files) efficiently using “file swarming”
Needs many concurrent sessions
Adopts Hybrid P2P instead of centralized P2P
14-07-2020
BitTorrent
• Unstructured P2P System
• Peers have no parent peers or child peers
• Centralized tracker
• Collects information on peers
• Responds to requests for that information
• Built-in fairness incentive
• Rechoking favors cooperative peers
• Simple user interface
Terminology
• Peer:
A peer is a node in a network participating in file sharing.
It can simultaneously act both as a server and a client to other nodes on the
network.
• Neighboring peers:
Peers to which a client has an active point to point TCP connection.
• Client:
A client is a user agent (UA) that acts as a peer on behalf of a user.
• Torrent:
A torrent is the term for the file (single-file torrent) or group of files (multi-
file torrent) the client is downloading.
14-07-2020
Terminology
• Swarm:
A network of peers that actively operate on a given torrent.
• Seeder:
A peer that has a complete copy of a torrent.
• Tracker:
A tracker is a centralized server that holds information about one or more
torrents and associated swarms.
It functions as a gateway for peers into a swarm.
• Metainfo file:
A text file that holds information about the torrent, e.g. the URL of the
tracker. It usually has the extension .torrent.
Terminology
• Peer ID:
A 20-byte string that identifies the peer. How the peer ID is obtained is
outside the scope of this document, but a peer must make sure that the
peer ID it uses has a very high probability of being unique in the swarm.
• Info hash:
A SHA1 hash that uniquely identifies the torrent. It is calculated from data in
the metainfo file.
14-07-2020
Bit Torrent Architecture
The architecture consists of
• Tracker
• Static meta info file
• Original downloader
• End user/downloader
Bit Torrent Architecture
• Tracker
Server software that centrally coordinates the transfer of files among users. Does not
contain a copy of the file, only helps peers discover each other
• Meta info file
Mainly contains encoded information regarding
-url of the tracker
-Name of the file
-Hashes of the pieces of the file
• Original downloader
The original downloader is a peer with the whole file
• End user/downloader
The peers without a complete copy of the file
14-07-2020
Piece Selection Algorithm
• Random First Piece
Pieces are selected at random and different blocks can be requested
from different peers
Peer 1 Peer 2 Peer 3 Peer 4 (New)
1
2
3
4
5
Piece Selection Algorithm
• Rarest First Piece
Rarest pieces in the swarm
Peer 1 Peer 2 Peer 3 Peer 4 (New)
1
2
3
4
5
14-07-2020
Piece Selection Algorithm
• Strict policy
Once a block has been requested from a piece, the remaining blocks of
the same piece are requested with highest priority
Peer 1 Peer 2 Peer 3 Peer 4 (New)
1
2
3
4
5
Piece Selection Algorithm
• End Game Mode
Observed that last pieces tends to be downloaded slower. To prevent
slow end downloads, a request is sent to all peers and a cancel is sent
after receiving the piece
Peer 1 Peer 2 Peer 3 Peer 4 (New)
1
2
3
4
5
14-07-2020
Choking
• Choking is a temporary refusal to upload.
• It is one most powerful idea of Bit Torrent
to deal with free riders
• At any given point of time a peer will
have unchoked 4 of its peers
Methods of Choking / Unchoking
• Optimistic Unchoking
Apart from the 4 unchoked peers, an additional unchoke is allowed
which does not depend on the download rate
These are called optimistic unchokes and are selected randomly at every
30 seconds.
This is done to find unused connections which can be better than the
current unchokes
• Anti-snubbing
Sometimes a peer gets choked by all of its peers
If a peer has not received anything in the last 60 seconds it presumes it
has been ‘snubbed’
14-07-2020
Methods of Choking / Unchoking
As per tit-for-tat, it will choke the peers from which it isn’t receiving
anything, except as a an optimistic unchoke
• Upload Only
Once a peer has downloaded the whole file it has no download rates to
use the previous methods of choking
the unchoke is based on the upload rate of the peers, preferring peers
to whom no one is uploading to
Advantages and uses
• Distributing large files like Linux iso images.
• Distributing Software patches and updates.
• Distributing popular files which have high traffic for relatively short
periods
• Unlike traditional server/client downloads, high traffic leads to more
efficient file sharing via BitTorrent.
14-07-2020
Disadvantages and Security issues
• An easy distribution method for pirated/illegal content
• Cannot modify/update the file to newer versions once the torrent has
been distributed
• The IP of all peers and info of files they are downloading are publicly
available on trackers
• The tracker is a critical component and if it fails it can disrupt the
distribution of all the files it has tracking