Computer Science
cs670
The University Of Alabama in Huntsville
                                                                    BitTorrent
                                          G. W. Cox – Spring 2005
                                                                     What it is
Computer Science
                                                                                                  cs670
                                           • A p-t-p file sharing system with some novel
                                             features:
The University Of Alabama in Huntsville
                                                  – Load sharing through file splitting
                                                  – Uses bandwidth of peers instead of a server
                                                    (distributes bandwidth costs)
                                                  – A concept of “fairness” among peers
                                           • Successfully used:
                                                  – Used to distribute RedHat 9 ISOs (about 80TB)
                                                  – Lindows: 50% price break for downloading OS via
                                                    BitTorrent
                                          G. W. Cox – Spring 2005
                                                                                                          1
                                                                     The idea (setup)
Computer Science
                                                                                                         cs670
                                           • A “seed” node has the file
The University Of Alabama in Huntsville
                                           • File is split into fixed-size segments (256KB
                                             typ)
                                           • Hash calculated for each segment
                                           • A “tracker” node is associated with the file
                                           • A “.torrent” meta-file is built for the file –
                                             identifies the address of the tracker node
                                           • The .torrent file is passed around the web
                                          G. W. Cox – Spring 2005
                                                                    The idea (download)
Computer Science
                                                                                                         cs670
                                           • A client wants to download the file
                                           • Contacts the tracker identified in the .torrent file (using HTTP)
The University Of Alabama in Huntsville
                                           • Tracker sends client a (random) list of peers who have/are
                                             downloading the file
                                           • Client contacts peers on list to see which segments of the file
                                             they have
                                           • Client requests segments from peers (via TCP)
                                           • Client uses hash from .torrent to confirm that segment is
                                             legitimate
                                           • Client reports to other peers on the list that it has the segment
                                           • Other peers start to contact client to get the segment (while
                                             client is getting other segments)
                                          G. W. Cox – Spring 2005
                                                                                                                 2
                                                                                   The flow
Computer Science
                                                                                                             cs670
                                                                    Segments
                                                    Original           5
The University Of Alabama in Huntsville
                                                                           4                    Seed
                                                     File                      3                Node
                                                                                   2
                                                                                       1
                                                                                                3           Tracker
                                                                                       1                     Node
                                                                                           Client
                                                                                           Node
                                                                                                        3
                                                                                                            New
                                                                       Client                       1       Client
                                                                       Node
                                          G. W. Cox – Spring 2005
                                                                       Load sharing
Computer Science
                                                                                                             cs670
                                           • As the file segments are downloaded by
The University Of Alabama in Huntsville
                                             more and more peers, the peers
                                             become the sources for further
                                             downloads
                                           • Because the tracker randomizes the list
                                             of peers, the load gets spread randomly
                                          G. W. Cox – Spring 2005
                                                                                                                      3
                                                                            Fairness
Computer Science
                                                                                                          cs670
                                           • To spread the load, clients have to have an incentive
                                             to allow peers to download from them – BitTorrent
The University Of Alabama in Huntsville
                                             does this with an economic system
                                           • Each peer A keeps a record of the peers it has gotten
                                             segments from
                                           • When a peer B asks to download from A:
                                                  – A checks its list
                                                  – If B has given A segments in the past, B’s request gets a
                                                    higher priority
                                                  – Else, B’s request gets a low priority
                                           • The net result is a “tit-for-tat” arrangement – in order
                                             to be a successful downloader, a peer must allow
                                             frequent uploading
                                          G. W. Cox – Spring 2005
                                                                    Additional load spreading
Computer Science
                                                                                                          cs670
                                           • When a client needs to choose which segment to
                                             request first, a common approach is “rarest-first”
The University Of Alabama in Huntsville
                                           • In rarest-first approach, client identifies the segment
                                             held by the fewest of its peers and tries to download
                                             that segment
                                           • This tends to keep the number of sources for each
                                             segment as high as possible, speading load
                                           • Rarest-first also gives the client an “attractive”
                                             segment, making it a more popular source, and
                                             giving it more “credits” to download more segments
                                          G. W. Cox – Spring 2005