Ivy: A Read/Write
Peer-to-Peer File
System
Athicha Muthitacharoen, Robert Morris, Thomer M. Gil,
and Benjie Chen
Slides by: Brian Madden
Ivy: Introduction
• Ivy...
• ... is a multi-user read/write peer to peer file
system
• ... has no centralized or dedicated components
Outline
• Design
• Implementation
• Performance
• Conclusions
Ivy: Design
• Overview:
• Uses DHash distributed hash table
• Consists of a set of logs
• Presents semantics like those of NFS v3
Ivy: DHash
• DHash is a distributed peer-to-peer hash table
• Maps keys to arbitrary values
• Key/Value pairs distributed by key
• Block replication to protect against failures
• Encryption for integrity
Ivy: DHash
• Participants communicate only via DHash storage
• Uses simple put(key, value) / get(key) interface
• Replicates blocks to avoid data loss from node
crashes
Ivy: DHash
• Integrity is ensured via encryption
• SHA-1 hashes provide verification of contents
• Public key crypto provides authenticated writes
Ivy: Log Structure
• Consists of a linked list of immutable log records
• Each record is DHash content block
Field Use
DHash key of next oldest log
prev
record
head DHash key of log-head
seq per-log sequence number
timestamp time at which record was created
version version vector
Ivy: Log Structure
• A log record contains information about a single file
system operation
• Roughly equates to an NFS operation
• Contain minimum possible information
Ivy: Log Structure
• Log-head - A public key block with fixed DHash key
• Easy for other participants to find
• Stores key of most recent log record
Ivy: Log Structure
• Log records are kept indefinitely to recover from
malicious participants or from network partitions
Ivy: Concurrency
• Each participant has their own log
• Logs combine to form file systems
• i-number of root directory and head of each
log in the file system is stored in immutable
record called a view block
Ivy: Combining Logs
• In an Ivy file system with multiple logs
• Version vector is used to order log events
• Concurrent updates require application specific
conflict resolution
Ivy: Snapshots
• Every participant periodically creates a private
snapshot of the file system
• Contains entire state of the file system
• Avoids traversing the entire log
• Stored in DHash to make persistent
Ivy: Semantics
• Ivy provides NFS like application semantics but...
• Difficulties arise when the network is partitioned
• Ivy maximizes availability and allows updates
to proceed in all partitions
• Relies on DHash replication to hope data is
replicated across all partitions
Ivy: Semantics
• Each participant keeping a log prevents conflicts
within individual logs
• Ivy has mechanisms to deal with concurrent
updates within a view
• If an application relies on file system techniques for
mutual exclusion no guarantees are made
Ivy: Conflict
Resolution
• lc tool provided with Ivy
• Detects conflicting application updates
• Provides multiple historic views of the
problem file to allow for manual merging
• Application specific resolvers can also be
used
Ivy: Security
• To deal with malicious users:
• Create new view that...
• Does not include bad log
• Includes all logs up to a previous time point
Ivy: Implementation
• Written in C++
• Runs on FreeBSD
• Uses SFS toolkit for event driven programming
• Uses NFS loop-back server support
Ivy: Implementation
Ivy: Performance
• Tests run with replication turned OFF
• Results averaged over 5 runs
• Tested using modified Andrew Benchmark
• Creates directory hierarchy, copies files to
directories, walk the directory structure reading
attributes of each file, read files, compile files
into a program.
Ivy: Performance
Single log & machine 4 logs over WAN
Phase Ivy(s) NFS(s) Phase Ivy(s) NFS(s)
Mkdir 0.6 0.5 Mkdir 11.2 4.8
Create/Write 6.6 0.8 Create/Write 89.2 42.0
Stat 0.6 0.2 Stat 65.6 47.8
Read 1.0 0.8 Read 65.8 55.6
Compile 10.0 5.3 Compile 144.2 130.2
Total 18.8 7.6 Total 376.0 280.4
Ivy: Performance
• Analysis of the local run suggests the previous
performance numbers are inflated heavily by
SHA-1 hashing and memory copies
Ivy: Performance
• Analysis of the WAN run suggests that 274 of the
376 seconds are spent fetching log heads
• Ivy is slower than NFS because Ivy operations
often require more network round-trips
Ivy: Performance
Ivy: Performance
Ivy: Performance
Ivy: Conclusions
• We presented Ivy, a read/write peer-to-peer file
system
• Ivy is suitable for small groups of cooperating
participants
• An Ivy file system consists of a set of logs
• Ivy doesn’t require trust
• Ivy is 2-3 times slower than NFS