Lecture 3: GFS

GFS published in 2003, MapReduce in 2004, Hadoop/HDFS in 2006

Big Storage

performance - sharding
fault - tolerance
tolerance - replication
replication - inconsistency
consistency - low performance
strong vs weak conssitency

last write corrupt

GFS

big, fast
global
sharding
automatic recovery
single data center (really :O)
internal use
big sequential accces (not random)
single master!
- map reduce has as single master too, but failure is so unlikely its fine to rerun all operations

master data
- file name
- __ handles__
  - list of chunk server (cs)
  - primary version number (v)
  - lease expiration
- LOG, CHECKPOINT, DISK
  - append to log efficiently
READ
- name of master
- master ot list of servers
- gets chunk server which sends data back

WRITE
- no primary - on master
- find up to date replicas
- pick p, s
- increment version #
problem of split brain
- network partition
- give a primary a lease (has a timer)
  - primary know who has the lease and can wait for it to expire

these are secondaries
mostly appends
ask if they can do it
- only write if they promise they can
- what if primary crashes