Lecture 3: GFS

  • GFS published in 2003, MapReduce in 2004, Hadoop/HDFS in 2006

Big Storage

  • performance - sharding
  • fault - tolerance
  • tolerance - replication
  • replication - inconsistency
  • consistency - low performance
  • strong vs weak conssitency

  • last write corrupt

GFS

  • big, fast
  • global
  • sharding
  • automatic recovery
  • single data center (really :O)
  • internal use
  • big sequential accces (not random)
  • single master!
    • map reduce has as single master too, but failure is so unlikely its fine to rerun all operations

  • master data
    • file name
    • __ handles__
      • list of chunk server (cs)
      • primary version number (v)
      • lease expiration
    • LOG, CHECKPOINT, DISK
      • append to log efficiently
  • READ
    • name of master
    • master ot list of servers
    • gets chunk server which sends data back

  • WRITE
    • no primary - on master
    • find up to date replicas
    • pick p, s
    • increment version #
  • problem of split brain
    • network partition
    • give a primary a lease (has a timer)
      • primary know who has the lease and can wait for it to expire

  • these are secondaries
  • mostly appends
  • ask if they can do it
    • only write if they promise they can
    • what if primary crashes