Extra 5: Serverless, Coordination-free Distributed Computing, and the CALM Theorem

  • 4 parts
    • Serverless
    • Avoiding Coordination
    • CALM Theorem
    • Hydro

  • new platform, new language!
    • super computers with new programming paradigms!

  • how will people program the cloud!?
    • building a program modeling is hard
    • distributed systems, consistent, and partial failure

  • more popular than Map Reduce

Serverless

  • fine grain resource usage and efficiency
    • new economy models for cloud providers and users

  • auto scaling!
  • not unbounded distributed computing

  • pay per IO
    • can't do the "batch" to disks
  • No inbound network communcation
    • makes distributed comp difficult
      • embarrassingly parallel like Map work
      • Reduce doesn't :(
        • communication heavy, like shuffles in Spark :'(

Avoiding Coordination

  • How do you embrace state
    • data gravity
    • consistency - hard :(

  • consistency over long distances is hard
    • split brain problem

  • coordination based consistency is bad!

  • make consistency as small as possible

  • coordination has really bad tail latency
    • slowdown cascades

  • instead reason about application semantics
    • rich application logic to READS and WRITES
    • formalize semantics!

CALM Theorem

  • Consistency as Logical Monotonicity
    • if they are logically monotonic, it's consistent!

  • programming confluence
  • only care about outcomes

  • distribtued deadlock detection
    • checks cycles
    • there exists
  • garbage collection
    • reference between objects on different machines
    • 05 to 06 are garbage
      • but machine 2 can't say its garbage until machine 3
      • it requires coordination

  • you can get crazy parallelism!
    • share nothing!
    • how to write in logic language instead of declarative language?
      • like SQL
      • maybe it'll be internal language IR for compiler
        • like databases too!
    • have our cake and eat it too!

Hydro: Stateful Serverless and Beyond

  • Anna autoscaling multi-tier KVS
  • Cloudburst Stateful FaaS with caches
  • HydroLogic, an IR (doesn't depend on order)
  • Hydrolysis, compiler

  • Anna be like Redis and S3
  • CALM consistency of simple lattices
    • autoscaling
    • best-of-conference
    • multi-tiered!
      • can be in fast memory
      • or slow persistent disk

  • shared nothing at all scales and threads!!!
  • under contention, cache thrashing problems

  • auto scaling
    • cost 350x performance

  • robot motion planning

  • serverless jupyter
  • each cell is running a cloudburst lambda! :O

  • can handle all the memroy pursue

  • sharing model state!

  • motion planning with lambdas

  • run compute, then share state

  • need a coordination EC2, bottleneck

  • much quicker cost