Extra 7: CockroachDB, Spanner, MongoDB

timestamp 18:44
Amazon
- fork of MySQL or PostgreSQL
- custom storage on EBS
  - transaction level on EBS
Redis
- In memory
- single thread engine

CockroachDB

Whitepaper

Database Don't die
Open Source like Spanner
Distributed Database
- Decentralized Shared Nothing
- Log Structured Storage Architecture at individual nodes (RocksDB)
- Concurrency Control Model: Multi-Version OCC
- Serializable Isolation

multi-layer
- RocksDB storage manager (low level)
- Raft - replication and consensus
- key-value api
  - for pages to fetch data on

hybrid clock
- order transaction globally
transaction stage intent check conflict commits (?)

global keyspace
- key -> data
leader
- raft replication
instead of buffer manager -> disk (your classic single node database)
- get from distributed key value store system
FoundationDB does the same thing (distributed database)
engineering! done well, fault tolerant and all the edge cases!

Spanner

Cloud Spanner
Google wrote BigTable in 2006
- give up SQL
- give up joins
- column-based database
Adwords ran on sharded MySQL
- needed transactions
- Megastore

2011
geo-replicated
schematized, semi-relational ?
log structured on disk
strict 2PC MVCC Multi-Paxos 2PC
- Paxos groups
- External Consistency
  - global write_transaction synchronous replication
- lock-free read only transactions

joins are slow :'(
- have to go to another node/table
interleave
- single page efficient physical denormalization

wound-wait deadlock prevention
- don't need deadlock detection
- ordering throuhg unique timestamps from atomic clocks and GPS devices
tablets (shards)
- paxos - elect leader in tablet group
- 2PC for txn spanning tablets

completely physical wall-clock time
- necessary for linearizability
- paxos group decide order transaction commit

internal Google API
- time range

wait long enough
- then can commit
  - at commit + release locks

F1 has OCC
- now SQL
- Built for the ad system
Spanner SQL
- for everyone else

good benchmarks but expensive!

MongoDB

shared nothing document database (2007)
- json document
- now can do multi-action transactions

don't join

instead embed single document
pre join

the json example

no query optimizer
- just heuristics
instead generate a bunch of query plans
- execute all of them!
  - whichever is fastest return

shared nothing architecture
master slave replication
auto-sharding
- partition by hash or range
- automaticly split if a shard is too big
startups liked this, oh if I grow, it'll shard automaticly

mmap - OS storage manage
- changed
- storage backend is WiredTiger but could replace with RocksDB

avoid premature optimizations!
- based on months in the future
MySQL and Postgres should be fine for most cases
If need to ACTUALLY scale, you have money - instead find and pay smart people to help