- timestamp 18:44
- Amazon
- fork of MySQL or PostgreSQL
- custom storage on EBS
- Redis
- In memory
- single thread engine
- Database Don't die
- Open Source like Spanner
- Distributed Database
- Decentralized Shared Nothing
- Log Structured Storage Architecture at individual nodes (RocksDB)
- Concurrency Control Model: Multi-Version OCC
- Serializable Isolation
- multi-layer
- RocksDB storage manager (low level)
- Raft - replication and consensus
- key-value api
- for pages to fetch data on
- hybrid clock
- order transaction globally
- transaction stage intent check conflict commits (?)
- global keyspace
- leader
- instead of buffer manager -> disk (your classic single node database)
- get from distributed key value store system
- FoundationDB does the same thing (distributed database)
- engineering! done well, fault tolerant and all the edge cases!
- Cloud Spanner
- Google wrote BigTable in 2006
- give up SQL
- give up joins
- column-based database
- Adwords ran on sharded MySQL
- needed transactions
- Megastore
- 2011
- geo-replicated
- schematized, semi-relational ?
- log structured on disk
- strict 2PC MVCC Multi-Paxos 2PC
- Paxos groups
- External Consistency
- global write_transaction synchronous replication
- lock-free read only transactions
- joins are slow :'(
- have to go to another node/table
- interleave
- single page efficient physical denormalization
- wound-wait deadlock prevention
- don't need deadlock detection
- ordering throuhg unique timestamps from atomic clocks and GPS devices
- tablets (shards)
- paxos - elect leader in tablet group
- 2PC for txn spanning tablets
- completely physical wall-clock time
- necessary for linearizability
- paxos group decide order transaction commit
- wait long enough
- then can commit
- at commit + release locks
- F1 has OCC
- now SQL
- Built for the ad system
- Spanner SQL
- good benchmarks but expensive!
- shared nothing document database (2007)
- json document
- now can do multi-action transactions
- instead embed single document
- pre join
- no query optimizer
- instead generate a bunch of query plans
- execute all of them!
- whichever is fastest return
- shared nothing architecture
- master slave replication
- auto-sharding
- partition by hash or range
- automaticly split if a shard is too big
- startups liked this, oh if I grow, it'll shard automaticly
mmap
- OS storage manage
- changed
- storage backend is WiredTiger but could replace with RocksDB
- avoid premature optimizations!
- based on months in the future
- MySQL and Postgres should be fine for most cases
- If need to ACTUALLY scale, you have money - instead find and pay smart people to help