Extra 1: Scalability Harvard CS 75

  • VPS Service
    • for hosting
    • Amazon Web Services :)

Vertical Scaling

  • Just throw a bigger computers
  • more RAM, more cores, more storage space
  • SAS and SSD drives for write heavy operations like databases

Horizontal Scaling

  • a bunch of cheaper slower machines

Load Balancer

  • load balancer has its own IP address
    • from client to load balancer to server 1
    • could do a copy of everything on each server
    • could do instead images.host.com and have a server for each part

  • could load balance with our DNS server
  • Round Robin: to server 1, to server 2, to server 3, then back to server 1
  • What if server 1 gets heavy weight users?
    • Let Load Balancer decide instead of round robin

  • Can use AWS Load Balancer,
    • Application Load Balancer (ALB), Network Load Balancer (NLB)

Sticky Sessions

  • session cookies


  • RAID 0: 2 hard drives, stripe to drive 1 then drive 2
  • RAID 1: 2 hard drives, mirror data, if one dies, you still have a copy of your data
  • RAID 5: 5 drives, stripe to 4, and have 1 for redundancy
  • RAID 6: 2 hard drives can die
  • NFS file system, distributed file system
  • What if you trip over a power cord?


  • static page instead of dynamic page from PHP, HTML file is static
  • problem: changing the style would have to change hundreds of thousands of HTML files
  • MySQL cache: for identically executed queries

  • memcached: memory cache, stores in RAM
    • use extensively by Facebook

Data Replication

  • avoid single points of failure

  • Master-Master

  • High avaliability with Heartbeat Master-Master

  • Load Balancing + Replication
  • heartbeat
    • Active/Passive
    • if server dies, heartbeat ends, Passive becomes Active

  • Partition and Server slaves for replications

Layers of Replication and Redundancy

  • Multiple Load Balancers
  • Multiple Backend Servers
  • Multiple Load Balancers to Databases
  • Multiple Databases
  • Cross Connected
  • Multiple Network Swtiches
  • In a single Datacenter
    • AWS Avalibility Zone
      • Then different regions, East, West, Asia, Europe
  • To do Geography based Load Balancing by doing a Load Balancer with DNS
  • Avoid Single Points of Failure!


  • want TCP, SSL to Load Balancer
    • Now everything is in HTTP instead