Chapter 9: System Design and Scalability
Networking Metrics
-Latency -Bandwidth -Throughput
Algorithms that scale: step-by step
1) Ask questions 2) Make believe by making assumptions 3) Figure out how to do this reasonable 4) Solve your potential issues by either removing it, or mitigating the problem
Design Outline for systems
1) Scope the problem 2) Make reasonable assumptions 3) Draw major components 4) Identify key issues 5) Redefine key issues
Load Balancer
Distributes information aroud the nodes/server so that one server doesn't crash and take down the whole system. Most likely has systems containing cloned data(redundancy)
Considerations
Failures Availability and reliability Read-heavy vs Write-Heavy Security
Latency
How long it takes from data to go from one end to another. The delay between the sender and reciever.
Bandwidth
Maximum amount of data that can be transferred in unit of time
Vertical Partitioning
Partitioning by feature. Example: One partition has messages, another has profiles, etc
Cache
Small database that is close to the CPU so rapid lookup is possible. If it misses, it will request data from the server or harddrive.
Throughput
The actual amount of data that is transferred in a given time
Denormalization
To add redundant information into a database to speed up reads. Joins in a large database get extremely long, so you might want to denormalize it and add common information(like table name) to the table.
Horizontal Scaling
To increase the number of nodes
Vertical Scaling
To increase the resources on one node
Key-base(or hash-based) partition
Use some part of the data to partition. Limits the number of serves
Mapreduce
Uses a Map step and a reduce step. The map takes in some data and emites a <key, value> pair The Reduce portion takes that key and associated value, and "reduces" them in some way, emitting a new key and value.