// Use cases
User: connects to a financial account
Service: extracts transactions from the account
Service: recommends a budget
Service: high availability
// Out of scope
Service performs additional logging and analytics
// Assumptions
Not even traffic
Update for active accounts (30 days)
Asynchronous budget notifications
10M users
30M financial accounts
100M budget items = 10 budget categories per user
50K sellers (determine transaction category)
5b transactions / month
500M read / month
10:1 write to read ratio (write-heavy)
1.2. Calculation
8B user_id
5B created_at
32B seller
5B amount
Total size = ~50B
250G transaction content per month = 50B * 5b
9 TB content in 3 years = 250G * 36
2,000 transactions / second
200 read / second
1.3. Design
1.4. Core
1.4.1. Use case: User connects to a financial account
10M User infor in RDBMS
Client -> Web Server
1. -> Accounts API server
(1) update newly entered account info
id int NOT NULL AUTO_INCREMENT
created_at datetime NOT NULL
modified_at datetime NOT NULL
user_id int NOT NULL
PRIMARY KEY(id)
FOREIGN KEY(user_id) REFERENCES users(id)
index id, user_id, created_at // O(logn) instead of O(n)
1.4.2. Use case: Service extracts transactions from the account
user first links the account / manually refreshes / daily automatic for active users
Client -> Web Server -> Accounts API server -> Queue (Amazon SQS or RabbitMQ) + async
1. -> Transaction Extraction Service
(1) pull from queue & extracts transactions
(2) save raw log files in the Object Store
2. -> Category Service
(1) categorize each transaction
(2) update SQL transactions table
3. -> Budget Service
(1) calculate aggregate monthly spending by category
(2) update SQL monthly_spending table
(2) -> Notification Service: exceeded budget -> send notification (Queue)
// transaction table
id int NOT NULL AUTO_INCREMENT
created_at datetime NOT NULL
seller varchar(32) NOT NULL
amount decimal NOT NULL
user_id int NOT NULL
PRIMARY KEY(id)
FOREIGN KEY(user_id) REFERENCES users(id)
// monthly_spending table
id int NOT NULL AUTO_INCREMENT
month_year date NOT NULL
category varchar(32)
amount decimal NOT NULL
user_id int NOT NULL
PRIMARY KEY(id)
FOREIGN KEY(user_id) REFERENCES users(id)
budget template = based on income tiers
customed budget = budget_overrides table
[1] SOLUTION-1: SQL queries on transactions table -> monthly_spending aggregate table (less rows)
[2] SOLUTION-2: MapReduce [(user, category, month) amount]
[3] analyses on transaction files could significantly reduce the load on the database.
1.5. Scale
1.5.1 Recent transactions in Cache
Client -> Web Server -> Read API
-> Object Store (S3, CDN) static content
-> Cache content or SQL
[1] Analytics database = monthly_spending aggregate table SQL DB -> Amazon Redshift or Google BigQuery.
[2] Keep monthly sum data & all content in S3
[3] 2,000 read requests /s => Memory Cache keep popular content + SQL read replica for cache misses
[4] 200 writes / second = tough for single SQL Write Master-Slave => SQL scaling pattern
2. Data structures for social network
2.1. Requirement & scope
// Use cases
User: searches for someone & sees the shortest path to it.
Service: high availability
// Assumptions
Not even traffic
Some searches are popular.
Graph data won't fit on single machine; unweighted Graph
2.2. Calculation
100M users (50 friends / user)
5b relationships = 100M * 50 friends
1b friend searches / month
400 search requests per second
2.3. Design
2.4. Core
2.4.1 User searches for someone & the shortest path.
BFS: solve unweighted shortest path
shard users @ user DB
Client -> Web Server -> Search API server
-> User Graph Service
-> Lookup Service =
(1) find the current user's info @ Person Server
(2) current user's friend_ids
(3) BFS: (source = current user), (adjacent_node = current user's friend_ids)
loop adjacent_node to get the target
2.5. Scale
400 read / second = [1] person data @ Cache [2] relation @ Cache [3] BFS in Cache
// BFS improvement
(1) offline process & save partial BFS traversals in NoSQL
(2) reduce machine jumps = Shard Person Servers by geo-location
(3) 2 BFS searches at the same time => source & target => merge paths
(4) BFS search from people with many friends => likely to reduce degrees of separation
A search limit => asking the user if they want to continue => search or stop
Graph DB: Neo4j or GraphQL language
3. key-value cache to save the results of the most recent web server queries
3.1. Requirement & scope
// use cases
User: request resulting in a cache hit
User: request resulting in a cache miss
Service: high availability
// Assumptions
Not even traffic
Fast lookups in Cache
Limited cache memory
Determine keep/remove
Determine expire/refresh
least recently used (LRU) to expire older entries.
Client -> Web Server -> Query API server
-> PRBFN -> boolean fingerprint -> chech Cache for matching
-> hit -> update entry's position to the front(LRU) -> return
-> miss
-> Reverse Index Service query = return top ones
-> Document Service = return titles and snippets
-> update cache & place at front
Cache implementation
-> doubly-linked list
-> added to the head
-> expired from the tail
-> hash table = fast lookups
When to update the cache (1) contents change (2) added / removed page (3) rank changes => TTL
3.5. Scale
Expanding Cache to many machines (1) handle heavy request (2) great memory needed
-> Each machine = own cach => low cache hit rate
-> Each machine = copy cach => efficient memory usage
-> sharding = use hashing to find cache location