SYSTEM DESIGN - EX2

1. Mint.com

1.1. Requirement & scope

// Use cases
User: connects to a financial account
Service: extracts transactions from the account
Service: recommends a budget
Service: high availability

// Out of scope
Service performs additional logging and analytics

// Assumptions
Not even traffic
Update for active accounts (30 days)
Asynchronous budget notifications

10M users
30M financial accounts
100M budget items = 10 budget categories per user
50K sellers (determine transaction category)
5b transactions / month
500M read / month
10:1 write to read ratio (write-heavy)

1.2. Calculation

8B user_id

5B created_at

32B seller

5B amount

Total size = ~50B

250G transaction content per month = 50B * 5b
9 TB content in 3 years = 250G * 36
2,000 transactions / second
200 read / second

1.3. Design

1.4. Core

1.4.1. Use case: User connects to a financial account

10M User infor in RDBMS

Client -> Web Server
1. -> Accounts API server
        (1) update newly entered account info

id int NOT NULL AUTO_INCREMENT
created_at datetime NOT NULL
modified_at datetime NOT NULL

user_id int NOT NULL
PRIMARY KEY(id)
FOREIGN KEY(user_id) REFERENCES users(id)

index id, user_id, created_at // O(logn) instead of O(n)

$ curl --request POST 
    --data '{ "user_id": "foo", "account_url": "bar", "account_login": "baz", "account_password": "qux" }'
    https://mint.com/api/v1/account

1.4.2. Use case: Service extracts transactions from the account

user first links the account / manually refreshes / daily automatic for active users

Client -> Web Server -> Accounts API server -> Queue (Amazon SQS or RabbitMQ) + async
1. ->  Transaction Extraction Service
        (1) pull from queue & extracts transactions
        (2) save raw log files in the Object Store
2. -> Category Service
        (1) categorize each transaction
        (2) update SQL transactions table
3. -> Budget Service
        (1) calculate aggregate monthly spending by category
        (2) update SQL monthly_spending table
        (2) -> Notification Service: exceeded budget -> send notification (Queue)

// transaction table
id int NOT NULL AUTO_INCREMENT
created_at datetime NOT NULL
seller varchar(32) NOT NULL
amount decimal NOT NULL
user_id int NOT NULL

PRIMARY KEY(id)
FOREIGN KEY(user_id) REFERENCES users(id)

// monthly_spending table
id int NOT NULL AUTO_INCREMENT
month_year date NOT NULL
category varchar(32)
amount decimal NOT NULL
user_id int NOT NULL

PRIMARY KEY(id)
FOREIGN KEY(user_id) REFERENCES users(id)

Category service

seller-to-category dictionary = 50,000 sellers (each < 255 bytes) = 12MB

1.4.3. Use case: Service recommends a budget

budget template = based on income tiers
customed budget = budget_overrides table
[1] SOLUTION-1: SQL queries on transactions table -> monthly_spending aggregate table (less rows)
[2] SOLUTION-2: MapReduce [(user, category, month) amount]
[3] analyses on transaction files could significantly reduce the load on the database.

1.5. Scale

1.5.1 Recent transactions in Cache

Client -> Web Server -> Read API 
    -> Object Store (S3, CDN) static content
    -> Cache content or SQL

[1] Analytics database = monthly_spending aggregate table SQL DB -> Amazon Redshift or Google BigQuery.
[2] Keep monthly sum data & all content in S3

[3] 2,000 read requests /s => Memory Cache keep popular content + SQL read replica for cache misses
[4] 200 writes / second = tough for single SQL Write Master-Slave => SQL scaling pattern

2.1. Requirement & scope

// Use cases
User: searches for someone & sees the shortest path to it.
Service: high availability

// Assumptions
Not even traffic
Some searches are popular.
Graph data won't fit on single machine; unweighted Graph

2.2. Calculation

100M users (50 friends / user)
5b relationships = 100M * 50 friends
1b friend searches / month
400 search requests per second

2.3. Design

2.4. Core

2.4.1 User searches for someone & the shortest path.

BFS: solve unweighted shortest path

shard users @ user DB

Client -> Web Server -> Search API server
    -> User Graph Service
        -> Lookup Service = 
            (1) find the current user's info @ Person Server
            (2) current user's friend_ids
            (3) BFS: (source = current user), (adjacent_node = current user's friend_ids)
                loop adjacent_node to get the target

2.5. Scale

400 read / second = [1] person data @ Cache [2] relation @ Cache [3] BFS in Cache

// BFS improvement
(1) offline process & save partial BFS traversals in NoSQL
(2) reduce machine jumps = Shard Person Servers by geo-location
(3) 2 BFS searches at the same time => source & target => merge paths
(4) BFS search from people with many friends => likely to reduce degrees of separation

A search limit => asking the user if they want to continue => search or stop

Graph DB: Neo4j or GraphQL language

3. key-value cache to save the results of the most recent web server queries

3.1. Requirement & scope

// use cases
User: request resulting in a cache hit
User: request resulting in a cache miss
Service: high availability

// Assumptions
Not even traffic
Fast lookups in Cache
Limited cache memory
Determine keep/remove
Determine expire/refresh

3.2. Calculation

10M users
10b query / month

2.7TB cache / month = 270B * 10b
4000 query / second

Key

Value

50B query

20B title

200B snippet

Total size = ~270B

3.3. Design

3.4. Core

3.4.1 User request -> cache hit

reduce read latency & avoid overloading DB

least recently used (LRU) to expire older entries.

Client -> Web Server -> Query API server
    -> PRBFN -> boolean fingerprint -> chech Cache for matching
        -> hit -> update entry's position to the front(LRU) -> return
        -> miss
            -> Reverse Index Service query = return top ones
            -> Document Service = return titles and snippets
            -> update cache & place at front

Cache implementation
    -> doubly-linked list
    -> added to the head
    -> expired from the tail
    -> hash table = fast lookups

When to update the cache (1) contents change (2) added / removed page (3) rank changes => TTL

3.5. Scale

Expanding Cache to many machines (1) handle heavy request (2) great memory needed
    -> Each machine = own cach => low cache hit rate
    -> Each machine = copy cach => efficient memory usage
    -> sharding = use hashing to find cache location

PreviousSYSTEM DESIGN - EX1 NextSYSTEM DESIGN - EX3

Last updated 5 years ago

Was this helpful?

SYSTEM DESIGN - EX2

1. Mint.com

1.1. Requirement & scope

// Use cases
User: connects to a financial account
Service: extracts transactions from the account
Service: recommends a budget
Service: high availability

// Out of scope
Service performs additional logging and analytics

// Assumptions
Not even traffic
Update for active accounts (30 days)
Asynchronous budget notifications

10M users
30M financial accounts
100M budget items = 10 budget categories per user
50K sellers (determine transaction category)
5b transactions / month
500M read / month
10:1 write to read ratio (write-heavy)

1.2. Calculation

8B user_id

5B created_at

32B seller

5B amount

Total size = ~50B

250G transaction content per month = 50B * 5b
9 TB content in 3 years = 250G * 36
2,000 transactions / second
200 read / second

1.3. Design

1.4. Core

1.4.1. Use case: User connects to a financial account

10M User infor in RDBMS

Client -> Web Server
1. -> Accounts API server
        (1) update newly entered account info

id int NOT NULL AUTO_INCREMENT
created_at datetime NOT NULL
modified_at datetime NOT NULL

user_id int NOT NULL
PRIMARY KEY(id)
FOREIGN KEY(user_id) REFERENCES users(id)

index id, user_id, created_at // O(logn) instead of O(n)

$ curl --request POST 
    --data '{ "user_id": "foo", "account_url": "bar", "account_login": "baz", "account_password": "qux" }'
    https://mint.com/api/v1/account

1.4.2. Use case: Service extracts transactions from the account

user first links the account / manually refreshes / daily automatic for active users

Client -> Web Server -> Accounts API server -> Queue (Amazon SQS or RabbitMQ) + async
1. ->  Transaction Extraction Service
        (1) pull from queue & extracts transactions
        (2) save raw log files in the Object Store
2. -> Category Service
        (1) categorize each transaction
        (2) update SQL transactions table
3. -> Budget Service
        (1) calculate aggregate monthly spending by category
        (2) update SQL monthly_spending table
        (2) -> Notification Service: exceeded budget -> send notification (Queue)

// transaction table
id int NOT NULL AUTO_INCREMENT
created_at datetime NOT NULL
seller varchar(32) NOT NULL
amount decimal NOT NULL
user_id int NOT NULL

PRIMARY KEY(id)
FOREIGN KEY(user_id) REFERENCES users(id)

// monthly_spending table
id int NOT NULL AUTO_INCREMENT
month_year date NOT NULL
category varchar(32)
amount decimal NOT NULL
user_id int NOT NULL

PRIMARY KEY(id)
FOREIGN KEY(user_id) REFERENCES users(id)

Category service

seller-to-category dictionary = 50,000 sellers (each < 255 bytes) = 12MB

1.4.3. Use case: Service recommends a budget

budget template = based on income tiers
customed budget = budget_overrides table
[1] SOLUTION-1: SQL queries on transactions table -> monthly_spending aggregate table (less rows)
[2] SOLUTION-2: MapReduce [(user, category, month) amount]
[3] analyses on transaction files could significantly reduce the load on the database.

1.5. Scale

1.5.1 Recent transactions in Cache

Client -> Web Server -> Read API 
    -> Object Store (S3, CDN) static content
    -> Cache content or SQL

[1] Analytics database = monthly_spending aggregate table SQL DB -> Amazon Redshift or Google BigQuery.
[2] Keep monthly sum data & all content in S3

[3] 2,000 read requests /s => Memory Cache keep popular content + SQL read replica for cache misses
[4] 200 writes / second = tough for single SQL Write Master-Slave => SQL scaling pattern

2.1. Requirement & scope

// Use cases
User: searches for someone & sees the shortest path to it.
Service: high availability

// Assumptions
Not even traffic
Some searches are popular.
Graph data won't fit on single machine; unweighted Graph

2.2. Calculation

100M users (50 friends / user)
5b relationships = 100M * 50 friends
1b friend searches / month
400 search requests per second

2.3. Design

2.4. Core

2.4.1 User searches for someone & the shortest path.

BFS: solve unweighted shortest path

shard users @ user DB

Client -> Web Server -> Search API server
    -> User Graph Service
        -> Lookup Service = 
            (1) find the current user's info @ Person Server
            (2) current user's friend_ids
            (3) BFS: (source = current user), (adjacent_node = current user's friend_ids)
                loop adjacent_node to get the target

2.5. Scale

400 read / second = [1] person data @ Cache [2] relation @ Cache [3] BFS in Cache

// BFS improvement
(1) offline process & save partial BFS traversals in NoSQL
(2) reduce machine jumps = Shard Person Servers by geo-location
(3) 2 BFS searches at the same time => source & target => merge paths
(4) BFS search from people with many friends => likely to reduce degrees of separation

A search limit => asking the user if they want to continue => search or stop

Graph DB: Neo4j or GraphQL language

3. key-value cache to save the results of the most recent web server queries

3.1. Requirement & scope

// use cases
User: request resulting in a cache hit
User: request resulting in a cache miss
Service: high availability

// Assumptions
Not even traffic
Fast lookups in Cache
Limited cache memory
Determine keep/remove
Determine expire/refresh

3.2. Calculation

10M users
10b query / month

2.7TB cache / month = 270B * 10b
4000 query / second

Key

Value

50B query

20B title

200B snippet

Total size = ~270B

3.3. Design

3.4. Core

3.4.1 User request -> cache hit

reduce read latency & avoid overloading DB

least recently used (LRU) to expire older entries.

Client -> Web Server -> Query API server
    -> PRBFN -> boolean fingerprint -> chech Cache for matching
        -> hit -> update entry's position to the front(LRU) -> return
        -> miss
            -> Reverse Index Service query = return top ones
            -> Document Service = return titles and snippets
            -> update cache & place at front

Cache implementation
    -> doubly-linked list
    -> added to the head
    -> expired from the tail
    -> hash table = fast lookups

When to update the cache (1) contents change (2) added / removed page (3) rank changes => TTL

3.5. Scale

Expanding Cache to many machines (1) handle heavy request (2) great memory needed
    -> Each machine = own cach => low cache hit rate
    -> Each machine = copy cach => efficient memory usage
    -> sharding = use hashing to find cache location

PreviousSYSTEM DESIGN - EX1 NextSYSTEM DESIGN - EX3

Last updated 5 years ago

Was this helpful?

1. Mint.com

1.1. Requirement & scope

1.2. Calculation

1.3. Design

1.4. Core

1.4.1. Use case: User connects to a financial account

1.4.2. Use case: Service extracts transactions from the account

1.4.3. Use case: Service recommends a budget

1.5. Scale

1.5.1 Recent transactions in Cache

2. Data structures for social network

2.1. Requirement & scope

2.2. Calculation

2.3. Design

2.4. Core

2.4.1 User searches for someone & the shortest path.

2.5. Scale

3. key-value cache to save the results of the most recent web server queries

3.1. Requirement & scope

3.2. Calculation

3.3. Design

3.4. Core

3.4.1 User request -> cache hit

3.5. Scale

1. Mint.com

1.1. Requirement & scope

1.2. Calculation

1.3. Design

1.4. Core

1.4.1. Use case: User connects to a financial account

1.4.2. Use case: Service extracts transactions from the account

1.4.3. Use case: Service recommends a budget

1.5. Scale

1.5.1 Recent transactions in Cache

2. Data structures for social network

2.1. Requirement & scope

2.2. Calculation

2.3. Design

2.4. Core

2.4.1 User searches for someone & the shortest path.

2.5. Scale

3. key-value cache to save the results of the most recent web server queries

3.1. Requirement & scope

3.2. Calculation

3.3. Design

3.4. Core

3.4.1 User request -> cache hit

3.5. Scale