# SYSTEM DESIGN - Question

## System Design

Harvard Youtube course

<https://www.youtube.com/watch?v=-W9F__D3oY4>

## 1. Trade off

```
Performance vs scalability
Latency vs throughput
Availability vs consistency
```

### 1.1. Performance vs scalability

scalable = added resources \~ increased performance.

* performance problem = slow for a single user.
* scalability problem = fast for a single user / slow under heavy load.

### 1.2. Latency vs throughput

* Latency = time to perform some actions
* Throughput = requests / time

Aim = maximal throughput with acceptable latency.

### 1.3. Availability vs consistency

* Consistency = read receives \[1] the lastet write or \[2] an error
* Availability = \[1] request always receives a response, \[2] no guarantee of consistency
* Partition Tolerance = \[1] The system continues to operate \[2] despite partitioning due to network failures

```
CP = [1] atomic read & write [2] response from the partitioned node -> timeout error.
AP = [1] eventual consistency [2] response the most recent data, but not the latest
```

#### 1.3.1 Consistency patterns

```
Weak consistency = [1] data may lose [2] video chat
Eventual consistency = [1] asynchronous data [2] email
Strong consistency = [1] synchronous data [2] RDBMSes
```

#### 1.3.2 Availability patterns

```
1. server fail-over (master-slave & master-master)
2. database replication (master-slave & master-master)
```

2 CONS: \[1] added hardware & complexity, \[2] potential for loss of data because not in time

```
99.9% availability - three 9s (downtime per year = 8hr 45min)
99.99% availability - four 9s (downtime per year = 52min)
```

#### 1.3.3 Availability in parallel vs in sequence

multiple components prone to failure

```
# in sequence
99.8% = 99.9% * 99.9%

# in parallel
99.9999% = 1 - [(1-99.9%) * (1-99.9%)]
```

## 2. DNS

```
name space to ip = www.google.com to 140.123.456.789
```

* hierarchical & caching

EX = CloudFlare and Route 53

CONS = \[1] delay (mitigated by caching), \[2] complex (managed by ISP or government), \[3] DDoS attack

## 3. CDN

```
globally distributed proxys, serving content closer to user
```

* static files = HTML/CSS/JS, photos
* improve performance = \[1] Users receive content at closer data centers, \[2] It is other people's server

```
Push CDNs = Owner uploads content and rewrite CDN URLs. (small traffic & infrequent upload)
Pull CDNs = Provider grabs new content from your server
```

CONS: \[1] cost, \[2] content stale

## 4. Load Balancer

\[1] receive request from client \[2] pick a worker to forward request \[3] waiting for response \[4] forward response to client

```
distribute incoming client requests to computing resources
```

EX = HAProxy

```
PROS
1. Horizontal scaling
2. Avoid unhealthy servers
3. Avoid overloading resources
4. Avoid SPOF
5. SSL termination
6. Session persistence
```

```
Distribute methods:
1. Random
2. Round Robin
3. Least loaded
4. Session/cookies/request params
5. Layer 4 = ip (faster)
6. Layer 7 = content (direct video traffic to video servers)
```

```
CONS horizontal scaling:
1. complexity (stateless servers, centralized sessions servers)
2. downstream databases suffer more simultaneous connections.
```

```
CONS load balancer: 
1. performance bottleneck
2. complexity
3. SPOF
```

## 5. Reverse proxy

```
a web server = [1] centralizes internal services [2] unified public interfaces
```

```
PROS:
1. Increased security (Hide backend information & IPs)
2. Increased scalability (Clients only see the reverse proxy's IP)
3. SSL termination
4. Caching
5. Serve Static content
```

```
CONS:
1. complexity
2. SPOF
```

### 5.1 Load balancer vs reverse proxy

load balancer = multiple servers

Reverse proxies = single server

## 6. Application layer

```
Microservices = independently deployable, small services

Service Discovery = [1] Consul, Zookeeper [2] find services by keeping track of registered names
```

CONS: complexity

## 7. Database

### 7.1 RDBMS

```
ACID RDBMS transactions.
1. Atomicity = transaction is all or nothing
2. Consistency = transaction brings DB from one state to another
3. Isolation = exec transactions concurrently or serially is the same
4. Durability = Once committed, it will remain so
```

```
Scale RDBMS
1. master-slave (replicate writes to more read slaves) (CONS: [1] more logics [2] CONS Replication)
2. master-master (CONS: [1] more logics, [2] CONS Replication, [3] LBs, [4] Sync & conflict resolution )
3. federation (split by functions) (CONS: complex join & logic)
4. sharding (split by word order) (CONS: complex join & logic)
5. denormalization (Redundant copies in multi-tables to avoid expensive joins) (CONS: [1] sync duplication, [2] add write loading)
6. SQL tuning (benchmark and profile)
```

PROS of federation & sharding = \[1] less read and write traffic \[2] more cache hits \[3] reduce index size

```
CONS replication
1. loss data before ready
2. more slaves more replication lag
3. replaying writing stuck in slaves
4. complexity
```

### 7.2 NoSQL

```
BASE
1. Basically available = guarantees availability.
2. Soft state = DB state may change over time, even without input.
3. Eventual consistency = Async
```

```
Types
1. Key-value store = O(1) by word order
2. Document store = JSON, MongoDB (collections of key-value)
3. Wide column store = ColumnFamily (Bigtable, HBase, Cassandra)
4. Graph database = complex relationships, such as a social network
```

```
Reasons for SQL:
1. Structured data
2. Strict schema
3. complex joins
4. Transactions


Reasons for NoSQL:
1. Semi-structured data
2. flexible schema
3. big data
4. Very data intensive / high throughput workload

Well-suited for NoSQL:
1. log data
2. Temporary data, such as a shopping cart
3. Frequently accessed ('hot') tables
```

## 8. Cache

```
[1] lookup cache [2] found & return [3] goto DB, save in cache, and return

CONS cache: [1] consistency between cache & DB [2] expiration [3] add logics
```

cache update strategy

```
Cache-aside = cache<-clien->DB
STEP: [1] look cache [2] miss & read DB [3] save to cache and return
CONS: [1] three trips in cache miss [2] stale data [3] replace new node without memory

Write-through = client->cache->db
STEP: [1] read & write to cache [2] sync to DB [3] return
CONS: [1] slow write fast read [2] replace new node without memory

Write-behind = client->cache~~>db
STEP: [1] read & write to cache [2] async to DB & return
CONS: [1] loss data before ready [2] complex 

Refresh-ahead = automatically refresh
CONS: [1] Difficult to predict which to refresh
```

### 8.1 MongoDB vs Redis vs Memcached

<http://stackoverflow.com/questions/7888880/what-is-redis-and-what-do-i-use-it-for>

When to use Redis: If you can map a case to Redis and discover you aren't at risk of running out of RAM.

* MongoDB = `key-value` + `disk-based store`.
* Redis = `key-value` + `built-in persistence`.
* Memcached = Redis - `built-in persistence`.

Persistence to disk = you can use Redis as a real DB instead of a volatile cache. The data won't disappear when you restart as with Memcached.

## 9. Asynchronism

* Reduce request times for expensive operations
* The user is not blocked, and the job is processed in the background.

```
Message queues = RabbitMQ
Task queues = Celery
```

Back pressure: limiting the queue size

## 10. Communication

### 10.1 HTTP

```
encoding and transporting data between a client and a server
request/response protocol
verb + resource
Level 7 on top of TCP/IP
```

### 10.2 REST & RPC

|                | RPC                                  | REST                                                     |
| -------------- | ------------------------------------ | -------------------------------------------------------- |
|                | request-response                     | request-response                                         |
| API            | private                              | public                                                   |
| syntax         | customized                           | general                                                  |
| implementation | tight coupling                       | minimizes coupling                                       |
|                |                                      | stateless and cacheable                                  |
| PROS           | Better performance, customized syntx | general purpose, less couple                             |
| CONS           | difficult to debug, tightly coupled  | few verbs only, difficult to name for nested hierarchies |

```
- - - - - - - REST - - - - - - - 
GET /someresources/anId

PUT /someresources/anId
{
  "anotherdata": "another value"
}

- - - - - - - RPC - - - - - - - 
GET /someoperation?data=anId

POST /anotheroperation
{
  "data":"anId";
  "anotherdata": "another value"
}
```

### 10.3 TCP & UDP

|                      | TCP/IP                                | UDP                                           |
| -------------------- | ------------------------------------- | --------------------------------------------- |
|                      | connection-oriented                   | connectionless                                |
| order                | receive in order                      | order not guarantee                           |
|                      | IP, hankshake = ack + checksum        | broadcast to subnet (in DHCP since no IP yet) |
| no correct response  | \[1] resend the packets \[2] timeouts | X                                             |
| data loss vs latency | intact data, less time critical       | data may lose, real-time                      |
| example              | web servers, database, and SSH.       | video chat                                    |

## 11. Security

```
1. XSS and SQL injection = Sanitize all user inputs
2. SQL injection = parameterized queries
3. least privilege.
```

## Appendix

### A1. power of 2

```
Power           Exact Value         Approx Value        Bytes
---------------------------------------------------------------
8                             256
10                           1024   1 thousand           1 KB
16                         65,536                       64 KB
20                      1,048,576   1 million            1 MB
30                  1,073,741,824   1 billion            1 GB
40              1,099,511,627,776   1 trillion           1 TB
```

### A2. metrics

```
Read sequentially from disk at 40 MB/s (100)
Read sequentially from 1 Gbps Ethernet at 100 MB/s (40)
Read sequentially from SSD at 1 GB/s (4)
Read sequentially from main memory at 4 GB/s (1)
```

```
Reading 1 MB sequentially from memory takes about 250 us
SSD 4x 
Disk 80x
```

```
6-7 world-wide round trips per second
2,000 round trips per second within a data center
```

```
L1 cache reference                           0.5 ns
L2 cache reference                           7   ns                      14x L1 cache
Compress 1K bytes with Zippy            10,000   ns       10 us
Round trip within same datacenter      500,000   ns      500 us
Disk seek                           10,000,000   ns   10,000 us   10 ms  20x datacenter roundtrip
Send packet CA->Netherlands->CA    150,000,000   ns  150,000 us  150 ms
```

### A3. per seconds

2.5M seconds per month

400 requests per second = 1 billion requests per month

### A4. varchar & char

```
char 格式, 資料不足長度者後面補空白,儲存空間固定
varchar 格式資料實際之內容,儲存空間不固定

單就儲存空間上來看
char(5)
寫入 'ab' 資料庫為 'ab ' 5 byptes # 會多存3個空白
寫入 'abcd' 資料庫為 'abcd ' 5 byptes # 會多存1個空白
寫入 'abcde' 資料庫為 'abcde' 5 byptes
寫入 'abcdef' 資料庫為 'abcde' 5 byptes

varchar(5)
寫入 'ab' 資料庫為 'ab' 3 byptes # 多 1 byte 是存長度
寫入 'abcd' 資料庫為 'abcd' 5 byptes
寫入 'abcde' 資料庫為 'abcde' 6 byptes
寫入 'abcdef' 資料庫為 'abcde' 6 byptes
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://huang-jason.gitbook.io/complat-software-training-101/interview-system-design-question.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
