System design critique request for below question
Anonymous User
866

Qn : Design a privacy api which takes a fb posts and figures out which users should be able to view it.

Requirements:
Posts should not be visible to excluded users lb→ load balancer
Authorization and authentication is not in scope
Privacy api should have less latency
Reads and writes in order of 1:10
System should be highly available and partition tolerant
No geolocation

Estimates:
Write qps: 100M users, 5 posts/day.. 50k posts/sec each post 1KB → 50 TB/year w/o replication
Read qps: 100M users, 50 posts/day .. 500k posts/sec

Typical production server can handle 1TB data without sharding, 10k requests/sec

Privacy api get method → used to fetch the privacy of post_id
Client (timeline generator) →lb→ get(post_Id) → lb → distributed LRU Cache → cache miss ? → lb and config router to fetch shard → db → fetch results and update cache → send response back to client
Since read requests are 500k → we need 50 primary servers and 10 primary cache servers and hash of the postid can be shard key. Write around cache and there can be an offline process to update cache in cache miss

Privacy api post method → used to update or write privacy of post_id
client (timeline generator) → lb → post(post_id,excluded_friends) → processing queue → consumer to read from Queue and hit lb to write to correct shard in post table → another consumer can parallely update read cache with this post for LRU function → send success response back

Use case
Timeline generator creates a new post → implicitly calls privacy post api→ updates the post settings
Timeline generator fetches an existing post → calls privacy get api → gets back list_of_excluded friends→ if current friend whose timeline is getting generated is not in excluded friends → display the post

Data model:
Friends table posts table
Id| list_of_friends id|author_id|raw_content|hash|excluded_friend_ids|

Comments (1)