Skip to content

Instantly share code, notes, and snippets.

@jianminchen
Created February 22, 2020 20:01
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jianminchen/824243c17071dab69d938e5ee8bed9c8 to your computer and use it in GitHub Desktop.
Save jianminchen/824243c17071dab69d938e5ee8bed9c8 to your computer and use it in GitHub Desktop.
design twitter - Feb 22, 10:00 AM - the interviewee also passed my coding question. He told me that he is strong at coding, working on system design, and he is reading Design large data intensive application book.
Design twitter, 1000 users,
- users : tweet, follow people, and have followers
users:
userId(twitterId) : Name, location, email address
tweets table:
tweetId -> userId, content of tweet(text), time
Likedtweets
tweetId -> userId, time
Retweets
tweetId -> retrweeUserId, time
FollowTable
user1->user2
user2->user3
user3->user1
POST CreateUser(email, name, password) OK (200)
UpdateUser()
Login(email, password) -> return auth token OK (200)
POST WriteTweet() -> OK(200)
DeleteTweet(tweetId)
Retweet(tweetId)
LikeTweet(tweetId)
GET GetTweets(int GetNext) JSON {tweets}
Post FollowUser
TwitterService
CAP theorom
push message onto serach service queue with tweetID
- analysis of words and hastags that are in tweet
- go ahead and perform some sort of ML model on tweet
- store searchable word -> tweetId
- intwerviewing.io -> 50 tweets
- system desing - > 30 tweets
- rolling average of the last 60 minutes of trending stuff
- we can start fresh at the start of every hour and count things that are trending
Search Service, reads the tweet, parses out hashtags, words and word/hashtag -> tweetId
trending -> count
celebrity, topic, hashtag
- availability over consistency
show case your design:
celebrity with milions of followers, tweet, notification to millions - bottleneck, network, cache, push/pull, search, ...
- user A has 20M followers
we're doing push? then we end up with 20M requests sent down
- push down to users who phave been active in last 24 hours
- periodically prepopulate feed for users
Database scaling
- shard based on userID consistent hashing
- cluser of databases (read replicas) (master slave replication, where we're opting mostly for availaiblity) - async replication
cache
- search service (trending)
- tweet itself
- tweetID does exist in chace? yes, get from cache, no, hit database
- cache users and their tweets
database, storage - build index,
- transaction
deposit
withdraw
- consistency
storage design - cloud storage
- balance binary search tree - B-tree
- transaction - log -
log-structure merge tree -
CA P
-
- large data stream, integer,
1, 2, 3, 4, 5, 6, -> six, 3, 4,
3 + 4/ 2
1, 2,3,4,5 , odd number, median -
assume that you store all integer into a few data structure, order can be lost, it is ok.
Add(int, ) -> do not require ->
findMedian() -> O(1) time complexity
minheap, maxheap, keep size of two heap max diff < 1
determine how to get median,
LSM-tree -> video, Microsoft, merge, Apple employee ->
Log -> append -> merge <- backward ->
client-> load balanace-> reverse proxy -> twitter, socail profile servcice -> cache -> storage -> database
long polling -> ...
booking in United statemet
diagram
- hand
https://gist.github.com/jianminchen/ff21b0a1cad67216376999512dd1b696
booking
design twitter ->
URL -
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment