Created
February 22, 2020 20:01
-
-
Save jianminchen/824243c17071dab69d938e5ee8bed9c8 to your computer and use it in GitHub Desktop.
design twitter - Feb 22, 10:00 AM - the interviewee also passed my coding question. He told me that he is strong at coding, working on system design, and he is reading Design large data intensive application book.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Design twitter, 1000 users, | |
- users : tweet, follow people, and have followers | |
users: | |
userId(twitterId) : Name, location, email address | |
tweets table: | |
tweetId -> userId, content of tweet(text), time | |
Likedtweets | |
tweetId -> userId, time | |
Retweets | |
tweetId -> retrweeUserId, time | |
FollowTable | |
user1->user2 | |
user2->user3 | |
user3->user1 | |
POST CreateUser(email, name, password) OK (200) | |
UpdateUser() | |
Login(email, password) -> return auth token OK (200) | |
POST WriteTweet() -> OK(200) | |
DeleteTweet(tweetId) | |
Retweet(tweetId) | |
LikeTweet(tweetId) | |
GET GetTweets(int GetNext) JSON {tweets} | |
Post FollowUser | |
TwitterService | |
CAP theorom | |
push message onto serach service queue with tweetID | |
- analysis of words and hastags that are in tweet | |
- go ahead and perform some sort of ML model on tweet | |
- store searchable word -> tweetId | |
- intwerviewing.io -> 50 tweets | |
- system desing - > 30 tweets | |
- rolling average of the last 60 minutes of trending stuff | |
- we can start fresh at the start of every hour and count things that are trending | |
Search Service, reads the tweet, parses out hashtags, words and word/hashtag -> tweetId | |
trending -> count | |
celebrity, topic, hashtag | |
- availability over consistency | |
show case your design: | |
celebrity with milions of followers, tweet, notification to millions - bottleneck, network, cache, push/pull, search, ... | |
- user A has 20M followers | |
we're doing push? then we end up with 20M requests sent down | |
- push down to users who phave been active in last 24 hours | |
- periodically prepopulate feed for users | |
Database scaling | |
- shard based on userID consistent hashing | |
- cluser of databases (read replicas) (master slave replication, where we're opting mostly for availaiblity) - async replication | |
cache | |
- search service (trending) | |
- tweet itself | |
- tweetID does exist in chace? yes, get from cache, no, hit database | |
- cache users and their tweets | |
database, storage - build index, | |
- transaction | |
deposit | |
withdraw | |
- consistency | |
storage design - cloud storage | |
- balance binary search tree - B-tree | |
- transaction - log - | |
log-structure merge tree - | |
CA P | |
- | |
- large data stream, integer, | |
1, 2, 3, 4, 5, 6, -> six, 3, 4, | |
3 + 4/ 2 | |
1, 2,3,4,5 , odd number, median - | |
assume that you store all integer into a few data structure, order can be lost, it is ok. | |
Add(int, ) -> do not require -> | |
findMedian() -> O(1) time complexity | |
minheap, maxheap, keep size of two heap max diff < 1 | |
determine how to get median, | |
LSM-tree -> video, Microsoft, merge, Apple employee -> | |
Log -> append -> merge <- backward -> | |
client-> load balanace-> reverse proxy -> twitter, socail profile servcice -> cache -> storage -> database | |
long polling -> ... | |
booking in United statemet | |
diagram | |
- hand | |
https://gist.github.com/jianminchen/ff21b0a1cad67216376999512dd1b696 | |
booking | |
design twitter -> | |
URL - |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment