There is a client who has a CSV like this one updated daily with new data (1.2Gb of new data each update). They want to expose an API like this:
curl http://rambo.com/api?month=0
{
"trip_count": 123456,
"pickup_locations": [148, 114,...]
}
being trip_count
the number of trips for month=0
(January) and pickup_locations
the different PULocationID
(no repeated values) for that month.
It must support 1000QPS with response time: p99 under 100ms and max response time under 1 second.
Write a document explaining how you would solve the problem to someone with development skills. You don't need to implement it.
- You can't mention any tech provider (AWS, Google...) or any related technology.
- One page limit, about 700 words, in English. You don't need to be Shakespeare (I'm currently working for this company and this is my real english level)
- This problem is open on purpose, make all the decisions you want.
- Ask anything you want to (jobs@tinybird.co). We recommend you to prepare a set of questions before start.
- It'd be nice if you could share the writeup with us using a platform where we can leave comments (Google docs for example)
- All the questions you ask and in general the comunication during the test. This is what this "tech" test is about, this is how we work.
- Clarity.
- Technical approach.
- Explain how it'd work if the CSV is updated every 60 seconds.