Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Data Analysis

I spent the last week working on two side projects.

I have an interest in developer productivity. A few years ago I wrote a tool called Task Ranger to track how I spend my time: TaskRanger.com The problem with the tool is you have to manually write down and tag what you did every twenty minutes or so. This is annoying and introduces bias into the data.

It would be cool if I could figure out what I was working on automatically by analyzing objective metrics instead. That’s what metal-detector is all about: https://github.com/JesseAldridge/metal-detector

KPM Gif

Metal-detector is a keylogger combined with analysis code that magically figures out how you spent your time. (The analysis part is still a wip though 😅 Maybe a Hidden Markov Model will work?)

The other side project I worked on is another data analysis tool: https://github.com/JesseAldridge/yc_still_good

In a recent blog post Michael Seibel raises the question of whether YCombinator used to better than it is now: https://blog.ycombinator.com/yc-has-changed/

“I suspect people think YC was better in the early days because some companies from that era are
now household names like: Airbnb (w2009) and Dropbox (s2007). However, in reality it often takes
10 years or more for startups to achieve that sort of impact.”

I was wondering if I could do some analysis to figure out whether that is a reasonable claim. So I manually googled for each of the most notable YC companies to find their valuations over time, stuck them all into a json file, and graphed them against their age. By looking at the valuation of e.g. Airbnb five years after it was founded and Instacart five years after it was founded, we can more fairly compare the companies.

Judging by this graph I think it’s fair to say YC is doing about as well as ever.

The cool thing about this project is it includes a general purpose data processing framework. It has a bunch of utility code that I can use to take any unstructured data, structure it, and then do any graphing and analysis I might want to do on top of that.

I have a strong interest in data processing so I think I might focus my auto-code generation efforts in this direction. Like I might make an app builder tool focused on data processing. It seems like narrowing the scope in this way could be helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.