Github makes it easy to start participating to a project. The recommended way of doing this is just fork, push your changes and send a pull request to the origin project. So the question is now, how are projects using this workflow, and which projects are relying on it.
Note: all along, I use a kernel of query to extract pull requests related to forks. The request is limited to one month to keep it under the size limit. As an added bonus, the query provides the average latency between fork and pull-request.
The query named Fork2PullRequestByProject.sql helps sort out the greatest projects using of this feature over the month of april:
Number one: Homebrew ! It seems that this workflow makes it really easy for a new comer on a project based on recipes with ties to a large number of sources to push a change related to source version update.
Second: bootstrap from Twitter. A javascript lib seems a pretty good playfield for starting sharing changes.
Third: rails
This time, let's sort the combined fork+pullrequest by language.
The query named Fork2PullRequestByLanguage.sql gives the results.
This time, the result is quite expected : javascript, ruby, python. One remark though: javascript and ruby numbers exceed the double of python, as if they were really better fitted for this kind of use.
In average, all language keep the latency between fork and pull request under a day, java being the longest.
The query name Fork2PullByLatency.sql sorts the events by a exponential scale of the delay between fork and pull request.
One thing to notice is the 2 maximums in the series, one peaking at 8 to 16 minutes and a smaller one peaking at around one day.
My interpretation of this fact is that we have two kinds of conditions that lead to a fork: the quick fix one which is done in the same movement as the fork and the code pull request one which needs some more thorough review before submitting the pull request.
This assumption could be validated by a deeper analysis of the size of the pull requests in each case.