Skip to content

Instantly share code, notes, and snippets.

Last active May 19, 2024 10:30
Show Gist options
  • Save yegor256/7907d47b44ca6e048430bb0fcccf8fd0 to your computer and use it in GitHub Desktop.
Save yegor256/7907d47b44ca6e048430bb0fcccf8fd0 to your computer and use it in GitHub Desktop.
OSBP: Research Questions

The following research questions are for the OSBP course students:

Q1: Is there a correlation between open source experience of a programmer and the quality of code they write? In order to answer this question we analyzed the amount of followers in 100K Github profiles and the quality of 10M pull requests made by these people. The quality of a pull request is a composite metric that we introduced, which includes time, comments, corrections, complexity, and other factors.

Q2: It is recommended to start every message in a Github conversation (issue or pull request) with a nickname of the person that is supposed to answer it. However, not everybody does this. We assume that message that start with nicknames receive responses sooner than messages without such a prefix. In order to confirm or disconfirm this theory, we analyzed 10M messages in 1000 Github repositories.

Q3: Sometimes, in order to prove some technical point in a Github discussion (issue or pull request), programmers may refer to an external links, such as StackOverflow or Wikipedia. Would be interesting to find out, what are the most quoted sources for, say, Java projects.

Q4: Most of us would agree that a software team should express gratitude to programmers who contribute to a repository with code changes — at least a simple "thank you" message at the end of a pull request would be an obvious indicator of such a gratitude. Would be interesting to analyze a large enough amount of pull requests in Github and find out how often programmers say "thank you" to a code author after a pull request is merged.

Q5: It is recommended to explain motivation of every pull request by linking it to an issue. However, we believe that it doesn't happen very often — programmers don't make such links. In order to validate this assumption we analyzed 100K+ pull requests in open Github repositories and found out how many of them provide back-links to issues.

Q6: It is believe that issues in repositories must be closed by those who opened them. However, very often repository maintainers violate this rule and close issues instead of their authors. Would be interesting to analyze how often this happens, analyzing 100K issues in Github.

Q7: What is an average productivity of a programmer, in open source projects and in proprietary ones? Would be interesting to measure it by lines of code, hits of code, tasks closed, pull requests merged. It may become obvious that programmers contribute much more actively to open source projects. In order to find this out, we should build a tool that measure productivity of a GitHub account, and a similar tool for a GitLab (or any other system that programmers use in-house).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment