SourceCred enables open-source projects to attribute
cred to members of their
communities in a way that fairly reflects the value those members have
contributed to the project.
For SourceCred to succeed in the long-term, it needs to be a extensible. For example, the initial system might only look at GitHub interactions and Git metadata, but it should be possible to add new data sources like static code analysis, or records from email design discussions. The system needs a more-than-sum-of-parts approach with new information, as opposed to treating them as independent signals.
SourceCred also needs to be robust against gaming and Sibyll attacks. If
gaming the cred algorithm gives better returns than contributing to projects,
cred will cease to be useful as a metric. This rules out simplistic
approaches, like heuristic weights over commit counts or number of lines of
Having explored the design space, we've decided to model project contributions via a contribution graph. Basically, every action, as well as the entities that take those actions, can be seen as nodes in a graph; that graph might contain GitHub issues, Git commits, individual files, GitHub user identities, and so forth. Edges connect related entities; for example, an edge showing that a particular GitHub identity authored a pull request, and another edge showing that the pull request merged a particular commit. This satisfies our requirement of being extensible, and surfacing rich relationships between different entities.
Then, for cred attribution, we will use a PageRank-based approach. PageRank is naturally resilient to Sibyll attacks, and does a good job of capturing important transitive relationships. (For example, a pull request deserves a lot of cred, transitively, if the commit it merged contained the first implementation of an important function.)
Q2: Creating the Prototype
In Q2, we created a prototype that ingests a repository's data from Git and GitHub, produces a contribution graph, and runs a fast in-browser PageRank on that graph. The prototype, de-risked our basic technical approach. However, it also had technical debt that would have hamstrung future development. Specifically, it used the Graph as an arbitrary key-value store for all data in the SourceCred universe, which was convenient to implement, but was not a scalable abstraction.
It also lacked many features that would be necessary for usage by actual teams, such as configurability, the ability to explicitly add contributions missing in the Git and GitHub metadata, and the ability to scale to large repositories.
Q3: Launching the Beta
Our goal for Q3 is to evolve the prototype into a beta: one that has all the minimum features so that SourceCred can be used by real teams, and is also a solid technical foundation for years of future development.
Here are the specific key results on that trajectory.
Pay Technical Debt
- Rewrite the Graph class
- Rewrite Git plugin to use new graph
- Rewrite Github plugin to use new graph
- Rewrite cred explorer to use new graph
🚀(MAYBE) Switch from Flow to TypeScript
- Implement 'spotlights' for adding new contributions
- created via ad-hoc GitHub interactions
🚀added from within the SourceCred UI
🚀Implement 'domains' for organizing contributions
🚀including 'functional domains' like
🚀including 'component domains' like
- Make the PageRank implementation tunable
- via directed edge weight tuning
- via priors on node types
- Scales to
🚀Scales across repositories, to the entire @ipfs org
Create the Beta
- Polish end-to-end SourceCred install and usage flow
- Add documentation / tutorial on getting up and running
🚀Launch a website
🚀Create a logo
Collect Feedback & Iterate
- Work with ipfs community on their cred attribution