📊 Is anyone else super dissatisfied with the tech industry's preferred/tracked open-source metrics?
@GitHub
stars; pip install or download counts; @-mentions or tags on social media: all of these stats can, and will, be gamed. We can do much better!
👇🏻Here are some ideas:
(1) Projects listing a particular repo as a dependency.
This can be easily tracked via GitHub's dependency graph, or by scraping which Dockerfiles, conda environment YAMLs, etc. reference a library or framework.
(2) "Bus factor" of a particular open-source project.
Bus factors measure how resilient a project is to sudden engineering turnover - and is a solid method of understanding the health of an open-source project.
More on bus factors below:
📄https://arxiv.org/abs/2202.01523
(3) General insights into code health:
What is the test coverage like for the open-source project? Are there any code quality concerns? To what extent?
✨📝 "We propose a single framework to identify several categories of code quality issues that can cause low code readability
and maintainability.
We also developed an automated tool that spots
large numbers of such issues in open-source repositories."
https://t.co/QuL0FQTGq7
👩💻 Paige Bailey #BlackLivesMatter
@DynamicWebPaige
·
Nov 14
📊 Is anyone else super dissatisfied with the tech industry's preferred/tracked open-source metrics?
@GitHub
stars; pip install or download counts; @-mentions or tags on social media: all of these stats can, and will, be gamed. We can do much better!
👇🏻Here are some ideas:
👩💻 Paige Bailey #BlackLivesMatter
@DynamicWebPaige
·
Nov 14
(1) Projects listing a particular repo as a dependency.
This can be easily tracked via GitHub's dependency graph, or by scraping which Dockerfiles, conda environment YAMLs, etc. reference a library or framework.
Image
👩💻 Paige Bailey #BlackLivesMatter
@DynamicWebPaige
·
Nov 14
(2) "Bus factor" of a particular open-source project.
Bus factors measure how resilient a project is to sudden engineering turnover - and is a solid method of understanding the health of an open-source project.
More on bus factors below:
📄https://arxiv.org/abs/2202.01523
Image
👩💻 Paige Bailey #BlackLivesMatter
@DynamicWebPaige
·
Nov 14
(3) General insights into code health:
What is the test coverage like for the open-source project? Are there any code quality concerns? To what extent?
This Tweet is unavailable. Learn more
👩💻 Paige Bailey #BlackLivesMatter
@DynamicWebPaige
(4) On a similar vein:
What is the documentation coverage like for each feature? This should include everything from function docstrings, to guides and tutorials.
And are there 'vignettes' (end-to-end examples, stitching together multiple features)?
📃 Documentation is a developer's first impression of your project—and great docs show a commitment to empathy+respect: welcoming people wherever they are, and patiently helping them move to the next level.
Docs are such a powerful way to care for users, to build a community.
(5) Do one or more companies depend on the open-source project for their core business?
And, if yes: are engineers at those companies contributing back to the project + upstreaming changes, or have they forked the codebase and kept it closed-source?
(6) For machine learning & research libraries:
Of recent papers that have released source code, how many of those papers import or use your tool? How has that # changed over time?
Do you observe that a given framework or library is preferred for a specific use case/domain?