Skip to content

Instantly share code, notes, and snippets.

@BenJam
Last active July 12, 2017 22:47
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save BenJam/9e1d6276efabced6e3973137e6993b01 to your computer and use it in GitHub Desktop.
Save BenJam/9e1d6276efabced6e3973137e6993b01 to your computer and use it in GitHub Desktop.

What is this?

A scratchpad for a classifer of software libraries focussing on a number of areas that will guide users to make more informed decisions about whether to include a library within their own project.

See librariesio/libraries.io#1486 for more information.

Themes/Questions

There are a number of key themes here that we need to look at. Within each I have posed a number of questions that might be of interest to a developer or maintainer.

Code

  • What scale is this project? - Looking at things like novel LoC, #dependencies and LoC in dependencies perhaps.
  • How many dependencies are included (inc. transisitve dependencies) - Simple count of dependencies vs. transisitvie dependencies (as we do now)
  • Are dependencies 'pinned' or open?
  • Does the project have any implicit dependencies (i.e. 'Elastic is expected to be running on port X') - This one is going to be very difficult, requiring some readme/documentation scanning
  • Does the project have tests?
  • Are those tests passing? - Some integration with CI provider perhaps?
  • What's the test coverage like? Is it improving? - using external code checking tool for instance
  • Does the project use a CI process or tool? - Detecting configuration file and/or badged in readme.md files
  • Is the code linted? Does it have many warnings?
  • Is the code readable? Does the project use standard code style? Does it follow it? - Check for code style in readme.md, contributing.md or styleguide.md or simialr. Apply a standard, external provider scroe like code climate.
  • Is the project 'performant'? Does it include any performance tests?

Documentation

  • Is the code documented?
  • What's the code comment coverage like? Is it improving?
  • Is there a readme?
  • IS there a wiki?
  • Is there a website?
  • Is there a installation document?
  • Is there an architectural diagram or other 'architecture-type docuemnt?'
  • Are these documents up to date?

Security

  • Does the project have any open CVEs listed against it directly?
  • Does the project have any open CVEs listed in its dependency tree?
  • How quickly are CVEs patched?
  • How quickly are dependency updates patched?
  • Does the project execute any code?
  • What priv's does the project require?
  • Does the project communicate with any external service?
  • Does the porject whitelist or blacklist those services?
  • Does the project communicate using secure connections?
  • Does the project cleanse and validate all inputs form external sources?
  • Does the project show any evidence of a security review (attacker model, risk assesment, thread/library safe checks etc)?

Distribution

  • Is the project distributed via a package manager?
  • Has the project been removed from its package manager?
  • Does that package manager have adequate measures to protect users of that distribution?
  • Does the distibution include a link to source? Is that source validated as the correct source for this project?
  • Does the project use semantic versioning?
  • Is there a regualr release cycle?
  • Has that cycle become longer or shorter?
  • Has there been a release in the last 3/6/12 months?
  • Is there a changelog? Is it updated?
  • What size is the diff of releases relative to the size of the project? i.e. how consistent is the project in its evolution
  • Are releases freuently breaking or non-breaking changes?

Community

Contributors

  • Are there many or few?
  • How many are there relative to the number of 'users' of this project?
  • Are they likely to be over-stretched or can they cover that demand?
  • What kind of redundancy is built into the community's structure?
  • How many contributors are there with a commit bit?
  • How many contributors are permitted to distribute a new patch or release?
  • Is the contirbutor

Contributions

  • How many contributors are actively working on the project versus the number of contributors over time?
  • How distributed is the workload (i.e. are there 1-2 people contributing 80-90% of the work?)
  • What proportion of contributions are code versus documentation, new issues, comments or PRs?
  • Are the number of issues being managed effectively? Are they increaseing, decreasing or staying the same?
  • What proportion of contributions are accepted within the exsting community versus outside?
  • What's the proportion of issues that are new, old and 'legacy'?
  • What's the average time to close an issue? Or accept a PR?

Principles and practices

  • Does the project have good guidance for contributors (code of conduct, readme.md, issue and PR templates)
  • Is there good guidance for getting a local instance up and running? A vagrant install? Test fextures etc?
  • Does the project have good guidance for contributing? (issue labels, code style, architecture, strategy, key features, decisions)
  • Does the project manage and distribute its knowledge well (project website, wiki, FAQ, stackexchange)
  • Are issues labelled, is there guidance on how to label?
  • Are these documents maintained?
  • Is deployment, testing, documentation, reviewing etc. automated?

Academic

  • Is there a citations file?

Ecosystem

  • Which projects would be at risk if certain factors changed? (orgs removing support, maintianers leaving, project yanked)
  • Can we use graph theory to create a stronger open source ecosystem, so that changes to not avalance.

Support

  • What's the turnover of support requests to answers like on sites like stack exchange?
  • Are support responses from a wide or narrow group?
  • Does the project have a readme.md, a installing.md or other installation document?
  • Is that/are those documents maintained?

FAILS

  • Data flow dependency
  • Static analysis
  • Implicit dependencies
  • Process/data/workflow dependencies/complements
@sclinede
Copy link

sclinede commented Jun 6, 2017

Distribution

What size are diffs relative to the size of the project? - How do you plan to use that? I mean that maybe part of the metrics should be listed only as a hint, without any label assigned to it.

Does the project use semantic versioning?
Are releases frequently breaking or non-breaking changes? - How do you plan to measure that? By version name? But we could not be sure that maintainers understand and use the Semantic Versioning... Could we provide any verification for that?

Contributors

Are there many or few? - we need to have a tool to measure that across different languages or so. As I understand for the different community, those values may vary.

What kind of redundancy is built into the community's structure? - Do you have any specific example of that measurement?

Principles and practices

Are issues labeled, is there guidance on how to label? - Don't understand how it could be measured? Is there any standard or docs for labeling?

@BenJam
Copy link
Author

BenJam commented Jun 6, 2017

@scliede is is literally just a dumping ground of ideas at this point that I'll be working through into something practical. Starting from base principles :)

@sclinede
Copy link

sclinede commented Jun 6, 2017

Yeah, I got it just some concerns that I found and mentioned to answer them later, sorry if it was too straight)

@BenJam
Copy link
Author

BenJam commented Jun 6, 2017

Fair enough :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment