Skip to content

Instantly share code, notes, and snippets.

@taktoa
Last active May 19, 2018 15:46
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save taktoa/ac5e8ac7f00bd40fb8b3a29cb7299a96 to your computer and use it in GitHub Desktop.
Save taktoa/ac5e8ac7f00bd40fb8b3a29cb7299a96 to your computer and use it in GitHub Desktop.

Ideas to improve my workflow

Easy

diff-factor-comments

A script that automatically separates a Haskell file diff into two diffs: one with only changes to comments, and one with changes to actual code.

Could be extended to other languages, as long as you can describe what the syntax for a comment is. Minimum viable version would only support single-line comment syntax (since I never use multiline comments anyway).

Medium

git-commit-comments

Using git and diff-factor-comments, commit only the changes that affect comments. Has a flag to commit only staged changes. Has a flag to commit only unstaged changes. The commit name/description would be automatically generated from the diff contents (and perhaps there could be a magic dotfile in the repository to change the behavior on a per-project basis).

The complicated part is dealing with the three-phase structure of git (unstaged, staged, committed), and correctly restoring the relevant parts of the state after the right parts are committed. Perhaps this logic could be usefully separated into a generic higher-order script that takes as input a commit message and an executable with the same "type signature" as factor-diff.

Hard

hackage-rank

Given a Hackage database, this script will compute various interesting metrics about each package, that will be used to determine whether or not I want to depend on it. Not all of these metrics are equally hard to implement; this is just a list of everything I could think of that could feasibly be automatically computed for 90% of codebases. If a metric is marked with PERSONAL, that means it is something that I care about but others may not.

  • Trustworthiness:
    • How many people depend on this package?
    • What is the PageRank of this package in Hackage's dependency graph?
    • Are people depending on this package more and more over time? Or is it a popular package that is nonetheless implicitly deprecated, like parsec?
    • How long-tailed is the distribution of committers who have made semantics-affecting commits (e.g.: projects that have lots of important commits made by random drive-by contributors are probably less trustworthy)? What does that look like if you weight by various metrics of committer trustworthiness (e.g.: activity on GitHub, number of Haskell repositories, quality of Hackage packages maintained by this person).
  • Correctness:
    • Does the package compile under Nix? Automatically try doJailbreak and dontCheck for "partial credit" if it doesn't.
    • Are all instances of RecordWildCards
    • Does the package have Liquid Haskell annotations? If so, does the Liquid Haskell typecheck?
    • Does the package include Coq or Agda source code?
    • What is the Safe Haskell status of this source code?
    • Does the codebase use {-# LANGUAGE Trustworthy #-}? If so, perhaps a flag should be raised... though in practice this may actually be a signal of quality (most people don't care about SafeHaskell).
    • How many times does the codebase use functions/values whose names begin with unsafe?
  • Performance:
    • Does the codebase use the [] type constructor?
    • Does the codebase use the String type from base?
    • Does the codebase use ST?
    • Does the codebase use lots of unboxed types?
    • Does the codebase define recursive ADTs whose recursive constructors include types that transitively contain mutable data? If so, the performance will likely be bad because Haskell garbage collection is linear with respect to the number of mutable references in the heap.
    • Does the codebase have benchmarks?
  • Documentation:
    • What is the Haddock coverage of this codebase?
    • What is the mean/median number of characters per comment?
    • PERSONAL: Does the codebase use multiline comment syntax?
  • Maintenance:
    • How many commits has this package gotten in the last day/month/year? Might be hard to figure out a robust statistic for this (since we want to approximate the instantaneous derivative of commits/time, but computing the numerical derivative of noisy / discontinuous data will give nonsense).
    • How long has this package been around?
    • Is the upstream of this package on GitHub, or somewhere else?
    • Does the package have Cabal upper bounds? Does it follow the PVP?
  • Code quality:
    • Does the codebase include a .hlint.yaml? If so, how long is it?
    • Does the codebase include a .stylish-haskell.yaml?
    • How many of each type of warning does GHC emit when compiling with the -Weverything flag? It's important to stratify on warning type because some warnings may occur in everyone's code, and thus not be predictive of code quality.
    • How many of each type of warning does HLint emit when using the .hlint.yaml in the codebase?
    • How many of each type of warning does HLint emit when using a user-specified .hlint.yaml?
  • Other:

Note that of course the metrics emitted by this tool should not be something you optimize on when writing code; they should only be used when deciding whether or not to depend on some code. These metrics do correlate with quality, but they do not define it.

It may be worthwhile to make these metrics more granular (e.g.: by module, rather than by package).

It would also be interesting to use a neural network to try to learn the function that decides whether or not I will use a package in a given category based on my past usage of packages and these metrics. Alternatively, I could rate a random sample of packages on Hackage and train it on a subset (and to prevent overfitting, stop training when the loss on the complement of that subset starts increasing). The loss function should probably prioritize minimizing false negatives (where the function thinks I wouldn't use the package even though I actually would) over minimizing false positives, since it doesn't take that much time to evaluate the quality of a package.

Kythe has a Haskell indexer that could be useful for computing many of these metrics.

It is important to remember that there are only 12,580 packages on Hackage, so implementing some of these features will only be worthwhile in the following situations:

  1. You are being paid by people in the Haskell community to implement them.
  2. The feature improves the model accuracy on a random sample of Hackage enough that the time savings associated with it is enough to justify the necessary work. Reading through and rating 12,580 packages can probably be done pretty quickly.
  3. You enjoy implementing the feature enough that it is worthwhile as an activity in and of itself.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment