Skip to content

Instantly share code, notes, and snippets.

@starsandspirals
Last active August 30, 2019 08:49
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save starsandspirals/3594e0b6072ff95c8d51e5264a7f6277 to your computer and use it in GitHub Desktop.
Save starsandspirals/3594e0b6072ff95c8d51e5264a7f6277 to your computer and use it in GitHub Desktop.

Project Summary

A complete list of my contributions over the course of the project can be found at https://github.com/google/codeworld/pulls?utf8=%E2%9C%93&q=is%3Apr+author%3Astarsandspirals+created%3A%3C2019-08-26+created%3A%3C2019-09-03+created%3A%3E2019-05-27.

The aim of my project for Google Summer of Code was to create a GHC source plugin which provides a domain-specific language for embedding exercise requirements into Haskell source files, so that they can be checked and the results automatically reported. This plugin has potential applications for users in a broad range of educational contexts, such as university professors teaching functional programming classes using Haskell, community members writing tutorials for their libraries, and authors who are writing textbooks with exercises.

Much of the groundwork for this plugin was built on work that existed as a prototype within CodeWorld. This is a web-based educational environment for K-12 students to learn the basics of programming in a functional style, as well as being a more general environment for experimenting with Haskell that anyone can use. The requirements checker within CodeWorld was an additional step run as part of the compilation process, which used syb and haskell-src-exts to parse the code. This allowed teachers to easily implement rules such as patterns specified via regular expressions that would be matched as templates against the Haskell code submitted by their students.

My eventual goal was to take this system and rewrite it in the form of a GHC plugin. This would have two main benefits: firstly, it would allow users outside of the CodeWorld ecosystem to make use of the requirements checker, and secondly, it would allow for more complex checks to be performed at different stages of compilation. Previously the system could only analyse the raw source code and the syntax tree parsed by haskell-src-exts, but by making use of the GHC API it is possible to check requirements after renaming, after typechecking, and potentially even carry out runtime tests.

Results and Documentation

The code for this project is available at https://github.com/google/codeworld/tree/master/codeworld-requirements. This folder is present in the main CodeWorld repository but is also a standalone Cabal package, and when installed the source plugin can be loaded at compile time by passing the -fplugin=CodeWorld.Requirements.RequirementsChecker flag to GHC.

Requirements can be embedded in Haskell source files using REQUIRES or XREQUIRES comments, the high-level format of which is described in the next section. If such a comment is provided, then the plugin will output a :: REQUIREMENTS :: block of information, which is then produced as a message by the compiler. The most complex piece for the user is the language used to specify requirements within the comments, which is described at the end.

Requirement Comments

There are two ways to embed requirements into a source file: REQUIRES comments, or XREQUIRES comments.

A REQUIRES comment looks like this:

{-
    REQUIRES

    Description: Some user-readable description.
    Rules:
    - << YAML-format description of 1st rule to check >>
    - << YAML-format description of 2nd rule to check >>
    - << YAML-format description of 3rd rule to check >>
-}

Description is a user-readable arbitrary string, which is used in the output to provide a section header summarizing this requirement. The Rules are YAML machine-readable descriptions of the formal rules that are verified. Details on the rules are below. A program can contain multiple REQUIRES blocks, one for each top-level requirement. Each requirement can then have any number of rules that must pass before that requirement is satisfied.

An XREQUIRES comment is an obfuscated variation on REQUIRES. Instead of a single requirement, it contains a compressed and base64-encoded list of requirements. (It is possible to include multiple XREQUIRES blocks, but it's not necessary.) It looks like this:

{-
    XREQUIRES
    eJyNkE9LxDAQxb/KkNMKQWxtd/Wo0IW96S56sUJmm+km0CQlyeKfT2+oDQq6
    sKcZeL957zEvbNs8Pm22zQ5aJkbvDh6N4CCkiyFNtHLaBQTljoME6yLsCYyT
    utckL1vWWgAwGDtFoXkfqYskF7MTh7paVqvy4l8o+XIoyroolieBwOGmXl3d
    1olg/HfbO+iPtovaWRB9yiIBB7LkcdCfFCAqgs4Zk+QRYyRvc1eFYafNONAD
    prOwmI7/2D+j17gfkhPGyYzSK9ATSOq1JQlvOqpJyOm5Ts6ZwfuP9Sx8JxUc
    cuJpqDwHuj4Hqn4g9voF6VGoEg==
-}

Requirements Output Block

A typical requirements output block looks something like this:

                    :: REQUIREMENTS ::
Obfuscated:

    XREQUIRES
    eJyNkE9LxDAQxb/KkNMKQWxtd/Wo0IW96S56sUJmm+km0CQlyeKfT2+oDQq6
    sKcZeL957zEvbNs8Pm22zQ5aJkbvDh6N4CCkiyFNtHLaBQTljoME6yLsCYyT
    utckL1vWWgAwGDtFoXkfqYskF7MTh7paVqvy4l8o+XIoyroolieBwOGmXl3d
    1olg/HfbO+iPtovaWRB9yiIBB7LkcdCfFCAqgs4Zk+QRYyRvc1eFYafNONAD
    prOwmI7/2D+j17gfkhPGyYzSK9ATSOq1JQlvOqpJyOm5Ts6ZwfuP9Sx8JxUc
    cuJpqDwHuj4Hqn4g9voF6VGoEg==

[Y] First user-visible description
[Y] Second user-visible description
[N] Third user-visible description
    Detailed description of what went wrong.
                  :: END REQUIREMENTS ::

The requirements block includes everything from :: REQUIREMENTS :: to the matching :: END REQUIREMENTS ::.

The block begins with an obfuscated version of the requirements, which can be used to replace the plain-text requirements (REQUIRES) with the obfuscated version (XREQUIRES) as discussed above. After this, there are top-level lines beginning with one of [Y] , [N] , or [?]. These indicate, for each requirement, if the requirement is satisfied, not satisfied, or if there was a problem checking the requirement. In the latter two cases, there are further lines explaining what went wrong.

Requirements Language

These are the current checks implemented, and how to specify them as rules in requirement comments.

  • containsMatch

    Example:

    containsMatch:
      template: |
        __func $any = $any
      cardinality:
        atLeast: 4
    explanation: The code defines fewer than 4 functions.
    

    Checks that the code contains a declaration matching the one given. The template is a standard Haskell declaration, except that it can contain certain wildcards. Any identifier beginning and ending with underscores, like _x_, is a placeholder that matches any chosen name. Additionally, there's some additional extra syntax:

    • $any matches any pattern or expression at all.
    • $var matches any variable name.
    • $con matches any constructor name.
    • $lit matches any literal expression.
    • $num, $char, or $str match specific types of literals.
    • $(tupleOf [| some template |]) matches any tuple whose elements each match the template in Oxford brackets (that is, between [| and |]).
    • $(contains [| some template |]) matches any pattern or expressions with a subexpression that matches the template in Oxford brackets (that is, between [| and |]).
    • $(allOf [ [| first template |], [| second template |] ]) matches any expression that matches all of the child templates. Other logical combinators include anyOf and noneOf.
  • matchesExpected

    Example:

    matchesExpected:
      variable: var
      hash: 999999
    

    Checks that the definition of var, once source locations are cleared, hashes to the given value (modulo 1 million).

  • notDefined

    Example:

    notDefined: var
    

    Checks that there is no definition for a variable named var. If there is, it fails.

  • usesAllParams

    Example:

    usesAllParams: func
    

    Checks that func makes use of all of its named parameters. If any parameters are not used in an equation, it fails.

  • notUsed

    Example:

    notUsed: var
    

    Checks that there are no references to var in the module.

  • maxLineLength

    Example:

    maxLineLength: 80
    

    Checks that there are no lines longer than 80 characters.

  • noWarningsExcept

    Example:

    noWarningsExcept: [Defined but not used]
    

    Checks that there are no warnings, except for those that match any of a list of given regular expressions. This list can be empty, in which case no warnings will be allowed.

  • typeSignatures

    Example:

    typeSignatures: true
    

    Checks that all top-level declarations have accompanying type signatures.

  • blacklist

    Example:

    blacklist: [circle]
    

    Checks that none of the symbols appearing in the blacklist are used. This rule and the whitelist rule below are checked after renaming, so it can be used in cases where a standard library symbol needs to be blacklisted but new symbols with the same name (such as ones in a let or where construct) should still be allowed.

  • whitelist

    Example:

    whitelist: [ellipse]
    

    Checks that only symbols appearing in the whitelist are used. This rule and the blacklist rule above are checked after renaming, so it can be used in cases where a standard library symbol needs to be whitelisted but new symbols with the same name (such as ones in a let or where construct) should not be allowed.

Future Work

  • Integration with CodeWorld

The more powerful source plugin version of the requirements checker has not yet been integrated with CodeWorld, which is still using the original version, although progress was made towards this by migrating from haskell-src-exts to ghc-lib-parser over the course of the project. This is due to some difficulties in getting the plugin functioning with GHCJS. The current state of this integration can be seen at google/codeworld#1074.

  • More powerful rules

Now that checks can be performed after renaming and after typechecking just as easily as they can after parsing or on the raw source code, this opens the door for much more powerful rules to be implemented within the requirements checker, which could involve type unification or other static analysis already implemented in GHC. For example, http://hackage.haskell.org/package/inspection-testing shows some very powerful uses of GHC via plugins to prove things about code, which could provide inspiration.

  • Runtime testing

This would be harder to implement even with the additional power of a source plugin, but is still theoretically possible. One way would be to have the plugin inject the testing code at the beginning of main, and at the same time move requirements reporting from a diagnostic output at compile time to a runtime action. Requirements that are checked at compile time would be hard-coded into the result, but dynamic requirements (checked at runtime) would be evaluated on each execution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment