Skip to content

Instantly share code, notes, and snippets.

@gushogg-blake
Last active January 16, 2023 09:31
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save gushogg-blake/eba91befba7aecb33965f26c243625b4 to your computer and use it in GitHub Desktop.
Save gushogg-blake/eba91befba7aecb33965f26c243625b4 to your computer and use it in GitHub Desktop.

Codex

Current find & replace features work well for names and single-line patterns that happen to map well to regular expressions. For more complex manipulations, something more comprehensive and purpose-built is needed.

In this article I introduce Codex, a flexible and expressive language for specifying code modifications.

Codex works like traditional find & replace, but with some extensions to make dealing with multiple lines and indentation intuitive, and to allow matching language-specific syntax elements with Tree-sitter queries.

Find expressions

A codex expression consists of one or more of the following:

  • Plain text, which matches itself.

  • A newline, which matches one or more newlines, skipping over whitespace-only lines.

  • An increase or decrease in indentation, which matches exactly that and is relative to the current context.

  • On its own line, * or + which matches zero or more lines or one or more lines respectively; followed by an optional capture label (see below), e.g. * @someLines.

  • A regular expression (in JavaScript literal syntax) followed by an optional capture label, e.g. /\w+/@functionName.

  • A Tree-sitter query which matches the text of the corresponding nodes, followed by an optional capture label, e.g. (function_declaration) @fn.

  • [ and ] which mark the start and end of the text to replace, respectively. Either or both can be omitted, defaulting to the start and end of the match.

Capture labels

A capture label consists of an @ followed by an alphabetic name for the capture, and makes the associated match available to use in the replacement (see Replacement expressions).

Examples

  1. Combining literals, regular expressions, and line quantifiers to match a JavaScript function:

    function /\w+/@name\(/[^)]*/@args) {
    	* @body
    }
    
  2. Matching one or more JavaScript functions with a Tree-sitter query:

    (function_declaration)+ @fns
    

Escaping

The following characters must be escaped with a backslash in literals:

  • \, /, [, ], and (.
  • @ if preceded by a regular expression or Tree-sitter query.
  • * and + if at the start of a line.
  • *, +, and ? if preceded by a Tree-sitter query as in the example above.

Capturing & deleting Tree-sitter nodes

Captured nodes within Tree-sitter queries are available in the replacement, and the names can be prefixed with a dash (e.g. @-name) to delete those nodes from the result (ie. they will not be there when a surrounding capture is inserted into the replacement).

Deleted nodes are available to use elsewhere in the replacement without the prefix, e.g. @name.

Replacement expressions

A replacement expression consists of one or more of the following:

  • Plain text, which produces itself.

  • A newline, which produces a newline and preserves the current indentation.

  • An increase or decrease in indentation, which indents or dedents relative to the current context.

  • A capture reference, e.g. @captureName, which produces the corresponding regular expression match, lines, or syntax nodes. Multi-line captures are re-indented to the current context.

To insert a literal @ in the replacement, two @s are used (@@).

When performing the replacement (including when removing @--prefixed nodes), blank lines are inserted or deleted according to context and preferences.

Examples

  1. Converting a JavaScript module that exports an object with an init method, to a function that performs the body of the init method and returns the original object with the init method removed:

    Find
    module.exports = (object
    	(method_definition
    		(property_identifier) @p
    		(statement_block "{" (_)+ @initBody "}")
    	) @-init
    	.
    	"," @-c
    	(#eq? @p "init")
    ) @obj/;?/
    
    Replace with
    module.exports = function() {
    	@initBody
    	
    	return @obj;
    }
    
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment