Skip to content

Instantly share code, notes, and snippets.

@netsensei
Last active March 3, 2021 07:38
Show Gist options
  • Save netsensei/5016312c06fd3a08cb69 to your computer and use it in GitHub Desktop.
Save netsensei/5016312c06fd3a08cb69 to your computer and use it in GitHub Desktop.
Open Refine tips & tricks

Open Refine Tips & Tricks

Asorted commands, formula and such

This gist contains a list of tricks, commands and formula I collected while working with Open Refine.

Merging datasets based on a common key column

See this blogpost: Merging Datasets with Common Columns in Google Refine

Complex expressions

Open Refine supports expressions when cleaning update using its' "transform" tools. However, the documentation is a bit lacking on how to create 'complex' expressions. These are expressions that consist of multiple statements that need to be evaluated.

A classic example would be evaluating two separate cells in a row, in order to determine the value of a third row.

Then you will need boolean operations:

if (and(exp1, exp2), "ValueA", "ValueB")

Another example is mimicking an if(exp1) { valueA } elseif(exp2) { valueB } else { valueC } structure. This can be accomplished by nesting if structures:

if(exp1, (if(exp2), "valueA", "valueB"), "valueC")

Real life example given a dataset with 3 columns "ID", URL" and "Type".

if (isNotNull(cells["URL"]), (if(isNotNull(cells["URL"].value.match(/(.*)jpg/)), "representation", "data")), "")

Translates to:

  • If URL == empty Then type = NULL
  • elseIF (URL ends on .jpg) Then type = "representation"
  • else type = "data"

Note the use of a regular expression with the "match" function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment