Skip to content

Instantly share code, notes, and snippets.

@mcampos-quinn
Last active May 14, 2019 22:34
Show Gist options
  • Save mcampos-quinn/0d556e1252a345272a1a7cf918a3ac82 to your computer and use it in GitHub Desktop.
Save mcampos-quinn/0d556e1252a345272a1a7cf918a3ac82 to your computer and use it in GitHub Desktop.
OpenRefine Jython / Python expression to conditionally merge columns
'''
This jython expression merges columns that are named sequentially,
but only if there is data in the column.
Trying to merge columns with GREL that include empty cells results in an error.
My data included columns like this:
Title 1 | Title 2 .. | .. Title 14
----
My Title | Other Title | etc.
And I wanted values in a single cell separated by newlines like this:
Joined Titles
----
My Title
My Other Title
Blank cells in a column return a KeyError, so try/except clauses
leave blank cells blank.
This was helpful: https://github.com/OpenRefine/OpenRefine/wiki/Jython
To run this, in the example of Title columns above, select Title 1> Edit Cells> Transform
then select Python/Jython in the language dropdown, and paste the script in the text box.
Make sure that "Title" is changed to whatever your column names are and that the
range of sequence numbers reflects how many you have.
Once it's run you can delete the extra sequence columns if you want.
'''
_columns = []
for num in range(2,15):
_columns.append("Title %s" % (num))
try:
title1 = cells["Title 1"]["value"]
except KeyError:
title1 = None
for _column in _columns:
try:
if cells[_column]["value"]:
title1 = "%s \n %s" % (title1, cells[_column]["value"])
except KeyError:
pass
return title1
'''
Here's a version where I had speakers in 4 separate columns and I wanted them joined by "; "
_columns = []
for num in range(2,5):
_columns.append("speakers %s" % (num))
try:
name1 = cells["speakers 1"]["value"]
except KeyError:
name1 = None
for _column in _columns:
try:
if cells[_column]["value"]:
name1 = "%s; %s" % (name1, cells[_column]["value"])
except KeyError:
pass
return name1
'''
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment