Skip to main content

University Library, University of Illinois at Urbana-Champaign

OpenRefine

A free, open source, powerful tool for working with messy data.

Overview

Google Refine Expression Language (GREL) is to OpenRefine what formulas are to Excel or SQL to a database: a way to accomplish more complex transformations, queries, and arrangement of data. In OpenRefine, GREL can be used in four places:

  • Creating a custom text or numeric facet
  • Adding a column based on another column
  • Transforming cells in a column using the transformation function
  • Creating a new column by fetching a URL

All GREL input windows in OpenRefine share a similar layout:

  1. The expression box is where you can type in your expression.
  2. The space just to the right of the expression box will display a syntax error message in the case of an error.
  3. The preview box will preview the transformation of your data so that you can check the quality before applying the transformation.
  4. Under the "History" tab, you can select from previous GREL expressions used in any of your projects.

Google Refine Expression Language input window

 

Syntax

GREL was designed to resemble Javascript and generally uses “in sequence” syntax with a dot between each argument. For example:

value.replace('cat','dog')

In this case, the values of the current column will be searched for the string ‘cat’ which will be replaced by the string ‘dog’.

GREL includes a specific vocabulary for referring to cell values:

  • value = the values in the current column
  • cells['Column2'].value = the values in the specified column, ‘Column2’

A string can be written with either single or double quotation marks.

Common Expressions

Below is a list of commonly used expressions for quick reference. For more detailed information on GREL, please see About Regular Expressions.

Google Refine Expression Language Function
value.replace('string1','string2') replaces 'string1' with 'string2' in the current column
value.replace(' ','') replaces blank spaces with no space
value.replace("''"," ") replaces quotations with blank space (NOTE: double quotations must be used on the outside.)
cells['Column1'].value + cells['Column2'].value concatenates the values in the specified columns
cells['Column1'].value + 'string' adds 'string' to the values in the specified column
value + ' ' + cells['Column2'].value concatenates the values in the current column with the values in the specified column with a space in between
value.splitByCharType[0] returns the first element of a string based on the character type
value. == cells['Column2'].value compares two columns against one another
value.contains('string') checks to see if values in a column contain 'string'
value.startswith('string') checks to see if values in a column start with 'string'
value.endswith('string') checks to see if values in a column end with 'string'
diff(date 1, date 2, 'results format') returns the difference between 'date 1' and 'date 2' in the format (months, days, etc.) specified

NOTE: GREL is case sensitive, so be sure to check column names, strings, etc. are accurate before applying an expression.

5/21/2018 - Brinna Michael