Skip to Main Content

University Library, University of Illinois at Urbana-Champaign

OpenRefine

A free, open source, powerful tool for working with messy data.

Combining Cell Values

OpenRefine 3.3 allows to combine information from multiple columns. This method allows you to combine the contents of two columns, add a specific string or characters to a column's values, or a combination of both. There are several ways to combine contents from multiple columns. Here, introduce two basic functions to combine information:

Joining Columns

OpenRefine 3.3. allows to join the contents of multiple columns. It can be done in a few simple steps.

 

  1. Go to the one of columns you would like to join.
  2. Select "Edit column" and click "Join columns."
  3. Select columns you would like to join and click "OK."
    1. You can add separator between the contents of each column.
    2. You can replace null values.
    3. You can overwrite combining information into the original column or create a new column for the combining contents.

Helpful tips

  • New column will be placed after the column you chose. You can reorder the columns. The instruction can be found in Reordering columns.

 

Concatenation

 

NOTE: This function requires the use of regular expressions. The relevant expressions are provided here, but for more information, please refer to the section on Google Refine Expression Language (GREL) and the Resources page.

  1. Go to the column you would like alter and click the arrow button on the column header.

  2. Select the “Edit columns” and then choose “Add Column based on this column.” OR

  3. Select the "Edit cells" and then chose "Transform."

  • If selecting 2, you can create a new column for merging data. See an example of the first picture below.

  • If selecting 3, you can overwrite merging data in the column you chose. See an example of the second picture below. 

  1. A pop-up window will appear with several sections, including a GREL expression input box and a preview panel. For more information on this window, refer to the Google Refine Expression Language (GREL) section.

  2. The expression for combining column information is:

value + cells[‘Column’].value OR

cells[‘Column1’].value + cells[‘Colmn2’].value

  • "Value" indicates the values in the current column.

  • "Column" should match the name of the column whose values you would like to combine with the current column.

  • If a data type in a column is number, the result will return to sum of two values in each row. To avoid this, add string as following step 5.

  1. To combine column information AND additional strings, use a “+.” For example:

value + “-” + cells[‘Column’].value OR

cells[‘Column1’].value + “-“ + cells[‘Colmn2’].value

  •  Adds a dash between the values of the current column and the values of the column to be combined with the current column.

  1. Once you have typed your expression into the expression box, check the preview window to confirm that your transformation looks the way it should.There is also a section to the right of the expression box which will tell you if there are any syntax errors in your expression.

  2. When you are satisfied with the preview, select “OK.”

 

 

 

 

Helpful Tips:

  • When inputting the name of the column you wish to combine with the current column, remember that regular expressions are CASE SENSITIVE. Nothing will happen if the column name is not exact.

 

NOTE: If some sells contain null values, null cells will return when merging columns using these techniques. To avoid null values, follow a few simple steps:

  1. Go to "Text facet" and select "Customized facets."
  2. Chose "Facet by null" from the drop-down menu.
  3. A facet window will appear in the pane to the left side of the grid view. "True" means null cells.
  4. Go to "Edit cells" where you faceted and select "Common transforms." For more details, see "Common Transformations."
  5. Chose "To empty string" from the drop-down menu.