Skip to Main Content

University Library

LibGuides

OpenRefine

A free, open source, powerful tool for working with messy data.

Layout

Once you have imported your data, it is important to familiarize yourself with OpenRefine’s layout.

  1. In the top right corner there are three buttons:
    1. “Open…” returns you to the home screen where you can select projects.
    2. “Export” opens a dropdown menu of options to export your data.
    3. “Help” opens the OpenRefine User Documentation in a new tab in your browser.

  1. Below the bolded header stating how many rows/records there are two options:
    1. “Show as” allows you to change the grid view between rows and records. For more information on the difference between rows and records, see the explanation of Records and Rows below.
    2. “Show” allows you to change the number of rows/records visible in the grid view.

  1. In the center of the page is your data in the grid view, which looks similar to Excel. Features of the grid view include:
    1. Column headings with dropdown arrows for chosing functions
    2. Row/Record numbers and alternate row/record shading
    3. Selectable flags and stars

  1. On the left, there is a pane with two tabs:
    1. “Facet/Filter” allows you to work on selected sections of your data, including faceting, clustering, and filtering.
    2. “Undo/Redo” tracks and stores your history, allows you to undo or redo transformations, and export a JSON file of your transformations.

Records and Rows

There are two settings for the grid view in OpenRefine: rows or records.

The difference between rows and records is that “rows” display your data in individual lines, each numbered separately, while “records” display your data in multi-line groupings depending on the relationships between the data in those lines. For example:

This data has been transformed using “split multi-valued cells” on the author field to separate different authors into their own lines. On the left, the data is displayed as “records,” showing the different lines with the multiple authors grouped together. On the right, the data is displayed as “rows,” showing each of the multiple authors as a separate line.

NOTE: Take caution when permanently renumbering rows or records and be aware of what setting you are viewing your data under.

5/1//2018 - Brinna Michael