Skip to main content

University Library, University of Illinois at Urbana-Champaign

OpenRefine

A free, open source, powerful tool for working with messy data.

Text Facet

Faceting allows you to quickly view unique values in a column, make edits to those values, and narrow your display to show results containing a specific facet.

To display facets:

  1. Go to the column you would like to analyze and click the arrow button on the column header.
  2. Choose “Facet” from the drop-down menu, and then select “Text facet.”
  3. A facet window will appear in the pane to the left side of the grid view.

Process for using the text facet

Helpful Tips:

  • Text facets will automatically display in alpha-numeric order, but you can sort them by count instead. Just select "Count" at the top of the facet window where it says "Sort by."
  • If you click on a facet, the grid view will only display rows or records that match this facet. Any number of facets and/or filters can be combined to narrow your data. To combine facets, hover over the facet you would like to add and select “include” at the right.
  • You can edit values using the facet window by hovering over the facet you want to edit and selecting “edit” at the right. Any edits made will be automatically applied to all rows with that value.
  • You can export a list of facets and their counts by selecting "[#] choices" at the top of the facet window. Simply copy the text in the pop-up window and paste into Excel or a text editor.

Text Filter

The text filter option works like the “Find” function in Excel, allowing you to search a column for values containing a specific string.

To display the text filter function:

  1. Go to the column you would like to search and click the arrow button on the column header.
  2. Choose “Text filter.”
  3. A window with a search box will appear in the pane to the left side of the grid view.

Process for using the text filter

 

In addition to the basic text filter function, OpenRefine 2.8 includes an "invert" function with will return all rows/records that DO NOT include the term in the text filter box. To invert a text filter:

  1. Follow steps 1-3 above.
  2. Enter your term in the text box and select "invert" in the top right corner of the search box.
  3. The top bar of the search box will turn orange and the results will change to display rows/records without the specified search term.

Helpful Tips:

  • To narrow your search further, check the “case sensitive” box below the search box to return only results that exactly match your search term.
  • You can combine a text filter with a facet to narrow results. Either create a facet and then use the text filter to search within a desired column, or narrow results with a text filter and then facet a desired column.

Custom Text Facets

There are two types of custom facets that are applicable to text specifically: word and text length.

To display these facets:

  1. Go to the column you would like to facet and click the arrow button on the column header.
  2. Choose “Facet” and then select “Customized facets.”
  3. Under “Customized facets,” select either “Word facet” or “Text length facet” and a window will appear in the pane to the left side of the grid view.

Process for using the word facet and text length facet

Helpful Tips:

  • The “Word facet” option functions like the “Text facet” but with the individual words within the column values.
  • The “Text length facet” allows you to narrow results based on the length of a value using a sliding scale.
  • Both of these can be helpful in identifying missing or misspelled values.

Facet by Blank

The facet by blank function allows you to narrow your data based on whether or not the value in a particular column is blank.

To display this facet:

  1. Go to the column you would like to facet and click the arrow button on the column header.
  2. Choose “Facet” and then select “Customized facets.”
  3. Under “Customized facets,” select “Facet by blank” and a window will appear in the pane to the left side of the grid view.

Helpful Hints:

  • The “Facet by blank” function returns true or false values where “true” means the value is blank and “false” means the value is not blank.
  • This function is useful for identifying potentially missing sections of data.

Duplicate Facet

The duplicates facet allows you to narrow your data based on whether or not the value in a particular column is unique.

To display this facet:

  1. Go to the column you would like to facet and click the arrow button on the column header.
  2. Choose “Facet” and then select “Customized facets.”
  3. Under “Customized facets,” select “Duplicates facet” and a window will appear in the pane to the left side of the grid view.

Helpful Hints:

  • The “Duplicates facet” function returns true or false values where “true” means the value is not unique and “false” means the value is unique.
  • This function is useful for isolating data with unique values.

8/17/2018 - Brinna Michael