Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

University Library, University of Illinois at Urbana-Champaign

Introduction to Text Mining Methods and Tools: Introduction

An introductory guide to text mining tools.

Credits

This guide was created by Erica Parker, Literatures and Languages Library graduate assistant.

Questions? Ask us!

If you have questions about text mining, please contact us:

Scholarly Commons
306 Main Library
217-244-1331

You can also email Daniel Tracy, Information Sciences and Digital Humanities Librarian.

 

 

Text Mining Overview

What is text mining?

Text mining is a research practice that involves using computers to discover information in large amounts of unstructured text.

Unstructured text is data not formatted according to an encoding structure like HTML or XML.

Examples of unstructured data used for text mining include journal and news articles, blog posts, and email

Researchers use text mining tasks such as:

  • sentiment analysis
  • entity extraction
  • document summarization

By using these tasks, researchers can connections and draw conclusions about the content of large text corpora. 

The image on the right is one example of what you can do with text mining. This pie chart represents the total words spoken by characters in the Jacobean play The Revenger's Tragedy.

Credit: Chart by Pgogy, available via Creative Commons license.

Text Mining Goals

Why do text mining?

Text mining helps researchers detect patterns and connections in large volumes of textual material.

According to researcher Marti Hearst, "In text mining, the goal is to discover heretofore unknown information, something that no one yet knows and so could not have yet written down." Text mining enables researchers to draw conclusions from large volumes of material they would not be able to otherwise read, synthesize, and incorporate into their scholarship.

Researchers in fields ranging from biological sciences to the humanities have begun using text mining to detect patterns and discover unknown information. 

Getting Started

How do I get started? 

This guide offers tips and resources for beginning text mining and analysis:

For information on finding data sets to use in your research, click the Data Sets tab.

The ATLAS.ti tab offers an overview of the qualitative analysis tool ATLAS.ti, which performs text mining and other forms of analysis for research.

Projects on Campus lists campus organizations and departments that do text mining-related work on campus.

The Resources tab includes links to general resources on text mining, including online courses and blogs on the subject.