Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

University Library, University of Illinois at Urbana-Champaign

Text Mining Tools and Methods

This guide contains resources for researching with text mining

Downloadable Tools for Text Mining

There are many ready-to-use digital tools for conducting text mining research. Some of the most popular ones are:​

AntWord Profiler
This resource is a freeware tool for profiling the vocabulary level and complexity of texts.  AntWord Profiler is a free download available for Windows, Mac OS X, or Linux.

Developed here at Illinois, ConText is a free, open-source application for performing a variety of text analysis techniques, including network graphs and topic models, based on textual data.

Gephi is an open graph visualization platform that supports exploration of all kinds of networks and complex systems.  Gephi can be downloaded for free onto any Linux, Windows, or Mac OS X device.

This resource is an open-source tool for comparing and collating multiple witnesses to a single textual work.  It offers a number of possibilities for humanities computing and textual scholarship. Juxta is a Java-based application that is available as a free download for WIndows, Mac OS X, and UNIX operating systems. 

Mallet is a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text. 

PhiloLogic is a full-text search, retrieval, and analysis tool developed by the ARTFL Project and the Digital Library Development Center (DLDC) at the University of Chicago.  It is free software that can be downloaded for a wide range of systems. 

Scrapy is an open source and collaborative framework for extracting the data you need from websites.  It is available as a free download for Linux, Windows, Mac OS X.

Textal is a free smartphone app that allows you to analyze websites, tweet streams, and documents to explore the relationship between words in the text via an intuitive word cloud interface.  The app allows you to generate graphs and statistics, as well as share the data and visualizations in any way you like.  Textal is available as a free download from the App Store on your Apple iOS device. 

TXM is a free, open source cross-platform Unicode & XML based text/corpus analysis environment and graphical client.  It is available as a free download for Windows, Linex, and Mac OS X.  It has a comprehensive range of analysis tools, such as concordances, collocate search, frequency list, etc.