The University of Illinois has contract agreements with several scholarly publishing vendors to conduct text mining. These vendors are listed below, along with instructions on how to get started. If you have a question about a vendor not listed below, or otherwise need assistance accessing the data, please contact the Scholarly Communications and Publishing Department.
Data Mining Instructions: For access to these collections, contact Scholarly Communications and Publishing. Review Gale's data mining FAQ. Original image files are also available in JPG or TIF format. See the spreadsheet below to view all hard drives and additional information.
Listed below are digital libraries that offer digitized versions of newspapers. Some are not in machine-readable format. For information on making scanned images machine readable, see the library guide on Optical Character Recognition.
The Viral Texts Project uses text mining methods to study how news stories, short fiction, poetry, and more went “viral” in nineteenth-century newspapers and magazines. Their data is open access on the Viral Texts GitHub for researchers to reuse for their own text mining projects.