The University of Illinois has contract agreements with several scholarly publishing vendors to conduct text mining. These vendors are listed below, along with instructions on how to get started. If you have a question about a vendor not listed below, or otherwise need assistance accessing the data, please contact the Scholarly Communications and Publishing Department.
The following pages link directly to the scholarly publishing databases, and do not automatically sign you in via the University of Illinois' institutional login. To login to these databases, you can either find and log in via the "Institutional Login" pages for these databases, or you can find the database in the library's A-Z Databases page.
Elsevier provides research articles and books focused on fields of science and technology, including engineering, medicine, social science, and GIS. Notable databases include INSPEC, ScienceDirect, Scopus, and Engineering Village.
Data Mining Instructions: Access the Elsevier API, and review their data mining policy. Each researcher must create an Elsevier account and register for their own API key. Data delivered in XML format.
JSTOR is a collection of research articles and books dating back to the earliest publications in humanities fields, especially language, literature, history, and philosophy.
Data Mining Instructions: JSTOR Data for Research provides an API that can be used to retrieve metadata and reference information for up to 25,000 documents. For researchers needing to conduct full-text analysis OR retrieve more than 25,000 documents, contact JSTOR directly to request the data set. Data delivered in PDF, HTML (if available), XML, or JSON format.
Wiley's AGU Digital Library, which includes 100 years of earth and space science research, includes books in their collection as they become available.
Data Mining Instructions: Use the CrossRef data mining service, which includes thousands of publishers, including Wiley. See more of Wiley's terms and conditions here. Data delivered in JSON format.
Libraries and archives make available online some digitized content that can be used in text analysis. Due to copyright restrictions, the texts available are primarily texts created before the early twentieth century.
Text Creation Partnership for EBBO, ECCO, Evans
Digital Public Library of America