Skip to Main Content

University Library, University of Illinois at Urbana-Champaign

Scholarly Research Toolbox For Business Administration

Resources for the Illinois Business Administration Faculty and Doctoral Students

Cline Center for Advanced Social Research

Cline Center for Advanced Social Research has developed some data sets for textual research.   According to the Center's website, this is its purpose as a campus research center:

The Cline Center’s research draws on two types of data resources: extreme-scale collections of unstructured textual data created by news content providers around the world, and structured datasets created by the Cline Center that support a wide range of social science research on human flourishing and societal well-being. The following illustrates just one of the data assets curated or supported by the Cline Center.

Global News Index

  • Over 70 million unduplicated news stories collected with Cline Center’s Voyager web crawl system from thousands global news websites between 2006 to the present, with new stories added daily;
  • The entire population of newspaper stories published between 1945 and 2005 by the New York Times and Wall Street Journal;
  • Open-source information encompassing over 4 million news stories from BBC Monitoring’s Summary of World Broadcasts (SWB) from 1979 to the present and over 3 million news stories captured by the U.S. Government’s Foreign Broadcast Information Service (FBIS) from 1994 to 2003. News data from these sources consist of stories from every country in the world that have been translated into English by fluent speakers who are culturally resonant with the countries in which news items originally appeared.
  • Another 6.2 million scanned and digitized microfilm and microfiche records of SWB and FBIS content captured from the 1940s to the 1990s.
  • The only digitized record in the world of story summaries for four of the five main American newsreels that formed the hub of a worldwide newsreel system serving a global audience with visual news items nearly seven decades before the advent of CNN. The Cline Center also holds story summaries for one of the leading British newsreel companies along with a unique set of early television news summaries. Together, these various sources include around 130,000 stories broadcast between 1915 and 1985.

For database access and training,  contact Dan Shalmon, Engagement Coordinator.  Note: For UIUC researchers interested in textual analysis and sentiment, not intended for classroom assignments. 

International and Social Science Datasets