Skip to main content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

University Library, University of Illinois at Urbana-Champaign

Plot lexical trends using HathiTrust+Bookworm: Testing hypotheses

Testing hypotheses with HT+Bookworm

You may be able to use HT+Bookworm for testing actual hypotheses that you formulate in your research.

Consider the two words “lady” and “woman”. Many people believe that Britain has historically been a more class-divided country than the United States, with class hierarchies more prevalent in Britain than in the supposedly more democratic USA. A reasonable hypothesis, then, would be that the relative prevalence of the word “lady” over the word “woman” would be more pronounced in texts published in Britain than in those published in the USA.

Generate plots as shown below to check to see if this institution is verified. (This example is not meant to be a research study. It is intended only as a proof-of-concept.)

In the first graph, we plot the trends for the terms "woman" and "lady" with the "Publication Country" facet selected to be 'United States':


results showing works published in the US with occurrences of woman and lady, no major divergences


For the second graph, we plot the trends for the words "woman" and "lady", but with the "Publication Country" facet selected to be 'United Kingdom':

woman vs lady in UK works. steady decline in use of lady from until 1960, when use of woman overtakes use of lady


Exercise 1.
One could reasonably expect that the relative decline of usage in “lady” over “woman” would have set in earlier in Britain than in the USA. Review the plots above to check if they corroborate this intuition.

Exercise 2.
Click on the Metrics option in the upper right corner. Change it to Number of Words, Number of Volumes, or Percent of Volumes. What do the resulting plots look like? Can you intuitively explain (in general terms) why the plots look this way?

Scholarly Commons

Scholarly Commons's picture
Scholarly Commons
306 Main Library
Drop-ins welcome
Monday-Friday 8:30am-6:00pm
Phone: 217-244-1331