A Guide to the HathiTrust Research Center: Algorithms

An introductory guide to the tools and resources of the HathiTrust Research Center.

Using Algorithms

After you have created a workset, go back to the HTRC Portal to run analyses of your workset(s):

1. Go back to the HTRC Portal homepage and click on “Text Analysis Algorithms” and then click “Execute an Algorithm” near the top of the page. You will be taken to this "Algorithms" page:

The "Algorithms" page provides a brief description of what each algorithm does. You should select your desired algorithm by clicking on it.

2. Click on "Meandre_OpenNLP_Entities_List." and you will be taken to the algorithm page shown here:



3. Enter a Job Name for this analysis. It will show up later as "Job Title" when you look at the results.

4. Select the workset THATCamptest@harrigreen in the drop-down list below the message "Please select a workset for analysis."

Note: Avoid running against a randomly chosen public workset, as many of the public worksets are very large, and large worksets may not run to completion within a reasonable amount of time, or may crash due to running out of memory space.

5. Click on submit to execute the algorithm → you will be taken to "Job Staging" screen, where you may have to refresh to see the status of your job. 

Viewing Your Results

Your workset may take a few minutes to process. On the analysis page, you will see the status of your job under "Active Jobs". 

When your results are ready, you will see the job under "Completed Jobs":

Now your results are ready to view! Click on the blue link under "Job Name" to view results.


Here is an example result using the TagCloud algorithm:


Now you can perform topic modeling, generate word clouds, and more! 

Try experimenting with different worksets. Run different algorithms, experimenting with different parameters when possible, to see how your results turn out.