Skip to Main Content

University Library, University of Illinois at Urbana-Champaign

Introduction to Generative AI

This library guide is a UIUC campus resource to read and reference for instructional, professional, and personal learning. Updates will occur on a semester basis. Last Updated: March 2024


A content aggregator, or moderator, is an individual or organization that collects data. Content aggregators are often employees who train and improve the tool's algorithms. However, some worker communities have been exploited, as noted in Time magazine's article, "150 African Workers for ChatGPT, TikTok and Facebook Vote to Unionize at Landmark Nairobi Meeting." These employees, often noted as "invisible workers" or "ghost workers," can range from those who train and annotate or label the data to those who enhance and test the algorithm or the models as well as other tasks. Outsourced and contract data workers are especially susceptible to these conditions.

However, it should be noted that if a tool uses web scraping, collecting public information from the Internet, then data is gathered from anyone who has posted content online or has had their content published online. Training AI materials on copyrighted material are protected by fair use but exploit artists and authors while also devaluing their labor. In addition to equity and wage concerns, there are also repercussions to workers' mental health due to exposure to sensitive topics.

Many companies have hired, or are hiring, professionals to create original content and provide oversight, although this process is still under development.