Copyright protection is automatic when a sufficiently original (minimally creative) work of authorship is fixed in a tangible medium of expression. Fixation is easy to do: think taking a picture, writing something down on paper or typing it into a computer—all of these actions constitute fixation. Many people are surprised that copyright is automatic, but it is. So, you need not use a copyright notice on work nor register the work with the copyright office to have a legal copyright. Copyright allows creators the exclusive rights to copy their work, distribute their work (even over the internet), make derivatives from the work, and publicly display or publicly perform it.
Copyright owners can also grant others the ability to exercise these rights through licenses (which is another way of saying contracts). Textbooks, works of art, and computer programs are all examples of copyright-protected material. You even own the copyright to your blog posts! More information about copyright can be found at our Copyright Reference LibGuide. While generative AI is currently an emerging tool and not explicitly referenced in the Copyright Act, AI tools and AI-generated works are still subject to copyright law, which protects a creator’s exclusive right to manage the use of their work.
Need help locating sources for text and data mining? Check out the Finding Text Data Libguide: https://guides.library.illinois.edu/c.php?g=1281804
Not sure what you can and cannot do with material that you find through the Library’s digital collections? Email Scholarly Communication and Publishing: scpub@illinois.edu.
In some circumstances, you may also be able to use copyrighted material through fair use, which can allow the unlicensed use of copyrighted material. Fair use permits is an affirmative defense to copyright infringement, allowing someone to use copyrighted material without first seeking permission from the copyright holder. When determining whether a particular use is a fair use, one must consider each of the four factors in the fair use statute: the purpose of the use; the nature of the underlying work used; the amount of the work taken; and the effect on the potential market for the underlying work. Also, if a new use is transformative, or comments on the original with a new purpose or meaning, it is more likely to be considered a fair use. (For a deeper explanation of fair use, watch this fair use video).
Generally, training generative AI on legally accessed copyrighted material may constitute a transformative fair use because inputting many works into generative AI software is for the purpose of machine learning and large language modeling rather than reading the works. However, when engaging copyrighted material with AI – both as training input and AI-generated output – researchers should note that terms of service can and often do limit fair use, and contract law trumps any rights and exceptions to copyright law a user of copyrighted works might otherwise be able to assert.
For example, the University’s licensing agreement with Elsevier allows for researchers affiliated with the University of Illinois to access the content they have published, but forbids users from scraping said content – a common practice when training generative AI. In this scenario, scraping violates Elsevier’s terms of service regardless of fair use. As many online resources come with terms of service or other legal constraints, users should review these constraints before using AI to engage with copyrighted material.