OpenAI has filed a motion to dismiss two similar lawsuits from book authors, and is essentially saying the plaintiffs don’t really understand copyright infringement.
The motion to dismiss, filed at a US district court in California, marks OpenAI’s first attempt at trying to throw out legal cases aimed at the company for alleged copyright violation. OpenAI claims that the authors “misconceive the scope of copyright, failing to take into account the limitations and exceptions (including fair use) that properly leave room for innovations like the large language models now at the forefront of artificial intelligence.”
Authors sue ChatGPT
This year, two nearly identical class-action lawsuits filed back-to-back saw two book authors suing OpenAI for allegedly violating copyright infringement.
The first lawsuit, filed by Sarah Silverman, Richard Kadrey, and Christopher Golden in early July, alleges that OpenAI “illegally” trained on their copyrighted works without their consent or credit. The second lawsuit, filed by authors Paul Tremblay and Mona Awad, carries a similar allegation.
For Silverman, she alleges that her copyrighted novel The Bedwetter was used to train ChatGPT and other AI models without her permission.
OpenAI, ChatGPT, and copyright
OpenAI says that in the US, “the constitutional purpose of copyright is to ‘promote the Progress of Science and Useful Arts’.” The company cites from past notable copyright battles involving Big Tech – including Oracle, Sony and Google – highlighting that in some cases, using copyrighted material in “transformative ways” does not violate copyright.
“These are key legal principles upon which countless of artificial intelligence products have been developed by a wide array of technology companies,” OpenAI wrote.
Further, OpenAI also challenges the book authors’ use of the term “derivative work” in describing ChatGPT’s outputs. OpenAI states that not every single answer that ChatGPT produces necessarily infringes on a “derivative work.”
OpenAI asks that if ChatGPT’s responses to simple ‘Yes’ or ‘No’ questions, or general knowledge questions such as ‘the name of the President of the United States’, or describing the plot of Homer’s The Iliad – does that mean ChatGPT has infringed upon millions of texts in existence?
“The plaintiff’s theory is simply incorrect, and would be unworkable were it not… that is not how copyright law works,” OpenAI explains.
OpenAI also claims that the authors fail to provide solid proof that ChatGPT copied their works word-for-word. The company argues that in copyright infringement lawsuits, proof of “substantial similarity” is an important hallmark in determining whether or not A plagiarised from B.
Thus, “if a defendant’s work is not ‘substantially similar’ to an original, it is neither a ‘copy’ nor a ‘derivative work’ for purposes of Section 106 [of US copyright law],” argued OpenAI.
Copyright infringement and “ideas”
Perhaps the most bizarre part of OpenAI’s motion to dismiss, is that it’s attempting to convince the judge that copyright laws “don’t protect ideas, facts, or language.”
“Copyright protects the particular way an author expresses an idea – not the underlying idea itself, facts embodied within the author’s articulated message, or building blocks of creative expression,” OpenAI argues.
In both cases, OpenAI says that the authors registered copyright for their specific books. However, elements that cannot be copyrighted are “word frequencies, syntactic patterns, and thematic markers” – they’re “simply beyond the scope of protection,” the company argues.
AI and creatives don’t get along
Creatives have long expressed strong disdain towards large language models (LLMs). Hollywood’s actors and writers – who currently remain on strike – have also highlighted AI’s infringement upon the creative industry as a priority concern that needs to be addressed by studios and streamers.
Do OpenAI’s arguments hold water? The judge will decide.