In August, we wrote about how DALL-E 2, Microsoft Image Creator and other artificial intelligence technology use visual artists and photographers’ work to learn how to generate images based on user prompts. This high-stakes question of whether the unauthorized use of copyrighted works to train AI is legal is not confined to visual images.
While there are multiple AI training lawsuits against such companies as Google and Stability AI, comedian and author Sarah Silverman and writers Christopher Golden and Richard Kadrey made headlines when they filed two suits in the same court in July 2023. One case set to be argued in December cited Microsoft’s OpenAI, and the other, citing Meta’s Llama system, was recently argued before a judge. Each claims that AI training infringed on the book copyrights for Silverman’s Bedwetter, Kadrey’s Sandman Slim, and Goldman’s Ararat.
Illegally obtained training content
Both suits allege that the AI systems illegally acquired datasets containing the books from so-called “shadow library” websites Bibliotik, Library Genesis, Z-Library, and others. The Meta suit claims that Llama used ThePile, which is a copy of the better-known Bibliotik. These illegal libraries make books available to AI in bulk using torrent technology.
The plaintiffs argue in the Meta case that Llama did not cite copyright information when prompted to reproduce the content in the authors’ books. The authors also argue that they did not give copyright consent to use the books as training materials. The authors also claim six counts of copyright violations and unjust enrichment, unfair competition, and negligence. They seek statutory damages, restitution of profits and more.
Motion to dismiss ruling with a caveat
On November 9, federal judge Vince Chhabria ruled to grant Meta’s motion to dismiss allegations that text generated by Llama infringes on their copyrights. Still, he indicated that the plaintiffs were welcome to amend their claims.
Judge Chhabria allowed the core argument that the books were used to train Llama but told attorneys for the authors that their other arguments do not stand up. Chhabria noted at one point that the plaintiffs’ work and Llama’s language model would need to be substantially similar, which they are not, for the lawsuit to be successful. The judge then permitted the plaintiffs to amend most of their claims.
While this firm is not directly involved in these cases, watch this space for more on this quickly-moving issue.