OpenAI and Microsoft (NYSE: MSFT) may have already started cleaning up themselves after The New York Times filed a federal lawsuit alleging that the companies unlawfully used the media organization’s content to train the artificial intelligence chatbot ChatGPT.
Large language models (LLMs) like ChatGPT are capable of doing what they do because of the vast amounts of information that they’re trained on. For these LLMs to keep performing well, AI developers like OpenAI will need to continuously train them using new information. With the lawsuit, the Times is telling the world that OpenAI and Microsoft can’t keep piggybacking off their publicly accessible but copyrighted content — at least not for free.
IP and AI lawyer Cecilia Ziniti noted that current responses now seem to be “copyright-aware” suggesting that this move could shift the discussion into fair-use territory. “Things get more interesting now. Is it fair use to use NYT content without consent as input for training? Is that use transformative?”
🦜OpenAI seems to have fixed verbatim content parrot-backs, at least since NYT put together Exhibit J.
— Cecilia Ziniti (@CeciliaZin) December 29, 2023
Some copyright-aware answers from ChatGPT …
"I'm sorry, but I can't provide verbatim excerpts from copyrighted texts"
"I can't complete the paragraph"
"I can summarize or… https://t.co/499awrZeRD pic.twitter.com/7P1fpS4wy2
It’s starting to appear that this will be the tact that OpenAI and Microsoft are taking. Ziniti found similar changes in Dall-E, OpenAI’s image generator. The program now comes with what looks like an infringement filter which lets it outright refuse to create copyrighted images or offer an abstracted version of it.
🚀 Luigi's revenge? More OpenAI progress on copyright today.
— Cecilia Ziniti (@CeciliaZin) January 2, 2024
Ask for a copyrighted character. Now, DallE either refuses or — critically for copyright — abstracts your requests. It's a filter to avoid infringing outputs. Examples and legal analysis in thread. 👇 https://t.co/8kK5YOhJ6F pic.twitter.com/m5x1SAgsBg
In doing so, ChatGPT’s outputs become “much less likely to be infringement,” citing a similar case involving a dollmaker.
6/ My take? GPT's post-filter outputs are much less likely to be infringement.
— Cecilia Ziniti (@CeciliaZin) January 2, 2024
Here's a real example. One farting doll maker sued another. And won. Reason:
✅ Farting doll concept – not copyrightable. It's ok to copy someone's idea of a farting doll.
❌ Farting doll with… pic.twitter.com/ViHUmdClmF
The changes may not impact the current case but it’s a glimpse into how OpenAI is planning to tackle training its models using copyrighted content in the future. Neither OpenAI nor Microsoft have formally responded to the lawsuit as of this writing. Needless to say, the outcome of the case will impact not just the two entities but the relationship of the industries on a broader scale.
Information for this story was found via Cecilia Ziniti on X, and the sources and companies mentioned. The author has no securities or affiliations related to the organizations discussed. Not a recommendation to buy or sell. Always do additional research and consult a professional before purchasing a security. The author holds no licenses.