Meeka Bondy Meeka Bondy

What’s In An AI-Generated Voice? Maybe More Than The Law Can Hear

An examination of voice in the age of voice cloning and the shortcomings of copyright, right of publicity, and trademark law.

April 15, 2026

Ever since a song called “Heart On My Sleeve” went viral in 2023, the music industry has been grappling with the conundrum of the AI-generated voice. Universal Music Group shut down the “Fake Drake” song, but not because of any rights it held to the song itself (or the vocals behind it). In rapid succession, Suno and Udio hit the music scene, enabling thousands of would-be musicians to create music in the style of their favorite songs, genres, and artists. The three major record labels quickly sued both AI music generators but some of those lawsuits have since settled, both services continue to operate, and at least one is expected to relaunchwith licensed label content. 

By the time OpenAI announced Sky, a new voice assistant chatbot that sounded “eerily similar” to Scarlett Johannson, voice cloning technology was already threatening to make dubbing studios and voice actors obsolete. That same year, two voice actors sued an AI text-to-speech company called Lovo for violating their copyright, trademark, right of publicity, contractual, and other rights. The putative class action alleged Lovo used their voices and sound recordings without permission—not only as training material to improve its voice cloning technology—but as cloned voices that users could use to create their own AI-generated voice recordings.   

It turns out the human voice—along with its AI-generated sound-alike—raises “a number of difficult questions, some of first impression.” A song can be protected as a copyrightable “musical work,” “musical composition,” and/or “sound recording.” A film can be protected as a copyrightable “motion picture” or “audiovisual work.” But the human voice that animates those works sits at a more elusive intersection of copyright, right of publicity, trademark, and perhaps in the near future, biometric law. 

Voice Is Not Copyrightable

Decades before AI music and voice generators arrived, the only way to clone a voice was by hiring a human sound-alike. In 1988 and 1991, two advertisers did just that, prompting Bette Midler and Tom Waits to each sue for voice misappropriation. The defendants—who had cleared the rights to use the songs and hired the singers to sing them—argued there could be no copyright infringement, since voice misappropriation was preempted by federal copyright law. The Ninth Circuit disagreed, noting in the Waits case that the plaintiffs’ claims were “for infringement of voice, not for infringement of a copyright subject such as sound recording or musical composition.”

As to the copyrightability of voice, the Ninth Circuit began with the basic premise that voice is a sound, stating in the Midler case:

“Copyright protects ‘original works of authorship fixed in any tangible medium of expression.’ A voice is not copyrightable. The sounds are not ‘fixed.’ What is put forward as protectable [in voice] is more personal than any work of authorship.” (citations omitted)

When it comes to copyright protection, musical compositions qualify. Sound recordings qualify. Audiovisual works qualify. But the human voice itself does not. And, if voice does not qualify as copyrightable subject matter, then, as the Ninth Circuit reasoned, a human sound-alike copying that voice could not amount to copyright infringement:

Mere imitation of a recorded performance would not constitute a copyright infringement even where one performer deliberately sets out to simulate another’s performance as exactly as  possible.” 

Unlike the sound-alikes in Midler and Waits, the sound-alikes in Lehrman v. Lovo, Inc. were AI-generated with voice cloning technology. In their copyright infringement claims, the voice actor plaintiffs pointed to Lovo’s unauthorized use of sound recordings voiced by them to both train its large language models (LLMs) and enable cloned voice outputs by its users. In rejecting both claims, the Lehrman court returned to first principles—that voice is uncopyrightable—and accordingly, imitating or simulating voice cannot constitute copyright infringement.  

In reaching its decision, the court underscored copyright law’s well-established idea-expression dichotomy:

“[C]opyright ‘must concern the expression of ideas, not the ideas themselves,’ and does not extend to something as abstract and intangible as a ‘voice,’ (distinguishing between the ‘the recognizable sound of [the musician’s] voice (which is not within the subject matter of copyright)’ and ‘the copyrighted work in which that voice is embodied (which, of course, is within the subject matter of copyright)’).”

In the eyes of copyright law, there is a substantive difference between the copying of a sound recording and the copying of the sounds fixated in that sound recording. Absent any allegation or evidence that Lovo’s Genny model did or could generate infringing copies of the plaintiffs’ original sound recordings, the court found no copyright infringement in Genny’s ability “to create new recordings that mimic attributes of their voices like ‘pitch, loudness, tone, timbre, cadence, inflection , breathiness, roughness, strain, jitter, (variation in pitch), shimmer (variation in amplitude), spectral tilt, and overall intelligibility.” 

Accordingly, as the court explained, imitating or simulating the sounds (or voices) in a sound recording does not amount to a copying of the sound recording itself:

“The rights of duplication and derivation, with respect to sound recordings, ‘do not extend to the making or duplication of another sound recording that consists entirely of an independent fixation of other sounds, even though such sounds imitate or simulate those in the copyright sound recordings.’” (citations omitted)

Voice Under Right Of Publicity Law

Voice is protectable, if at all, under right of publicity law—and only in some states. Even so, not all voices qualify for protection, not all uses are prohibited, and a cloned voice is not anyone’s actual voice.

When the U.S. Copyright Office issued its Copyright and AI Report, it noted, in Part 1: Digital Replicas, how federal copyright law and state right of publicity laws both fall short, particularly in the face of “today’s sophisticated digital replicas.” After confirming “the Copyright Act protects original works of authorship but does not prevent the unauthorized duplication of an individual’s image or voice alone,” the Copyright Office made a point of addressing the limits of state laws: 


“State laws are both inconsistent and insufficient…. [S]ome states currently do not provide rights of publicity and privacy,while others only protect certain categories of individuals. Multiple states require a showing that the individual’s identity has commercial value. Not all states’ laws protect an individual’s voice; those that do may limit protection to distinct and well-known voices, to voices with commercial value, or to use of actual voices without consent (rather than a digital replica).”

Case in point—the California right of publicity statute at issue in the Midler case included “voice,” but the court held that “voice” applied only to Midler’s actual voice—not to its imitation by a human sound-alike. The court also declined to read “voice” into the statute’s reference to “likeness,” concluding that “likeness” “refers to a visual image not a vocal imitation.”

The court closed one statutory door only to open a common law one—recognizing a tort claim for the “appropriation of the attributes of one’s identity,” including attributes as integral to our identity as voice:

“A voice is as distinctive and personal as a face. The human voice is one of the most palpable ways identity is manifested. We are all aware that a friend is at once known by a few words on the phone. At a philosophical level it has been observed that with the sound of a voice, ‘the other stans before me.’ A fortiori, these observations hold true of singing, especially singing by a singer of renown. The singer manifests herself in the song. To impersonate her voice is to pirate her identity.” (citations omitted)

Even where voice qualifies for protection, that protection may still turn on whether the particular use is prohibited under that state’s right of publicity law. Not all uses are prohibited. Some state laws cover a limited set of advertising, merchandising, or other commercial uses, while others incorporate broad First Amendment exceptions. The result is a “patchwork of protections, with the availability of a remedy dependent on where the affected individual lives or where the unauthorized use occurred.” 

Even where a person’s voice qualifies for right of publicity protection and the use would otherwise be prohibited, a cloned voice—however convincingly human—remains a synthetic voice that most right of publicity statutes do not explicitly cover. The first state statute to address this gap was Tennessee’s aptly named ELVIS Act, which extends protection to “voice” and its “simulations.” Touted as “the first-of its-kind law to protect musicians from AI-generated media,” the Act reflects the music industry’s successful push to bring voice and sound-alikes within Tennessee’s amended right of publicity law:

“‘Voice’ means a sound in a medium that is readily identifiable and attributable to a particular individual, regardless of whether the sound contains the actual voice or a simulation of the voice of the individual.”

The Act goes on to include “voice” wherever “photograph” and “likeness” are mentioned, and expands "use" to include “the commercial availability of a sound recording or audiovisual work in which the individual's name, photograph, voice, or likeness is readily identifiable,” thereby ensuring that any “simulation” of someone’s “readily identifiable” voice is actionable. 

New York and California also amended their laws to cover “digital replicas,” though neither amendment was done in the context of their right of publicity statutes. New York’s version amends its General Obligations Law and California’s is part of its Labor Code. Both laws impose certain requirements when contracting for the use of digital replicas of voice or likeness in personal and professional services agreements. Illinois enacted a more comprehensive update—more akin to the ELVIS Act—to address digital replicas under its amended Right of Publicity Act. 

The Lehrman case was filed in New York before those digital replica amendments took effect, but the view of the United States District Court in the Southern District of New York of voice clones is nonetheless telling. Citing the “potentially weighty consequences” for “ordinary citizens who may fear the loss of dominion over their own identities,” the court acknowledged the absence of any remedy under federal copyright or trademark law, yet allowed the right of publicity claims to proceed.

New York Civil Rights Law §§ 50 and 51 cover the voice of living persons, the voice of deceased persons, and digital replicas of the voice of deceased persons. However, New York’s right of publicity statute does not explicitly cover AI-generated voice clones. Even so, the Lehrman court allowed the voice actors to proceed with their right of publicity claims, perhaps persuaded that “the essence of their voices ha[d] been replicated, such that the Genny algorithm can and does continually produce entirely new sound clips using their voices.” 

Voice as a Trademark?

Trademark law may be the next frontier for voice protection in the age of AI-generated voice cloning. Like other sounds, voice can be protectable as trademarks—but only when it functions as a source identifier, acquiring secondary meaning beyond merely being a part of the product or service itself. 

In 1950, NBC became the first company to register a federal trademark for a sound—the musical notes G, E, and C known as the NBC Chimes—marking the first "purely audible" service mark recognized by the U.S. Patent and Trademark Office. Four decades later, voice-based sound marks began to follow, including “Toot Toot At Beneficial You’re Good for More” in 1992, and the Tarzan Yell and AOL’s “You’ve Got Mail” in 1998. Earlier this year, Matthew McConaughey made some AI noise when he secured eight trademarks, including sound marks for audio of him saying “Alright, alright, alright”—his memorable line from Dazed and Confused—and “Just Keep livin’, right?”  

The Trademark Manual of Examining Procedure (TMEP) defines a sound mark as one that “identifies and distinguishes a product or service through audio rather than visual means.” Audio elements such as “a series of tones or musical notes, with or without words” and “wording accompanied by music” may qualify—but only where they “assume a definitive shape or arrangement” and “create in the hearer’s mind an association of the sound” with a particular product or service. 

Not all sounds qualify for sound mark registration. Only those that function as source indicators can serve as trademarks—sounds that are “arbitrary, unique or distinctive and can be used in a manner so as to attach to the mind of the listener and be awakened on later hearing in a way that would indicate for the listener that a particular product or service was coming from a particular … source.” By contrast, sounds that “resemble or imitate ‘commonplace sounds,’” or that “goods make in the normal course of operation,” require a showing of acquired distinctiveness beyond the everyday associations listeners attach to such commonplace or functional sounds.

There has been little litigation involving voice-based sound marks and the case law on sound marks more generally remains thin. In Ride the Ducks v. Duck Boat Tours, perhaps the most on-point case, a federal district court in Pennsylvania held that “a quacking noise made by tour guides and tour participants by use of duck call devices” was not sufficiently “distinctive” to support trademark infringement or dilution claims against a competing tour operator using a “nearly identical” “Kwacky Kwacker.” 

While a number of companies have registered voice and other sound marks to identify their goods and services, relatively few, if any (before McConaughey’s J.K.  Livin’ Brands Inc.’s registrations), have been tied to a particular person’s vocal or other services. Instead, most voice-based sound marks have been associated withs fictional characters (e.g., Tarzan, Homer Simpson), company mascots (e.g., AFLAC duck), or corporate brands (e.g., AOL, Yahoo).

Years before voice-based sound marks were issued, Michael Buffer—the voice behind the iconic phrase— “Let’s Get Ready to Rumble,” registered a service mark tied to his announcing services. He successful enforced that mark in several lawsuits and settlements involving unauthorized uses in radio, recordings, and film. Because the mark covered a phrase, however, it was a word mark— not a sound mark—and thus protected the words themselves, irrespective of how and by whom they were spoken.

Unlike Buffer and McConaughey, the plaintiffs in Lehrman had not registered for any word, sound, or other service marks for their voice services. Instead, they argued, in essence, that a “trademark-like interest in [one’s] image, likeness, persona, and identity” —which the court confirmed can include one’s voice—made the unauthorized use of their voice actionable under the Lanham Act. 

The court, for reasons unrelated to the protectability of voice under trademark law, rejected the Lanham Act claims but confirmed that voice is just as capable as an image of attaining the secondary meaning necessary to function as a trademark, stating:

“[V]oices are protectable only to the extent they function primarily as source identifiers rather than as products themselves.”

Central to the court’s analysis was distinguishing between voice as a “brand identifier” and voice as part of the plaintiffs’ “ordinary services in the voice-over market.” The court elaborated: 

“[V]oices may be protectable to the extent that they are being used primarily to identify the source of particular sound recordings, but are not protectable to the extent that they primarily function as content in those sound recordings.” 

It remains to be seen how far trademark law will stretch in protecting human voice from its AI-generated counterparts. McConaughey may be among the first celebrities to secure voice-based sound marks tied to his “entertainment services and persona,” but as musicians and celebrities confront the limits of copyright and right of publicity law, more may turn to trademark law to supplement whatever protection publicity rights may provide. 

What To Look Out For Next

Voice will continue to occupy a unique—and uneasy—position under intellectual property law. Copyright law provides little shelter for voice, separate and apart from its fixation in a sound recording. Right of publicity coverage can be spotty or nonexistent, particularly when faced with an AI-generated doppelganger. Trademark law, in turn, requires a secondary meaning that may be difficult to attain—especially when voice is both embedded in the brand and part of the services offered under it. 

None of this is likely to stop the rapid advancements in voice cloning technology or for that matter, the acceleration of lawsuits testing the proprietary limits of both human and AI-generated voice. Plaintiffs’ law firms have already set their sights on a host of voice and speech recognition technologies, alleging that the collection of “voiceprints” violates privacy statutes such as Illinois’ Biometric Information Privacy Act. The next frontier of AI voice litigation may have already arrived. 

View Full Article (PDF)

Read More
Meeka Bondy Meeka Bondy

An AI-Generated Image Might Be Worth More Than A Thousand Words

How new AI-generated personas are drawing new lines around copyright and right of publicity law.

March 13, 2026

When a hyper-realistic deepfake video featuring Brad Pitt and Tom Cruise throwing punches over Jeffrey Epstein surfaced weeks ago, many bemoaned the end of Hollywood movie-making as we know it. A little-known Bytedance-backed AI model called Seedance 2.0 became more famous than Tilly Norwood overnight, striking fear across the creative community and prompting cease and desist letters from at least four major movie studios

Welcome to the next phase of generative AI litigation. 

Generative AI Litigation 2.0 Has Arrived

We saw the preview, when three major movie studios handed Midjourney a 110-page complaint, alleging direct and indirect copyright infringement by Midjourney’s Image Service and upcoming Video Service. We saw the prequel, when fair use won, at least when it came to copying books to train the text-to-text generative AI tools that burst on the scene back in 2022. But in this next phase of generative AI technology, the digital replicas, deepfakes, and synthetic media made possible by today’s image, video, audio, and increasingly multimodal versions implicate right of publicity, copyrightability, and fair use questions not at issue in Bartz v. Anthropic or Kadrey v. Meta

The AI-generated humans, characters, and synthetic media from Seedance, Midjourney, and Tilly Norwood raise intellectual property (IP) issues that simply do not arise when the inputs and outputs are limited to words and punctuation marks. In the wake of these new AI image and video generators, three distinct AI personas are emerging, each following a different IP path:

·       The Pitt-Cruise deepfake—digital replicas of real humans, which implicate right of publicity law.

·       The Midjourney images—digital replicas of fictional characters, which implicate copyright law.

·       Tilly Norwood—synthetic media of digital humans, which may implicate neither. 

The first wave of generative AI litigation was about words, text, and training data. The next wave will be about faces, characters, and identity.

Human Likeness Is Not Copyrightable

Integral to the Pitt-Cruise deepfake, and its ability to garner more than 15 million views, is the identity and recognizability of the well-known actors in the video. Yet, neither Brad nor Tom have any copyright basis to object to the AI-generated video itself.

To the extent any copyright exists in the AI-generated images, video, and audio comprising a deepfake, those rights belong not to the humans depicted in the deepfake, but to the human who prompted it into being and posted it for all to see. Moreover, while copyright might attach to the video itself as a copyrightable audiovisual work, copyright protects “original works of authorship”—not the humans depicted in them. Copyright law does not recognize a human’s name, face, voice, or likeness as a copyrightable work.

Under U.S. copyright law, only “original works of authorship” that are “fixed in any tangible form of expression,” are “independently created by a human author,” and possess “at least some minimal degree of creativity” qualify as copyrightable subject matter. In contrast, “identity is not fixed in a tangible medium of expression” and “[a] person's likeness—her persona—is not authored and it is not fixed.” Accordingly, a “person’s name or likeness is not a work of authorship” and therefore falls outside the scope of copyright protection.

That does not leave Brad or Tom without any recourse. While actors rarely own the copyright to the tabloid photographs or blockbuster films in which they appear, they do possess right of publicity rights, not to the photographs or videos themselves, but to their likeness as captured in those works. “The right of publicity is an intellectual property right that protects against the misappropriation of a person’s name, likeness, or other indicia of personal identity—such as nickname, pseudonym, voice, signature, likeness, or photograph—for commercial benefit.” 

Courts—and Hollywood—have long recognized this distinction between the copyright rights to the audiovisual work and the right of publicity rights to the likenesses embedded in that work. As one court put it:  

“There is no ‘work of authorship" at issue in [a] right of publicity claim…. The fact that an image of the person might be fixed in a copyrightable photograph does not change this.... The fact that the photograph itself could be copyrighted, and that defendants owned the copyright to the photograph that was used, is irrelevant to the [right of publicity] claim.... The defendants did not have her consent to continue to use the photograph ....”

For AI image and video generators like Seedance, the uncopyrightability of human likeness becomes particularly meaningful in the context of AI-generated videos like the Pitt-Cruise deepfake. While Seedance need not worry about any copyright claims from famous people like Brad or Tom, it faces newfound exposure under right of publicity law, risks that did not arise when generative AI tools were limited to text. 

Moreover, any fair use defenses that allowed AI text generators to copy books to train their LLMs have no bearing in the context of AI-generated deepfakes featuring human likenesses. In the face of right of publicity claims, AI image generators are left with less than convincing arguments—that it was the user, not them, who was the bad actor, or that the deepfake constituted protected free speech or was somehow not commercial in nature. 

Characters Are Different

The Pitt-Cruise deepfake highlights the limits of copyright law when it comes to human identity. But the Midjourney images present a different picture. Unlike human likenesses, fictional characters occupy a more storied place under copyright law. Characters can, in and of themselves, qualify as copyrighted works, transcending the scripts, dialogue, photography, recordings, and performances that depict them. 

Since the radio days of the so-called Sam Spade case, courts have recognized the distinct copyrightability of characters where “the character really constitutes the story being told, but [not] if the character is only the chessman in the game of telling the story.” In the more recent so-called Batmobile case, the Ninth Circuit articulated the now widely adopted “three-part test for determining whether a character in a comic book, television program, or motion picture is entitled to copyright protection.” In order to be protectable under copyright law, the character must (1) possess “physical as well as conceptual qualities, (2) be “‘sufficiently delineated’ to be recognizable as the same character whenever it appears and “display consistent, identifiable character traits and attributes,” and (3) be “‘especially distinctive’ and ‘contain some unique elements of expression’ and not “a stock character such as a magician in standard magician garb.”

Iconic characters like Yoda and Darth Vader would easily meet these standards and qualify as copyrightable characters. In bringing both direct and secondary (i.e., vicarious and contributory) copyright infringement claims, the plaintiffs point not only to Midjourney’s use of their copyrighted characters to train its Image Service but to its other acts of infringement—its distribution of infringing outputs in response to user prompts, its refusal to institute copyright protection measures used by other AI image generators, and its use of user-generated outputs in the marketing and promotion of its Image Service.

The fair use arguments that protected the training activity behind text-to-text versions like Claude and Llama might not hold when it comes to Midjourney’s use of copyrighted characters to train its LLMs, let alone to how it “directly reproduces, publicly displays, and distributes reproductions and derivative works of [copyrighted] content” and “directly produces image outputs that infringe on [the plaintiffs’] copyrighted characters.” We will have to wait and see whether fair use continues to win out when the infringement involves the AI-enabled image-generation of copyrightable characters.

Outputs and Inputs Measure Differently on the Fair Use Scale

If the Midjourney lawsuit is any indication, in this next wave of generative AI, the fair use battleground is shifting—from the training inputs used by AI companies to the outputs generated by user prompts. In the first wave of generative AI litigation, the copyright claims focused on the use of copyrighted works by AI companies in the context of AI training. Key to those earlier fair use rulings was the finding that “training LLMs did not result in any exact copies nor even infringing knockoffs of their works being provided to the public.”

For the Bartz court, it mattered little to the fair use analysis that the LLM’s “mapping of contingent relationships was so complete” that it “indeed simply ‘memorized’ the work it trained upon almost verbatim” and could “recite works it had trained upon.” Rather, the important question was whether “any LLM outputs infringing upon their works ever reached users of the public-facing Claude service.” That “Claude created no exact copy, nor any substantial knock-off[,] [n]othing traceable to Authors’ works” was key. The court underscored the point repeatedly: 

“To repeat and be clear: Authors do not allege that any LLM output provided to users infringed upon Authors’ works. Our record shows the opposite…. Here, if the outputs seen by users had been infringing, Authors would have a different case. And, if the outputs were ever to become infringing, Authors could bring such a case. But that is not this case. Instead, Authors challenge only the inputs, not the outputs, of these LLMs.” (citations omitted)

Similarly, the Kadrey court reinforced this input-output distinction in the context of the plaintiffs’ regurgitation argument, “that the model will regurgitate their works (or outputs that are substantially similar), thereby allowing users to access those works or substitutes for them for free via the model.” Meta countered that its “mitigations” (the post-training of models to prevent them from ‘memorizing’ and outputting copyrighted material) successfully lowered Llama’s regurgitation rate to a mere 50 words and punctuation marks 60% of the time, even with “adversarial prompting” (the testing of models designed to get them to regurgitate copyrighted material from their training data). The court found this regurgitation rate of 50 words and punctuation marks 60% of the time to not be a “meaningful portion” when it came to the copying of books. 

In this next phase of generative AI litigation, it could be that the very ability of Midjourney’s Image Service to so easily produce such near-perfect replicas of so many of our most cherished movie characters could ultimately be the very feature that topples fair use protection for the AI-generated images and video of the future. Even as generative AI technology accelerates beyond simple text to lifelike images and videos, courts may be less wowed by how “transformative” these technologies are and more troubled by the “market substitution” posed by how well and easily they regurgitate so many copyrightable characters.

What to Look Out for Next

With Seedance, Midjourney, and Tilly Norwood, the future of generative AI technology promises to bring a lot more than an endless torrent of words. We now have at least three distinct AI personas to watch, each moving down a different legal trajectory. In the Pitt-Cruise deepfake, we see user-generated, AI-generated replicas of real humans, which will turn more to right of publicity law than copyright law. In the Midjourney images and video, we see user-generated, AI-generated replicas of fictional characters, which will rise and fall under copyright law. 

With Tilly Norwood, the uncanny valley—and broader workforce displacement concerns—that make us so uneasy with synthetic media might be the only forces holding “her” back, as her appearance neither invokes nor infringes any preexisting copyright or right of publicity rights. Ironically, the very human creativity that courts have recognized when extending copyright protection to fictional characters could pave a path for synthetic media like Tilly Norwood to move beyond the general uncopyrightability of AI-generated images.  

When it comes to AI-generated images, it turns out looks matter—and character counts.

View Full Article (PDF)

Read More
Meeka Bondy Meeka Bondy

Three Fair Use Ironies From 2025 To Watch Out For In 2026

A look at how generative AI technology is upending the boundaries of copyright law and fair use.

February 13, 2026

The year 2025 was a riveting year for those of us in media and technology—including me. I launched Meeka J. Bondy, PLLC to bring my big law, big media, and big tech background to the cutting-edge businesses shaping the future of entertainment. No doubt 2026 will bring a wave of dealmaking for media and technology companies alike. But before we get too far into the year, allow me to share my first post under my new banner. 

Copyrighting the Uncopyrightable

2025 brought us the first fair use decisions—one against and two in favor—out of the dozens of copyright cases filed against AI companies since 2022. The first AI fair use ruling, Thomson Reuters v. Ross Intelligence, held that it was not fair use to use copyrighted works for AI training purposes.  That rejection, however, was carefully cabined, applying only in the context of a nongenerative AI tool developed for the purpose of creating a competing product. More remarkable than the court’s fair use analysis was its soliloquy on copyrightability and the apparent fine art of lawyering. 

After confirming the longstanding legal precedent that judicial opinions themselves are uncopyrightable, the court began on familiar ground, finding that Westlaw headnotes, taken as a whole, constitute a copyrightable “factual compilation,” based on the “minimal degree of creativity” reflected by the compiler’s “choices as to selection and arrangement.” But the judge went further, concluding that each Westlaw headnote, standing on its own, constituted “an individual, copyrightable work”—“even any that quote judicial opinions verbatim.” Analogizing “the lawyer’s editorial judgment to that of a sculptor,” the court explained:

“A block of raw marble, like a judicial opinion, is not copyrightable. Yet a sculptor creates a sculpture by choosing what to cut away and what to leave in place. That sculpture is copyrightable. So too, even a headnote taken verbatim from an opinion is a carefully chosen fraction of the whole.”

While I wholeheartedly agree that the practice of law involves a fair amount of creativity, I have my doubts that a lawyer—through the craft of verbatim copying and basic paraphrasing—can, simply by lifting important quotes and placing them in a proprietary database, infuse copyrightability into uncopyrightable judicial opinions. 

For me, the seemingly lower bar for “originality” here is hard to square with the U.S. Copyright Office’s approach to AI-generated works, which largely refuses copyright registration for AI-generated outputs (or at most, limits protection to the human author’s “selection, coordination, and arrangement”). In the age of AI, it feels disjointed to protect the work of human lawyers chiseling verbatim quotes from uncopyrightable judicial opinions, while discounting the work of human artists chiseling visual artwork from uncopyrightable AI-generated images. I, for one, will be looking for the Third Circuit to clarify this copyrightability issue (if not the rest of the fair use analysis).

Getting paid for sitting idle.

On the heels of the first rejection of the fair use defense in a nongenerative AI training case came two back-to-back fair use decisions—Bartz v. Anthropic and Kadrey v. Meta—both ruling in favor of fair use where the purpose of the AI training was to create a generative AI tool. Both cases arose out of the Northern District of California. Both involved the use of books to train LLMs. Both found generative AI to be highly “transformative.” Yet their copyright analyses were worlds apart, leading to some interesting, if not ironic, results.

For the Bartz court, two issues loomed large. First, did the AI company actually purchase the initial training copies? Second, were those copies actually used to train the LLM? 

On the first issue, the judge blessed AI training only if the initial copies were purchased “fair and square.” Under the court’s reasoning, AI companies that source their training materials from pirated shadow libraries are barred from invoking fair use.  No amount of subsequent transformative use could cleanse the illegitimate origins of the initial copies. For some, it may seem fair and equitable to require AI companies to pay the list price for the books in order to be eligible to raise the fair use defense. For fair use purists, however, this reasoning cuts against the doctrine’s core premise—that fair use excuses what would otherwise be infringing conduct. 

On the second issue, the court drew a sharp distinction between copies that were used for training and those that were not. If the copy was actually used to train an LLM, that use was fair use. If the copy merely sat in a centralized library and never made its way to the training dock, that use was not fair use—and therefore constituted copyright infringement. At first glance, that binary distinction appears sensible. But it leads to a somewhat absurd result—authors whose books were used to train AI received no compensation (other than any royalties owed for a single copy), while authors whose books were never used for training were entitled to be made whole for the harms caused simply by having their books sit idle in a corporate database. A less absurd framing of the ruling might go like this—using books to train AI is fair use, but stockpiling “all the books in the world” in a “general purpose” library “forever” is not.

Shortly after the decision, the parties settled the remaining claims for a staggering $1.5 billion dollars—roughly $3,000 per book spread across 500,000 titles. But importantly, that settlement applied only to the pirated books that sat unused in the central library—not to the books actually used to train AI. For now, this headline-grabbing settlement offers little guidance of what AI training license fees for books should be.

The Maybe Not So Vertical Fair Use Lane.

Taking a markedly different approach to fair use, the Kadrey court agreed that generative AI is highly transformative, but sharply criticized the Bartz court for fixating on the first fair use factor (the purpose and character of the use), while “blowing off” the most important factor, the fourth factor (the effect of the use on the potential market for or value of the copyrighted work). Rejecting what it called an “inapt analogy,” the Kadrey court countered:

“[U]sing books to teach children to write is not remotely like using books to create a product that a single individual could employ to generate countless competing works with a miniscule fraction of the time and creativity it would otherwise take.” 

For the Kadrey judge, the issue was not just about generative AI’s ability to memorize, regurgitate, or produce exact replicas or even substantially similar outputs. It was more existential. The court elaborated:

"So, by training generative AI models with copyrighted works, companies are creating something that will often dramatically undermine the market for those works, and thus dramatically undermine the incentive for human beings to create things the old-fashioned way.” 

The opinion reads almost as a plea, urging future plaintiffs to regroup and focus squarely on the fourth factor. Reluctant but nonetheless compelled to rule against the authors in this case, the judge captured well the angst shared by so many whose livelihoods depend on the value of human creativity:

“No matter how transformative LLM training may be, it’s hard to imagine that it can be fair use to use copyrighted books to develop a tool to make billions or trillions of dollars while enabling the creation of a potentially endless stream of competing books that could significantly harm the market for those books.”

For now, the fair use shield remains intact, at least in the context of generative AI training. But as AI becomes more competitive, more specialized, and more vertical, cracks may begin to form. Read together, the fourth factor focus in Kadrey and the first factor emphasis in Thomson-Reuters could spell tougher terrain for the next generation of AI. In the spirit of Kadrey and Bartz, broadly “horizontal” AI tools may continue to qualify for fair use protection. But as AI becomes more verticalized, it may be harder for AI companies to claim fair use when the copyrighted works they seek to emulate both fuel their models and define their outputs—and when the competing tools they seek to build are capable of displacing the entire market for those copyrighted works.

View Full Article (PDF)

Read More