What’s In An AI-Generated Voice? Maybe More Than The Law Can Hear

Apr 20

April 15, 2026

Ever since a song called “Heart On My Sleeve” went viral in 2023, the music industry has been grappling with the conundrum of the AI-generated voice. Universal Music Group shut down the “Fake Drake” song, but not because of any rights it held to the song itself (or the vocals behind it). In rapid succession, Suno and Udio hit the music scene, enabling thousands of would-be musicians to create music in the style of their favorite songs, genres, and artists. The three major record labels quickly sued both AI music generators but some of those lawsuits have since settled, both services continue to operate, and at least one is expected to relaunchwith licensed label content.

By the time OpenAI announced Sky, a new voice assistant chatbot that sounded “eerily similar” to Scarlett Johannson, voice cloning technology was already threatening to make dubbing studios and voice actors obsolete. That same year, two voice actors sued an AI text-to-speech company called Lovo for violating their copyright, trademark, right of publicity, contractual, and other rights. The putative class action alleged Lovo used their voices and sound recordings without permission—not only as training material to improve its voice cloning technology—but as cloned voices that users could use to create their own AI-generated voice recordings.

It turns out the human voice—along with its AI-generated sound-alike—raises “a number of difficult questions, some of first impression.” A song can be protected as a copyrightable “musical work,” “musical composition,” and/or “sound recording.” A film can be protected as a copyrightable “motion picture” or “audiovisual work.” But the human voice that animates those works sits at a more elusive intersection of copyright, right of publicity, trademark, and perhaps in the near future, biometric law.

Voice Is Not Copyrightable

Decades before AI music and voice generators arrived, the only way to clone a voice was by hiring a human sound-alike. In 1988 and 1991, two advertisers did just that, prompting Bette Midler and Tom Waits to each sue for voice misappropriation. The defendants—who had cleared the rights to use the songs and hired the singers to sing them—argued there could be no copyright infringement, since voice misappropriation was preempted by federal copyright law. The Ninth Circuit disagreed, noting in the Waits case that the plaintiffs’ claims were “for infringement of voice, not for infringement of a copyright subject such as sound recording or musical composition.”

As to the copyrightability of voice, the Ninth Circuit began with the basic premise that voice is a sound, stating in the Midler case:

“Copyright protects ‘original works of authorship fixed in any tangible medium of expression.’ A voice is not copyrightable. The sounds are not ‘fixed.’ What is put forward as protectable [in voice] is more personal than any work of authorship.” (citations omitted)

When it comes to copyright protection, musical compositions qualify. Sound recordings qualify. Audiovisual works qualify. But the human voice itself does not. And, if voice does not qualify as copyrightable subject matter, then, as the Ninth Circuit reasoned, a human sound-alike copying that voice could not amount to copyright infringement:

“Mere imitation of a recorded performance would not constitute a copyright infringement even where one performer deliberately sets out to simulate another’s performance as exactly as possible.”

Unlike the sound-alikes in Midler and Waits, the sound-alikes in Lehrman v. Lovo, Inc. were AI-generated with voice cloning technology. In their copyright infringement claims, the voice actor plaintiffs pointed to Lovo’s unauthorized use of sound recordings voiced by them to both train its large language models (LLMs) and enable cloned voice outputs by its users. In rejecting both claims, the Lehrman court returned to first principles—that voice is uncopyrightable—and accordingly, imitating or simulating voice cannot constitute copyright infringement.

In reaching its decision, the court underscored copyright law’s well-established idea-expression dichotomy:

“[C]opyright ‘must concern the expression of ideas, not the ideas themselves,’ and does not extend to something as abstract and intangible as a ‘voice,’ (distinguishing between the ‘the recognizable sound of [the musician’s] voice (which is not within the subject matter of copyright)’ and ‘the copyrighted work in which that voice is embodied (which, of course, is within the subject matter of copyright)’).”

In the eyes of copyright law, there is a substantive difference between the copying of a sound recording and the copying of the sounds fixated in that sound recording. Absent any allegation or evidence that Lovo’s Genny model did or could generate infringing copies of the plaintiffs’ original sound recordings, the court found no copyright infringement in Genny’s ability “to create new recordings that mimic attributes of their voices like ‘pitch, loudness, tone, timbre, cadence, inflection , breathiness, roughness, strain, jitter, (variation in pitch), shimmer (variation in amplitude), spectral tilt, and overall intelligibility.”

Accordingly, as the court explained, imitating or simulating the sounds (or voices) in a sound recording does not amount to a copying of the sound recording itself:

“The rights of duplication and derivation, with respect to sound recordings, ‘do not extend to the making or duplication of another sound recording that consists entirely of an independent fixation of other sounds, even though such sounds imitate or simulate those in the copyright sound recordings.’” (citations omitted)

Voice Under Right Of Publicity Law

Voice is protectable, if at all, under right of publicity law—and only in some states. Even so, not all voices qualify for protection, not all uses are prohibited, and a cloned voice is not anyone’s actual voice.

When the U.S. Copyright Office issued its Copyright and AI Report, it noted, in Part 1: Digital Replicas, how federal copyright law and state right of publicity laws both fall short, particularly in the face of “today’s sophisticated digital replicas.” After confirming “the Copyright Act protects original works of authorship but does not prevent the unauthorized duplication of an individual’s image or voice alone,” the Copyright Office made a point of addressing the limits of state laws:

“State laws are both inconsistent and insufficient…. [S]ome states currently do not provide rights of publicity and privacy,while others only protect certain categories of individuals. Multiple states require a showing that the individual’s identity has commercial value. Not all states’ laws protect an individual’s voice; those that do may limit protection to distinct and well-known voices, to voices with commercial value, or to use of actual voices without consent (rather than a digital replica).”

Case in point—the California right of publicity statute at issue in the Midler case included “voice,” but the court held that “voice” applied only to Midler’s actual voice—not to its imitation by a human sound-alike. The court also declined to read “voice” into the statute’s reference to “likeness,” concluding that “likeness” “refers to a visual image not a vocal imitation.”

The court closed one statutory door only to open a common law one—recognizing a tort claim for the “appropriation of the attributes of one’s identity,” including attributes as integral to our identity as voice:

“A voice is as distinctive and personal as a face. The human voice is one of the most palpable ways identity is manifested. We are all aware that a friend is at once known by a few words on the phone. At a philosophical level it has been observed that with the sound of a voice, ‘the other stans before me.’ A fortiori, these observations hold true of singing, especially singing by a singer of renown. The singer manifests herself in the song. To impersonate her voice is to pirate her identity.” (citations omitted)

Even where voice qualifies for protection, that protection may still turn on whether the particular use is prohibited under that state’s right of publicity law. Not all uses are prohibited. Some state laws cover a limited set of advertising, merchandising, or other commercial uses, while others incorporate broad First Amendment exceptions. The result is a “patchwork of protections, with the availability of a remedy dependent on where the affected individual lives or where the unauthorized use occurred.”

Even where a person’s voice qualifies for right of publicity protection and the use would otherwise be prohibited, a cloned voice—however convincingly human—remains a synthetic voice that most right of publicity statutes do not explicitly cover. The first state statute to address this gap was Tennessee’s aptly named ELVIS Act, which extends protection to “voice” and its “simulations.” Touted as “the first-of its-kind law to protect musicians from AI-generated media,” the Act reflects the music industry’s successful push to bring voice and sound-alikes within Tennessee’s amended right of publicity law:

“‘Voice’ means a sound in a medium that is readily identifiable and attributable to a particular individual, regardless of whether the sound contains the actual voice or a simulation of the voice of the individual.”

The Act goes on to include “voice” wherever “photograph” and “likeness” are mentioned, and expands "use" to include “the commercial availability of a sound recording or audiovisual work in which the individual's name, photograph, voice, or likeness is readily identifiable,” thereby ensuring that any “simulation” of someone’s “readily identifiable” voice is actionable.

New York and California also amended their laws to cover “digital replicas,” though neither amendment was done in the context of their right of publicity statutes. New York’s version amends its General Obligations Law and California’s is part of its Labor Code. Both laws impose certain requirements when contracting for the use of digital replicas of voice or likeness in personal and professional services agreements. Illinois enacted a more comprehensive update—more akin to the ELVIS Act—to address digital replicas under its amended Right of Publicity Act.

The Lehrman case was filed in New York before those digital replica amendments took effect, but the view of the United States District Court in the Southern District of New York of voice clones is nonetheless telling. Citing the “potentially weighty consequences” for “ordinary citizens who may fear the loss of dominion over their own identities,” the court acknowledged the absence of any remedy under federal copyright or trademark law, yet allowed the right of publicity claims to proceed.

New York Civil Rights Law §§ 50 and 51 cover the voice of living persons, the voice of deceased persons, and digital replicas of the voice of deceased persons. However, New York’s right of publicity statute does not explicitly cover AI-generated voice clones. Even so, the Lehrman court allowed the voice actors to proceed with their right of publicity claims, perhaps persuaded that “the essence of their voices ha[d] been replicated, such that the Genny algorithm can and does continually produce entirely new sound clips using their voices.”

Voice as a Trademark?

Trademark law may be the next frontier for voice protection in the age of AI-generated voice cloning. Like other sounds, voice can be protectable as trademarks—but only when it functions as a source identifier, acquiring secondary meaning beyond merely being a part of the product or service itself.

In 1950, NBC became the first company to register a federal trademark for a sound—the musical notes G, E, and C known as the NBC Chimes—marking the first "purely audible" service mark recognized by the U.S. Patent and Trademark Office. Four decades later, voice-based sound marks began to follow, including “Toot Toot At Beneficial You’re Good for More” in 1992, and the Tarzan Yell and AOL’s “You’ve Got Mail” in 1998. Earlier this year, Matthew McConaughey made some AI noise when he secured eight trademarks, including sound marks for audio of him saying “Alright, alright, alright”—his memorable line from Dazed and Confused—and “Just Keep livin’, right?”

The Trademark Manual of Examining Procedure (TMEP) defines a sound mark as one that “identifies and distinguishes a product or service through audio rather than visual means.” Audio elements such as “a series of tones or musical notes, with or without words” and “wording accompanied by music” may qualify—but only where they “assume a definitive shape or arrangement” and “create in the hearer’s mind an association of the sound” with a particular product or service.

Not all sounds qualify for sound mark registration. Only those that function as source indicators can serve as trademarks—sounds that are “arbitrary, unique or distinctive and can be used in a manner so as to attach to the mind of the listener and be awakened on later hearing in a way that would indicate for the listener that a particular product or service was coming from a particular … source.” By contrast, sounds that “resemble or imitate ‘commonplace sounds,’” or that “goods make in the normal course of operation,” require a showing of acquired distinctiveness beyond the everyday associations listeners attach to such commonplace or functional sounds.

There has been little litigation involving voice-based sound marks and the case law on sound marks more generally remains thin. In Ride the Ducks v. Duck Boat Tours, perhaps the most on-point case, a federal district court in Pennsylvania held that “a quacking noise made by tour guides and tour participants by use of duck call devices” was not sufficiently “distinctive” to support trademark infringement or dilution claims against a competing tour operator using a “nearly identical” “Kwacky Kwacker.”

While a number of companies have registered voice and other sound marks to identify their goods and services, relatively few, if any (before McConaughey’s J.K. Livin’ Brands Inc.’s registrations), have been tied to a particular person’s vocal or other services. Instead, most voice-based sound marks have been associated withs fictional characters (e.g., Tarzan, Homer Simpson), company mascots (e.g., AFLAC duck), or corporate brands (e.g., AOL, Yahoo).

Years before voice-based sound marks were issued, Michael Buffer—the voice behind the iconic phrase— “Let’s Get Ready to Rumble,” registered a service mark tied to his announcing services. He successful enforced that mark in several lawsuits and settlements involving unauthorized uses in radio, recordings, and film. Because the mark covered a phrase, however, it was a word mark— not a sound mark—and thus protected the words themselves, irrespective of how and by whom they were spoken.

Unlike Buffer and McConaughey, the plaintiffs in Lehrman had not registered for any word, sound, or other service marks for their voice services. Instead, they argued, in essence, that a “trademark-like interest in [one’s] image, likeness, persona, and identity” —which the court confirmed can include one’s voice—made the unauthorized use of their voice actionable under the Lanham Act.

The court, for reasons unrelated to the protectability of voice under trademark law, rejected the Lanham Act claims but confirmed that voice is just as capable as an image of attaining the secondary meaning necessary to function as a trademark, stating:

“[V]oices are protectable only to the extent they function primarily as source identifiers rather than as products themselves.”

Central to the court’s analysis was distinguishing between voice as a “brand identifier” and voice as part of the plaintiffs’ “ordinary services in the voice-over market.” The court elaborated:

“[V]oices may be protectable to the extent that they are being used primarily to identify the source of particular sound recordings, but are not protectable to the extent that they primarily function as content in those sound recordings.”

It remains to be seen how far trademark law will stretch in protecting human voice from its AI-generated counterparts. McConaughey may be among the first celebrities to secure voice-based sound marks tied to his “entertainment services and persona,” but as musicians and celebrities confront the limits of copyright and right of publicity law, more may turn to trademark law to supplement whatever protection publicity rights may provide.

What To Look Out For Next

Voice will continue to occupy a unique—and uneasy—position under intellectual property law. Copyright law provides little shelter for voice, separate and apart from its fixation in a sound recording. Right of publicity coverage can be spotty or nonexistent, particularly when faced with an AI-generated doppelganger. Trademark law, in turn, requires a secondary meaning that may be difficult to attain—especially when voice is both embedded in the brand and part of the services offered under it.

None of this is likely to stop the rapid advancements in voice cloning technology or for that matter, the acceleration of lawsuits testing the proprietary limits of both human and AI-generated voice. Plaintiffs’ law firms have already set their sights on a host of voice and speech recognition technologies, alleging that the collection of “voiceprints” violates privacy statutes such as Illinois’ Biometric Information Privacy Act. The next frontier of AI voice litigation may have already arrived.

View Full Article (PDF)

Meeka Bondy

What’s In An AI-Generated Voice? Maybe More Than The Law Can Hear

An AI-Generated Image Might Be Worth More Than A Thousand Words