If a vendor tells you a vocal dataset is "copyright-cleared," that sentence contains at least four distinct legal concepts, each of which can fail independently, and each of which will be examined during the due diligence phase of any deal worth more than a few million dollars.
This post breaks down what "cleared" actually means when the data in question is a singing voice, what rights have to be stacked on top of each other for a dataset to be genuinely safe, and what specific documentation your legal team should demand before you sign. It is not a substitute for legal advice. It is a framework for asking better questions.
The rights stack: why "we own the recording" is not enough
Every commercially exploitable vocal recording sits under a layered structure of intellectual property rights. For a dataset to be safe for AI training, each layer has to be accounted for. Get one layer wrong and the entire clearance collapses.
| Layer | What it protects | Who typically holds it |
|---|---|---|
| Master recording (sound recording copyright) | The specific audio file itself | Label, producer, or independent artist |
| Musical composition (publishing) | The underlying melody, lyrics, and chord progression | Publisher or songwriter |
| Performer's neighboring rights | The vocalist's exclusive rights in the performance itself | The singer, directly |
| Moral rights | The right to be identified and to object to distortion of the performance | The singer, often non-waivable in Europe |
| Right of publicity / voice rights | The commercial use of a recognizable voice | The singer, under state or national law |
Each row in that table is a separate legal question. A vendor who owns the master recording but who never obtained explicit AI training authorization from the vocalist has only cleared the first row. The performer's neighboring rights and voice rights are still attached to the recording and still enforceable, sometimes decades after the session.
Layer 1: The master recording
This is the easiest layer to verify and the one most vendors actually have in order. Ownership of the master is typically documented in a producer agreement, a work-for-hire agreement, or an assignment from the original artist. If the vendor can show you a signed document transferring ownership of the specific audio file to them, you have cleared the first layer.
What to ask for: a clear chain of title for the recording, ideally including the original session contract, any subsequent assignments, and confirmation that the vendor has the exclusive right to authorize reproduction and derivative works of the audio file.
What can go wrong: split ownership that was never documented, joint works where one co-owner never agreed to the current use, or session musicians who were paid but never signed a work-for-hire agreement.
Layer 2: The musical composition
If your training data includes vocals singing existing songs (covers, for example), the composition layer matters. The underlying melody and lyrics are a separate copyrighted work from the recording. Training an AI on a cover version of "Hotel California" without composition clearance is still a copyright problem, even if the cover recording itself is fully owned.
For most modern AI training use cases, this layer is handled in one of two ways:
- Original compositions only. The vendor provides only vocals singing original compositions (lyrics and melody written by or for the vocalist), avoiding the composition layer entirely.
- Compulsory mechanical licenses. In the U.S., Section 115 of the Copyright Act provides a statutory mechanical license for reproducing compositions in cover recordings, but this only covers distribution of the recording, not AI training. Mechanical licenses do not automatically authorize training use.
If your intended use case involves training on recordings of existing songs, you need explicit authorization from the composition rightsholder, not just a standard mechanical license. This is not theoretical. The Concord Music Group v. Anthropic case, currently active in the Northern District of California, centers on music publishers alleging that Anthropic trained on copyrighted song lyrics without authorization. The court dismissed the contributory infringement claims in March 2025 but the direct infringement claim is still pending.
Layer 3: Performer's neighboring rights
This is the layer that most AI training deals get wrong, and it is the layer where the legal risk in 2026 is accelerating fastest.
In the European Union, performer rights are codified in Directive 2006/115/EC (the Rental and Lending Directive). Performers have exclusive related rights in their fixed performances, including the rights of fixation, reproduction, distribution, rental, and making available to the public. These rights are separate from the sound recording copyright and they are held directly by the performer, not by the label or producer.
What this means in practice: a European vocalist who sang on a session in 2019 has a direct, enforceable property right in her performance. If that recording is later used to train an AI model in 2026, she can sue the AI company, not just the original label, because her neighboring rights attach to the performance itself. The EU Rental Directive also includes an equitable remuneration right that performers cannot waive.
In the United States, the analogous rights are weaker and more fragmented, but they still exist. Work-for-hire agreements generally transfer economic rights, but the specific right to authorize AI training use is a recent enough concept that most pre-2023 session contracts never addressed it. That leaves a drafting gap that plaintiffs' lawyers are already exploiting.
What "explicit AI training rights" should look like
A clean vocal dataset agreement should include, at minimum, a specific grant of rights from each vocalist covering:
- Reproduction of the recording for the purpose of training machine learning models
- Creation of derivative works, including statistical representations and model weights derived from the recording
- Sublicensing to third-party enterprise buyers for their own AI training purposes
- Distribution of the recording as part of a commercial dataset
- Storage of the recording and associated metadata for the duration of the training and model lifecycle
- An acknowledgment that the vocalist has been informed of the specific purpose (AI training) and the categories of potential downstream use
If a vendor cannot show you this language in their vocalist agreements, the dataset is not cleared for AI training in any jurisdiction that recognizes performer rights. Full stop.
Layer 4: Moral rights
Moral rights are the right to be identified as the performer and the right to object to distortion or modification of the performance that would be prejudicial to the performer's reputation. They are codified at the international level in Article 5 of the WIPO Performances and Phonograms Treaty.
In most of continental Europe, moral rights are non-waivable. A performer can contractually agree not to exercise them, but in some jurisdictions the agreement itself may not be enforceable, particularly where the use is deemed to damage the performer's reputation. Germany and France are particularly strict.
For AI training datasets, the moral rights exposure typically comes downstream. A vocalist who consents to her voice being used for training may still have a cause of action if the resulting model is used to generate vocals that damage her reputation (imagine the model being fine-tuned to produce political propaganda or explicit content using a voice clearly derived from her original contributions).
The defensive posture most enterprise dataset providers take is a combination of a contractual moral rights waiver (where legally permissible), a use restriction baked into the dataset license, and a contractual requirement that downstream buyers impose equivalent restrictions on their own outputs. None of these are airtight, but together they materially reduce exposure.
Layer 5: Right of publicity and voice rights
The fifth layer is the right of publicity, which protects a person from commercial use of their recognizable voice without authorization. In the United States, right of publicity is a patchwork of state laws. Approximately 25 to 30 states recognize some form of right of publicity, and the strongest statutory version is California's, which explicitly protects "name, voice, signature, photograph, or likeness" under Civil Code §3344.
The leading precedent on voice rights in California is Midler v. Ford Motor Co., 849 F.2d 460 (9th Cir. 1988), in which Bette Midler successfully sued Ford for hiring a sound-alike singer to mimic her voice in a commercial. The Ninth Circuit held that the deliberate imitation of a distinctive voice for commercial purposes is a tort under California law, regardless of whether the specific recording being imitated was used.
In March 2024, Tennessee became the first state to explicitly add "voice" to its right-of-publicity statute. The ELVIS Act (Ensuring Likeness Voice and Image Security Act) went into effect on July 1, 2024, and it criminalizes the unauthorized distribution of algorithms, software, or technology "primarily designed" to produce unauthorized voice replicas. Violations are Class A misdemeanors, with civil remedies for damages and injunctive relief.
At the federal level, the NO FAKES Act was reintroduced in Congress on April 9, 2025. It would create a federal private right of action against unauthorized AI-generated replicas of a person's voice or likeness. As of April 2026, the bill has not been enacted, but it has significant industry backing and is one of the more likely AI-related statutes to pass in the current session.
What this means for dataset buyers: even if every other layer of the rights stack is cleared, you can still be sued by a vocalist (or a vocalist's estate) if your model outputs a voice that is recognizable as theirs and you used their contributions in training without a proper right-of-publicity release. The defense is a specific contractual grant of voice rights and, where possible, a waiver of Midler-style claims.
The six documents your legal team should ask for
When you are evaluating a vocal dataset vendor, request the following six documents. If any one of them is missing or evasive, treat the dataset as partially cleared at best.
- The vocalist agreement template in its current form, with the sections granting AI training rights clearly identified.
- A sample executed agreement (redacted for personal information) showing a real signed contract from a real vocalist.
- The consent log for at least one specific recording in the dataset, showing the vocalist's identifier, timestamp, IP address if applicable, and the exact consent language shown at the time.
- The chain of title for the master recordings, including any assignments from producers or co-owners.
- The composition rights documentation (or confirmation that all recordings are of original compositions by the vocalist).
- The withdrawal procedure describing what happens if a vocalist withdraws consent, including contractual obligations passed through to downstream buyers.
A vendor who has their house in order can produce all six of these within a week. A vendor who cannot is not actually in the business of selling cleared training data. They are selling the liability, and you are buying it.
Why "sourced ethically" is not the same as "cleared"
One of the marketing patterns we see in the vocal data space is the use of the word "ethically sourced" as a substitute for a detailed clearance argument. "Ethically sourced" can mean almost anything. It can mean the vocalists were paid a fair session rate. It can mean the vendor did not scrape from YouTube. It can mean the vendor feels good about where the data came from.
None of those meanings are enforceable. A fair session rate does not transfer AI training rights. Not scraping from YouTube does not establish that the vocalists consented to AI use. Feeling good about provenance is not a legal posture.
When you see "ethically sourced" on a vocal dataset page, treat it as a marketing claim, not a legal one. Follow up with the six-document request. The vendors who can answer move forward. The vendors who cannot get cut from the evaluation.
What "cleared" means at The Vocal Market
Our enterprise vocal dataset licensing program was built specifically to satisfy the rights stack described above. Every vocalist who contributes to the dataset signs a digital agreement that includes:
- A specific, timestamped consent to the use of their recordings for AI model training
- An explicit grant of rights for sublicensing to enterprise buyers for their own training purposes
- An acknowledgment of the special category nature of voice data under GDPR Article 9 and explicit consent under Article 9(2)(a)
- A moral rights waiver to the extent permitted under the vocalist's jurisdiction, combined with contractual use restrictions that bind downstream buyers
- A withdrawal mechanism that meets GDPR's "as easy as giving" standard
The consent chain for every recording is logged with user identifier, timestamp, IP address, and the exact version of the agreement shown at the time of consent. If your legal team wants to audit the clearance for a specific recording before you commit, we can produce the documentation within 48 hours. Request a sample dataset and we will send the corresponding clearance records alongside it.



