Google is leveraging a vast library of YouTube content to train its artificial intelligence models, including Gemini and the new media generator Veo 3. According to CNBC, this is made possible by utilizing a collection of 20 billion videos.
Google has confirmed this use of content but clarified that it only pertains to a portion of the material and is conducted under agreements with creators and media companies.
A YouTube representative explained that the company has always employed its own content to enhance its services, and the emergence of generative AI has not changed this practice. "We recognize the importance of protections, which is why we have developed robust mechanisms to safeguard creators," the company stated.
However, experts are concerned about potential copyright violations. They argue that using others' videos to train AI without the creators' knowledge may lead to a crisis in intellectual property rights. While YouTube claims it has previously communicated this, most creators were unaware that their content was being utilized for training purposes.
Google does not disclose how many videos are used for training its models. Even using just 1% of the library would amount to over 2.3 billion minutes of content—40 times more than its competitors.
When uploading videos, creators grant YouTube broad rights to use their content. However, there is no option for creators to opt out of having their videos used for training Google's models.
Representatives from organizations protecting digital rights argue that creators' years of work are being utilized to develop AI without compensation or notification. For example, Vermillio has created a service called Trace ID that identifies similarities between AI-generated videos and original content, sometimes exceeding 90% similarity.
Some creators are open to their content being used for training, viewing new tools as opportunities for experimentation. However, the majority feel the situation is opaque and requires clearer regulations.
YouTube has even entered into an agreement with Creative Artists Agency to develop a management system for AI content that imitates famous individuals. Yet, the mechanisms for removing or tracking similar content remain inadequate.
Meanwhile, there are calls in the U.S. to provide authors with legal protections that would allow them to control the use of their creative works in the generative AI landscape.
Recently, Google also changed its internal content moderation rules on YouTube, allowing videos that partially violate guidelines to remain online if deemed socially important.
