Скопировано

Creative Commons — AI Shouldn't Take Your Data Without Asking

02.07.2025 08:29:00
Дата публикации
The Creative Commons organization, known for its free licenses, has introduced a new project — CC signals. This is a set of legal and technical tools that allow data owners to indicate whether their content can be used to train artificial intelligence and under what conditions.

The project is aimed at solving a growing problem: uncontrolled data extraction for AI can lead to the closure of open resources, the emergence of paid access to sites, and a decrease in transparency on the Internet. CC signals offers an alternative — not prohibitions, but clear signals.

Signals can be both legally binding and advisory, but always with an emphasis on the ethical side of the issue. This is in harmony with the philosophy of Creative Commons: openness, reciprocity, and respect for authorship even in the era of machine learning.

Owners of data sets will be able to set conditions: for example, an obligation to indicate the source, share results, open the code of the AI ​​model, or not use the data for commercial purposes. All of this is presented in a machine-readable and human-readable form.

The project is especially relevant against the backdrop of how large platforms are revising their policies. Reddit restricts access to bots via robots.txt, Cloudflare is developing paid schemes, and developers are creating tools that “waste” the resources of unethical AI scrapers.

CC signals offers a more sustainable model: instead of confrontation, an agreement. An attempt to create a new social contract between those who share knowledge and those who use it to train AI.

The project is still in its early stages. The first drafts have been published on the Creative Commons website and GitHub. The organization is collecting feedback and plans an alpha launch in November 2025.

Open discussions are also planned, where developers, lawyers, and platform representatives will be able to ask questions and make suggestions.

In conditions where data is becoming the “new oil”, it is important not only to protect resources, but also to build trust. CC signals is an attempt to maintain openness without compromising technical progress and your privacy.

(text translation is done automatically)