Does AI content need transparency to protect society?


AI content is currently in the Wild West phase – you generate what works. There are no transparency rules. China and OpenAI are working on countermeasures.

Besides the issue of copyright on training content for AI models, another fundamental question plagues the AI ​​industry: what about all the AI ​​content that is barely recognizable as such (images) or no longer recognizable (texts)?

What consequences could this lack of transparency in AI content have for society? Are we facing news overload and mass layoffs in the media industry? Are essays and dissertations dead? Is a wave of fake news and spam headed our way? Of course, all these problems already exist. But AI automation could take scaling to a new level.

In order for our society to make a conscious decision and regulate these and similar issues, we first need transparency. Behind which work hides a human being, behind which hides a machine? Without this transparency, attempts at regulation will struggle.

A d

China bans AI media without watermark

The China Cyberspace Authority, which among other things regulates and censors the Internet in China, prohibits the creation of AI media without watermarks. This new rule will come into effect on January 10, 2023.

Authority talks about the dangers posed by “deep synthesis technology,” which, while meeting user needs and improving user experience, is also misused to spread illegal and harmful information, damage reputations and forging identities.

These scams would endanger national security and social stability, according to a statement from the authority. New products in this segment must first be evaluated and approved by the authority.

The authority insists on the importance of watermarks that identify AI content as such, without restricting the function of the software. These watermarks should not be removed, manipulated or hidden. AI software users must register for accounts using their real names, and their generations must be traceable, the authority explains.

OpenAI explores AI text detection systems

Unlabeled texts generated by AI, in particular, could pose new challenges to society. One example is the education system, which has partly feared the death of homework since the introduction of ChatGPT.


Stargazer Episode Zero is a comic created with trending AI art on Amazon
Stargazer Episode Zero is a comic created with trending AI art on Amazon

And rightly so: large language models like ChatGPT are particularly good at reproducing frequently written and clearly documented knowledge into new words in a compact, comprehensible and largely error-free way. They are therefore tailor-made for school work, which is generally based on relatively basic existing knowledge.

Other examples of potentially harmful use of AI texts include sophisticated spamming or mass distribution of fraudulent content and propaganda on fake websites or social media profiles. All of this is already happening, but great language models could increase the quality and volume of this content.

OpenAI, the company behind ChatGPT and GPT-3, is therefore working to make AI-generated content discoverable via technical, statistical marking. The company is aiming for a future in which it will be much more difficult to pass off AI-generated text as written by a human.

The company is experimenting with a server-level cryptographic wrapper for AI text that can be recognized as a watermark via a key. The same key is used as the watermark and for the authenticity check.

“Empirically, a few hundred tokens seem to be enough to get a reasonable signal that yes, this text is from [an AI system]. In principle, you can even take a long text and isolate which parts are probably from [the system] and which parts probably didn’t,” says Scott Aaronson, professor of computer science at the University of Texas, who is currently a visiting scholar at OpenAI working on the system.

OpenAI researchers plan to present this system in more detail in an article in the coming months. It’s also just one of the detection techniques currently being researched, according to the company.

But even if OpenAI or another company manages to implement a working detection mechanism and the industry can agree on a standard, it probably won’t solve the AI ​​transparency problem once and for all.

Stable Diffusion shows that open source generative AI can compete with commercial offerings. This could also apply to language models. In addition to labeling AI-generated content, an authentication system for human authorship may also be needed in the future.

This text was entirely written by a human (Matthias Bastian).

Leave a Comment

Your email address will not be published. Required fields are marked *