How do the AI images stand out from their training material? A study of diffusion patterns aims to provide an answer to this question.
Debates about the art of AI that Stable Diffusion, DALL-E 2 and Midjourney create or not create have accompanied the tools since their inception. Probably the most intense debate is that of copyright, particularly with regard to the training materials used. Recently, the popular “trend on ArtStation” prompt has been at the center of protests against AI images. It is based on the imitation of popular artworks on ArtStation.
The extended and broadly trained scattering models are meant to ensure that each new prompt produces a unique image that is far removed from the originals in the training data. But is this really the case?
Dataset size and replication are related
Researchers from New York University and the University of Maryland have addressed this question in a new paper titled “Diffusion Art or Digital Forgery?
They examine different diffusion models trained on different datasets such as Oxford Flowers, Celeb-A, ImageNet and LAION. They wanted to find out how factors such as the amount of training data and training affect the rate of image replication.
At first glance, the results of the study are not surprising: diffusion models trained on smaller datasets are more likely to produce images that are copied or very similar to the training data. the the amount of replication decreases as the size of the training set increases.
The study only looked at a small portion of the training data
Using the twelve million “12M LAION Aesthetics v2 6+” images, the researchers looked at only a small portion of the two billion image Stable Diffusion training dataset. They found that models like Stable Diffusion in some cases “blatantly copy” from their training data.
However, the almost exact reproduction of training data is not inevitable, as shown in older studies of generative models such as GANs, the paper states. The team confirms this with a latent diffusion model (LDM) trained with ImageNet, where there is no evidence of significant data replication. So what does Stable Diffusion do differently?
Copies don’t happen often, but quite often
Researchers suspect that the replication behavior in Stable Diffusion results from a complex interplay of factors, such as the model being conditioned by text rather than classes and the dataset used for training having an asymmetric distribution of image repeats.
In random tests, on average, about two out of 100 generated images were very similar to the images in the dataset (similarity score > 0.5).
The purpose of this study was to assess whether broadcast models are able to reproduce high-fidelity content from their training data, and we find that they are. Although typical full-scale model images do not appear to contain copied content detectable using our feature extractors, the copies seem to occur often enough that their presence cannot be safely ignored.
Since only 0.6% of Stable Diffusion’s training data was used for testing, there are many examples that can only be found in the larger models, the researchers write. In addition, the methods used may not detect all cases of data replication.
For both of these reasons, the paper states, “the results here consistently underestimate the amount of replication in stable diffusion and other models.”