Stable Diffusion 2.0 is a completely improved version


Version 2.0 of Stable Diffusion brings many advances. The most important new feature is the improved OpenCLIP text-to-image conversion model.

In August 2022, AI startup Stability AI, in collaboration with RunwayML, LMU Munich, EleutherAI and LAION, released Stable Diffusion, an open source image AI that was immediately well received by the community.

Stable Diffusion can be used online for a fee and with content filters, or downloaded for free and used locally with no content restrictions. Version 2.0 continues this open source approach. Stability AI leads the way.

Improved text encoder and new image modes

For version 2.0, the team used OpenCLIP (Contrastive Language-Image Pre-training), an enhanced version of the multi-modal AI system that learns visual concepts from natural language in a self-supervised way. OpenCLIP was released by LAION in three versions in mid-September and is now implemented in Stable Diffusion. Stability AI supported the formation of OpenCLIP. CLIP models can compute representations of images and text as embeds and compare their similarity. This way, an AI system can generate an image that matches some text.

A d

Thanks to this new text encoder, Stability Diffusion 2.0 can generate significantly better images compared to version 1.0, according to Stability AI. The model can generate images with resolutions of 512×512 and 769×768 pixels, which are then scaled to 2048×2048 pixels by a bottom-up diffusion model it is also new.

The new upscaler in action: The left image has 128×128 pixels, the right image has been upscaled to 512×512. | Image: Stable Broadcast

The new Open CLIP model was trained with a “aesthetic dataset” compiled by Stability AI based on the LAION-5B dataset. Sexual and pornographic content has been filtered beforehand.

Also new is a image depth model which analyzes the depth of an input image and then uses text input to transform it into new patterns with the outlines of the original image.

Depth analysis allows Stable Diffusion 2.0 to accurately transform existing topics into new topics that resemble the original image. | Image: Stable Broadcast

Stable Diffusion version 2.0 also gets a paint template that can be used to replace individual image elements in an existing image, like painting a cap or VR headset on your head.


Leave a Comment

Your email address will not be published. Required fields are marked *