Picture: Warner Bros. / Screenshot
Currently, Stable Diffusion is just a powerful AI image generator. Long-term plans go way beyond that.
Generative AI is clearly on the rise lately: from text to image, from text to HD video or from text to 3D, AI systems can create more and more media formats, some of which are fully automated. New models appear almost every week and are constantly being improved.
Additionally, generative AI tools are increasingly facilitating the digitization of the real world. Relatively simple apps for PCs and smartphones use NeRF technology to generate a volumetric 3D scene from individual photos of an object or room.
Based on current trends, one could say that generative artificial intelligence going to be a powerful engine to increase digitization. It can dramatically increase the quantity and quality of digital content. The ultimate tool would be a single template for creating many types of media that can be manipulated by professionals and non-professionals using natural language.
Stable Diffusion head thinks an AI-generated Holodeck is achievable in a few years
Stability AI CEO Emad Mostaque’s comments in a Reddit AMA should be seen in the context of the above thesis. Stability AI is the startup behind the open-source AI Stable Diffusion image mentioned at the beginning of this article.
Mostaque cites a similar experience to the Ready Player One VR Sci-Fi Movie Oasis or the famous Star Trek Holodeck as the goal of the company’s generative AI models.
This AI system should continue to be open source so that everyone can “create anything they can imagine”. This, he said, requires “full multimodality” in AI models, that is, generative AI systems trained with many content and data formats.
Mostaque says Stabililty AI is already in talks with game studios and other companies that have access to 3D data for data capture. “Yeah, we’ll be doing something like the Holodeck in a few years,” Mostaque says when asked about generative AI for VR and gaming.
Midjourney CEO David Holz expressed similar thoughts not too long ago. He expects AI-powered real-time video games to emerge in ten years. Recently, a developer gave a glimpse of how Stable Diffusion could be implemented in VR worlds.
Mostaque hints at better models and possible copyright solution for stable streaming
In the near future, Mostaque announced more important improvements for Stable Diffusion. Stability AI currently trains models with billions of parameters, which will then be optimized.
“You can think of it like bulking and cutting, because then we optimize those. Personally, I expect the models to run at the cutting edge in the future with much higher quality than MJ v4 or DALLE 2. The future being next year or two,” says Mostaque.
The CEO also addresses criticisms of the current model, which uses copyrighted data for AI training. This allows it to generate images in the style of renowned artists, if their names are included in the prompt. It works the same way with competitors DALL-E 2 and Midjourney.
“We are working on fully licensed datasets as well as opt-out mechanisms for future model development that we run and support. We will soon make some announcements on this subject. It should be noted that these models are unlikely to “mature” next year, so they will be regularly updated,” says Mostaque.
According to Mostaque, Stability AI is also in discussions with governments on open source datasets and models, and working on international AI education initiatives.