Stable Diffusion is the most important neural network in the history of generative art

Stability.ai announced the public release of the Stable Diffusion graphical neural network model. You might think that this is just another news that another ordinary neural network has appeared in the art world. But this is far from the case for two reasons, one of which you see in the hubs. Details to get started our flagship course in Data Science.


HollyB#1382 (portrait)
HollyB#1382 (portrait)

First, unlike DALL·E 2 and Midjourney, which are comparable in quality, the Stable Diffusion neural network is open source. Therefore, anyone can create applications based on it for free to solve specific problems of converting text into an image.

People are already creating projects on Google Colab from text descriptions (from Deforum and Pharmapsychotic), as well as in plugin Figma and use search engines to find text descriptions, generated images, and seed values Lexica.art. In addition, the developers of Midjourney provided the ability to combine this neural network with Stable Diffusion, which led to amazing results (this feature is temporarily disabled, but may soon become available again as soon as the problem of the malicious potential of such a bundle is resolved):

Midjourney + Stable Diffusion;  alessandrochille darken eyecon01

Midjourney + Stable Diffusion; alessandrochille, Darkens, eyecon01

While I am writing these lines, less than three days have passed since the release of Stable Diffusion. It’s hard to imagine what might happen in the weeks and months ahead.

Secondly, unlike DALL·E mini (Craiyon) and Disco Diffusion, comparable in openness, Stable Diffusion allows you to create photorealistic and aesthetic works of art that are not inferior in quality to OpenAI and Google models. Many even claim that it is a high-tech “generative search engine,” as Mostak likes to call such developments.

To give you an idea of ​​Stable Diffusion’s level of artistry and technology, I’ll include a few of my favorite images I’ve found in the Discord communities (all of these images are by Stable Diffusion unless otherwise noted).

ai_coo#2852 (street art)

ai_coo#2852 (street art)

Stable Diffusion embodies the best of art world neural networks: it is arguably the best open source graphics neural network in existence. It has no analogues yet, and, no doubt, a great future awaits it.

In my articles, I often wrote about neural networks at the development stage – years before they become suitable for everyday use. These articles are only interesting from a theoretical point of view, but Stable Diffusion is an example of a model that is interesting from both a theoretical and a practical point of view. It combines the achievements of modern research and experience of real application. Applications based on it are already being created, and you will very soon be able to use them to solve serious and not very serious problems.

It is curious that news about such services can come from those from whom you do not expect them at all. From parents, children, spouses, friends and colleagues. In a word, all those who are completely alien to the world of graphical neural networks can suddenly learn about the latest in this area. Through graphics, AI technologies are able to reach out even to those who are used to not noticing the rapid approach of the future. Isn’t it poetic?

HollyB#1382 (seascape)

HollyB#1382 (seascape)

Stable Diffusion is more than open source DALL E 2


The Stability.ai studio was created to “develop open neural networks to realize our capabilities.” These are not experimental models that most people will never see. These are tools that anyone can use. And this favorably distinguishes the neural network from OpenAI, where the secrets of the best developments (GPT-3 and DALL E 2) are protected as the secret of the universe, and Google, where beta versions of their own systems (PaLM, LaMDA, Imagen and Parti) are not even planned yet. Already a few months ago, I heard rumors that Stability.ai created something more than their planned alternative to DALL·E 2.

Stability.ai founder Imad Mostak learned from the mistakes of OpenAI. For example, the fact that the Craiyon neural network went viral only proves that the closed beta version of DALL·E had a number of shortcomings. People do not want to watch how masterpieces are created, but to create them. Since the public release did not disclose the code or weight of the model, which most users do not care about, the Stability.ai studio itself took this important step. She created a ready-to-use online platform for those who do not know how or do not want to code.

Twobob#2909 (nature)

Twobob#2909 (nature)

This platform is called dream studio Lite. It allows you to generate up to 200 images for free to experience the depth of Stable Diffusion. Like DALL E 2, there is also paid subscription: for £10 you can create up to 1000 images (OpenAI returns 15 credits once a month, but to get more you need to purchase 115 credits for $15). For the sake of clarity, let’s bring these prices to a common denominator: in DALL·E, an image costs 3 cents, and in Stable Diffusion it costs only 1 cent.

Stable Diffusion can be used via the API (the cost scales linearly: 100 generated images will cost you £1). In addition to generating images, Stability.ai will soon announce DreamStudio Pro (audio/video) and Enterprise (studios).

It is also worth noting that in addition to creating images from a text description, DreamStudio will soon implement the function of generating some images from others also using a text description. Here are some examples:

clif08#7318

clif08#7318

symmetry#5379

symmetry#5379

Neverduft#5541

Neverduft#5541

On the same site there is a resource for selecting a query, which will be useful to everyone who is new to this (after all, it is extremely difficult to find a “common language” with models). Here, unlike DALL·E 2 (and even Craiyon), you can influence the result through the parameters and better manage it.

Studio Stability.ai has done everything possible to simplify access to models. OpenAI were pioneers, so their progress was slowed down by the need to calculate all the risks and cases of model bias. Still, OpenAI shouldn’t have dragged out the closed beta so much and created a subscription business model that limits creative freedom. Midjourney and Stable Diffusion have already proven this.

RobotElbows#3572 (ukiyo-e style)

RobotElbows#3572 (ukiyo-e style)

Openness and security are more important than privacy and control


Open source technologies also have limitations. As I wrote in the article
GPT-4chan ‘the Worst AI Ever’openness is more important than privacy and tight control, but it should never threaten security.

Stability.ai takes this seriously. Therefore, it involves the lawyers and ethicists of the Hugging Face community in the distribution of models under license. Creative ML openRAIL-M (under conditions close to the model BigScience’s BLOOM). As stated in the announcement, this is a “liberal license for commercial and non-commercial use”, which provides open but responsible approach to consistent use of models. However, any derivative works must be distributed under no less restrictive terms.

not#2122 (stained glass)

not#2122 (stained glass)

An open source model is a very important step, but it is equally important to create protective mechanisms that will prevent this model from becoming an instrument of deception and self-affirmation by violating other people’s rights. However, these undesirable consequences are possible without violating the terms of the license. In his blog, Imad Mostak wrote about it this way: “Since we trained these models on image-text pairs found on the wide expanses of the World Wide Web, the model can reproduce some of the prejudices of society and create dangerous content, so mitigating this effect and openly discussing such distortions can lead everyone to the right dialogue.” Openness and security are in any case more important than privacy and control.

Open source – new horizons


With strong ethical values ​​and openness, Stable Diffusion plans to outperform its competitors in global impact. Those who want to download this neural network and run it from a hard drive should note that this will require 6.9 GB of VRAM, which corresponds to a high-end custom GPU. Lighter than DALL·E 2, but for the computers of most unprepared users – a heavy burden. If you are ready for this, then you can, like me, use Dream Studio.

pontap#4224 (watercolor)

pontap#4224 (watercolor)

Widely recognized as the best generative model, Stable Diffusion will be the basis for the development of countless apps, sites and services that will change the way people create and work with art. Until now, to get decent results, you had to use DALL·E 2 or Midjourney, limited by their full opacity (Craiyon is better for memes, but does not meet professional quality requirements).

And now applications for a variety of tasks will grow like mushrooms after the rain, and everyone will be able to use them. Already now many improve children’s drawingscollect collages using external and internal retouchingcreate magazine coversdraw cartoonsmake different transformational and animation videos, do one image from another

Many of these features are available in DALL·E and Midjourney, but Stable Diffusion takes graphics to the next level. This opinion is also shared by Andrey Karpaty: “I consider the day of the release of Stable Diffusion to be historical for human creativity, compressed into a single and public artifact. This is a significant part of the phase transition to the fusion of the work of natural and artificial intelligence, an area in which we have not moved an inch before.”

Stable Diffusion Leads to a Very Important Dialogue


Global change does not please everyone. As I wrote in an article about graphical neural networks
How Today’s AI Art Debate Will Shape the Creative Landscape of the 21st Century, “… we are facing a very difficult situation – and open source is only making things worse. Artists and other creative minds are sounding the alarm, and for good reason. Many of them will lose their jobs because they will not be able to successfully compete with modern programs. Firms such as OpenAI, Midjourney and Stability.ai have built their success on the work of many artists. And instead of a reward, they “put them on their necks without asking” the entire target audience of their neural networks.

As I wrote in the same article, the Stable Diffusion neural network is a new class of software tools. To understand it, you need to adapt your thinking to new realities. The results of the emergence of such neural networks cannot be accurately predicted by analogies with the past. Some of them we have already seen, and some we will see for the first time. The future that awaits us is uncharted territory and must be treated accordingly.

HollyB#1382 (portrait)

HollyB#1382 (portrait)

Conclusion


The public release of Stable Diffusion is, without a doubt, the most important and significant event in the world of graphical neural networks. And this is just the beginning. Imad Mostak
posted on Twitter: “Because our models are faster, better and more specific, we can expect their quality to go up across the board. Not just images, but audio from next month, and then we’ll move on to 3D and video. Language, code and more machine learning now…”

We are on the verge of a revolution that will last for several years and will change our understanding of graphics and creativity in general, interaction with them and attitude towards them. And not only in a philosophical and intellectual way, but as something common and experienced by each of us. The world of creativity will never be the same again, and we must be open to new things and respect each other in order to build this bright future together. Only a responsible attitude towards open source technologies will lead to the changes that we will be glad to see.

Joe#5956 (Cityscape)

Joe#5956 (Cityscape)

Try the beautiful request. He may surprise you.


45% discount with promo code HABR
And we will help you upgrade your skills or master a profession that is in demand at any time:

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *