Generate images on any hardware without Midjourney

Today there are many services for generating images. Some of them are paid, others are not. And even in most free services you may encounter restrictions, for example, on the number of free images per unit of time, image resolution, etc. This circumstance can be explained simply. Despite the availability of the technology, the iron required for generation remains expensive. And few people want to give away GPU resources for free. However, the craving for free stuff is hard to overcome. Therefore, in this article we will learn how to still generate images using just a browser and a few lines of code.

Hardware requirements

There aren't really any. It is necessary that you have a modern browser on your computer or tablet. Most likely, this is already the case. You will need a Google account to connect to the Google Disk service. It would be enough.

Colab

We will need a free GPU. First of all, we go to Colaba using the link https://colab.research.google.com In the “File” menu, create a notepad.

Next we connect the GPU. Go to the “Runtime Environment” menu and select “Change Runtime Environment”. We select what is available, except for the CPU. Most likely, the T4 GPU will be free, that’s enough for us.

In the right corner of the notepad we find “Connect” and connect.

Install

In the first cell of the notepad we install the necessary dependencies:

!pip -q install git+https://github.com/huggingface/diffusers transformers accelerate -q

We start executing the code. Click on the launch button on the left of the cell with the code or hover the cursor over the code and press CTRL+ENTER, or, for Macs, COMMAND+ENTER. We are waiting for completion.

Imports

Copy the following code into the next cell:

from diffusers import StableDiffusionXLPipeline
from diffusers.utils import make_image_grid
import torch

Let's launch.

Initialization

Next, copy the code into the following cell:

pipe = StableDiffusionXLPipeline.from_pretrained("segmind/SSD-1B", torch_dtype=torch.float16, use_safetensors=True, variant="fp16")
pipe.to("cuda")

Let's launch. We will not use the largest model. Of course, there are larger models than this, but we still have limitations. Free we have limited storage space in Colaba. You can see it by clicking on the place in the upper right corner where it says RAM and Disk.

We see that they only provide us with 15GB. But this model of free memory is quite enough.

Now it's magic!

Promty

In order to receive images, we need industrial. In this implementation you will need two of them. One positive Promt, i.e. description of the model task in the format of what we expect from it. The second Promt is negative, i.e. something we would not want to receive from a model.

prompt = "Fish plays the guitar" # Positive prompt
neg_prompt = "ugly, blurry, poor quality" # Negative prompt

Yes, and it’s better to compose formalities in English. The model studied in English, so she will understand Russian, but not as well as her native one. There is nothing complicated here. Let's take it Google Translate and forward.

In the current example, we will expect the model to show a picture of a fish playing a guitar.

In the negative note, we indicated that we would not like the picture to be ugly, blurry and of poor quality. Sounds reasonable.

Generation

We will need the following code in the new cell:

images = pipe(prompt=prompt, negative_prompt=neg_prompt, width=512, height=512, num_images_per_prompt=4).images
make_image_grid(images, rows=2, cols=2)

Let's not run this code right away, but let's figure out a little what is what. In the object pipe we see several parameters. Prompt And negative_prompt: everything is clear with them, we talked about them earlier.

width – image width

height – picture height

The values ​​of these parameters must be a multiple of 8. It is better not to set them more than 1024. For 4 images in one pass, the resolution of 512 is optimal. Anything more will lead to an increase in memory consumption, and the system will say OutOfMemoryError and go to the trace. You will need to go again to the “Runtime Environment”, restart the environment and re-run everything in Notepad one by one.

If you need an image with a resolution of 1024, then it is better not to run more than 2 in one pass. The higher the resolution, the more detailed the picture will be. Experiment.

num_images_per_prompt – the number of images that will be generated in one run of this code.

As a result of executing the first line of code, we will receive a list of image objects. In this case there will be 4 of them. To make it clearer, we will arrange these pictures together using the method make_image_gridwhich will arrange the pictures in a 2×2 table.

Let's launch.

Each time you run this code you will get a different image. You can see this for yourself.

To make it easier to further change the process and run everything from one cell, let’s combine the code as follows in a new cell:

prompt = "Fish plays the guitar" # Positive prompt
neg_prompt = "ugly, blurry, poor quality" # Negative prompt

images = pipe(prompt=prompt, negative_prompt=neg_prompt, width=512, height=512, num_images_per_prompt=4).images
make_image_grid(images, rows=2, cols=2)

Saving the result

We remember that images – this is a list of pictures. In our case, there are 4 of them. Therefore, to get a specific picture from the list, you need to execute in the following cell:

images[i]

Where i this is the number of the picture in the list, starting from 0. That is, to get the first picture from the list, you need to execute the code

images[0]

Styles

The style, frame and other parameters of the pictures are determined using promt. You can try this prompt:

prompt = "A brave cat protects the Galaxy from aliens. Anime style. Close up." # Positive prompt
neg_prompt = "ugly, blurry, poor quality" # Negative prompt

images = pipe(prompt=prompt, negative_prompt=neg_prompt, width=512, height=512, num_images_per_prompt=4).images
make_image_grid(images, rows=2, cols=2)

Let's launch.

Further, everything depends only on your imagination. Experiment with industrial products and generate, generate, generate.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *