Creating Isometric Game Levels Using Stable Diffusion

Hi all. Today I'll show you how you can create 2.5D isometric levels using rapid prototyping techniques grayboxingand generative artificial intelligence, namely Stable Diffusion. Almost the entire process described in the article can be easily automated.

Step 1: Prototype the level geometry using grayboxing

Grayboxing is a technique in game design when the geometry of a level is prototyped using simple primitives. This step can be done in Unity or Blender. Creating geometry directly in Unity (using primitives or Probuilder) allows us to immediately bake the navmesh and test the navigation and logic of the level before we move on to the visual part.

Step 2: Export the level graybox render

The next step is to export a render of the level geometry in the angle in which the player will see it. This is necessary so that in the future we can correctly project the level texture onto the geometry. For example, we use a camera with rotation angles (54.7°, 0, 45°) in Blender coordinates and (35.3°, 135°, 0) in Unity, respectively. To create a screenshot, a separate scene (Scene) with predefined lighting camera settings is used.

Option 2: Export depth map directly from Unity/Blender

As an additional option, instead of exporting the illuminated graybox geometry, you can immediately render the depth map directly from Blender or Unity (using DepthTextureMode.Depth). This will skip the next step. This may give a better result in level texture generation in some cases, but I didn't notice a significant difference.

Level geometry created using 1x1 unit cubes

Level geometry created using 1×1 unit cubes

Installation and preparation of Stable Diffusion

I will not describe in detail the installation process of Stable Diffusion; many articles have already been written on this topic. In this article we will use Stable Diffusion Webuiaka Automatic1111. However, nothing prevents you from using ComfyUI or Stable Diffusion directly from a Python script to automate the process.

We use stable diffusion webui in Docker containerrunning on Paperspace GPU servers with Ampere A4000 (16 GB VRAM).

We will use Stable Diffusion XL (SDXL 1.0) as the base model. You can download from Huggingface (copy to the models/Stable-diffusion folder).

We will also need two extensions – LoRA (already built into automatic1111) and ControlNet (link). To install ControlNet support in automatic1111, simply go to the Extension tab and copy the git repository link into the “URL for extension's git repository” field, then click Install and restart your Automatic1111.

LoRA (Low-Rank Adaptation of Large Language Models) – These are lightweight models that can be used in addition to the base model to obtain a specific style, element, character, or other influence on the result of image generation. Checkpoints and LoRA are what the Stable Diffusion community lives and breathes today. Civitai – the most famous site for searching and sharing various checkpoints, LoRA and more. We will use LoRA called “Zavy's Cute Isometric Tiles – SDXL” – it allows you to create isometric art, which is exactly what we need. However, nothing prevents you from training your LoRA.

ControlNets – like LoRA, these are models that are used in addition to the base one to control the generation result. Unlike LORA, they do not affect the image style in any way, but they do allow you to influence its shape. For example, you can use an OpenPose skeleton to pose a specific pose for a character. You can download various ControlNet models for SDXL from hugging face.

We will use diffusers_xl_depth_full.safetensorsto use the depth map to create a specific level layout. It is downloaded to the folder models/ControlNet.

Step 3. Generate a depth map

Go to the tab ControlNet and select our geometry screenshot. We exhibit Enableafter which we select Control Type – depth. Choosing a preprocessor depth_midas and the model we installed – diffusers_xl_depth_full.safetensors

Let us set the weight of the control network (Control Weight) at 0.9. The higher this number, the closer the result will be to the given image, but if it is too high, the flavor will be lost and the result will be boring.

We will also exhibit My Prompt is more importantthis allows you to slightly increase the influence of text promt.

Click on the orange icon between the preprocessor and the model – this will allow you to generate a depth map and see its preview.

If instead of the illuminated graybox you immediately exported the depth map from Blender/Unity, then you can skip the generation step and just select none as your preprocessor.

At this point, the Control Net setup is complete, we are ready to generate our image.

Step 4: Generating the Level Texture

For the main generation I used the following settings:

  • Sampling method: DPM++ 3M SDE Exponential

  • Sampling steps: 40

  • Width, Height: 1200, 1200

  • CFG Scale: 7

Promt:

An ancient, dimly lit chamber with ornate carvings on the walls, Pressure plates are visible on the floor, surrounded by torches emittingering light, ancient, dimly lit, chamber, ornate carvings, pressure plates, torches, isometric, zavy-ctsmtrc, art , masterpiece, breathtaking, monument valley, lara croft go, empty background,

Negative promt:

photography, cropped, crop

We can increase the Batch Size and, by changing the prompt, select the design that you need.

Stable Diffusion WebUI

Stable Diffusion WebUI

Example images

Example images

Step 4: Export albedo, specular, normal maps

To create realistic 2.5D lighting, we will need different texture maps. You can create a normal map using img2img, or simply using a preprocessor Normal Map, similar to creating a depth map from the third step. A separate albedo export allows you to “remove lighting” and apply it afterwards using the game engine.

Unfortunately, I did not find an open-source solution for exporting albedo and specular maps. This can be done for a fee using the service SwitchLight. Write in the comments if you know ways to do this using Stable Diffusion.

Albedo, Specular, Normal map

Albedo, Specular, Normal map

Step 5: Finishing Textures and Removing Detail Using Inpaint

Our texture is almost ready, but to create gameplay it would be nice to remove some elements such as the background or fire (we will add it with particles). You can also modify the texture for interactive elements such as doors, buttons, levers, etc. This tool is great for this. inpaintlocated in the tab img2img.

We use masked content – fill and settings similar to the original generation. As a prom, we use what we want to complete, for example, for a closed door: closed stone door, zavy-ctsmtrc, isometric,

Step 6: Texture Reprojection

Our texture is ready, all that remains is to use it in the game. To do this, you can use various techniques similar to photogrammetry. However, the simplest method, which can be used for a large number of similar levels, is to simply create a UV template on the cube from the inside.

UV unwrapping and projection onto a simplified geometry template

UV unwrapping and projection onto a simplified geometry template

Result:

A little self-promotion:

We are a small remote indie studio that creates mobile games. We'd love for you to try our latest game Bifrost: Heroes of Midgard which came out last year on iOS And Android.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *