Create a third-person zombie shooter with DOTS

Salute, Khabrovsk. As we already wrote, January is rich in new launches and today we are announcing a set for a new course from OTUS – “Unity Game Developer”. In anticipation of the start of the course, we are sharing with you the translation of interesting material.


We are rebuilding the Unity core using our data-oriented technology stack (Data-Oriented Tech Stack) Like many game studios, we also see great advantages in using the Entity Component System (ECS), C # Task System (C # Job System) and Burst Compiler. At Unite Copenhagen, we had the opportunity to chat with Far North Entertainment and delve into how they implement this DOTS functionality in traditional Unity projects.

Far North Entertainment is a Swedish studio co-owned by five engineering friends. Since the release of Down to Dungeon for Gear VR in early 2018, the company has been working on a game that belongs to the classic genre of PC games, namely a post-apocalyptic game in zombie survival mode. What sets the project apart from others is the number of zombies that are chasing you. The vision of the team in this regard drew thousands of hungry zombies following you in huge hordes.

However, they quickly ran into a lot of performance problems already at the prototyping stage. The creation, death, renewal and animation of all this number of enemies remained the main bottleneck, even after the team tried to solve the problem with oblect pooling and animation instancing.

This forced the studio’s technical director Andres Ericsson to turn his attention to DOTS and change the mindset from object-oriented to data-oriented. “The key idea that helped bring about this shift was that you had to stop thinking about objects and hierarchies of objects and start thinking about data, how it is being transformed, and how to access it,” he said . His words mean that it is not necessary to build a code architecture with an eye on objects of real life in such a way that it solves the most general and abstract problem. He has many tips for those who, like him, are faced with a change in worldview:

“Ask yourself what the real problem is that you are trying to solve, and what data is important to get a solution. Will you convert the same data set the same way over and over again? How much useful data can you fit in one line of the processor cache? If you make changes to existing code, evaluate how much junk data you add to the cache line. Is it possible to divide the calculations into several threads or do I need to use a single command stream? ”

The team came to understand that entities in the Unity Component System are just search identifiers in component streams. Components are just data, while systems contain all the logic and filter out entities with a specific signature, known as archetypes. “I think one of the insights that helped us visualize our ideas was to introduce ECS as an SQL database. Each archetype is a table in which each column is a component, and each row is a unique entity. In essence, you use systems to create queries for these archetype tables and perform operations on entities, ”says Anders.

Introducing DOTS

To come to this understanding, he studied the system documentation Entity Componentexamples ECS and examplewhich we did together with Nordeus and presented at Unite Austin. General information about data-oriented architecture was also very helpful to the team. “Mike Acton Report about data-oriented architecture with CppCon 2014 – this is exactly what first opened our eyes to this way of programming. ”

The Far North team published what they learned in their Dev blog, in September this year, they came to Copenhagen to share their experiences with the transition to a data-oriented approach at Unity.

This article is based on a report, it explains in more detail the specifics of their implementation of ECS, the C # Task System and the Burst compiler. Far North also kindly shared a lot of code samples from their project.

Zombie Data Organization

“The problem we were faced with was interpolating the displacements and rotations for thousands of objects on the client side,” says Anders. Their initial object-oriented approach was to create an abstract script Zombieviewthat inherited a common parent class Entityview. Entityview – this is Monobehaviorattached to Gameobject. It acts as a visual representation of the game model. Each Zombieview was responsible for handling his own movement and rotation interpolation in his function Update.

This sounds normal, until you understand that each entity is located in memory in an arbitrary place. This means that if you are accessing thousands of objects, the CPU must get them out of memory one at a time, and this happens extremely slowly. If you put your data in neat blocks arranged in series, the processor can cache a whole bunch of data at the same time. Most modern processors can receive about 128 or 256 bits from the cache in one cycle.

The team decided to convert enemies into DOTS in the hope of resolving client-side performance issues. The first in line was a function Update in Zombieview. The team determined which parts should be divided into different systems and determined the necessary data. The first and most obvious thing was the interpolation of positions and turns, since the game world is a two-dimensional grid. Two float variables are responsible for where the zombies are going, and the last component is the target position, it tracks the server position for the enemy.

[Serializable]
public struct PositionData2D : IComponentData
{
    public float2 Position;
}
 
 
[Serializable]
public struct HeadingData2D : IComponentData
{
    public float2 Heading;
}
 
[Serializable]
public struct TargetPositionData : IComponentData
{
    public float2 TargetPosition;
}

The next step was to create an archetype for enemies. The archetype is a set of components that belong to a certain entity, in other words, it is the signature of the component.

The project uses prefabs to identify archetypes, as enemies require more components, and some of them need links to Gameobject. It works so that you can wrap your component data in ComponentDataProxywhich will turn him into Monobehavior, which in turn can be attached to the prefab. When you create an instance using EntityManager and pass the prefab, it creates an entity with all the data of the components that were attached to the prefab. All component data is stored in 16 kilobyte memory chunks called Archetypechunk.

Here is a visualization of how component flows will be organized in our archetype chunk:

“One of the main advantages of archetype chunks is that you don’t often need to reallocate a bunch when creating new objects, as the memory has already been allocated in advance. This means that creating entities is writing data to the end of component flows inside archetype chunks. The only case when it is necessary to perform heap allocation again is when creating an entity that does not fit into the borders of the chunk. In this case, either the allocation of a new chunk of an archetype of 16 KB in size will be initiated, or if there is an empty fragment of the same archetype, it can be reused. Then the data for the new objects will be recorded in the component flows of the new chunk ““Explains Anders.

The multithreading of your zombies

Now that the data was densely packed and placed in memory in a convenient way for caching, the team could easily use the C # task system to run its code on several CPU cores in parallel.

The next step was to create a system that filtered all entities from all archetype blocks with components PositionData2D, HeadingData2D and TargetPositionData.

For this, Anders and his team created JobComponentSystem and constructed your request in a function Oncreate. It looks something like this:

private EntityQuery m_Group;

protected override void OnCreate()
{
	base.OnCreate();

	var query = new EntityQueryDesc
	{
		All = new [] 
		{
			ComponentType.ReadWrite(),
			ComponentType.ReadWrite(),
			ComponentType.ReadOnly()
		},
	};

	m_Group = GetEntityQuery(query);
}

The code announces a request that filters out all the objects in the world that have a position, direction and purpose. Next, they wanted to schedule tasks for each frame using the C # task system to distribute the calculations across several workflows.

“The coolest thing about the C # task system is that it is the same system that Unity uses in its code, so we didn’t have to worry about executable threads blocking each other, requiring the same processor cores and causing performance problems . ”says Anders.

The team decided to use Ijobchunk, because thousands of enemies implied the presence of a large number of archetype chunks, which should correspond to the request at runtime. Ijobchunk distributes the correct chunks across various workflows.

Each frame is a new challenge. UpdatePositionAndHeadingJob responsible for handling the interpolation of positions and turns of enemies in the game.

The code for scheduling tasks is as follows:

protected override JobHandle OnUpdate(JobHandle inputDeps)
{
	var positionDataType       = GetArchetypeChunkComponentType();
	var headingDataType        = GetArchetypeChunkComponentType();
	var targetPositionDataType = GetArchetypeChunkComponentType(true);

	var updatePosAndHeadingJob = new UpdatePositionAndHeadingJob
	{
		PositionDataType = positionDataType,
		HeadingDataType = headingDataType,
		TargetPositionDataType = targetPositionDataType,
		DeltaTime = Time.deltaTime,
		RotationLerpSpeed = 2.0f,
		MovementLerpSpeed = 4.0f,
	};

	return updatePosAndHeadingJob.Schedule(m_Group, inputDeps);
}

This is what the task looks like:

public struct UpdatePositionAndHeadingJob : IJobChunk
{
    public ArchetypeChunkComponentType PositionDataType;
    public ArchetypeChunkComponentType HeadingDataType;

    [ReadOnly]
    public ArchetypeChunkComponentType TargetPositionDataType;

    [ReadOnly] public float DeltaTime;
    [ReadOnly] public float RotationLerpSpeed;
    [ReadOnly] public float MovementLerpSpeed;
}

When a worker thread retrieves a task from its queue, it invokes the core of the task.

Here’s what the execution core looks like:

public void Execute(ArchetypeChunk chunk, int chunkIndex, int firstEntityIndex)
{
	var chunkPositionData       = chunk.GetNativeArray(PositionDataType);
	var chunkHeadingData        = chunk.GetNativeArray(HeadingDataType);
	var chunkTargetPositionData = chunk.GetNativeArray(TargetPositionDataType);
		   
	for (int i = 0; i < chunk.Count; i++)
	{
		var target       = chunkTargetPositionData[i];
		var positionData = chunkPositionData[i];
		var headingData  = chunkHeadingData[i];

		float2 toTarget = target.TargetPosition - positionData.Position;
		float distance  = math.length(toTarget);

		headingData.Heading = math.select(
			headingData.Heading,
			math.lerp(headingData.Heading, 
					math.normalize(toTarget), 
					math.mul(DeltaTime, RotationLerpSpeed)),
			distance > 0.008
		);

		positionData.Position = math.select(
			target.TargetPosition,
			math.lerp(
				positionData.Position, 
				target.TargetPosition, 
				math.mul(DeltaTime, MovementLerpSpeed)),
			distance <= 1
		);

		chunkPositionData[i] = positionData;
		chunkHeadingData[i]  = headingData;
	}
}

“You may notice that we use select instead of branching, this allows us to get rid of the effect called incorrect branch prediction. The select function will evaluate both expressions and select the one that matches the condition, and if your expressions aren’t so difficult to calculate, I would recommend using select, as it is often cheaper than waiting for the CPU to recover from a branch prediction incorrect. ”notes Anders.

Boost Productivity with Burst

The final step in converting DOTS to enemy position and course interpolation is to enable the Burst compiler. The task seemed quite simple to Anders: “Since the data is located in adjacent arrays and since we use the new mathematics library from Unity, all we had to do was add an attribute Burstcompile into our task. ”

[BurstCompile]
public struct UpdatePositionAndHeadingJob : IJobChunk
{
    public ArchetypeChunkComponentType PositionDataType;
    public ArchetypeChunkComponentType HeadingDataType;

    [ReadOnly]
    public ArchetypeChunkComponentType TargetPositionDataType;

    [ReadOnly] public float DeltaTime;
    [ReadOnly] public float RotationLerpSpeed;
    [ReadOnly] public float MovementLerpSpeed;
}

The Burst compiler gives us Single Instruction Multiple Data (SIMD); machine instructions that can work with multiple sets of input data and create multiple sets of output data with just one instruction. This helps us fill up more places on the 128-bit cache bus with the correct data. The Burst compiler, combined with a cache-friendly data composition and job system, allowed the team to significantly increase productivity. Here is the table they compiled by measuring performance after each conversion step.

This meant that Far North completely got rid of the problems associated with the interpolation of the position on the client side and the direction of the zombies. Their data is now stored in a convenient form for caching, and cache lines are filled only with useful data. The load is distributed to all CPU cores, and the Burst compiler produces highly optimized machine code with SIMD instructions.

Far North Entertainment DOTS Tips and Tricks

  • Start thinking in terms of data streams, because in ECS, entities are simply search indexes in parallel component data streams.
  • Imagine ECS as a relational database in which archetypes are tables, components are columns, and entities are indices in a table (row).
  • Organize your data into sequential arrays to use the processor cache and hardware prefetch.
  • Forget about wanting to create hierarchies of objects and trying to find a common solution before understanding the real problem you are trying to solve.
  • Think about garbage collection. Avoid over-allocating heaps in performance-critical areas. Use the new native Unity containers instead. But be careful, you have to deal with manual cleaning.
  • Recognize the value of your abstractions, beware of the overhead of invoking virtual functions.
  • Use all CPU cores with the C # task system.
  • Analyze the hardware level. Does the Burst compiler actually generate SIMD instructions? Use the Burst Inspector for analysis.
  • Stop wasting cache lines in empty. Think of packing data into cache lines as packing data into UDP packets.

The main advice Anders Ericsson wants to share is more general advice for those whose project is already under development: “Try to identify specific areas in your game where you are having performance issues, and see if you can apply DOTS specifically in this isolated area. You do not need to change the entire code base! ”.

Future plans

“We want to use DOTS in other areas of our game, and we were delighted with the announcements on Unite about DOTS animations, Unity Physics and Live Link. We would like to learn how to convert more game objects into ECS objects, and it seems that Unity has made significant progress in implementing this, ”concludes Anders.
If you have additional questions for the Far North team, we recommend that you join them. Discord!
Check out the playlist Unite Copenhagen DOTSto find out how other modern gaming studios use DOTS to create great high-performance games, and how DOTS-based components like DOTS Physics, the new Conversion Workflow, and the Burst compiler work together.

On this the translation came to an end, and we welcome to visit free webinarunder which tell you how to create your own zombie shooter in an hour.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *