Speeding up Dictionary in C# using structures and CollectionsMarshal
If you are a C# developer, then you are probably familiar with the class Dictionary
. You most likely used classes as values. But what if I say that Dictionary
can we use structures? Don't worry about structures being copied when passed to or returned from a method. There is a way to avoid this, and it works quickly.
Disclaimer
The information in this article is only correct under certain conditions. I admit that the benchmark may show different results on different hardware, with a different compiler, during a different phase of the moon, or under a different scenario for using the language functionality in question. Always check your code and don't rely only on articles from the Internet.
Usage scenario
Let's imagine that there is some data array data
. We are required to implement the following functionality:
Convert
data
to the dictionary for later searching.Using a certain key, find an object in the dictionary and change it.
Repeat step 2 as many times as required.
Let's write code that simulates this behavior:
// Инициализация словаря
var dict = new Dictionary<int, SomeClass>(Length);
// Заполнение словаря
foreach (var (i, obj) in data) {
dict.Add(i, obj);
}
// Поиск значения и изменение данных
for (int i = 0; i < Length; i++) {
dict[i].DoWork(i);
}
The code above works as intended. Let's try to speed it up. Let's replace the class SomeClass
on the structure SomeStruct
and compare the performance of both options.
// Инициализация словаря
var dict = new Dictionary<int, SomeClass>(Length);
// Заполнение словаря
foreach (var (i, obj) in data) {
dict.Add(i, obj);
}
// Поиск значения и изменение данных
for (int i = 0; i < Length; i++) {
var obj = dict[i];
obj.DoWork(i);
dict[i] = obj;
}
Benchmark
Performance measurements were carried out on a data array of 100,000 elements. The size of classes (without header) and structures varied from 4 to 128 bytes. For performance measurements I used the library BenchmarkDotNet. The benchmark code and results can be found in GitHub.
The benchmark results show performance degradation when using structures larger than 20 bytes. In the implementation with structures, they are copied multiple times, and the dictionary is searched twice. This has a negative impact on performance. Let's break down the code measurements into parts to understand what can be improved.
Initializing the Dictionary
The benchmark showed the expected results. The initialization time and the size of the dictionary with structures increase linearly with the size of the structures.
This is due to the fact that the array entries in this case it stores values directly, not references. Accordingly, storing such a dictionary requires simply more memory.
To be fair, it should be noted that the CLR allocated even more memory for classes, it just happened earlier – during array initialization data
. If you measure the time spent on initializing an array of classes and structures, the results will not be in favor of the classes. But this is beyond the scope of the article.
Populating the dictionary
And again the expected results. The time it takes to copy structures when the dictionary is full depends linearly on the size of the structures. Although the difference between structures and classes is practically unnoticeable up to 20 bytes.
Finding a value and changing it
And for the third time, the results are not in favor of the structures.
This is due to the fact that searching by key and copying structures is carried out twice:
SomeStruct s = dict[i]; // 1-й поиск по ключу и копирование структуры
s.DoWork(i);
d[i] = s; // 2-й поиск по ключу и копирование структуры
This is where the class will help us CollectionsMarshal
.
Who is this CollectionsMarshal of yours?
In short, this is Class with four extension methods:
AsSpan<T>
— returnsSpan<T>
for elementsList<T>
.GetValueRefOrAddDefault<TKey, TValue>
— by key returns a link to an element from the dictionaryTValue
creatingdefault
value if the element does not exist.GetValueRefOrNullRef<TKey, TValue>
— by key returns a link to an element from the dictionaryTValue
or link tonull
if the element does not exist.SetCount<T>
— sets the value of Count forList<T>
.
We are only interested in GetValueRefOrAddDefault
And GetValueRefOrNullRef
. Using these methods, you can retrieve values from a dictionary by reference, avoiding double key lookups and double copying of the structure. For example, the code above could be rewritten as follows:
ref SomeStruct s = ref CollectionsMarshal.GetValueRefOrNullRef(dict, i);
s.DoWork(i);
A few more benchmarks
We will take measurements of the implementation with GetValueRefOrNullRef
and compare with previous results:
Code execution time s CollectionsMarshal
even faster than with classes. To compensate for the performance overhead of initializing and populating the dictionary, the number of lookups must be many times greater than the size of the array.
Features of CollectionsMarshal
Checking for default and null
As mentioned earlier, methods GetValueRefOrAddDefault
And GetValueRefOrNullRef
return a link to default structure and a reference to null.
Check whether the structure is default, i.e. all fields have a default value, it’s quite simple – you need to check the exists flag:
ref var element = ref CollectionsMarshal.GetValueRefOrAddDefault(dict, key, out bool exist);
if (exist) {
// some code here
}
With reference to null
the situation is different. There is no boolean flag, but when comparing with null
an exception will be thrown NullReferenceException
. It's better to use the method Unsafe.IsNullRef
ref var element = ref CollectionsMarshal.GetValueRefOrNullRef(dict, key);
if (Unsafe.IsNullRef<T>(ref element)) {
// some code here
}
Changing the dictionary after receiving a structure reference
In the documentation for the methods GetValueRefOrAddDefault
And GetValueRefOrNullRef
it is explicitly stated that the dictionary cannot be changed after a reference to the structure has been received. Why this should not be done is demonstrated in the example below. After changing the dictionary, any changes to the structure obtained from the reference will not affect the value in the dictionary.
ref var element = ref CollectionsMarshal.GetValueRefOrNullRef(dict, key);
Console.WriteLine($"ref element: {element.Item1}"); // 30
Console.WriteLine($"dict[key]: {dict[key].Item1}"); // 30
element.Item1 = 50; // change #1
Console.WriteLine($"ref element: {element.Item1}"); // 50
Console.WriteLine($"dict[key]: {dict[key].Item1}"); // 50
dict.Add(100, new (100)); // add a new element
element.Item1 = 60; // change #2
Console.WriteLine($"ref element: {element.Item1}"); // 60
Console.WriteLine($"dict[key]: {dict[key].Item1}"); // 50
conclusions
Structures are an underrated element of C# that, under certain conditions, can speed up your application. When using structures as values for Dictionary
better to use class CollectionsMarshal
. Methods of this class GetValueRefOrAddDefault
And GetValueRefOrNullRef
allow you to get dictionary elements by reference. This, in turn, can have a positive impact on code performance when the number of dictionary lookups is relatively large.