Go: Dealing with Lock Contention with the Atomic Package

The translation of the material was prepared as part of the course “Golang Developer. Professional”… If you are interested in learning more about the course, we invite you to Open Day online.


This article is based on Go 1.14.

Go provides memory synchronization mechanisms such as a channel or a mutex to help solve various problems. With regard to shared memory, a mutex protects memory against data races. However, while there are two types of mutexes, Go also provides atomic memory primitives in the atomic package to improve performance. But let’s get back to data races first before diving into solutions.

Data race

A data race can occur when two or more goroutines are accessing the same area of ​​memory at the same time, and at least one of them is writing. While map has its own data race protection mechanism, simple structures do not, which makes them vulnerable to this problem.

To illustrate the data race, I’ll take an example of a configuration that is constantly updated by a goroutine. Here is its code:

Running this code clearly shows that the result is non-deterministic due to data race:

[...]
&{[79167 79170 79173 79176 79179 79181]}
&{[79216 79219 79220 79221 79222 79223]}
&{[79265 79268 79271 79274 79278 79281]}

Each line was expected to be a contiguous sequence of integers, but in reality the result was completely random. Running the same program with a flag -race indicates a data race:

WARNING: DATA RACE
Read at 0x00c0003aa028 by goroutine 9:
  [...]
  fmt.Printf()
      /usr/local/go/src/fmt/print.go:213 +0xb5
  main.main.func2()
      main.go:30 +0x3b

Previous write at 0x00c0003aa028 by goroutine 7:
  main.main.func1()
      main.go:20 +0xfe

Read and write protection against data races can be implemented with a mutex or (which is the most common solution) with the atomic package.

Mutex vs Atomic

The standard library provides two kinds of mutexes in a package sync: sync.Mutex and sync.RWMutex; the latter is optimized for cases where your program deals with many readers and very few writers. Here’s one solution:

The program will now print the expected result; the numbers increased as they should:

[...]
&{[213 214 215 216 217 218]}
&{[214 215 216 217 218 219]}
&{[215 216 217 218 219 220]}

The second solution can be accomplished thanks to the package atomic… Here is the code:

The result is also quite expected:

[...]
&{[32724 32725 32726 32727 32728 32729]}
&{[32733 32734 32735 32736 32737 32738]}
&{[32753 32754 32755 32756 32757 32758]}

As far as the generated output is concerned, it looks like a solution using the atomic package is much faster as it can generate a higher sequence of numbers. Comparing both programs will help you find out which one is the most effective.

Performance

The benchmark should be interpreted according to what is being measured. In this case, I will be measuring the previous program, where it has a recorder that constantly stores the new configuration, as well as a few readers who constantly read it. To cover more potential cases, I’ll also include tests for a reader-only program, provided the configuration doesn’t change very often. Here’s an example of this new case:

Running the test ten times side by side gives the following results:

name                              time/op
AtomicOneWriterMultipleReaders-4  72.2ns ± 2%
AtomicMultipleReaders-4           65.8ns ± 2%
MutexOneWriterMultipleReaders-4    717ns ± 3%
MutexMultipleReaders-4             176ns ± 2%

The benchmark confirms what we’ve seen before in terms of performance. To understand exactly where the mutex bottleneck is, we can restart the program with the tracer enabled.

For more information on the trace package, I suggest you read my article “Go: Discovery of the Trace Package. “.

Here is the profile of the program using the package atomic:

Goroutines run non-stop and can complete their tasks. As for the profile of a program with a mutex, the picture is completely different:

The runtime is pretty fragmented now and this is due to the mutex that parks the goroutine. This is confirmed by the goroutine overview, which shows the time it took to sync in a lock:

The blocking time is about a third of the total time. This can be detailed from the blocker’s profile:

Package atomic definitely gives an advantage in this case. However, in some cases, performance may be degraded. For example, if you need to keep a large map, you will have to copy it every time the map is updated, which makes it inefficient.

For more information on mutexes, I suggest you read my article “Go: Mutex and Starvation“.


Learn more about the course “Golang Developer. Professional”

Watch a demo lesson “Formatting data”

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *