Do you know Go well? Keep a couple of useful things on code optimization

We'll talk about:

  • benefits for competitive programming

  • techniques in Go in general, such as using iotaerror handling, interface output, etc.

  • methods for optimizing work with slices

We’ll discuss how to avoid unnecessary memory allocation, how to deal with race conditions, talk about the compactness and conciseness of the code, and a lot of other useful things.

I come across some of this especially often, some techniques are borrowed from colleagues, some I saw on the Internet. I’m sure that this is relevant not only for me, that’s why I’m writing this article.

My channel with Go tools developer, with an analysis of tricky interview questions, examples with code, training lessons and a bunch of other useful things, there’s a whole folder for everyone who loves GO.

Let's go already!

Useful things for competitive programming

Linking the for and select loops

Allows you to avoid blocking the loop in one of the states; everything happens inside one goroutine:

func(s *sub) loop() {
  // ... определяем изменеяемое состояние ...
  for {
      // ... задаем каналы для  разных случаев ...
      select {
          case <-c1: // прочитать из канала без сохранения в переменную
           // ... прочитать/записать состояние ...
          case с2 <- x: // записать в канал
           // ... прочитать/записать состояние ...        
          case y := <-c3: // прочитать из канала в переменную
           // ... прочитать/записать состояние ...
      }
  }
}

As you can see, in this our select We handle all possible cases:

  • if the value can be read from the channel without storing it in a variable

  • if the variable's value can be written to the channel

  • if the value can be read from the channel and stored into a variable

Because the cycle for infinite, operator select will select one of the cases and execute the corresponding code. The cycle then repeats and the process repeats again.

This way we can handle different cases within the same goroutine, avoiding the loop blocking and allowing the goroutine to continue its work.

Service channel, response channel (chan chan error)

When we use goroutines, checking for completion of execution using a boolean flag can lead to a data race. The race status can be seen using go build -race main.go. To avoid this, let's create a channel that transmits the channel.

type sub struct {
  closing chan chan error // запрос ответ
}
// использование
func (s *sub) Close() {
  errChan = make(chan error)
  s.closing <- errChan
  return <-errChan
}
// Обработка сигнала закрытия в loop
func (s *sub) loop() {
  // ...
  var err error // задается когда в произошла ошибка во время выполнения основной работы
  for {
    select {
      case errChan := <-s.closing: // проверяем есть ли сигнал на завершение работы
        errChan <- err // вернем ошибку через предоставленный канал, может быть nil или объект error
        close(s.updates) // закрываем канал пересылки для данных в основную горутину
        return // завершим работу loop()
    }
  }
 // ...
}

It looks quite strange, a channel with an error channel. This design allows bidirectional communication between goroutines. We transmit the channel through which vs we can return the answer. Method loop() it’s like a small server, and to stop it we give it a request to stop working – we write the value to the channel sub.closing, in the channel we pass the channel into which the server will place the response when it finishes working. In case of a normal stop in the channel there will be nilotherwise the channel will return error.

The advantages of this entire design are obvious:

  1. Synchronization: Using channels, we synchronize the execution of goroutines, ensuring the correct order of operations and preventing race conditions.

  2. Safety: The use of channels ensures the security of data access during parallel execution of operations. Channels ensure that only one goroutine can access data at a time.

  3. Error management: Channels allow us to pass error information between goroutines, allowing us to handle errors efficiently.

nil channels in select statements for temporary pausing

If we want to send elements to the channel one at a time, then we can select them from a certain queue; this can be organized like this:

var pending []Item // заполняется процедурой получения данных, очищается процедурой отправки отчетов о работе
// отправляем информацию о завершенной задаче в канал s.updates
for {
  select {
  case s.updates <- pending[0]: // отправляем первый элемент
    pending = pending[1:] // когда отправка удаласть, удаляем первый элемент из массива перерписваивая слайс без него
  }
}
// это будет падать с ошибкой

Why does this code fail? At that moment when pending becomes empty we cannot access its 1 element.
Channels have this special feature: if you assign a value to a channel nil, then sending and receiving are blocked in this channel. We deinitialize it manually.
This feature can be used to introduce a temporary block.
Sample code:

a := make(chan string)
go func(){ a <- "a" }()
a = nil
select {
case s:=<-a: // тут канал будет заблокирован
  fmt.Println("s=", s)
}

This way we can temporarily disable some execution options in select.
So let's fix the problem with this method:

var pending []Item // заполняется процедурой получения данных, очищается процедурой отправки отчетов о работе
// отправляем информацию о завершенной задаче в канал s.updates
for {
  var first Item
  var updates chan Item
  if len(pending) > 0 {
    first = pending[0]
    updatesChan = s.updates // укажем реальный канал, чтобы разблокировать исполнение в select
  }

  select {
  case updatesChan <- first: // отправляем первый элемент, заблокируется, если канал s.updates = nil
    pending = pending[1:] // когда отправка удаласть, удаляем первый элемент из массива перерписваивая слайс без него
  }
}

Here we are using an infinite loop forin which elements from the list are sent pending into the channel updates. If the list pending is not empty, then the first element is extracted from it and stored in a variable firstand also establishes a channel updatesChan equal to the channel s.updates.

Then the expression is executed select with one case corresponding to item submission first into the channel updatesChan. If the channel s.updates not equal nil, then sending an element to a channel will unblock the goroutine, which is waiting for data from that channel. After successfully submitting an item from the list pending the first element is removed.

If the channel s.updates equals nil, then sending an element to a channel will block the goroutine, which will wait until at least one element is sent to the channel. Thus, the goroutine pauses until it is ready to receive data from another goroutine that is sending data to that channel.

This is how simple and efficient goroutine synchronization works in Go; we use channels to transmit data and select to wait for events.

Quite simple and popular Go tricks

Checking for the presence of a key in map

An obvious thing, and many people know this, but I simply have to mention it. To check if the key is in mapwe simply write the result of taking the key in 2 variables, the second variable is error or nil:

_, keyIsInMap := myMap["keyToCheck"]
if !keyIsInMap {
  fmt.Println("key not in map")
}

Checking when casting variable types

Sometimes you need to convert a variable from one type to another. The problem is that if the type is incorrect, the code will panic. For example, the following code attempts to cast a variable data to string type string:

value := data.(string)

If the conversion data in type string doesn't happen, the code will panic. Therefore, let's consider the conversion method better.

Similar to checking for the presence of a key in map: when casting types, we get a logical value and check whether the cast occurred or not:

value, ok := data.(string)

In this example ok is a boolean value that tells whether the type cast was successful or not. This way type mismatches are handled more gracefully than with the panic mechanism.

Specifying array size when using append for optimization

To add elements to an array, it is best to use append. For example:

for _, v := range inputArray {
  myArray = append(myArray, v)
}

However, in the case of large arrays, the appending process will slow down because append you will need to constantly increase the size myArray for new values, this is a constant reallocation of memory. It's better to first specify the length of the array and then assign each value directly:

myArray := make([]int, len(inputArray))
 for i, v := range inputArray {
  myArray[i] = v
 }
}

There is a third option, which I like even more: it combines the previous two. I consider it a little more convenient for perception, and it does not lead to a loss of speed, because the size is assigned at the beginning:

myArray := make([]int, 0, len(inputArray))
for _, v := range inputArray {
  myArray = append(myArray, v)
}

Here, the array size is set to 0 and the maximum size is set to the length of the input array. That's why append no need to change sizes on the fly.

Using append and ellipses to join arrays

It's quite an obvious thing, but I'll mention it. Sometimes you need to combine 2 slices, and it’s very useful that append is a function with a variable number of arguments. See what a typical call looks like append:

myArray = append(myArray, value1)

AND append allows you to add multiple elements at the same time:

myArray = append(myArray, value1, value2)

But the coolest thing is unpacking the slice using ... when passing it to a function. So, let's combine the slice inputArray With myArray:

myArray = append(myArray, inputArray...)

In this case, the elements are unpacked inputArray and we add them to the end myArray.

Displaying parameter names and values ​​when outputting a structure

I use this all the time now. Previously, to display parameter names and values ​​in a structure, I marshaled it to JSON and logged it. But there is a much simpler way: when executing Printf add + or # into a format depending on what you want to output.

package main
import "fmt"

func main() {
    MyStruct := struct{
        Value1 string
        Value2 int
    }{Value1:"first value", Value2:2}
    
    fmt.Printf("%v \n", MyStruct)  // {first value 2} 
    fmt.Printf("%+v \n", MyStruct) // {Value1:first value Value2:2}
    fmt.Printf("%#v \n", MyStruct) // struct { Value1 string; Value2 int }{Value1:"first value", Value2:2}
}

Using iota with custom types when enumerating

When enumerating in Go it is better to use a keyword iota. Each time it is called, it assigns increasing integer values. This is great for creating enums and is used in conjunction with a custom integer type so that the compiler ensures that users of their code only use the specified enums. Example:

type PossibleStates int

const (
 State1 PossibleStates = iota  // 0
 State2                        // 1
 State3                        // 2
)

Here we create a custom type PossibleStates as an alias over intafter which each enum will have the type PossibleStatethe value of which is assigned by the keyword iota.

Eventually State1, State2, State3 will have values ​​0, 1, 2 respectively.

Using functions corresponding to interface ones as parameters when creating a simulated interface

This technique was a revelation for me. Let's say there is an interface that needs to be simulated:

type DataPersistence interface {
 SaveData(string, string) error
 GetData(string) (string, error)
}

This is an interface for several different types of this persistence. We need to test the code, so let's create a simulated structure DataPersistence for use in tests. But instead of writing a complex mock structure, let's simply create a structure with parameters, which are functions that correspond to interface functions.

This is what the simulation will look like:

type MockDataPersistence struct {
 SaveDataFunc func(string, string) error
 GetDataFunc func(string) (string, error)
}

// SaveData просто вызывает параметр SaveDataFunc
func (mdp MockDataPersistence) SaveData(key, value string) error {
 return mdp.SaveDataFunc(key, value)
}

// GetData просто вызывает параметр GetDataFunc
func (mdp MockDataPersistence) GetData(key string) (string, error) {
 return mdp.GetDataFunc(key)
}

This means that when testing, functions are configured as we need them, right in the same test:

func TestMyStuff(t *testing.T) {
 mockPersistor := MockDataPersistence{}
 // здесь устанавливаем SaveData, чтобы просто вернуть ошибку
 mockPersistor.SaveDataFunc = func(key, value string) error {
  return fmt.Errorf("error to check how your code handles an error")
 }

// теперь проверяем, как thingToTest (то, что тестируется) разбирается с тем, когда 
 // SaveData возвращает ошибку
 err := thingToTest(mockPersistor)
 assert.Nil(t, err)
}

The ease of perception really improves: now you can clearly see what the simulation is capable of in each test. Additionally, we now have access to test data in the simulated function without the need to maintain external data files.

Creating your own interface if you don't have one

Let's say you're using another Go library, and there's a structure there, but no interface – create it yourself. For example, this structure:

type OtherLibsStruct struct {}

func (ols OtherLibsStruct) DoCoolStuff(input string) error {
 return nil
}

We create the required interface directly in the code:

type InterfaceForOtherLibsStruct interface {
 DoCoolStuff(string) error
}

After you create an interface, you can write code to use the interface. When you need to test this code, you can do the mock interface trick we talked about earlier.

Instantiating Nested Anonymous Structures

And I had to use this technique several times when using the generated code. Sometimes code generation results in a nested anonymous structure. Let's say something like this:

type GeneratedStuct struct {
  Value1 string `json:"value1"`
  Value2 int `json:"value2"`
  Value3 *struct {
    NestedValue1 string `json:"NestedValue1"`
    NestedValue2 string `json:"NestedValue2"`
  } `json:"value3,ommitempty"`
}

Let's say we now need to create an instance of this structure for use. How to do it? WITH Value1 And Value2 everything is simple, but how to instantiate a pointer to an anonymous structure Value3?

The first thing that comes to mind is to write it in JSON and then marshal it into a structure. But this is terrible and somehow amateurish. It turns out that you need to use another anonymous structure when instantiating it:

myGeneratedStruct := GeneratedStuct{
  Value3: &struct {
   NestedValue1 string `json:"NestedValue1"`
   NestedValue2 string `json:"NestedValue2"`
  }{
   NestedValue1: "foo",
   NestedValue2: "bar",
  },
 }

Generally obvious, but keep in mind that this anonymous structure must match exactly, down to the JSON tags.

For example, due to a type mismatch, it will not be possible to compile this, although at first glance everything is the same:

myGeneratedStruct := GeneratedStuct{
  Value3: &struct {
   NestedValue1 string `json:"nestedValue1"`
   NestedValue2 string `json:"nestedValue2"`
  }{
   NestedValue1: "foo",
   NestedValue2: "bar",
  },
 }

Techniques for working with slices

Filtering without memory allocation

To avoid reallocation, we use the fact that the slice a[:0] points to the same array and has the same capacity as the original slice a.

b := a[:0]
for _, x := range a {
    if f(x) {               // f() - некая фильтрующая функция
        b = append(b, x)
    }
}

b := a[:0] – create a slice b zero length, but with the same capacity as the slice a.

For items that need to be garbage collected, you can use something like:

for i := len(b); i < len(a); i++ {
    a[i] = nil
}

Unfolding a slice

To fill a slice with its same elements, but in reverse order, you can do this:

for i := len(a)/2-1; i >= 0; i-- {
    opp := len(a)-1-i
    a[i], a[opp] = a[opp], a[i]
}

The same thing, only with two indexes:

for left, right := 0, len(a)-1; left < right; left, right = left+1, right-1 {
    a[left], a[right] = a[right], a[left]
}

Mix the slice

You can mix the slice elements like this, here we use the Fisher-Yates algorithm and math/rand:

for i := len(a) - 1; i > 0; i-- {
    j := rand.Intn(i + 1)
    a[i], a[j] = a[j], a[i]
}

In general, starting with Go 1.10, we can use a ready-made function math/rand.Shuffle:

rand.Shuffle(len(a), 
             func(i, j int) { a[i], a[j] = a[j], a[i] })

Creating batches with minimal resource allocation

This is useful if you want to batch process large slices.

actions := []int{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
batchSize := 3
batches := make([][]int, 0, (len(actions) + batchSize - 1) / batchSize)

for batchSize < len(actions) {
    actions, batches = actions[batchSize:], append(batches, actions[0:batchSize:batchSize])
}
batches = append(batches, actions)

// [[0 1 2] [3 4 5] [6 7 8] [9]]

A little more detail about what's going on here:

  1. Declaring a slice actions with integers from 0 to 9.

  2. Set the batch size (subslice size) batchSizeequal to 3.

  3. Create an empty slice batches for storing batches. We install its capacity in (len(actions) + batchSize - 1) / batchSizeto make sure there is enough space for all the batches.

  4. Starting the cycle forwhich works as long as the batch size is batchSize less than slice length actions. Inside the loop:

    • Updating the slice actionsdiscarding the first batchSize elements using the operation actions[batchSize:].

    • Updating the slice batches by adding a new batch created from the first batchSize slice elements actions using an operation append(batches, actions[0:batchSize:batchSize]).

  5. After the end of the cycle, when the length actions less than or equal to batchSizeadd the remaining elements to batches using an operation append(batches, actions).

Throwing out duplicates (deduplication)

import "sort"

in := []int{3,2,1,4,3,2,1,4,1} // любой элемент можно отсортировать
sort.Ints(in)
j := 0
for i := 1; i < len(in); i++ {
	if in[j] == in[i] {
		continue
	}
	j++
	// сохраняем исходные данные
	// in[i], in[j] = in[j], in[i]
	// устанавливаем только то, что требуется
	in[j] = in[i]
}
result := in[:j+1]
fmt.Println(result) // [1 2 3 4]

A little more detail:

  1. Sort the slice in by using sort.Ints(in) in ascending order.

  2. We create j to track the position where we will place the next unique element.

  3. Inside the loop we check if the current element is equal to in[j]. If equal, then skip the current iteration of the loop using continuewithout changing j.

  4. And if the current element is not equal in[j]increase j by 1 and copy the current element to position in[j]. It means that in[j] now contains a unique value, and j indicates the next position where the next unique element should be placed.

Somehow you can put it into a slice result all unique values ​​from in.

Bring an element to front or insert if the element is missing

// moveToFront перемещает needle в начало среза
func moveToFront(needle string, haystack []string) []string {
    if len(haystack) != 0 && haystack[0] == needle {
        return haystack
    }
    prev := needle
    for i, elem := range haystack {
        switch {
        case i == 0:
            haystack[0] = needle
            prev = elem
        case elem == needle:
            haystack[i] = prev
            return haystack
        default:
            haystack[i] = prev
            prev = elem
        }
    }
    return append(haystack, prev)
}

haystack := []string{"a", "b", "c", "d", "e"} // [a b c d e]
haystack = moveToFront("c", haystack)         // [c a b d e]
haystack = moveToFront("f", haystack)         // [f c a b d e]

Overall, the code is pretty transparent:

  1. If the first element of the slice is equal to needlethe function returns the slice unchanged.

  2. Otherwise, the function goes through the slice using a loop for and moves the elements until it finds needle.

  3. When needle found, the function places the previous element at the current position and breaks the loop, returning the modified slice.

  4. If needle not found, function adds prev (last element of the slice) to the end and returns the modified slice.

Something similar can be used in scenarios where you need to quickly move an element to the beginning. For example, to implement a queue or stack, where you often need to quickly move elements to the top. Well, a stack or queue can be used in caching systems, where elements that are frequently requested must be quickly available.

Sliding window

We cut our slice into intersecting pieces of the required size.

func slidingWindow(size int, input []int) [][]int {
    // возвращает входной срез как первый элемент
    if len(input) <= size {
        return [][]int{input}
    }

    // выделяем срез точного размера, который нам нужен
    r := make([][]int, 0, len(input)-size+1)

    for i, j := 0, size; j <= len(input); i, j = i+1, j+1 {
        r = append(r, input[i:j])
    }

    return r
}

a:=[]int{1,2,3,4,5}
aa := slidingWindow(3, a)
fmt.Println(aa) // [[1 2 3] [2 3 4] [3 4 5]]

In general, this whole story with slicing is actively used in signal processing, time series analysis, and especially in ML.

The end

Well, we looked at useful features and techniques in Go. There are some things I use regularly and I hope they will make your life easier too.

Write what you have encountered in practice – it will be interesting to read)

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *