Why iterators in Go 1.23 are poorly designed

NOTE: This post is an adaptation of the following tweet (but is completely self-contained): https://x.com/TheGingerBill/status/1802645945642799423

TL;DR Go language now it's perceived as too much “functional” rather than so shamelessly imperative language.

Recently I came across twitter A post demonstrating how iterators will be designed in Go 1.23 (this version will be released in August 2024). I get the impression that many in the community do not like this innovation. I decided to speak on this matter from the point of view of a language designer.

United pull request for this offer is here: https://github.com/golang/go/issues/61897

It explains in detail and in depth why certain decisions were made when designing the language and not others, so I recommend you read it if you are familiar with Go.

Here's an example from the original tweet I found at the time:

func Backward[E any](s []E) func(func(int, E) bool) {
	return func(yield func(int, E) bool) {
		for i := len(s)-1; i >= 0; i-- {
			if !yield(i, s[i]) {
				// Здесь идёт код очистки
				return
			}
		}
	}
}

s := []string{"a", "b", "c"}
for _, el in range Backward(s) {
	fmt.Print(el, " ")
}
// c b a

In this example it is quite clear What is done here, but the whole structure of the example seems a little crazy to me, at least for most common practical cases.

As far as I can tell, the code will then be transformed and take approximately the following form:

Backward(s)(func(_ int, el string) bool {
	fmt.Print(el, " ")
	return true // `возврат false` был бы равноценен явному `break`
})

In this way, Go iterators are much closer to the “for each” methods used in some languages ​​(e.g. .forEach() in JavaScript), for example, in that a callback is passed to such a method. Moreover, here’s what’s interesting: this approach was already possible in Go before version 1.23, but there was still no syntactic sugar that would allow it to be used inside an operator for range.

I'll try to summarize the rationale behind the introduction of iterators in Go 1.23, but it seems that the developers were trying to achieve the following while minimizing the undesirable consequences of these factors:

  • Make iterators look/act like generators from other languages ​​(hence see yield)

  • Minimize the need to share large numbers of stack frames

  • Provide cleaning using defer

  • Reduce instances of storing data outside the control flow

This is how it all explains Russ Cox (rsc) in the original sentence:

A note about the two types of iterators: push vs pull. In the vast majority of cases, push iterators are more convenient to implement and use, since they can be built and demolished in the context of calls yield. This is much better than having to implement them as separate operations and then provide them to the caller. When using push iterators directly (including with a loop range) has to avoid storing any data on the control thread, so some clients may need pull-up iterators from time to time. In any such code it is easy to call Pull and put it aside stop.

Russ Cox explores this situation in more detail in his article Storing Data in Control Flow, telling why he likes this approach to design.

More complicated example

Note: Don't try to understand what exactly this code does. With this example, I just want to show how cleanup is organized in code where something like defer is used.

The example from the original pull request illustrates a much more complex approach: here cleanup is required when working with directly pulled up meanings:

// Pairs возвращает итератор, перебирающий последовательные пары значений из данной последовательности.
func Pairs[V any](seq iter.Seq[V]) iter.Seq2[V, V] {
	return func(yield func(V, V) bool) bool {
		next, stop := iter.Pull(it)
		defer stop()
		v1, ok1 := next()
		v2, ok2 := next()
		for ok1 || ok2 {
			if !yield(v1, v2) {
				return false
			}
		}
		return true
	}
}

Alternative pseudo-sentence (with state machine)

Note: I don't think Go does this at all.

When designing Odin I wanted to give the user the ability to design their own “iterators”, but to keep these elements very simple – in fact, ordinary procedures. I didn't want to add a special construct to the language strictly for this purpose. This would make the language overly complex, which is exactly the kind of complexity I wanted to minimize in Odin.

For example, as a pseudo-suggestion, I would try giving Go iterators the following capability:

func Backward[E any](s []E) func() (int, E, bool) {
	i := len(s)-1
	return func(onBreak bool) (idx int, elem E, ok bool) {
		if onBreak || !(i >= 0) {
			// Если у нас есть код очистки, то он идёт здесь 
			return
		}
		idx, elem, ok = i, s[i], true
		i--
		return
	}
}

This pseudo-sentence would work something like this:

for it := Backward(s);; {
	_, el, ok := it(false)
	if !ok {
		break // it(true) не требуется вызывать, поскольку был вызван вариант `false` 
	}

	fmt.Print(el, " ")
}

This is roughly how I operate in Odin: BUT Odin does not support stack-frame-scope closures, only procedural literals with out-of-scope capture. Since Go has garbage collection, I can't see why they would be needed for this. In this respect, Odin differs from Go mainly in that it does not attempt to unify these ideas into a single design.

I know that for some this approach seems much more complicated than usual. Here we do exactly the opposite of what Cox prefers – he suggests storing data in the control flow, and we store it outside the control flow. But this is how I usually want to use an iterator, and not the way they plan to implement it in Go. This is the problem: with my approach, we sacrifice the beauty of storing data in the control flow, and therefore the convenient distinction between push and pull operations that Cox talks about.

Note: I am an imperative programmer by nature, I like to see exactly how certain things are done. For me, this is more important than building “outwardly elegant” code. Therefore, the approach that I describe above is fundamentally the following: we think, first of all, about how things will be done.

By the way, the typeclass/interface route will not work in Go, since from a design point of view such a concept is not independent and, in fact, only introduces unnecessary confusion. That's why I didn't offer it in the first place. Different languages ​​have different requirements, so what works in some may not work in others.

What philosophy is evident in Go?

Apparently, the approach taken in Go 1.23 is in line with explicit philosophy to target Go at the typical (to be honest, mediocre) programmer working at Google. Such a programmer does not want (and cannot) use a “complicated” language such as C++.

I will quote here Rob Pike:

The key point here is this: our programmers are Googlers, not researchers. That is, they are usually quite young, recent graduates have probably learned Java, maybe C or C++, and could also have learned Python. They can't understand high-end language, but we want them to make good software for us. Therefore, the language that we give them should be easily understandable for them and such that it is easy to adopt.

I know many will be offended by such a remark, but upscale designing a language requires authors to understand who they are making the language for. This is not an insult, but simply a statement of fact: Go was originally intended for current and former Google employees, as well as for representatives of similar companies. You may be “skillier and more capable” than the typical Google employee, but that doesn't matter. It's clear why people like Go: the language is simple, straightforward, and most programmers can get up to speed with it very quickly.

But if you look at it, the iterators above are designed quite in the spirit of Go, which should be especially obvious to members of the Go team like Russ Cox (assuming that he is the author of the original version of the proposal in question). With this approach, Go becomes much more complex and even seems “magical”. I understand how iterators work since literally I am part of those responsible for designing the language and implementing the compiler. Another problem with this approach is that it is not very performant because it requires closures or callbacks.

Perhaps this choice in design is due to the fact that from the average Go programmernot required implement iterators, but just be able to use them enjoy. However, most of the iterators that you might need in practice will already be available in the Go standard library or will be provided in some third-party packages. Therefore, the main burden falls on the shoulders authorbut not user package.

I think that's why there are so many people who are “angry” about how this version is designed. According to many, this approach goes against everything Go was originally “intended” for. In addition, this option may be seen as an overcomplicated “hodgepodge”. I see the beauty in the fact that it looks like a generator with yield that also supports code inlining, but I admit that this is not what many people who are accustomed to this language expect from Go. In Go, much of the magic is hidden behind the scenes, most notably garbage collection, goroutines, the select statement, and many other constructs. But I see that Go is going a bit overboard with disclosure this magic to the user, because for the average Go programmer such constructions still look too complicated.

Another aspect that confuses many is related to funcreturning something like this func, which accepts func as an argument. And also with the fact that the body for range converted to func and that's all break (and other constructs outside of control flow) are converted to return false. There are simply three levels of procedures in depth, but a language with this design is perceived more as functional than imperative.

Note: I’m not suggesting replacing the current iterator structure with the structure I’m describing here, but pointing out that it’s unlikely that such a generalized approach to iterators will take root well in Go. In my opinion, Go is an undeniably imperative language, and its first-class constructs are about the same as those in Csharp. It does not pretend to be a functional language. Iterators are in a weird place in Go because they are exactly where iterators are supposed to be in imperative languages, but conceptually they are very “functional”. In functional languages, iterators can be very elegant, but in overtly imperative languages ​​they will always seem strange. The point is exactly inthat they are unified as a separate construct, and do not serve as separators in the construct (initialization + iterator + elimination).

An aside: the Odin approach

As I hinted above, in Odin an iterator is just a procedure call, where the last of many return values ​​is a boolean and indicates whether to continue or not. Because Odin doesn't support closures, you have to write extra code to implement the equivalent of Go's Backward iterator.

Note: before you start saying “it has become even more difficult,” it’s better to scroll further. Most iterators in Odin are designed differently, and I would never recommend writing such an iterator in cases where a trivial for loop would suffice. Such a loop would be more convenient for everyone – those who write the code and those who read it.

// Структура, в которой явно описано состояние
Backward_Iterator :: struct($E: typeid) {
	slice: []E,
	idx:   int,
}

// Конструкция, явно описывающая итератор
backward_make :: proc(s: []$E) -> Backward_Iterator(E) {
	return {slice = s, idx = len(s)-1}
}

backward_iterate :: proc(it: ^Backward_Iterator($E)) -> (elem: E, idx: int, ok: bool) {
	if it.idx >= 0 {
		elem, idx, ok = it.slice[it.idx], it.idx, true
		it.idx -= 1
	}
	return
}


s := []string{"a", "b", "c"}
it := backward_make(s)
for el, _ in backward_iterate(&it) { // можно было бы написать и `for el in` 
	fmt.print(el, " ")
}
// c b a

This code is actually Seems It's much more complex than the Go approach because there's a lot more writing involved. But, at the same time, it is much easier to understand, comprehend, and even faster to execute. The iterator does not call the body of the for loop; on the contrary, the iterator is called in the loop body itself. I know how much Cox likes the opportunity store data in control flow and I agree – it's really cute. But this approach does not fit well with Odin, in particular because Odin does not have closures. But there are no Closures in Odin, since memory management in this language is done manually.

The “iterator” in this case is simply syntactic sugar to perform the following operation:

for {
	el, _, ok := backward_iterate(&it)
	if !ok {
		break
	}
	fmt.print(el, " ")
}

// с `or_break`

for {
	el, _ := backward_iterate(&it) or_break
	fmt.print(el, " ")
}

Odin simply takes away all the magic and makes it crystal clear what exactly is going on. “Construction” and “destruction” must be done manually, there are clearly procedures for this. The iteration itself is also a simple procedure called on each pass of the loop. All three constructs are processed separately, rather than merging into one confusing tangle, as is done in Go 1.23.

Odin, unlike Go, does not hide its magic. In Odin, you have to manually handle “closure-like” values ​​while simultaneously constructing and destroying the “iterator” itself.

Plus, with Odin's approach, you can easily make as many return values ​​as you want! A good example of this is the Odin package core:encoding/csvwhere Reader can be treated as an iterator:

// core:encoding/csv
iterator_next :: proc(r: ^Reader) -> (record: []string, idx: int, err: Error, more: bool) {...}
// Пользовательский код
for record, idx, err in csv.iterator_next(&reader) {
	...
}

Another digression: iterators in C++

In this article, I will try not to be too critical of the “iterators” used in C++. In C++ such constructions are far from just iteratorswhereas with the Go approach it is still just iterators. I fully understand why “iterators” in C++ work the way they do, but 99.99% of the time I need just an iteratorand not a structure with a full set of algebraic properties and simply universal applicability.

Those who are not very familiar with C++ may think of an iterator as a specialized class/structure that must be accompanied by overloaded operators and act as a “pointer”. Historically, an “iterator” in C++ looked like this:

{
	auto && __range = range-expression ;
	auto __begin = begin-expr ;
	auto __end = end-expr ;
	for ( ; __begin != __end; ++__begin) {
		range-declaration = *__begin;
		loop-statement
	}
}

In addition, it was wrapped in a macro until C++11 introduced loop syntax and auto.

The biggest problem is that when working with “iterators” in C++, you need to define at least 5 different operations.

First, the following three operator overloads:

And also two independent procedures, they are also related methods that return a value iterator:

If I were designing for C++ regular iterators, then I would just add a simple method to the struct/class called iterator_next or something like that. That's all. Yes, in this case all other algebraic properties would be lost, but, honestly, I do not need them AT ALL to work on any problems that I encounter. When I do problems of this kind, I always have to either work with a contiguous array or implement the algorithm manually, since I want to guarantee high performance for this data structure. True, I wrote my own language (Odin) precisely because I categorically disagree with the entire C++ philosophy and want to stay away from this madness.

“Iterators” in C++ are much more complex than iterators in Go, but when used locally they are much more straightforward. At least in Go you don't have to construct a type with five different properties.

Conclusion

I think Go iterators are truly appropriate in the context for which they were designed, But Seems, as if they contradict the ideas about Go that most specialists in this language have already formed. Yes, I know that Go “would have” to become more complex over the years, especially after generics appeared in the language. By the way, I think generics in Go are designed well, there are only a few syntactic rough edges, but adding iterators of the same kind seems like a mistake to me.

To summarize, I note that all this, in my opinion, clearly contradicts the philosophy of Go, to which many are already accustomed, and, in addition, forces the programmer to work in a distinctly functional, rather than imperative, style.

I think it is for these reasons that neither I nor many others like all this baggage with iterators in Go, although I completely understand why the language developers made the decisions they did. It’s just that this Go seems to many to be very far from the original.

Maybe my friends and I are exaggerating too much, and most people will never have to implement or even use iterators, so it's not a big deal that iterators are so complicated in practice.

The penultimate, non-disputable thesis: maybe Go needs to be even stricter, ask the “functionalists” to get out and stop requesting features that only make Go unnecessarily complicated.

One last point that is not controversial: if it were up to me, I wouldn’t allow custom iterators in Go at all, but I’m not on the Go development team and I’m not trying to be there.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *