C # 8 and null validity. How do we live with this

Hello colleagues!

The time has come to mention that we have plans to release a fundamental the books Ian Griffiths on C # 8:

Meanwhile, in his blog post author published two interconnected articlesin which he considers the subtleties of such new phenomena as “nullability”, “null-obliviousness” and “null-awareness”. We have translated both articles under one heading and suggest discussing them.

The most ambitious new feature in C # 8.0 is called nullable references (nullable links).

The purpose of this new feature is to smooth out the damage from a dangerous thing, which the computer scientist Tony Hoar once called his “billion dollar error“. C # has a keyword null (the equivalent of which is found in many other languages), and the roots of this keyword can be traced back to the language Algol W, in the development of which Hoar participated. In this ancient language (it appeared in 1966), variables referring to instances of a certain type could receive a special meaning indicating that right now this variable is not referenced anywhere. This opportunity was very widely borrowed, and today many experts (including Hoar himself) believe that it has become the biggest source of costly software errors of all time.

What is wrong with assuming zero? In a world where any link may point to zero, you have to consider this wherever any links are used in your code, otherwise you run the risk of being refused at runtime. Sometimes it’s not too burdensome; if you initialize a variable with expression new where you declare it, you know that this variable is not equal to zero. But even such a simple example involves some cognitive load: before the release of C # 8, the compiler could not tell you if you are doing anything that can convert this value to null. But, as soon as you start stitching different pieces of code, it’s much more difficult to judge with certainty about such things: how likely is it that this property that I’m reading right now can return null? Is it allowed to transmit null into that method? In what situations can I be sure that the method I’m calling will set this argument out not in null, but to a different meaning? Moreover, the matter is not even limited to remembering to check such things; it’s not entirely clear what you should do if you run into zero.

With numerical types in C # there is no such problem: if you write a function that takes some numbers as an input and returns a number as a result, then you don’t have to wonder if the transmitted values ​​are really numbers, and if anything among them can be mixed up. When calling such a function, it is not necessary to think about whether it can return anything instead of a number. Unless such a development of events interests you as an option: in this case, you can declare parameters or results of the type int?, indicating that in this particular case you really want to allow the transmission or return of a null value. So, for numerical types and, in a more general sense, significant types, zero tolerance has always been one of those things that are done voluntarily, as an option.

As for the reference types, prior to C # 8.0, the permissibility of zero was not only set by default, but it could not be disabled either.

In fact, for reasons of backward compatibility, zero-validity continues to operate by default even in C # 8.0, since new language functions in this area remain disabled until you explicitly request them.

However, as soon as you enable this new feature – everything changes. The easiest way to activate it is to add inside element in your file .csproj. (I note that more filigree control is also available. If you really really need it, you can configure the behavior that allows nullseparately on each line. But, when we recently decided to include this feature in all our projects, it turned out that activating it on the scale of one project at a time was a completely feasible task.)

When in C # 8.0 links that allow nullare fully activated, the situation is changing: now by default it is assumed that the links are not null only if you yourself do not specify the opposite, just as with meaningful types (even the syntax is the same: could you write int? if you really wanted to so that the integer value is optional. Now you write string?, if you mean that you want either a link to a string or null.)

This is a very significant change, and, first of all, due to its significance, this new feature is disabled by default. Microsoft could have designed this language feature differently: you could leave the default links nullable and introduce new syntax that would allow you to specify that you want to ensure that it is not allowed null. Perhaps this would lower the bar when exploring this possibility, but in the long run such a solution would be incorrect, since in practice most of the links in the huge mass of C # code are not designed to point to null.

Assuming zero is an exception, not a rule, and that is why, when this new language feature is enabled, preventing null becomes a new default. This is reflected even in the original feature name: “nullable references.” The name is curious, given that links could point to null ever since C # 1.0. But the developers chose to emphasize that now the null assumption goes into the category of things that need to be explicitly requested.

C # 8.0 smoothes out the process of embedding links that allow null, since it allows you to enter this feature gradually. One does not have to make a yes or no choice. This is very different from features. async/awaitadded in C # 5.0, which tended to spread: in fact, asynchronous operations oblige the caller to be async, which means that the code that calls this caller must be async, and so on to the very top of the stack. Fortunately, types that allow nullare arranged differently: they can be introduced selectively and gradually. You can work through the files one by one, or even line by line, if necessary.

The most important aspect of types allowing null (thanks to which the transition to them is simplified), is that by default they are disabled. Otherwise, most developers would refuse to use C # 8.0, since such a transition would cause warnings in almost any code base. However, for the same reasons, the entry threshold for using this new feature feels rather high: if a new feature makes such dramatic changes that it is disabled by default, then you probably won’t want to mess with it, but there are problems associated with switching to it will always seem unnecessary hassle. But this would be a shame, because the feature is very valuable. It helps to find bugs in the code before users do it for you.

So, if you are considering introducing types that allow null, be sure to note that you can introduce this feature step by step.

Warnings only

The coarsest level of control over an entire project after a simple on / off is the ability to activate alerts regardless of annotations. For example, if I fully enable the zero assumption for Corvus.ContentHandling.Json in our repository Corvus.ContentHandlingby adding In the group of properties in the project file, then in its current state 20 warnings from the compiler will immediately appear. However, if instead I take advantage , then I’ll get just one warning.

But wait! Why will less warnings be shown to me? In the end, here I just asked for warnings. The answer to this question is not entirely obvious: the fact is that some variables and expressions can be nullNeutral (null-oblivious).

Null neutrality

C # supports two null validation interpretations. First, any variable of a reference type can be declared as allowing or not allowing null, and secondly, if possible, the compiler will logically conclude whether or not this variable can be null at any particular point in the code. This article deals only with the first variety of admissibility null, that is, about the static type of a variable (in fact, this applies not only to variables and parameters and fields close to them in spirit; both static and logically inferred admissibility null defined for each expression in C #.) In fact, validity null in its first understanding, the one we are considering is an extension of the type system.

However, it turns out that if we focus only on null admissibility for a type, the situation will not be as coherent as one might assume. This is not just a contrast between "null validity" and "invalid null". In fact, there are two more possibilities. There is a category of “unknown”, which is mandatory due to the availability of generics; if you have an unlimited type parameter, then it will be impossible to find out anything about the validity null for him: code that uses the appropriate generalized method or type can substitute an argument in them, or allowing or not allowing null. You can add restrictions, but often such restrictions are undesirable, since they narrow the scope of the generalized type or method. So, for variables or expressions of some unlimited parameter of type T must be unknown (un) zero tolerance; perhaps, in each case, the question of admissibility null it will be decided separately for them, but we don’t know which option will appear in the generic code, since it will depend on the type argument.

The latter category is called “neutral”. By the principle of "neutrality" everything worked before the advent of C # 8.0, and this will work if you do not activate the ability to work with nullable links. (Basically, this is an example retroactivity. Even though the idea of ​​null neutrality was first introduced in C # 8.0 as a natural state of code prior to activating null validity for references, C # designers insisted that this property was never really alien to C #.)

Perhaps you don’t have to explain what “neutrality” means in this case, since it was in this vein that C # always worked, so you yourself understand everything ... although, perhaps, this is a little dishonest. So listen: in a world where you know about permissibility null, the most important characteristic null-neutral expressions is that they do not cause warnings about the validity of null. You can set the null-neutral expression to allow null variable, not allowing. Null-neutral variables (as well as properties, fields, etc.) can be assigned expressions that the compiler considered “possibly null" Or no null"

That’s why, if you just turn on warnings, then there aren’t so many new warnings. All code remains in the context of disabled validation annotations null, so all variables, parameters, fields and properties will be nullare neutral, which means that you will not receive any warnings if you try to use them in conjunction with any entities that take into account null.

Why, then, do I get warnings at all? A common reason is because of an attempt to make friends in an unacceptable way two pieces of code that take into account null. For example, suppose I have a library where links that allow null, and this library has the following deeply contrived class:

public static class NullableAwareClass
	{
	    public static string? GetNullable() => DateTime.Now.Hour > 12 ? null : "morning";
	

	    public static int RequireNonNull(string s) => s.Length;
	}

Further, in another project, I can write this code in the context where null validity warnings are activated, but the corresponding annotations are disabled:

static int UseString(string x) => NullableAwareClass.RequireNonNull(x);

Since null validity annotations are disabled, the parameter x here null-neutral. This means that the compiler cannot determine if this code is true or not. If the compiler issued warnings in cases where nullneutral expressions are mixed with null, a significant proportion of these warnings could be considered doubtful - therefore, warnings are not issued.

With this wrapper, I actually hid that the code takes into account validity null. This means that now I can write like this:

	int x = UseString(NullableAwareClass.GetNullable());

The compiler knows that GetNullable can return null, but since I called a method with a null-neutral parameter, the program does not know if this is right or wrong. Taking advantage nullwith a neutral wrapper, I disarmed the compiler, which now does not see a problem here. However, if I combined these two methods directly, everything would be different:

int y = NullableAwareClass.RequireNonNull(NullableAwareClass.GetNullable());

Here I pass the result GetNullable right in RequireNonNull. If I tried to do this in a context where null assumptions are enabled, the compiler would generate a warning, regardless of whether I turned on or off the context of the corresponding annotations. In this particular case, the context of annotations does not matter, since there are no declarations with a reference type. If you enable null assumptions, but disable the corresponding annotations, then all ads will become null-neutral, which, however, does not mean that all expressions become such. So, we know that the result GetNullable allows null. Therefore, we get a warning.

In summary: since all ads in the context of disabled annotations that allow nullare null-neutral, we won’t get many warnings, since most expressions will be nullneutral. But the compiler will still be able to catch assumption errors null in those cases when the expressions do not pass through some null-neutral intermediary. Moreover, the greatest benefit in this case will be from detecting errors associated with attempts to dereference potential null values ​​using .e.g.

int z = NullableAwareClass.GetNullable().Length;

If your code is well-designed, then there should not be a large number of errors of this kind.

Gradual annotation of the entire project

After you take the first step - just activate the warnings, then you can proceed to the gradual activation of annotations, file by file. It is convenient to include them immediately in the entire project, see in which files warnings appear - and then select a file in which there are relatively few warnings. Disable them again at the level of the entire project, and write at the top of the file you selected #nullable enable. So the assumption is fully turned on null (both for warnings and for annotations) in the entire file (unless you turn them off again using another directive #nullable) Then you can go through the entire file and make sure that all entities that are likely to be null are annotated as allowing null (i.e. add ?), and then deal with the warnings in this file, if any remain.

It may turn out that adding all the necessary annotations is all that is required to eliminate all warnings. The reverse is also possible: you may notice that when you neatly annotate one file about validity null, other warnings have surfaced in other files using it. As a rule, there are not many such warnings, and you have time to quickly fix them. But, if for some reason after this step you just drown in warnings, then you still have a couple of solutions. Firstly, you can simply cancel the selection, leave this file and take on another one. Secondly, you can selectively turn off annotations for those members that you think are causing the most problems. (Directive #nullable you can use as many times as you want, so you can control the null validity settings even line by line if you want to.) Perhaps if you return to this file later when you already activate null validity in most of the project, you will see fewer warnings, than the first time.

There are times when problems cannot be solved in such a straightforward way. So, in certain scenarios related to serialization (for example, when using Json.NET or Entity Framework), the work may be more difficult. I think this problem deserves a separate article.

Assumptions null improve the expressiveness of your code and increase the chances that the compiler will catch your errors before users bump into them. Therefore, it is better to include this feature if possible. And, if you include it selectively, then the benefits of it will begin to feel faster.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *