The type system is a programmer’s best friend

I’m tired of
and from excessive use of primitive types to model the functional area.
Value in string
– not the best type to record the user’s email address or country of residence. These values deserve much richer and more specialized types. I need a datatype to exist EmailAddress
, which cannot be null. I need a single entry point to create a new object of this type. It must be validated and normalized before returning a new value. I need this datatype to have useful methods like .Domain()
or .NonAliasValue()
which would return for the input foo+bar@gmail.com
values gmail.com
and foo@gmail.com
. This useful functionality should be built into these types. This provides security, helps prevent bugs, and greatly improves maintainability.
Carefully designed types with useful functionality motivate the programmer to do the right thing.
For example, EmailAddress
can have two methods for checking equality:
Equals
would returntrue
if two email addresses (normalized) are identical.EqualsInPrinciple
would also returntrue
forfoo@gmail.com
andfoo+bar@gmail.com
.
Such type-specific methods would be extremely useful in many cases. The user cannot be denied login if he registered with mail
jane@gmail.com
while trying to login with
Jane@gmail.com
. In addition, it will be very convenient to match the user who contacted the support service from the main mail address (
foo@gmail.com
), and registered account (
foo+svc@gmail.com
). These are typical requirements that are simple
string
can’t do without a bunch of extra logic scattered around the codebase.
Note: according to official RFCthe part of the email address before the @ symbol can be case sensitive, but all popular hosts treat it as case insensitive, so it would be logical to take this knowledge into account in the domain type.
Good types help prevent bugs
Ideally, I’d like to go even further. The email address can be verified or unverified. Typically, an email address is validated by sending a unique code to the inbox. Such “business” interactions can also be expressed through a type system. Let’s have a second type named
VerifiedEmailAddress
. If you like, it can even inherit from
EmailAddress
. I don’t care, but make sure to get a new copy
VerifiedEmailAddress
is possible only in one place in the code, namely at the service responsible for validating the user’s address. And suddenly it turns out that the rest of the application can use this new type to prevent bugs.
Any email sending function can rely on security VerifiedEmailAddress
. Imagine what would happen if the email address was written as a simple string
. The developer would have to first find/load the corresponding user account, look for some obscure flag like HasVerifiedEmail
or IsActive
(which, by the way, is the worst flag, since its importance becomes more and more significant over time), and then hope that the flag is set correctly, and not incorrectly initialized as true
in some standard constructor. There are too many opportunities for error in such a system! Using primitive string
for an object that is easy to express as its own type, this is lazy and unimaginative programming.
Extended types protect against future errors
Another great example is money! It’s just that a bunch of applications express monetary values using the type
decimal
. Why? This type has so many problems that such a solution seems incomprehensible to me. Where is the type of currency? In any area where they work with people’s money, there should be a specialized type
Money
. It should at least include a currency type and operator overloading (or other safeguards) to prevent stupid mistakes like multiplying $100 by £20. Also, not all currencies
only two decimal places are stored
. Some currencies, such as the Bahraini or Kuwaiti dinar, have three. If you are dealing with investments or loans in Chile, then fix
need four decimal places. These aspects are already enough to justify the creation of a special type
Money
but that’s not all.
If your company does not create everything on its own, then sooner or later you will have to work with third-party systems. For example, most payment gateways send money requests and responses as values integer
. Integer values do not suffer from the rounding problems inherent in types float
and double
, so they are preferred over floating point numbers. The only subtlety is that the values must be passed in derived units (for example, in cents, pence, dirhams, groszes, kopecks, and so on), that is, if your program works with values in decimal
, you will have to constantly convert them back and forth when communicating with an external API. As mentioned earlier, not all currencies use two decimal places, so a simple multiplication / division by 100 every time is not enough. Things can get very complicated very quickly. The situation could be greatly simplified if these rules were encapsulated in a concise single type:
var x = Money.FromMinorUnit(100, "GBP")
: £1var y = Money.FromUnit(100.50, "GBP")
: £1.50Console.WriteLine(x.AsUnit())
: 1.5Console.WriteLine(x.AsMinorUnit())
: 150
The situation is aggravated by the fact that in many countries the formats for denoting money are also different. In the UK, ten thousand pounds fifty pence can be written as
10,000.50
however in Germany ten thousand euros and fifty cents would be written as
10.000,50
. Imagine the amount of code related to money and currencies that would be scattered (and possibly duplicated with small discrepancies) throughout the codebase if these rules were not placed in one type
Money
.
In addition, in a specialized type Money
you can enable many other features that greatly simplify the work with monetary values:
var gbp = Currency.Parse("GBP");
var loc = Locale.Parse("Europe/London");
var money = Money.FromMinorUnit(1000050, gbp);
money.Format(loc) // ==> £10,000.50
money.FormatVerbose(loc) // ==> GBP 10,000.50
money.FormatShort(loc) // ==> £10k
Of course, for this type of simulation
Money
it will take some effort, but once implemented and tested, the rest of the codebase will be much more secure. It will also prevent most of the bugs that would otherwise seep into the code over time. While minor aspects such as protected object initialization
Money
through
Money.FromUnit(decimal v, Currency c)
or
Money.FromMinorUnit(int v, Currency c)
may seem insignificant, they force progressive developers to think about which one is the value received from the user or external API, and therefore prevent bugs from the very beginning.
Thoughtful types can prevent unwanted side effects
Extended types are nice because they can be given any shape. If my article hasn’t sparked your imagination yet, let me show you another good example of how a specialized type can save a development team from wasting resources and even security bugs.
Every codebase I’ve worked with had something like this as a function parameter string secretKey
or string password
. What can go wrong with these variables?
Imagine this code:
try
{
var userLogin = new UserLogin
{
Username = username
Password = password
}
var success = _loginService.TryAuthenticate(userLogin);
if (success)
RedirectToHomeScreen(userLogin);
ReturnUnauthorized();
}
catch (Exception ex)
{
Logger.LogError(ex, "User login failed for {login}", userLogin);
}
Here the following problem arises: if an exception is triggered during the authentication process, then this application will write to the logs (accidentally) the user’s password in text form. Of course, code like this shouldn’t exist at all, and hopefully you’ll catch it during a code review before it goes into production, but in reality, this happens from time to time. The likelihood of most of these bugs increases over time.
Initially class UserLogin
could have had a different set of properties, and at the initial code review, this code snippet was probably good. Years later someone could change the class UserLogin
so that the password appeared in it in text form. Then the feature didn’t even show up in the diff that was submitted for yet another review, and voila, you’ve just added a security bug to your code. I am sure that every developer with several years of experience has faced similar problems sooner or later.
However, this bug could easily be avoided by adding a specialized type.
In the C# language (which I will take as an example), when an object is written to the log (or somewhere else), the method is automatically called .ToString()
. Knowing this, one can design such a type Password
:
public readonly record struct Password()
{
// Здесь будет реализация
public override string ToString()
{
return "****";
}
public string Cleartext()
{
return _cleartext;
}
}
This is a minor change, but now it is impossible to accidentally display the password somewhere in text form. Is not that great?
Of course, during the authentication process, you may still need a value in plain text, but you can access it using a method with a very friendly name Cleartext()
. The vulnerability of this operation becomes immediately obvious, and this automatically motivates the developer to use this method for its intended purpose and handle it carefully.
The same principle applies when dealing with personal data (for example, passport number, individual tax number, and so on). Model this information using specialized types. Redefine standard functions like .ToString()
for yourself and expose sensitive data through functions with appropriate names. So personal data will never leak into logs and other places, the removal of which will require a lot of work.
These little tricks can make a big difference!
Make it a habit
Whenever you are dealing with data that has certain rules, behaviors, or dangers, think about how you can help yourself by creating an explicit type.
Based on the example with the type Password
you can expand it again!
Passwords are hashed before they are stored in the database, right? Of course, but a hash is not just string
. During the login process, we will have to compare the previously stored hash with the newly calculated hash. The problem is that not every developer is a security specialist, and therefore he may not be aware that comparing two hash strings can make the code vulnerable to timing attacks.
It is recommended to check the equality of hashes of two passwords in a non-optimized way:
// Сравнение двух байтовых массивов на равенство.
// Метод специально написан так, чтобы цикл не оптимизировался.
[MethodImpl(MethodImplOptions.NoInlining | MethodImplOptions.NoOptimization)]
private static bool ByteArraysEqual(byte[] a, byte[] b)
{
if (a == null && b == null)
{
return true;
}
if (a == null || b == null || a.Length != b.Length)
{
return false;
}
var areSame = true;
for (var i = 0; i < a.Length; i++)
{
areSame &= (a[i] == b[i]);
}
return areSame;
}
Note: code example taken from ASP.NET Core repository.
So it makes perfect sense to code this functionality into a specialized type:
public readonly record struct PasswordHash
{
// Здесь будет реализация
public override bool Equals(PasswordHash other)
{
return ByteArraysEqual(this.Bytes(), other.Bytes());
}
}
If a
PasswordHasher
returns values of type only
PasswordHash
then even developers who are not security savvy will be forced to use a secure way to check for equality.
Think carefully about how the functional area is modeled!
Needless to say, there are no clear right or wrong decisions in programming, and there are always nuances that cannot be covered in one post, but in general, I recommend thinking about how to make the type system your best friend.
Many modern programming languages have very rich type systems, and I think we greatly underestimate them as a way to improve our code.