Rust's Magic Type System

if !is_valid_email(&form.email) || !is_valid_password(&form.password) {
	return HttpResponse::BadRequest().finish();
}

This code is a piece of shit; a nightmare waiting to happen. To understand why and how to fix it, you first need to understand the biggest lesson Rust has taught me: the power of using the type system to enforce invariants.

Let's figure it out. In programming, an invariant is a rule or condition that must always be trueFor example, if we are writing software to manage bank accounts, one of the invariants might be that the balance should never be less than zero (assuming overdrafts are not allowed).

struct BankAccount {
    // in cents
    balance: i32,
}

But how can we observe this invariant?

There are several approaches that can be divided into two categories: manual invariant enforcement and automatic invariant enforcement.

Manual invariant enforcement

Manual observance of the invariant includes:

  • Code revisions

  • Comments in code

  • Documentation

  • Project documents

  • Even verbal agreements shared between team members

As you can imagine, this type of enforcement is quite fragile, but it has its uses. For example, imagine a UI invariant that any delete action requires user confirmation. This would be very difficult to enforce automatically, so you simply describe these rules in documentation and try to enforce them during code reviews. Violating this invariant would not be catastrophic, so perhaps it is sufficient in this case.

However, if we use a manual approach in our bank balance example, it will quickly lead us to ruin.

struct BankAccount {
    // in cents
    // should never be less than 0!
    balance: i32,
}

We cannot afford to violate the invariant here; we need a more reliable method of enforcing it.

Automatic invariant compliance

Includes:

  • Runtime assertions

  • Runtime checks

  • Testing

  • Input validation

  • Using the type system

We will cover all of these approaches, with a particular focus on using the type system, which is the most reliable way to enforce invariants.

Asserts

Let's start with assertions as a method of automatic invariant compliance.

struct BankAccount {
    balance: i32,
}

impl BankAccount {
    fn new(initial_balance: i32) -> Self {
        assert!(initial_balance >= 0, "Initial balance cannot be negative");
        Self {
            balance: initial_balance,
        }
    }

    fn deposit(&mut self, amount: i32) {
        assert!(amount >= 0, "Deposit amount cannot be negative");
        self.balance += amount;
    }

    fn withdraw(&mut self, amount: i32) {
        assert!(amount >= 0, "Withdrawal amount cannot be negative");
        assert!(self.balance >= amount, "Insufficient funds");
        self.balance -= amount;
    }
}

We assert that the initial balance must be greater than or equal to zero. And we put the same assertions in the methods deposit And withdraw.

Our invariant is now automatically checked through code, but there are a few problems. Assertions are checked at runtime, which means that developers can still write incorrect code. Also, if an assertion is successful, our program will panic and crash.

Using Rust's type system

Let's improve this code by using Rust's type system. We'll change the balance type from a 32-bit signed integer to a 32-bit unsigned integer, and the balance can't be negative at all anymore. We can now remove the assertions in the functions new And depositand also the first assert in the function withdraw.

impl BankAccount {
    fn new(initial_balance: u32) -> Self {
        Self {
            balance: initial_balance,
        }
    }

    fn deposit(&mut self, amount: u32) {
        self.balance += amount;
    }

    fn withdraw(&mut self, amount: u32) {
        assert!(self.balance >= amount, "Insufficient funds");
        self.balance -= amount;
    }
}

However, we still need to make sure there are enough funds in the account.

Here we can take advantage of an important feature of Rust's type system. We will change the return value to the type Resultto account for this potential error.

fn withdraw(&mut self, amount: u32) -> Result<u32, String> {
	if self.balance >= amount {
		self.balance -= amount;
		Ok(self.balance)
	} else {
		Err("Insufficient funds".to_string())
	}
}

Then inside the function we will perform a simple runtime check. Since the function withdraw returns type Resultit will force the calling code to handle the potential error. We can also add tests to make sure that withdraw works correctly.

With this approach, our code will not compile or our tests will fail if the invariant is violated. This makes our code much more robust.

This powerful software design technique that uses a type system to enforce invariants is called type-driven designWhile our previous example was simple, a type system can enforce very complex invariants, especially if the language is statically typed and has an expressive type system like Rust.

Why is this code a nightmare?

At the beginning of the video I said that this code is a nightmare that is about to happen:

if !is_valid_email(&form.email) || !is_valid_password(&form.password) {
	return HttpResponse::BadRequest().finish();
}

Why?

#[post("/user/register")]
pub async fn register_user(
    form: web::Form<FormData>,
    pool: web::Data<PgPool>
) -> HttpResponse {
    if !is_valid_email(&form.email) || !is_valid_password(&form.password) {
        return HttpResponse::BadRequest().finish();
    }

    let user = User {
        email: form.email.clone(),
        password: form.password.clone(),
    };

    match insert_user(&pool, &user).await {
        Ok(_) => HttpResponse::Ok().finish(),
        Err(_) => HttpResponse::InternalServerError().finish(),
    }
}

We have an API endpoint for creating new users. The invariant is that the email and password must always be valid. We ensure this through input validation: users provide unchecked data, and we call several validation functions to ensure that the entered data meets our requirements. Only then do we save the data to the database.

The problem is that these checks are only performed once – at the beginning of the request handler. So can the function insert_user Is it safe to assume that the email and password are valid?

#[derive(Debug)]
struct User {
    pub email: String,
    pub password: String,
}

async fn insert_user(pool: &PgPool, user: &User) -> Result<Uuid, sqlx::Error> {
    let user_id = Uuid::new_v4();
    let password = hash_password(user.password.as_str());

    // insert user into database ...

    Ok(user_id)
}

If we look at the function signature in isolation, there is no information that guarantees that the email and password are valid; they are defined as simple strings. The function must take its word that the calling code has properly validated before passing the input, and that is a recipe for disaster.

As the code grows and changes, you can imagine a validation check being accidentally removed or data being changed in some way, and here we are:

#[post("/user/register")]
pub async fn register_user(form: web::Form<FormData>, pool: web::Data<PgPool>) -> HttpResponse {
    if !is_valid_email(&form.email) || !is_valid_password(&form.password) {
        return HttpResponse::BadRequest().finish();
    }

    let user = User {
        email: form.email.clone(),
        password: form.password.clone(),
    };

+   // remove sensitive data before logging
+   user.password.clear();
+   dbg!("Registering user: {:?}", &user);

    match insert_user(&pool, &user).await {
        Ok(_) => HttpResponse::Ok().finish(),
        Err(_) => HttpResponse::InternalServerError().finish(),
    }
}

One way to prevent this is to perform the validation again inside the function insert_user:

async fn insert_user(pool: &PgPool, user: &User) -> Result<Uuid, sqlx::Error> {
+   if !is_valid_email(&user.email) || !is_valid_password(&user.password) {
+       // return error...
+   }

    let user_id = Uuid::new_v4();
    let password = hash_password(user.password.as_str());

    // insert user into database ...

    Ok(user_id)
}

However, this introduces unnecessary redundancy and is prone to errors.

The principle of “Don't validate but parse”

Instead, we can use the type-based design principle: don't validate but parse. Instead of scattering validation functions throughout our code, we can parse user input into new types that are guaranteed to respect our invariants.

First we will create two new types: Email And Password.

pub struct Email(String);
pub struct Password(String);

Both of these are struct tuples that wrap a string value. Wrapping built-in types with lax requirements into user-defined types with stricter requirements is called newtype pattern in Rust. In this case, our requirements (or invariants) are that the email is properly formatted and the password meets length requirements.

To ensure this, we will add a function parsewhich takes an unchecked string as input and parses it into a type Email or Password. The parsing operation may fail, so we will return the type Result.

impl Email {
    pub fn parse(email: String) -> Result<Email, AuthError> {
        if !is_valid_email(&email) {
            Err(AuthError::ValidationError("Email must be valid".to_string()))
        } else {
            Ok(Email(email))
        }
    }
}

impl Password {
    pub fn parse(password: String) -> Result<Password, AuthError> {
        if !is_valid_password(&password) {
            Err(AuthError::ValidationError("Password must be valid".to_string()))
        } else {
            Ok(Password(password))
        }
    }
}

Here we exploit a few unique features of Rust's type system. Because of Rust's visibility rules, the internal string is private and inaccessible outside the struct. And because Rust has no built-in or default constructors, the only way to create an instance is Email or Password — via function parse.

We still use the same validation functions as before, but now the validation logic is contained within the type, and the validation state is stored within the type. We can also add a method as_strto provide read-only access to the internal string data.

impl Email {
    pub fn parse(email: String) -> Result<Email, AuthError> {
        // ...
    }

    pub fn as_str(&self) -> &str {
        &self.0
    }
}

impl Password {
    pub fn parse(password: String) -> Result<Password, AuthError> {
        // ...
    }

    pub fn as_str(&self) -> &str {
        &self.0
    }
}

Now we can update the structure Userto use our new types:

#[derive(Debug)]
struct User {
    pub email: Email,
    pub password: Password,
}

and update the logger function:

#[post("/user/register")]
pub async fn register_user(
    form: web::Form<FormData>,
    pool: web::Data<PgPool>
-) -> HttpResponse {
-   if !is_valid_email(&form.email) || !is_valid_password(&form.password) {
-       return HttpResponse::BadRequest().finish();
-   }

-   let user = User {
-       email: form.email.clone(),
-       password: form.password.clone(),
-   };
+) -> Result<HttpResponse, AuthError> {
+   let email = Email::parse(form.email.clone())?;
+   let password = Password::parse(form.password.clone())?;

+   let user = User::new(email, password);

    // remove sensitive data before logging
    user.password.clear();
    dbg!("Registering user: {:?}", &user);

    match insert_user(&pool, &user).await {
        Ok(_) => HttpResponse::Ok().finish(),
        Err(_) => HttpResponse::InternalServerError().finish(),
    }
}

Now any subsequent code can be sure that the email and password are valid.

Advanced Approaches to Type-Based Design

Type-based design is a big topic, and this is just the tip of the iceberg. We just talked about the principle don't validate but parse and how to implement it using a new type approach in Rust. We can also use more complex approaches, such as type state patternwhich allows you to define the different states an object can be in, define specific actions for each state, and ensure valid transitions between states.

For example, a user in our API can be in one of three states: viewer, editor, or administrator. We will first create a struct representing each state, and then define the struct Userwhich is generalized by UserRolewhich is a spectator by default.

pub struct Viewer;
pub struct Editor;
pub struct Admin;

pub struct User<UserRole = Viewer> {
    pub email: Email,
    pub password: Password,
    state: PhantomData<UserRole>,
}

impl User {
    pub fn new(email: Email, password: Password) -> Self {
        Self {
            email,
            password,
            state: PhantomData,
        }
    }
}

We will store the generalization in a field statewhich uses PhantomDatato avoid unnecessary memory allocation.

We can then define methods that are available to all states and state-specific methods, such as the edit for editors. We can also ensure proper transitions between states: viewers can be promoted to editors, editors to admins, and admins can be demoted to editors.

impl User<Viewer> {
    pub fn promote(self) -> User<Editor> { /*...*/ }
}

impl User<Editor> {
    pub fn edit(&self) { /*...*/ }
    pub fn promote(self) -> User<Admin> { /*...*/ }
}

impl User<Admin> {
    pub fn demote(self) -> User<Editor> { /*...*/ }
}

Note that we are using Rust's ownership model: these state transition functions take self as input, which moves the instance into the function and makes it unavailable in the future. This means that if the viewer instance is promoted to an editor, the old user instance can no longer be used.

fn main() {
    let viewer = User::new(
        Email::parse("bogdan@email.com".to_string()).unwrap(),
        Password::prase("password".to_string()).unwrap(),
    );

    let editor = viewer.promote();

    viewer.get_email(); // error: borrow of moved value 'viewer'
}

So how do you start using type-based design in your own Rust projects? There are many ways to implement this powerful software design technique in Rust. In theory, you can apply some of these patterns in other languages, but Rust makes it especially practical due to the way the type system is designed. In other languages, these approaches are not always practical, if possible at all.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *