Generic Newtypes: A way to work around the orphan rule

September 21, 2019•1,180 words

Rust's orphan rule prevents us from implementing a foreign trait on a foreign type. While this may appear limiting at first, it is actually a good thing and one of the ways how the Rust compiler can prove at compile time that our code works the way we intended.

This blog post is a follow-up on one that I already wrote some time ago: https://blog.eizinger.io/5835/rust-s-custom-derives-in-a-hexagonal-architecture-incompatible-ideas
In this one, we will go more in-depth into the "local wrapper type" idea and rebrand it as "generic newtypes".

An example use case

Imagine we have an application that allows us to manage a database of users via an HTTP API. At the core of our application, we are dealing with a User struct that for now tracks only the name and the user's signup date.

use chrono::prelude::*;

pub struct User {
    name: String,
    signedup_on: DateTime<Utc>
}

While this is a fairly simple type, we will establish some constraints now that should help demonstrate of how the pattern of generic newtypes is applicable and helpful in other, more complicated scenarios.

The constraints are:

Our struct User most not directly implement Serialize.

This could be the case if the crate this struct is defined in should not have any dependencies or only selected ones. Many crates in the ecosystem already expose a serde feature-flag that gives you some serialization implementation. In my opinion, this is sub-optimal because it does not really scale in the long run. What about libraries like diesel? diesel provides similar traits to serde with FromSql and ToSql. Should crates also give you a diesel feature flag in case you want to use those types directly in a database schema? Testing libraries like quickcheck are another example. As the Rust ecosystem grows, this list will likely grow.

The serialisation provided by chrono's serde feature flag does not meet our requirements.

This is easily imaginable for any datatype that is not defined in some specification.¹ Even if it is, the crate might just not expose a serde feature flag and we are left with writing our own implementation of Serialize anyway.

Existing solutions

1. Create a regular newtype

We can create a newtype for DateTime<Utc> like this:

pub struct SignupDate(pub DateTime<Utc>);

Now that newtype might probably be useful in and of itself, even without the idea I am trying to sketch out here. Independently though, we now have a type that is local to our crate and we can implement Serialize on it: impl Serialize for SignupDate.

The downside of this approach is that we need to create a newtype for every type. Even though there is no runtime cost associated with that thanks for Rust's zero-cost abstractions, we have to write those newtypes which can be tiring, depending on how many there are. In addition, naming those newtypes can become tricky if their only purpose is to allow trait implementations: SerdeSignupDate is kind of a weird name. The bottom line is that while creating newtypes works, it is not ideal.

2. Create a serde module

We can create a module that exposes dedicated serialize and deserialize functions and then use this module to instruct serde, how to serialize/deserialize our type:

use chrono::prelude::*;

pub struct User {
    name: String,
    #[serde(with = "serde_signup_date")]
    signedup_on: DateTime<Utc>
}

where serde_signup_date is the module exposing the serialize and deserialize functions.

The downside of this approach is that you have to repeat yourself on all the call sites: i.e. every struct that contains our signedup_on field will need to use this attribute. Another constraint is that it only works as long as you don't use any type parameters for the field you are trying to annotate with #[serde(with)]:

pub struct Foo<T> {
    #[serde(with = "...")] // how are the functions in the module supposed to know how to serialize `T`?
    bar: T
}

Generic newtypes

Instead of repeating ourselves in creating newtypes, we can define a reusable, generic newtype:

pub struct Http<T>(pub T);

Similar to regular newtypes, this allows us to implement Serialize:

impl Serialize for Http<DateTime<Utc>> {
 // ...
}

Later, we would use it like this:

use chrono::prelude::*;

pub struct User {
    name: String,
    signedup_on: Http<DateTime<Utc>>
}

What is cool about this?

We only have to define one newtype that can be reused all over the place.
It actually reads fairly nicely: no more awkward names for newtypes.
It works with type parameters: we simply have to use Http<Bar> instead of just Bar:

pub struct Foo<T> {
    bar: T
}

let instance: Foo<Http<Bar>> = Foo {
    bar: Http(Bar(...))
};

Going back to our User example, we had one more constraint that we ignored so far, that is: User by itself must not implement Serialize.
Since we have Http<T>, should we just do: impl Serialize for Http<User>?

In short: no. We made bad experiences doing that.
The reason is because User is a record type² (i.e. it is a type with named fields).
In order to serialize this one, we would need to use serde's features of serialize.serialize_struct() and state all the fieldnames ourselves. This is actually fairly repetitive and exactly the reason why there is #[derive(Serialize, Deserialize)]. How can we leverage this functionality without breaking our constraint?

Our API is the contract we have with our clients. I would consider it to be a good practise to explicitly define this contract in the code. Hence, I recommend to create dedicated types for the wireformat:

#[derive(Serialize, Deserialize)]
pub struct UserResponse {
    name: String,
    signedup_on: Http<DateTime<Utc>
}

As we can see, things fall neatly into place. We can leverage serde's custom derive for our structural type and at the same time, simply wrap the types that don't define a Serialize implementation with our Http newtype and we are good to go!

Converting between UserResponse and User is fairly trivial as all we need to do is wrap the DateTime<Utc> in our Http<T> newtype.

Conclusion

Instead of repetitively creating specific newtypes, we can create generic newtypes for certain contexts in our applications like API modules. This allows us to work around the orphan rule while not taking too much of an ergonomic hit. In combination with dedicated types for the wire format, we can leverage a lot of serde's functionality.

The key here is to be as modular as possible: Ideally, you want to use the generic newtype for things that serialize to a single value like a JSON string or number. Then you create specific wire types for the messages you are exchanging with your clients and use the generic newtype so that you are able to derive Serialize.

Discussion

Comments or ideas?
Post them to the /r/rust thread: https://www.reddit.com/r/rust/comments/d79lh9/generic_newtypes_a_way_to_work_around_the_orphan/

I guess using DateTime for this example is kind of bad because there is actually a well-defined serialization for these data types in ISO6801. ↩
A previous version of this post used the term "structural type". Thanks to /u/thristian99 for suggesting a different term to avoid confusion. ↩

Generic Newtypes: A way to work around the orphan rule

An example use case

Existing solutions

1. Create a regular newtype

2. Create a serde module

Generic newtypes

Conclusion

Discussion

More from Thomas Eizinger
All posts

Rust's custom derives in a hexagonal architecture: Incompatible ideas?

Using GitHub actions and GitFlow to automate your release process

Generic Newtypes: A way to work around the orphan rule

An example use case

Existing solutions

1. Create a regular newtype

2. Create a serde module

Generic newtypes

Conclusion

Discussion

More from Thomas EizingerAll posts

Rust's custom derives in a hexagonal architecture: Incompatible ideas?

Using GitHub actions and GitFlow to automate your release process

More from Thomas Eizinger
All posts