How to simultaneously support synchronous and asynchronous code in Rust

Introduction

Sit back and listen to the old man's tale: what happened when I asked Rust for too much.

Let's say you want to write a new library in Rust. All that is required for this is to wrap it in a public API, through which access to some other product will be provided, for example, in Spotify API or maybe in a database API, say ArangoDB. It's not that hard: after all, you're not inventing anything new, you don't have to deal with complex algorithms. Therefore, you believe that the problem can be solved in a relatively straightforward manner.

You decide to implement a library using async. The work that your library will be doing is mostly making HTTP requests to handle I/O, so async makes sense (incidentally, it's one of the things that makes Rust so popular today). You sit down to code, and a few days later, you have v0.1.0 ready. “Nice,” you think as cargo publish finishes successfully and uploads your work to crates.io.

A few days pass, and you get a new notification from GitHub. It turns out that someone opened a topic:

How to use this library synchronously?

My project doesn't use async because it's too complex for what I need. I wanted to try your new library, but I'm not sure how to keep it simple. I don't think I'll bother filling my code everywhere. block_on(endpoint()). I've seen crates like reqwestexporting blocking module with exactly the same functionality. Perhaps you should do the same?

In a low-level context, this seems like a very difficult task. Is it possible to provide a common interface for both regular synchronous code and asynchronous code – which requires a runtime like tokio, awaiting futures, consolidation, etc.? I mean, I was asked politely, so I decided to give it a try. In the end, the only difference will be that the code will contain the async and await keywords here and there, no frills are done here.

Well, More or less this is exactly what happened with the crate rspotifywhich I once supported with Ramsayits creator. For those who don't know, it's a wrapper for the Spotify Web API. Just to be clear: I eventually got this code to work, although it wasn't as clean as I'd hoped.

First approaches

To give a broader context, I'll show you what the Rspotify client looks like in general:

struct Spotify { /* ... */ }

impl Spotify {
    async fn some_endpoint(&self, param: String) -> SpotifyResult<String> {
        let mut params = HashMap::new();
        params.insert("param", param);

        self.http.get("/some-endpoint", params).await
    }

Essentially, we would need to provide access to some endpoint some_endpoint for users working in both asynchronous and blocking modes. The important question here is: what to do if you have several dozen endpoints? And how to make it as easy as possible for the user to switch between synchronous and asynchronous code?

Good old copypasta

The first thing we did was implement the following option. It was pretty simple and it worked. Need to copy the regular client code to the new one blocking module in Rspotify. Here request (our HTTP client) and reqwest::blocking share the same interface, so we can manually remove keywords like async or .await and import reqwest::blocking instead of reqwest in the new module.

Then the Rspotify user can simply take rspotify::blocking::Client instead of rspotify::Client — and voila! The code is now blocking. As a result, clients that only work with async will get a much larger binary, so we can simply set a feature switch here, serve a convenient version of this file called blocking — and we're done.

Later the problem became much clearer. It turned out that half of the code in the crate was duplicated. If it was necessary to add a new endpoint or modify an existing one, then everything would have to be written or deleted twice.

There's no way to be sure that two implementations are equivalent without testing both thoroughly. This is a good idea, but what if you just copied and pasted all the tests incorrectly? Didn't think of that? The poor reviewer would have to read the code line by line twice to make sure it looked OK on both sides – and there's a huge margin for human error.

According to our experience gained while developing Rspotify, the process really slows down a lot, and especially for beginners who are not used to such ordeals. As a newly minted Rspotify support specialist, I enthusiastically began researching what other solutions are possible.

Call block_on

Second approach — is to implement everything on the asynchronous code side. After this, you need to create wrappers for the blocking interface that call internally block_on. block_on will execute the future until completion, making it effectively synchronous. You will still need to copy definitions methods, but the implementation in this case is written only once:

mod blocking {
    struct Spotify(super::Spotify);

    impl Spotify {
        fn endpoint(&self, param: String) -> SpotifyResult<String> {
            runtime.block_on(async move {
                self.0.endpoint(param).await
            })
        }
    }

Please note: in order to be able to call block_on, you first need to create some runtime in the endpoint method. For example, this can be done using tokio :

let mut runtime = tokio::runtime::Builder::new()
    .basic_scheduler()
    .enable_all()
    .build()
    .unwrap();

The question arises: should the runtime be initialized on every call to the endpoint, or can it be shared? It can be made global (ewwww) or, even better, keep the runtime within the Spotify structure. But, since in this case the execution environment is introduced changeable link, it has to be wrapped in Arc<Mutex<T>>thereby completely destroying the competition in your client. This should be done with the help of Handle from Tokio, which looks like this:

use tokio::runtime::Runtime;

lazy_static! { // Также можно воспользоваться `once_cell`
    static ref RT: Runtime = Runtime::new().unwrap();
}

fn endpoint(&self, param: String) -> SpotifyResult<String> {
    RT.handle().block_on(async move {
        self.0.endpoint(param).await
    })

Moreover, with this descriptor our blocking client starts working faster [1]a more productive solution is possible here. It requires reqwest, if you're interested. In short, it spawns a thread, which in turn calls block_on, which waits on the job channel [2] [3].

Unfortunately, this solution comes with significant costs. You pull up big dependencies like futures or tokio and include them in your binary. All this for the sake of… still ending up writing blocking code. This comes at a cost not only at runtime, but also at compilation time. I think this is just wrong.

At the same time, you still have a significant amount of duplicated code in your project. Even though these are just definitions, they tend to accumulate. reqwest — is a huge project, and there, perhaps, you can afford it in the case of the blocking module. But in a less popular crate, such as rspotify, it is more difficult to pull it off.

Crate duplication

According to the documentation, another way to deal with this problem is to create separate crates. We have rspotify-sync and rspotify-async, and users would choose which one they want to include as a dependency — even both, if needed. But the same problem remains: how exactly do we generate both versions of the crate? I couldn’t do this any other way than copy-pasting the entire crateeven at the cost of various tricks with Cargo: for example, creating two Cargo.toml files, one for each crate (in any case, this is very inconvenient).

Armed with such an idea, you won’t be able to use even procedural macros, because you can’t just go and create a new crate inside a macro. It would be possible to define a file format that would allow writing templates with Rust code that could replace elements such as async/.await. But it seems that this topic is completely beyond the scope of the article.

The option that worked: the maybe_async crate

Our third try was based on a crate called maybe_async . I remember as soon as I discovered this solution, I foolishly thought it was ideal.

In general, the idea of ​​this crate is to automatically remove all inclusions from the code async And .await using a procedural macro. This is how we actually automate copy-paste. For example:

#[maybe_async::maybe_async]
async fn endpoint() { /* материал */ }

Generates the following code:

#[cfg(not(feature = "is_sync"))]
async fn endpoint() { /* материал */ }

#[cfg(feature = "is_sync")]
fn endpoint() { /* удалили материал с `.await` */ }

You can pre-configure whether you want your code to be async or blocking. This is done by simply toggling the maybe_async/is_sync feature when compiling the crate. This macro works with functions, traits, and impl blocks. If some transformation is not as simple as removing async and .await, you can define your own implementation using the async_impl and sync_impl procedural macros. This works great, and we've been using it at Rspotify for a while now.

In fact, it was so good that I started Rspotify excluding http clientThis approach is even more flexible than work excluding async/sync. This way we can support multiple HTTP clients at once, for example, reqwest And ureqregardless of whether a particular client is synchronous or asynchronous.

Job excluding http client It's not that hard to implement if you have maybe_async at hand. You just need to define a type for HTTP clientand then implement it for each of the clients you want to support:

A small listing is worth a thousand words. (We've posted the full source code for two Rspotify clients on Github: here for reqwestand here for ureq)

#[maybe_async]
trait HttpClient {
    async fn get(&self) -> String;
}

#[sync_impl]
impl HttpClient for UreqClient {
    fn get(&self) -> String { ureq::get(/* ... */) }
}

#[async_impl]
impl HttpClient for ReqwestClient {
    async fn get(&self) -> String { reqwest::get(/* ... */).await }
}

struct SpotifyClient<Http: HttpClient> {
    http: Http
}

#[maybe_async]
impl<Http: HttpClient> SpotifyClient<Http> {
    async fn endpoint(&self) { self.http.get(/* ... */) }
}

This code can then be extended so that for any client you use, you can enable feature switching in the Cargo.toml file using flags. For example, if client-ureqthen it will act maybe_async/is_syncbecause the ureq — synchronous. At the same time, blocks would be deleted here async/.await And #[async_impl]and the Rspotify client would rely internally on the implementation for ureq.

This solution does not have any of the disadvantages that I indicated above when describing previous attempts:

  • The code is not duplicated at all

  • No overhead, neither at runtime nor at compile time. If the user needs a blocking client, then you can use ureq, which does not require tokio comrades to work with

  • For the user, everything is also quite clear: you just need to configure the flag in the Cargo.toml file

But here, take a few minutes off from reading and try to think of why not to stop at this option. In fact, I can give you 9 months – that's how long it took me to answer this question…

Problem

The point is that all features in Rust must be additive: “if we enable a feature, no other functionality should be disabled because of it. Typically, a program provides the ability to enable any combination of features – and this should be safe. Cargo should be able to combine features from a crate, so that you don't have to compile the same crate over and over again. If you want to understand this issue in more detail, here is an article that explains it quite well.

Due to this optimization, mutually exclusive features can break the dependency tree. In our case maybe_async/is_sync – This switchable feature activated via client-ureq. So if you try to compile it with it enabled, client-reqwestthe program will fail because maybe_async configured to generate synchronous function signatures. It is impossible to create a crate that directly or indirectly depends on both the synchronous and asynchronous versions of Rspotify. In fact, according to the Cargo help, the whole maybe_async concept is now incorrect.

Determiner of available features v2

There is a common misconception that this problem is solved by a “feature resolver v2”, which also very well explained in the help article. Starting with version 2021, this feature is enabled by default, but in earlier versions you could set it yourself in the Cargo.toml file. Among other things, this new version manages to do without unifying features in some special cases, but not in ours:

  • Features enabled using platform-specific dependencies for targets that are not currently built are ignored.

  • Assembly dependencies and procedural macros cannot share any features with normal dependencies.

  • Dependencies that are in effect during development do not enable any features unless they are necessary for the purpose currently being built (like, say, tests or examples).

By the way, I tried to reproduce this case myself – and everything worked as intended. In this repository An example of such a feature conflict is given, due to which the operation of any identifier is disrupted.

Other refusals

There were several more crates that also had this problem:

  • arangors And aragog: wrappers for ArangoDB. Both use maybe_async to switch between async and sync mode [5] [6].

  • inkwell : wrapper for LLVM. It supports many versions of LLVM that are not compatible with each other [7].

  • k8s-openapi : a wrapper for Kubernetes, it has the same problem as inkwell [8].

Fixing maybe_async

Once this crate started gaining popularity, this problem was reported in maybe_async: it explains the situation and demonstrates how to fix it:

async and sync in the same program fMeow/maybe-async-rs#6

Now at maybe_async there will be flags for two features: is_sync And is_async. In both cases, the crate would generate functions in the same way, but with the suffix _sync or _async at the identifier – this eliminates the conflict. For example:

#[maybe_async::maybe_async]
async fn endpoint() { /* материал */ }

Generates the following code:

#[cfg(feature = "is_async")]
async fn endpoint_async() { /* материал */ }

#[cfg(feature = "is_sync")]
fn endpoint_sync() { /* удалили материал с `.await` */ }

True, these suffixes are confusing, so I wondered if there was a more ergonomic way to solve this problem. I forked maybe_async and tried it — you can read more about the result here in this comment thread. Long story short, it turned out to be too difficult and I eventually gave up.

The only path to fixing this edge case forced us to make Rspotify usability worse for everyone. But I find it unlikely that anyone would have to depend on both synchronous and asynchronous code – at least no one has complained so far. rspotify, unlike reqwest, is a “high-level” library, so it’s hard to imagine it appearing more than once in the dependency tree.

Perhaps it would be worth contacting the Cargo developers for help?

Official support

We at Rspotify are not the first to encounter this problem, so you may be interested in reading other discussions on this topic that have occurred in the past:

  • The RFC for the Rust compiler is now closed.in which it was supposed to add the configuration predicate oneof (remember #[cfg(any(…​))] and similar) to support exclusive features. This only makes it easier to allow conflicting features in cases where no choicebut the features must still remain strictly additive.

  • Regarding the above mentioned RFC, a discussion has started some discussion regarding whether exclusionary features should be allowed in Cargo as such. Although there is some interesting stuff to read there, this discussion has not gone far.

  • In this discussion on the Cargo website explains a similar case with the Windows API. There are many more examples and ideas in this discussion, but none of them have made it into Cargo yet.

  • In another discussion on Cargo it's about whether there is a way to easily work in the test and build phase by combining flags. If the features are strictly additive, then cargo test –all-features will close everything. But if it doesn’t close, then the user will have to execute the command, selecting many combinations of flags, which is quite cumbersome. Unofficially this is already possible, thanks to cargo-hack.

  • A completely different approach on the Keyword Generics initiative. This appears to be the latest attempt to solve this problem, but it is still in the “research” stage. and at the time of writing this article there are no RFCs that describe these cases.

According to to this old comment, the Rust team has not yet given up on working on this topic; the discussion continues.

Although unofficial, there is another interesting approach that will continue to be explored in Rust called «Sans I/O». This Python protocol abstracts the use of network protocols such as HTTP, in our case this allows us to maximize reusability. There is an example of this kind in Rust, it's called tame-oidc.

Conclusion

Here are the options we now have to choose from:

  • Ignore Cargo Help It is safe to assume that no one will use both synchronous and asynchronous approaches when working with Rspotify.

  • Fix maybe_async and add suffixes _async And _sync per endpoint in our library.

  • Remove support for both asynchronous and synchronous code. This will create confusion that no one can understand, and which will affect other elements of Rspotify. The problem is that some blocking crates depend on rspotify, for example, ncspot or spotifydand other crates, for example, spotify-tuiuse asynchrony. I don't quite understand what to do in this case.

I know that I posed this problem to myself. You could simply say, “No, we only support asynchronous” or “No. We only support the synchronous option.” While there are users interested in using both at the same time, in some cases you just need to be able to say “no.” If a feature like this becomes so difficult to handle that the entire codebase turns into mush, and you simply don't have the staff to clean it up, then you have no other choice. If someone took care of this, they could simply fork it and convert it to synchronous for their own needs.

After all, most API wrappers and similar entities support either asynchronous or blocking code. So, serenity (Discord API), sqlx (SQL toolkit) and teloxide (Telegram API) are strictly asynchronous and at the same time very popular.

While it can be frustrating at times, I have no regrets about spending so much time trying to get synchronous and asynchronous code to work at the same time. My main reason for contributing to Rspotify was to study. I had no deadlines, no stress, I just wanted to improve the Rust library as best I could in my free time. And I did a lot learned; I hope you do too when you read this article.

I guess the moral of the article is this: it's important to remember that Rust is, first and foremost, a low-level language, and some things are simply impossible to implement without overcomplicating things. In any case, I'm interested to see how the Rust team plans to solve these problems.

Sources

[1] Cleaning up the blocking module ramsayleung/rspotify#112 (comments)

[2] reqwest/src/blocking/client.rs @ line 757 — GitHub

[3] Cleaning up the blocking module ramsayleung/rspotify#112 (comments)

[4] Cargo's Documentation, “Feature unification”

[5] Proposal: Move sync and async features into separate modules fMeow/arangors#37

[6] aragog/src/lib.rs @ line 488 — GitLab

[7] inkwell/src/lib.rs @ line 107 — GitHub

[8] k8s-openapi/build.rs @ line 31 — GitHub

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *