The problem of alternative implementation

In this post, I'd like to discuss a trend I see in the software world all the time. In fact, I'd go so far as to say that it's happening in the hardware world as well, but I'll focus on software systems because that's what I work with. In this discussion, I'll touch on human psychology and describe a common pitfall that I hope you can avoid.

I've spent most of my career, both in academia and in industry, working on optimizing dynamically typed programming languages. In my master's thesis, I worked on a simple optimizing

JIT for MATLAB

In my PhD I worked on

JIT for JavaScript

. Today I am working on YJIT – optimizing

JIT for Ruby

which has already been transferred to CRuby.

As part of my PhD, while working on my own JavaScript JIT, I read a lot of academic papers and blog posts about JIT compilers for other dynamic languages. Among other things, I read about the architecture of HotSpot, Self, LuaJIT, PyPy, TruffleJS, V8, SpiderMonkey, and JavaScriptCore. I also had the chance to talk to and even meet a lot of very smart people who were working on these projects.

One of the things that struck me was that the PyPy project found itself in a strange situation. The developers had created an advanced JIT compiler for Python that could provide much higher speeds compared to CPython. It seems like a lot of people could benefit from this performance boost, but PyPy has seen almost no “real world” use. One of the challenges for developers is that Python is a moving target. New versions of CPython come out regularly, always adding lots of new features, and PyPy can’t keep up, always being many Python versions behind. If you want your Python software to be compatible with PyPy, you’ll have to severely limit the Python features you use, and most Python programmers don’t want to bother with that.

Reading about LuaJIT, I found that it was and is highly regarded. Many people credit its author, Michael Pell, as an amazing programmer. LuaJIT provides big increase in productivity compared to the standard interpreted version of Lua, and is used quite actively in real projects. However, I have again noticed that many Lua programmers are reluctant to use LuaJIT, because Lua is constantly adding new features, and LuaJIT is many versions behind. This is a bit strange, considering that Lua is a language known for its minimalism. It seems that the developers could have made an effort to slow down the addition of new features and/or coordinate with Mike Pall, but this has not been done.

Almost four years ago, I was hired by Shopify to work on Ruby. For some reason, the Ruby JIT space is particularly competitive: there were a lot of projects building Ruby JITs. TruffleRuby JIT boasted the most impressive performance numbers, but it wasn't used very much. There are practical reasons for this, like TruffleRuby having a much longer warm-up time than CRuby, but I again saw a dynamic similar to PyPy and LuaJIT: CRuby kept adding features, and TruffleRuby contributors had to work hard to keep up. It didn't even matter that TruffleRuby could be much faster, because Ruby users always treated CRuby as the canonical implementation, and anything that wasn't fully compatible with it wasn't worth considering.

Hopefully you get my drift. In my experience, positioning your project as an alternative implementation of something is a losing strategy. It doesn't matter how smart you are. It doesn't matter how hard you work. The problem is that when you start building an alternative implementation, you are subject to the whims of the canonical implementation. Its developers control the direction of the project, and you are left to catch up. In the case of JIT implementations of traditionally interpreted languages, there is a rather strange dynamic, because new features are implemented much faster in the interpreter. The developers of the canonical implementation may see you as a competitor whom they are trying to overtake. And you may end up in the role of Sisyphus.

Almost four years ago, with the support of Shopify, two colleagues and I started a project to build YJIT, another Ruby JIT. The key difference was that we built YJIT not as an alternative implementation, but directly inside CRuby itself. This meant making a lot of architectural compromises, but the most important one was that YJIT could be fully compatible with every feature of CRuby out of the box. YJIT is now the “official” Ruby JIT, used by Shopify, Discourse, GitHub, and more. If you’ve visited github.com or any Shopify store today, you’ve interacted with YJIT. We’ve had far more success than any other Ruby JIT compiler, and the most important thing to achieve that was ensuring compatibility.

After reading this, you might think that the most important lesson is the old adage “if you can't beat it, join it.” In a way, it is. What I was saying is that if you start a project that tries to position itself as an alternative but superior implementation of something, you are likely to find yourself in a situation where you are constantly playing catch-up and living in the shadow of the canonical implementation. The canonical project continues to evolve, and you have no choice but to follow it, with little or no say in the direction of your own project. That's not fun. You might have better luck merging with the canonical implementation. However, that's not the whole answer.

In the Ruby world, there is Crystal, a Ruby-like language that is statically compiled with type inference. It is a language that is intentionally incompatible with Ruby, and its developers decided to move away from Ruby, but it has still achieved modest success. I think this is interesting because it puts the issue into perspective. Ruby users dislike Crystal because it is “almost Ruby, but not quite.” Syntactically, it looks like Ruby, but has many small differences and is in practice very incompatible. This only serves to confuse people by violating their expectations. Crystal would probably have had better luck if it had not marketed itself as being similar to Ruby in the first place.

Peter Thiel has a saying: “Competition is for losers”. His main point is that if you don't have to, you shouldn't put yourself in a position where you're forced to compete. My advice to young programmers: if, for example, you're thinking about creating your own programming language, don't try to create a subset of Python or something that looks too close to an existing language. Make something of your own. That way, you can develop your system at your own pace and in your own direction, without the expectation that your language should match the performance, feature set, or library ecosystem of another implementation.

I'll end with some caveats. What I said above applies if you're in a situation where there's a canonical implementation of a language or system. It's not in an area where there are open standards. For example, if you want to implement your own JSON parser, there's a well-written specification that's relatively small and doesn't evolve very quickly. You can achieve that. We also have a situation where there are multiple browser implementations of JavaScript. Part of what makes this possible is that there's an external standards body that manages the JS spec, and the people working on the JS standard understand that JIT-compiled implementations are critical to performance, and they're steering the language accordingly. They're not in a race to add a bunch of new features as fast as possible.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *