Migrating to the New Emscripten LLVM Backend

Illustration: Migrating to the New Emscripten LLVM Backend

At PSPDFKit, we’re big fans of the Emscripten compiler. For a number of years now, it’s allowed us to compile our battle-tested PSPDFKit SDK code and run it directly in modern browsers!

In this post, I’ll share our experience of updating to Emscripten’s third iteration of its backend: vanilla LLVM. It’s a welcome change by many in the community, and considering that the LLVM backend is now the only backend shipped with Emscripten, we thought it was time to tell you how our migration from the old Fastcomp backend to the new LLVM backend went.

Why the Update?

In the past, Emscripten shipped with a forked version of LLVM named Fastcomp, which offered support for WebAssembly and asm.js. This worked great, but it had its drawbacks.

Then came the announcement of a new backend. This new LLVM backend resulted in faster compile times and faster runtime performance. It also means we now have access to many of the more modern clang tools.

Fastcomp was actually a fork of LLVM 6 started in 2013, which posed a slight problem with our codebase, as we’re building with at least LLVM 9 in every other environment. So having to support different compilers and versions meant lots of branched build operations, further complicating build procedures.

But because LLVM replaced Fastcomp, it meant we could finally deal with the aforementioned issue and gain the advantages outlined above. That said, now I’ll delve into the ins and outs of our experience.

What Issues Did We Encounter?

I’m going to start by introducing the issues we had while migrating. I’m doing so not to be negative, but because there were actually very few, and highlighting these issues might help others save time during the transition.

Upstream LLVM Bugs

It’s great that the backend is up to date with the upstream LLVM repo, but that also has some drawbacks. For example, when we first made the switch, we were hitting compilation issues on specific files (although only a couple).

After some poking around and diagnosing, it became apparent that we were facing a known LLVM bug compiling with fast-math. The fact that it was already reported in the Emscripten repository was a slight saving grace that made it so we didn’t have to waste precious time tracking it down.

The lesson here is that dependencies can obviously have bugs; that’s the natural consequence of living on the edge.

Applying Emscripten Options

Even though the LLVM backend is now being used, there are still many Emscripten-specific flags that are prefixed by the -s option. In some of these cases, you only had to apply the -s flags to the linker flags because a lot of the code generation happened at link time (something I’ll talk more about later). This is no longer true, therefore it’s often much safer to apply the flags both at compile time and link time. This is advice you’ll often see in the Emscripten documentation.

Ironically, this was only an issue when we were initially testing out the migration. When using the Fastcomp backend and passing -s WASM=0, the produced output would be in the asm.js format. Now, when using the new LLVM backend, it will produce plain old JavaScript. This means we were able to remove most of the flags we were passing to the compiler, and we’re now able to use many more default values. And that in turn means we’re much less likely to be tripped up when passing linker-only flags.

What Did We Gain?

As mentioned earlier, it’s great that the backend has now been pulled in line with a more modern version of LLVM. This reduces the amount of effort required to support multiple compilers. But it doesn’t stop there!

With most of the gains that we’ll see in the following section, we didn’t have to do anything. It just worked out the box, and what’s more is: It worked much better.

Binary Size

While we were migrating to the new backend, we dug a little further into ways we could optimize the binary size we distribute. The great news is that we shaved off around 12 percent of our original binary size! Now, that wasn’t all related to the update, so I decided to run some separate tests just for this blog post and found that the difference between the LLVM backend and Fastcomp, with no changes in the build procedure and code, was a 6 percent decrease.

If you know anything about the web environment, you’ll understand how great that is. Every extra byte is a byte that needs downloading over the network, and if we can reduce that in any way, users have a much better experience.

Performance

Most of the performance gains came from the point above. Because the binary size was decreased, it meant that the download and instantiation of the WebAssembly binary is faster.

I found an approximately 7 percent performance increase in download and instantiation time, which again is important when you’re aiming for a faster start-to-render time.

Aside from the binary size changes, I also measured a 7 percent speedup in document opening. Now, this is highly dependent on the nature of the PDF, but any gains are welcomed.

Compilation Performance

Runtime performance wasn’t the only area to benefit. As I mentioned earlier, as part of moving to the new LLVM backend, there was a change in some roles of the compiler and linker. In simple terms, this means the compiler took on a little more work and the linker had less work. This may seem uninteresting, but once you realize that this results in a faster turnaround time for incremental builds, it starts to make more sense.

In recent years, even lower-level developers have become reliant on quick incremental builds. How many times have you missed a ; or called the wrong function? Being able to quickly change the code and know that it’s not going to take another five minutes (or longer!) to link your code is a real time saver!

In our codebase, using the -O0 flag, we found a speedup of 44 percent in link performance. Again, the overall compilation and linking performance of the full codebase didn’t change significantly, but incremental compilation gains are great.

It’s worth noting that our release build time actually slowed down, which is a shame, but it’s a price worth paying for smaller binaries and better performance.

Clang Tools

I talked about debugging tools for WebAssembly over on The State of Debugging in WebAssembly, so if you’re interested in debugging WebAssembly, I’d suggest heading over and reading that blog post.

At PSPDFKit, we’ve been using clang sanitizers to check our code for leaks and undefined behavior for a while. But with WebAssembly, that was never possible. Since updating to a more modern LLVM backend, we now have the power of sanitizers when running in WebAssembly.

What’s even cooler is that we’ve already caught a few bugs because of sanitizers. So no matter how confident you are, new tools can always give you a helping hand.

Conclusion

If you’re still stuck using Fastcomp, I hope I’ve convinced you to make the switch and to make it soon. This is not only because, as of version 2 of Emscripten there is no Fastcomp backend, but also because the migration is fairly painless and presents many benefits.

Have you already migrated? What roadblocks did you hit? It’s always interesting to hear about others’ experiences with Emscripten. Feel free to share with us on Twitter.

PSPDFKit for Web

PDF viewing, annotating, and collaboration for web apps.

Try Now