Clang has powerful sanitizers that help you find bugs faster, fix them with more confidence, and find impossible-to-reproduce race conditions. These tools are extremely powerful and mature and have existed in various forms since 2010. These tools are so incredibly useful that we collected our experiences using them across iOS, Android, and our C++ core to help you through some gotchas.
Enter Clang’s AddressSanitizer (ASan). It’s a fast memory error detector, based on both compiler instrumentation and a runtime library. It uses a shadow memory region and the slowdown is a very acceptable 2×. Apple added a checkbox for it in Xcode 7 and further improved support for it in Xcode 8.
Let’s take a small detour here. Our SDK consists of many subprojects and multiple testing targets. We keep all of our configuration in sync using
xcconfig files. The actual project file only contains the absolute minimal set of changes, and we configure everything we can with these shared files.
However, we have two opposing goals:
Tests should be easy to debug.
Tests should match the resulting binary as close as possible.
We resolve this by having a configuration file,
Defaults-Testing-CI.xcconfig, only for our Jenkins CI. Our actual configuration script is much larger, but this is just a selection of the most interesting settings in the context of this article.
GCC_TREAT_WARNINGS_AS_ERRORS = YES ENABLE_NS_ASSERTIONS = YES // from https://github.com/WebKit/webkit/blob/master/Tools/asan/asan.xcconfig // This should allow us to get better stack traces on errors. OTHER_CFLAGS = $(OTHER_CFLAGS_COMMON) -fno-omit-frame-pointer -g // Allows conditional `include` of files (CI files should only exist on CI). #include? "Defaults-Testing-CI.xcconfig"
// Release flags. LLVM_LTO = YES_THIN GCC_UNROLL_LOOPS = YES GCC_OPTIMIZATION_LEVEL = s SWIFT_OPTIMIZATION_LEVEL = -Owholemodule // Code protection. STRIP_INSTALLED_PRODUCT = YES SEPARATE_STRIP = YES COPY_PHASE_STRIP = YES DEAD_CODE_STRIPPING = YES STRIP_STYLE = non-global
What’s not documented is the equivalent switch that you can use in your
CLANG_ADDRESS_SANITIZER = YES
And when we say “not documented,” you’ll literally find a relevant tweet and our rdar://28250805 – Document xcode config settings to enable Clang Sanitizers on the first page of Google. Let’s be thankful WebKit is open source.
⚠️ Warning: Since this flag is undocumented, it might change without warning, and there are some hints that it might be renamed to
Using this flag makes it easier to dynamically switch this on or off without having to create a separate Xcode configuration that would be much harder to maintain. Additionally, you can configure your CI to run tests both with and without ASan to both have a great assurance of memory correctness and test the binary that you actually ship to customers. We currently have a farm of 15 Mac Minis that have a lot of fun testing all variants per commit.
Now, that would be far too easy, wouldn’t it?
Enabling ASan for us triggered an issue in some C++ code on launch. It seems this is a known false positive on ODR detection, and it’s easy to fix by setting
ASAN_OPTIONS=detect_odr_violation=0 in the environment variables. This works great in Xcode if you run your binary normally. It also used to work for tests in Xcode 7.3.1. However, since Xcode needs some custom settings for tests as well, there’s a bug in Xcode 8 that no longer merges the custom settings with yours, so your settings are ignored. rdar://28103342 – ASAN_OPTIONS no longer settable when running tests within Xcode. (Regression) took us a long time to figure out. We even opened a DTS, but Apple doesn’t help for beta software, and back then, Xcode 8 was already GM, but not yet officially released.
The good news is that ASan also checks for a function named
__asan_default_options() to get custom settings at runtime, and this works. You need to make sure to plant that function in your test host. If you don’t yet have a test host, set one up — you’ll need it for so many things (e.g.
NSUserDefaults), and it makes tests much more predictable. It also helps when you want to track code coverage.
Of course, that’s not the full story. There might be places that report invalid memory access that you’re unable to control. We still use Apple’s CoreGraphics CGPDF code in some older tests to generate test data. There are many problems around correctness with Apple’s PDF renderer, which forced us to move away from it over a year ago. PSPDFKit now shows its own PDF renderer, but we haven’t yet migrated all of our tests. For generating test data, CGPDF is good enough. However, ASan reported that CGPDF accesses already freed memory. Since this is deep in CoreGraphics, we can’t do anything to fix it, other than writing a radar.
That’s where ASan suppression lists come in. Suppressions can’t be directly added inside the options string; they have to be in a separate file, which is referenced via the
suppressions=FILEPATH option. We have a project-wide file named
AddressSanitizerSuppressions.supp that we copy into our test app hosts in the resource step, which allows us to use a relative path. (This sounds easy, but it took us days and the help of Twitter folks to really figure this one out.)
# Apple has issues in there that we cannot fix. interceptor_via_fun:pdf_Finalize
The file is always copied, even if we don’t enable ASan. In that case, it just doesn’t do anything, and it’s only used for the test hosts anyway. Google has great documentation on the available options, and the suppressor file options are well-documented on the LLVM website.
ASAN_OPTIONS.h file that contains the implementation. Import this into your test host app delegate and you’re good to go:
You’ll see a log similar to this when you run it:
==32672==AddressSanitizer: libc interceptors initialized || `[0x200000000000, 0x7fffffffffff]` || HighMem || || `[0x140000000000, 0x1fffffffffff]` || HighShadow || || `[0x120000000000, 0x13ffffffffff]` || ShadowGap || || `[0x100000000000, 0x11ffffffffff]` || LowShadow || || `[0x000000000000, 0x0fffffffffff]` || LowMem || MemToShadow(shadow): 0x120000000000 0x123fffffffff 0x128000000000 0x13ffffffffff redzone=16 max_redzone=2048 quarantine_size_mb=64M malloc_context_size=30 SHADOW_SCALE: 3 SHADOW_GRANULARITY: 8 SHADOW_OFFSET: 0x100000000000 ==32672==Installed the sigaction for signal 11 ==32672==Installed the sigaction for signal 10 ==32672==T0: stack [0x7fff59945000,0x7fff5a145000) size 0x800000; local=0x7fff5a13cfe8 AddressSanitizer: reading suppressions file at /Volumes/CI/ci/Library/Developer/CoreSimulator/Devices/3F1B4940-0E64-42B2-9982-C9D02DC17001/data/Containers/Bundle/Application/3285F08A-FA39-4704-904F-85538FFD8D70/PSPDFTestHost.app/AddressSanitizerSuppressions.supp ==32672==AddressSanitizer Init done
Using this file, we discovered a few rare, small, but hard-to-track-down memory corruptions, including one that was sitting extremely deep in CoreImage’s Kernel compiler (rdar://28252672 – Crash: Double-Free when using Core Image (deep in CoreImage /libFosl_dynamic)) with ASan. We now run all our tests with it.
Don’t panic; this also works in a pure Swift project, but you can’t just import the
ASAN_OPTIONS.h file in your app delegate. And because the bridging header is also just a header file, importing it there also won’t work. So you have to be a little bit creative. The trick is to create an empty
.m file and import
ASan/TSan can also be enabled via the command line directly:
xcodebuild -help | grep -i sanitizer -enableAddressSanitizer YES|NO turn the address sanitizer on or off -enableThreadSanitizer YES|NO turn the thread sanitizer on or off
Thanks to John Engelhart (Mr. JSONKit) for the tip!
It’s possible to run your NDK-compiled code with ASan, but it’s not as straightforward. That’s because ASan has to intercept the calls to
free to properly do memory accounting. That’s not a problem in statically linked binaries, but in Android apps, that means the ASan runtime library needs to be preloaded into the Zygote process that’s actually running the app.
Android NDK includes the script that will modify the OS running on a device to preload ASan — it’s called
asan_device_setup inside the
toolchains/prebuilt/<arch>/bin directory. Since it copies ASan libraries to the
system partition, it’ll only work on devices that allow root access. The script itself will enable ASan for all processes on the device, which turns out to be a serious problem. If a system service loads a library with memory leaks, ASan will trigger an exception and cause the device to boot loop. After bricking three separate devices, we found out that trying to run ASan on an actual device is not a good idea. We noticed the same issues when trying to use ASan with the Genymotion emulator.
You can successfully use
asan_device_setup with the bundled Android Emulator. We found out that both 6.0 and 7.0 x86 images worked well. We recompiled our NDK, adding
-fsanitize=address -fno-omit-frame-pointer to
LOCAL_LDFLAGS. Running the app on AVD then showed any memory issues as stack traces inside Logcat and helped a lot with finding memory leaks. The only remaining issue was that the stack traces weren’t fully symbolicated; it seems that
ndk-stack doesn’t recognize them, and ASan on a device itself can’t properly resolve all the symbols. The Chromium project actually uses a custom-built script to do desymbolication, which is unfortunately not compatible with general apps. This is a good idea for a side project. ;)
More information about ASan on Android can be found on the google/sanitizers project Wiki.
There’s also ThreadSanitizer (
ENABLE_THREAD_SANITIZER) and MemorySanitizer, which both have a much higher runtime cost, but can help you finding even more bugs. (MemorySanitizer currently isn’t supported on macOS.)
ThreadSanitizer especially can help you fix a whole set of hard-to-find edge cases and races. Since performance is much worse, we use a separate Jenkins job. With commenting “Run TSAN” on a pull request, we trigger a complete run, which takes about an hour (compared to ~20 minutes with ASan.) Start with
TSAN_OPTIONS.h, similar to how ASan is enabled.
You can only run enable one sanitizer per process, so disable ASan if you run ThreadSanitizer. We needed suppressions (see
ThreadSanitizerSuppressions.supp) for the ThreadSanitizer, ironically, for Apple’s PDF implementation in CoreGraphics, which just seems to be not very well tested. (We implemented our own PDF renderer in our PDF SDK, but we’re using Apple’s PDF code in our automated tests to generate simple test assets.)
Some people could get the idea that enabling sanitizers for production binaries sounds like a good idea to harden security and find more bugs. But they aren’t hardening tools; please don’t abuse them as such.)
For “serious” business, there’s also Valgrind and Dr. Memory. Since they either have lacking (or no) support for macOS and don’t really work with iOS either (although some people try), they’re only mentioned for completeness reasons. However, if you have a cross-platform codebase (like we do), these tools are phenomenal.
If you want to see for yourself how rock solid our upcoming PDF Viewer app is, get the app. It’s available for both iOS and Android — onboarding is just a few clicks, and it’s free.