Reliable Profiling in iOS with Flame Charts

Measuring the performance of iOS applications is typically done by profiling the application to calculate the time spent in each function. This is usually done using the Time Profiler in Xcode Instruments, but it is known to be slow and unreliable.

Emerge offers a profiling tool as part of Performance Testing in CI. This profiling is represented as a flame graph, and proved, which can be an easy way to gain insight into the critical elements that affect application performance and find solutions. Today we’re introducing a new way to use the same great profiling visualizations, completely local and open source.

ETTrace is an open source framework written in Objective-C and CLI (Command Line Interface) in Swift that profiles and renders data entirely locally. It’s designed to be simple and fast: just connect the framework to your application, run ettrace to start profiling, and stop to instantly see the flame graph. There is no need to restart the application, nor to wander through long menus to see the result.

Why we need a new profiler

Performance Analysis by Emerge is designed to prevent regressions from being merged into your codebase. It gives you consistent results under certain scenarios configured in CI. However, when debugging, sending to CI is not always convenient. You need something fast and local.

ETTrace makes it easy to debug performance issues. If you’re using Emerge, then when you’re ready, you can validate your fix by submitting it to CI. In addition, you can explore all areas of code in your application without writing custom tests. If you find critical performance paths with ETTrace, you can set them up in CI for monitoring.

You may be wondering why not just use Instruments for this? Although Time Profiler is the best iOS profiling tool, if you are not a performance expert (or even one) it can be tricky to use. This is partly due to the non-intuitive visualization as well as the massive component of the tool.

On Emerge, I spoke to many engineers working on large applications, and the feedback was the same: Time Profiler can be brittle and slow. Even to get the screenshots for this article, I ran into a few freezes and had to force quit. Frequent problems with symbols generating “trace”* and showing only addresses but not function names

*trace – properties of an event that indicates an attempt by one process to join another, collects performance data during application execution on an IOS device

Instruments vs. ETTrace

Instruments vs. ETTrace

ETTrace supports “symbolization” in two ways. First, if you have dSYMs (debug symbols), you can provide them directly to our tool via the –dsyms argument. Second, for assemblies in the simulator, ETTrace automatically uses the symbol table in the application binary for symbolization. Because our tool is open source, then if you run into symbolization issues, they’re easy to debug – unlike Instruments.

How Sampling Works (Selecting Discrete Data)

ETTrace is a data-gathering profiler, which means that it writes a stack at fixed intervals to create a visualization. Only the main thread is sampled, where UI-related performance issues such as freezes typically occur. Data collection is done on a background thread, something like this:

sStackRecordingThread = [[NSThread alloc] initWithBlock:^{
    NSThread *thread = [NSThread currentThread];
    while (!thread.cancelled) {
        [self recordStack];
        usleep(4500);
    }
}];

Each entry includes a list of addresses found on the stack and a current timestamp. The addresses are “symbolized” after the “trace” is completed, and the versions with symbols are summed. The maximum time between any two stacks should be 5ms (given that the sleep mode can take up to 0.5ms more than the time we specified). To ensure the traces are accurate, any additional time beyond this is accumulated and passed as

Technically, the visualization is a flame graph, which means that the nodes are ordered by time on the x-axis. It adheres to this data structure:

struct FlameChartNode {
  let name: String
  let duration: Double
  let children: [FlameChartNode]
}

The visualization generated when performing performance testing with Emerge is a flame graph where stacks are collected by name at each level. Each node does not have a specific start/end time as they are not ordered; only the duration on the x-axis. The data structure looks like this:

struct FlameGraphNode {
  let duration: Double
  // Children is keyed by node name
  let children: [String: FlameGraphNode]
}

Since ETTrace only visualizes one “trace” of the application (rather than the average of many), the data is presented as a flame graph, which is easier to debug. ETTrace also has a comparison feature that allows you to load two “trace” and compare them to see how the feature improved or worsened. When using this option, the visualization will be presented in the form of a flame graph.

Understanding Protocol Compliance

As an example of using ETTrace to analyze the impact of protocol compliance on application startup, using the open source Mastodon application but modified to include more protocol compliance. Normally, ETTrace is used after application startup, but to profile directly from startup, we add the ETTraceRunAtStartup key set to YES to Info.plist.

Now we can run the application with ETTrace.framework connected and start profiling! Make sure the app is removed from your phone before installing from Xcode. Then install the app but don’t launch it. Finally, run ettrace from the command line and follow the instructions, including manually launching the application from the start screen. The resulting flame graph shows a large amount of time spent on compliance with the protocols: more than 60 ms!

Slow protocol matching in ETTrace.

Slow protocol matching in ETTrace.

Next, try running the same application a second time and run ettrace to get a trace. This time the compliance checks are so fast that ETTrace doesn’t even track them! With both traces selected, you can use the differential flame graph to confirm the cause of the slowdown.

This demonstrates that the protocol compliance check still takes a slow path when first launching an app in iOS 16 (including after installing an update) and is very fast on subsequent launches. However, other kinds of protocol conformance checks, such as when the result of the as? is nil may still be very slow. Running your application locally with ETTrace can help identify these slow spots in your application.

Automation in CI

Debugging local performance with ETTrace is only part of the performance optimization process. You want a fast iteration cycle that allows you to evaluate new ideas, but continuous testing and alerts provide additional protection against problems in production and validate local performance measurements. Emerge Offers performance testing function, which does just that. Local performance debugging with ETTrace, together with Emerge Performance Analysis, provides a unified performance optimization workflow for developers, resulting in continuous improvement in application performance. If you want to know more about these tools feel free to write and if you have any feedback on ETTrace please open Github issue!

Thanks to Itay Brenner for his work on this project, and to Filip Busic, Miguel Jimenez and Keith Smiley for their feedback while testing the tool earlier!

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *