How to speed up the launch of an iOS application by 2 times using Network Instrument

An application is a connection of data from the network to a graphical interface. There are a lot of articles about UI, but almost no one remembers about the network, and it is this that affects the user’s waiting time for a response. At the same time, from the developer’s side it often looks like this: “well, I created a session, ran a request, processed an error, what else could be there?”

If you look at all the requests from the side, many questions will arise: is it necessary to reuse URLSession.shared, why the first requests, even very simple ones, take longer than others, how to speed up the launch of the application when there are a lot of requests, how to speed up the loading of images, how to monitor the quality of work networks, etc.

When analyzing through Network Instrument, we found a dozen different problems in our applications. I'm sure one of them is in your application too.

How to launch

Feel free to skip this part if you have already worked with other tools.

Compile the profiling application on your phone – select your phone and click ⌘ I. Usually, for profiling, the application is built in the Release configuration, and this may prevent it from being installed on the phone.

If the application has already been compiled, and you do not want to restart compilation, then select in the status bar Xcode → Open Developer Tool → Instruments :

Among all the tools, select Network:

Network will show where requests are being made, but if you want to analyze what's going on in the holes between them, you'll need to add Time Profiler. Click + Instrument and add the one you need, filtering the instruments by name:

To start, press the red round button or ⌘ R . After that, in the application, do the actions that you want to profile – for example, wait until the menu is shown – and press the black “stop” square. Usually after this you have to wait another minute until the data is processed.

To properly scale the graph, you need to hold down Option and select the desired area – it will stretch across the entire width of the screen.

There will be a lot of data slices, and the top ones most likely will not be useful to you. But if you expand the name of the application, you will find a lot of interesting things there: what data was processed by which URLSession, on what thread, etc. If you want to focus only on a couple, click on the plus sign to the left of the data line – it will move to the lower section, which can be stretched to fill the entire screen.

You can only test your applications. The debugger will not be able to connect to the rest.

As a result, you will be able to remove an entire profile from different requests. We will analyze them in detail.

What does a request consist of?

In the URLSessionTaskTransactionMetrics documentation there is a good query scheme. Let's look at it in more detail.

image.png
  1. A Task is created for the network request;

  2. domain lookup via DNS turns the host string into the IP to which you need to connect;

  3. A TCP connection is established, then a secure TLS connection is configured;

  4. At the end, we request data from the server, wait for a response, and receive a response.

Of course, all this detail is difficult to see in code or logs, so we need a Network Instrument that will draw a graph of all requests at once. For several requests the scheme changes greatly: server IP URLSession learns from the first request, for subsequent requests there is no need to establish a connection – they only execute the last part of the circuit. Below is a snapshot of the launch of the “Kebster!” application. out of 10 requests:

What can be seen in the diagram:

  • purple indicates the period when the request is blocked by the connection to the host and is not actually executed. The connection to the server is reused for all requests within one URLSession if you are using HTTP/2. Below we'll look at examples of how this breaks down;

  • blue indicates connection tasks that blocked the other two requests. This is not the case in subsequent requests, because all requests reuse this Connection and a reconnection is not necessary. There are three stages within the connection:

    • resolve DNS, i.e. turn the name of our endpoint into an IP number;

    • TCP connection to the server: send a connection request, get permission, send another confirmation request and establish a connection at the end. In general, it is not very important what exactly happens here – the important thing is that it blocks the execution of the request;

    • TLS—set up encryption for the connection.

  • after successful connection, both blocked requests continue. The gray area is waiting for a response. I made requests from Argentina to Russia, so they took up a significant part of all requests;

  • after the gray stripe there is a short green insert – this is receiving data. Its width depends on the amount of data in the response. For the first application configuration request, the response is very small, but the menu response is much wider, although it is only 500 kb;

  • some requests run in parallel (and this is good), while others are sequential, which is why the application takes longer to launch and users wait longer for it to open;

  • green are successful requests, and orange are those that failed. As a result of processing the latter, errors 400 and 401 were displayed.

At the same time, in the bottom panel you can select the desired request, view the request headers, response headers and the response itself.

Alas, it is not written what exact size of the response was received, but in the headers it is clear that Content-Encoding came br (what does it mean brotli) and JSON was compressed, i.e. we received not 500 kb, but 45 kb – the JSON was unpacked at the URLSession level automatically. Exact numbers can be obtained via URLSessionTaskTransactionMetrics.

There is room for improvement in this example, but let's start with examples that deviate greatly from the normal picture.

Connection

One of the most unexpected and interesting stages was connecting to the server. There were many options and problems, let’s look at everything separately.

URLSession review

When we launched the Kebster! analysis, we saw a completely different picture: many individualURLSessioneach contains only one connection, and each request takes 500 ms. Let me remind you: all requests come from Argentina to Russia.

If we expand the lines and switch the display mode from Task on HTTP Transactions by Connectionwe can see a more accurate representation:

It turns out that every request establishes a connection and spends 80% of the time doing so! These blockages are highlighted in blue and purple:

The reason is simple – as a result of an error, a separate URLSession. The requests did not share the connection with each other, which is why the connection took place over and over again.

Consequently, using only one session, you can correct the error – for example, use the standard URLSession.shared. Now the picture has become normal: the first two requests take 500 ms. as before, but the subsequent ones are faster – 130 ms each!

The same thing happened with pictures – each request had its own session created, which is why the connection took most of the time, and the pictures did not load as quickly as they could. We created a separate session for images so that it could be reused between requests.

As a result, when we reused the session, we got two clear tracks: for requests to the API and for pictures. They connect to different hosts, so they have different connections.

At the same time, it is normal to separate sessions to connect to different hosts. For example, analytics, logs, API and access to CDN for images can live in different sessions with different configurations. If you pass them all through one URLSession, then it will still create its own Connect for each host, so there will be no benefit, but if you create different URLSessions, then you can give them different names and configure different configurations. Useful if you download large files in the background.

By reusing URLSession, we speeded up both the start of the application and the loading of all images by half. Good result for changing three lines of code.

Sudden HTTP 1.1

We recently changed our anti-bot service and the launch of the Dodo Pizza application has slowed down. Having run the start through Network Instrument, we saw that there was only one URLSession, but the connections were still created separately. Because of this, the connection for simultaneous requests occurs anew each time. Later requests reuse connections, but their number is equal to the number of simultaneous requests.

The trick is in the one ①, located in the upper left corner of the request. It means that the connection was established via HTTP 1, and it does not reuse connections for parallel requests, but does so in the future – due to the header Keep-Alive.

The HTTP protocol version can also be viewed in the bottom panel:

The HTTP protocol version is controlled from the backend, because all current iOS versions automatically support modern protocol versions. To return HTTP/2, we had to change anti-bot provider.

Check the end-to-end connection, different proxies can break the protocol version. For example, Variti does not support HTTP/2, but Servicepipe does.

Pre-connection

Typically, an application goes to several different hosts: one for the API, a second for images, a third for payment, etc. Even if we use HTTP/2, the first request to them will take much longer. But this time can be reduced if you make the connection in advance!

Literally: call an empty request to the root of your host with type CONNECT, HEAD or OPTIONSso that the connection is established before you start making the first requests. Hosts can be registered in the application in advance – nothing terrible will happen, even if they become outdated and you make an extra request. In this case, the connection can be called literally from the first line in the application: while you configure the first dependencies, read the data from the database and draw the first interface, it can easily take 100-300 ms, during which the connection will occur, and the first “real” requests will go faster.

public func preheatConnections(endpoints: [URL]) {
  for endpoint in endpoints {
    Task(priority: .userInitiated) {
      try? await preheatConnection(to: endpoint)
    }
  }
}

private func preheatConnection(to endpoint: URL) async throws
  var request = URLRequest(url: endpoint)
  request.httpMethod = "HEAD"

  let session = URLSession.shared
  _ = try await session.data(for: request)

  // Обрабатывать результат даже не нужно, 
  // достаточно внутреннего состояния коннекта в URLSession
}
Early connection completed before the first real requests occurred

Early connection completed before the first real requests occurred

In the screenshot, the connection request started early by 400 ms, which allowed all subsequent requests to skip the connection stage. Sometimes the connection may take a little longer and take up part of the request – this is normal.

Very strange, but successful connect is not always displayed in the tools.

However, in the bottom panel you can see how long the request took and when it started by selecting the URLSession Tasks display type.

The beginning of the connection is not visible, but it partially takes up space in other requests

The beginning of the connection is not visible, but it partially takes up space in other requests

This way you can speed up the first requests in the application, loading the first picture or video, or any other service that goes to its backend. For example, you can speed up the first payment request if a separate host is responsible for payments.

Connecting to all hosts can be done as the first line of your application – while it is starting, you can connect to the host. This way the first requests will go through faster.

Static IP

The connection consists of three stages: DNS, TCP and TLS. You can save on DNS if you write not the host’s domain name, but its IP. It’s not a fact that such a solution will suit you in 2024, but you can try to reuse the IP from the last session for the first requests to gain another 50-100 ms. It’s so easy to break everything, so this is advice for the bravest. For example, we didn’t do that.

Connecting directly via IP eliminates the DNS step

Connecting directly via IP eliminates the DNS step

Of course, you can try scripts with a static IP from the last session. But then make sure you have a backup option in case the connection doesn’t work – for example, the balancer will transfer your client to another IP.

Analyzing the launch of the Dodo Pizza application

We have analyzed in detail what a request consists of and what problems there are. Now let's look at how the Dodo Pizza application is launched and what else we can see in Network Instrumentwhen we analyze the entire picture of the application launch.

The graph shows several interesting places:

  1. Requests do not start from the very beginning, there is some kind of pause – the application is launched, dependencies are built, the first interface is drawn. At this time, you can make a preliminary connection, bypassing the rest of the launch.

  2. After the pause there are 3 parallel requests. They are blocked by the connection, but we already know that we can try to connect even earlier.

  3. In the middle is a request for pizzerias – it has a large green area, which means a lot of data comes there. This is quite expected: I asked for a menu in Moscow, and there are many pizzerias there. But there are two more conclusions:

    • for some reason this request blocks the others, which is why the request for the cart and menu occurs much later – this delays the start of the application;

    • after this request there is a noticeable hole, during which there is not a single request at all! Time Profiler will help you find this problem later.

  4. The third block of requests is to obtain information about the cart. The menu cart itself is not needed, but it does indicate the order type – it is needed for the next menu request, because it is different for restaurant and delivery. If you save the order type on your phone from the last session, then this stage can be paralleled with the main menu. Together with correcting the request for a pizzeria, this will speed up the start by 30%!

  5. The last big request is getting a menu. It consists of three stages:

    • The blue area is a redirect: I requested a menu in English, but the backend told me that for this I need to go to another endpoint. It seems logical, but it takes time.

    • The second gray area – when redirecting, the request again makes a handshake to the host. Unfortunately, redirects also take time.

    • Green area – we receive data. The menu is large, and the data comes in several packages – we need to make sure that we receive it in a compressed form. Unfortunately, Network Instrument It doesn't show this, but you can check it via URLSessionTaskTransactionMetrics.

A couple of app launches after Network Instrument showed a dozen problems at the start of the application.

Hole between requests due to long parsing

The request for pizzerias turned out to be blocking by accident: they forgot to allocate it into a separate Task. But the hole after it turned out to be more interesting and consisted of two parts:

  1. Parsing pizzerias. Since there is a lot of data about pizzerias, it takes a long time to parse it. We added Time Profiler to analyze this place and it suggested a problem: when parsing each pizzeria, we calculated its operating time, and for this we took the opening and closing times for 7 days a week for all 300 pizzerias and created DateFormatter right in the loop. Initialization DateFormatter very long, so we took its creation out of the loop – it began to work noticeably faster. As a result, the request blocked program execution, and parsing this data delayed loading the menu.

  2. Interface creation in the main thread, about 60 ms. At this time, the request for the cart and menu could already be launched. To improve, we stopped waiting for the screen to draw and requested data immediately – all the screen had to do was subscribe to menu updates.

Request data before loading the interface – you can save 50-100 ms. before the first request.

Simultaneous downloads

You may have already asked the question: how many requests can you make at the same time? As much as you like, you are limited only by the protocol:

  • HTTP 1 supports about 6-8 connections;

  • HTTP/2 has no limit and can usually hold about 100 connections – depending on the server settings. Most of the time, URLSession is just waiting for a response from the network infrastructure, so there are all the benefits of parallelism.

On the other hand, you may have the same problem of receiving large amounts of data at the same time. In the screenshots they are shown as large green areas. In this case, you are limited by the speed of the user or server, but the dependence is more complex, because the data arrives in several packets.

Here I can only share an example: once, to migrate from one menu contract to another, we started loading two menus at once – the loading schedule of the old menu immediately dropped by 50%.

The time to obtain the first version of the menu increased noticeably after enabling parallel loading of the second menu

The time to obtain the first version of the menu increased noticeably after enabling parallel loading of the second menu

Sometimes the second menu was received only after the first one was received. In general, such maneuvers are not free. If you download several large files at the same time (for example, pictures, videos or 3D models), then it is worth experimenting with the number of parallel downloads – perhaps a smaller number will give greater speed.

Large parallel downloads can delay each other, but gray areas of waiting do not slow down the opening process in any way – they are simply waiting for a response.

A redirect that was not included in the connection

Let’s look at a very rare case, just let’s see “how it happens.” We have already discussed that menu redirection is not free – the data for a repeated request does not arrive immediately, and there is a double wait:

It's even worse if HTTP/2 doesn't work. For example, at the start we are asked only 3 requests in parallel, and when opening the menu 4-5 requests can be called simultaneously. In this case, the first request for the menu can create an additional connection and receive a redirect, but after the redirect it will end up in the old connection. It turns out that we were waiting for a new connection only to be redirected to the old one!

On HTTP 1, problems with connections can arise at a variety of times and you will only see them in the profiler 🙂

Query analysis via Firebase Performance

Measuring the network only on your phone is not enough – people's actual experiences may vary greatly. For example, the first Drinkit coffee shop opened in a business center with very poor Internet. This was already reflected in the guests’ experience, but in the Drinkit app, videos are also played directly in the menu!

We built a way to monitor through Firebase Performance like this:

  • all network requests are already analyzed through Firebase;

  • We additionally mark up a large client scenario, for example:

    • first launch of the application before the map is shown;

    • restarting the application until the menu is shown;

    • time to create an order and start payment.

  • We gradually refine the tracing so that all places become clear: we divide it into large stages, clarify what kind of processes take up the time between requests;

  • if something goes wrong, we can go into a specific session and see what happened there.

Detail is hidden behind the button View all sessions right:

What can we learn from here about loading two menus? How many times they were downloaded from the network, how many different models were converted, where there are incomprehensible expectations – for example, some kind of incomprehensible hole towards the end. As a result, we will see the result from the general user metric to individual tasks in the code.

Alas, for the article, the graph had to be simplified, because Firebase adds all application tracings to one graph – it’s not so easy to look only at the picture of “how the menu is loaded.” But it's possible.

Start collecting user metrics, and then drill down into areas that are unclear.

How to analyze on Android

On Android, you can also collect similar graphs through the mitmproxy proxy utility; just launch the recording application and export in har format. You can open the file in the PerfCascade visualizer. The graph shows different stages of requests, but, unfortunately, it does not group them by connections.

We were faced with the fact that launch speeds cannot be easily compared between platforms: even at the 50th percentile, an Android application launches 2-3 times slower, although the request pattern is the same. This happens due to the fact that Android phones, on average, are much weaker than iPhones both in terms of processor power and memory speed, and in the operation of the phone’s network modules.

It is worth comparing the performance of phones of the same power. For example, compare the iPhone 13, iPhone 14 and iPhone 15, and the Samsung Galaxy and Google Pixel phone models released over the past couple of years.

In this context, the launch time coincided, but on the iPhone it was the 50th percentile, and on the Android it barely reached 10-15.

Bottom line

Using Network Instrument, we have diagnosed and solved many problems:

  1. Now we use one URLSession per host to reuse connections.

  2. We added warming up the connection as the first line in the application.

  3. We discovered a broken HTTP/2 in time and fixed it at the infrastructure level. By replacing the antibot, we reduced the application launch time by 0.5 seconds.

  4. The most work involved changing the order of the queries:

    • made loading the menu almost the first request, previously they only requested toggle;

    • We started caching the toggle feature between launches and updating them asynchronously. The menu is now requested even earlier when restarting.

    • removed redirects on the network layer because they only slowed down the response;

    • We separated queries that can be called after rendering the first interface – this is how we parallelized the long update of the pizzeria. Now it does not affect the rendering of the interface at all;

    • We checked that all large JSONs are compressed, and for the new menu we used JSON Reference. This made it possible to transfer less data and also not duplicate objects in RAM;

    • We divided the launch scripts into the first and repeated ones in order to separately collect statistics on them in Firebase.

We started with this picture with a dozen problems, but in the end we accelerated the application by half and came to a scheme where the user waits only for one request for the menu to see it. In fact, the interface is shown in the middle of the bottom graph, and the large green area is a time saver.

In the bottom graph, we get more work done in less time. In the second launch scenario, requests are executed twice as fast as the original one, because the menu does not wait for feature tags to load

In the bottom graph, we get more work done in less time. In the second launch scenario, requests are executed twice as fast as the original one, because the menu does not wait for feature tags to load

What else can be improved?

  1. We will remove the loading of two versions of the menu so that they do not clog the channel, we will do this before the end of 2024. We will also save on parsing time.

  2. Cache the menu – it is updated more often than people enter our application, but the first screen does not change that much, so we can show data from the cache, and update the menu in the background – in the worst case, it will blink a little, but we will show it much earlier, and metrics on “blinks” can be collected separately and controlled.

  3. Request the second part of requests earlier, without waiting for the menu to be shown. In fact, these queries update different background data, so they are not visible to the user and will not have any visible effect.

  4. Add a change marker for large queries, e.g. eTag and processing HTTP code 304, for example, for updates to the list of pizzerias. We can avoid downloading it if there are no updates in it, because the pizzeria schedule is updated quite rarely.

In this article, I tried to talk about the problems at the start of the application, but there is still a whole layer of work to optimize the loading of pictures both in the catalog list and between screens. Like the article if the topic is interesting and subscribe to the Dodo Mobile channel in Telegramso as not to miss the next one.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *