Azure Active Directory Gateway is now on .NET Core 3.1

* Gateway – gateway

Azure Active Directory Gateway is a reverse proxy server that works with hundreds of services included in Azure Active Directory (Azure AD). If you used services such as office.com, outlook.com, azure.com, or xbox.live.com, then you were using an Azure AD gateway. The gateway is present in over 53 Azure datacenters worldwide and serves ~ 115 billion requests daily. Until recently, Azure AD Gateway ran on the .NET Framework 4.6.2. It has been running on .NET Core 3.1 since September 2020.

Motivation to migrate to .NET Core

The scale of the gateway results in significant computational resource consumption, which in turn costs money. Finding ways to reduce the cost of running the service was a key goal for the team behind it. The .NET Core performance hype caught our attention, especially since TechEmpower named ASP.NET Core one of the fastest web frameworks on the planet. We ran our own tests of the gateway prototypes on .NET Core, and the results allowed us to make a very simple decision: we have to migrate our service to .NET Core.

Is .NET Core Providing Real Cost Savings?

Certainly it does. With Azure AD Gateway, we were able to reduce our CPU (CPU) costs by 50%.

Previously, the gateway worked on IIS with .NET Framework 4.6.2. Today it runs on IIS with .NET Core 3.1. The image below shows that CPU usage was cut in half on .NET Core 3.1 compared to .NET Framework 4.6.2 (effectively doubling the bandwidth).

As a result of the increased bandwidth, we were able to reduce the size of our server from ~ 40K to ~ 20K cores (50% reduction).

How did the transition to .NET Core come about?

It took place in 3 stages.

Stage 1: Select an Edge Server

When we started working on the transition, the first question we had to ask ourselves was: Which of the three servers in .NET Core should we choose?

We ran our scripts on all three servers and realized it all comes down to TLS support. Given that the gateway is a reverse proxy, supporting a wide variety of TLS scenarios is very important.

Kestrel:

When we initiated our transition (November 2019), Kestrel did not support client certificate negotiation or revocation for each hostname. In .NET 5.0, support for these features was added.

As of .NET 5.0, Kestrel (thanks to the use of SslStream) does not support CTL stores for every hostname. Support expected in .NET 6.0.

HTTP.sys:

Server HTTP.sys encountered a mismatch between the TLS configuration on Http.Sys and the .NET implementation: Even if the binding is configured not to negotiate client certificates, accessing the Client certificate property in .NET Core causes unwanted TLS renegotiation.

For example, doing a simple null check in C # results in a renegotiation of the TLS handshake:

if (HttpContext.Connection.ClientCertificate != null)

This is shown in https://github.com/dotnet/aspnetcore/issues/14806 in ASP.NET Core 3.1. At the time we made the transition in November 2019, we were on ASP.NET Core 2.2 and therefore did not select this server.

IIS:

IIS met all our TLS requirements, so we chose this server.

Stage 2: Migrate the Application and Dependencies

Like many large services and applications, the Azure AD gateway has many dependencies. Some were written specifically for this service, and some were written by others inside and outside Microsoft. In some cases, these libraries have already targeted .NET Standard 2.0. In other cases, we updated them to support .NET Standard 2.0 or found alternative implementations, for example, removed our deprecated Dependency Injection library and used the built-in .NET Core support for dependency injection instead. At this stage, great help was provided by .NET Portability Analyzer.

For the application itself:

  • The Azure AD gateway had a dependency on IHttpModule and IHttpHandler from classic ASP.NET that ASP.NET Core doesn’t have. Therefore, we redesigned the application using middleware constructs in ASP.NET Core.

  • One of the things that really helped with the migration process was Azure Profiler (a service that collects performance traces on Azure VMs). We deployed our nightly builds to test sites, used wrk2 as a load agent for testing scenarios under load and collecting Azure Profiler traces. These traces then informed us about the next tuning needed to get the peak performance of our application.

Phase 3: Gradual Deployment

The philosophy we followed during the deployment was to detect as many issues as possible with minimal or no impact on operations.

  • We have deployed our initial versions in test, integration, and DogFood environments. This led to early detection of bugs and helped fix them before they got to work.

  • After completing the code, we deployed the .NET Core assembly to a single production system in a scaling unit. The scaling unit is a load balanced VM pool.

  • At a scaling unit of ~ 100 machines, where 99 machines were still running our existing .NET Framework assembly and only 1 machine had the new .NET Core assembly installed.

  • All ~ 100 cars in this scale block receive the exact type and amount of traffic. We then compared the status codes, error counts, functional scenarios, and performance of one machine with the other 99 machines to detect anomalies.

  • We wanted this single machine to behave functionally like the other 99 machines, but with a much higher performance / throughput, which is what we observed.

  • We’ve also “redirected” traffic from live production devices (running .NET Framework build) to devices running .NET Core to compare and contrast as above.

  • Once we achieved functional equivalence, we began to increase the number of scale units running on .NET Core and gradually expanded them to a whole datacenter.

  • After migrating the entire datacenter, the final step was to gradually roll out globally to all Azure datacenters where the Azure AD Gateway service is present. The migration is complete!

Received knowledge

  • ASP.NET Core is RFC attentive. This is a very good feature as it promotes good practice. However, classic ASP.NET and .NET Framework were more forgiving, which causes some backward compatibility issues:

  • HttpClient on .NET Core previously only supported ASCII values ​​in HTTP headers.

  • Forms and cookies that do not comply with RFCs result in validation exceptions. Therefore, we created “fallback” parsers using the classic ASP.NET source code to maintain backward compatibility for clients.

  • In method FileBufferingReadStream‘s CopyToAsync() performance degradation was observed due to multiple 1-byte copies of an n-byte stream. This issue was addressed in .NET 5.0 by choosing the default 4K buffer size: https://github.com/dotnet/aspnetcore/issues/24032.

  • Remember the classic ASP.NET quirks:

    • Space characters are auto-trimmed:

foo.com/oauth? client = abc trims to foo.com/oauth?client=abc in classic ASP.NET.

  • Over time, clients / downstream services became dependent on these trims, and ASP.NET Core does not automatically trim.

So we had to remove whitespace (auto-trim) to mimic the classic ASP.NET behavior.

  • Heading Content-Type automatically generated if the volume is missing:

When the response size is greater than zero bytes, but the header Content-Type missing, classic ASP.NET generates standard header Content-Type:text/html… ASP.NET Core not generating header Content-Type is forced by default, and clients who believe that the header Content-Type always appears in the answer, begin to experience problems. We mimicked the classic ASP.NET behavior by adding a default Content-Type when not present in downstream services.

Future

The move to .NET Core resulted in a doubling of the bandwidth of our service, and it was a great decision. Our journey through .NET Core won’t end with the transition. In the future, we are considering the following options:

The translation of the material was prepared within the framework of the specialization “Network Engineer”… This curriculum is suitable for those planning to take up the network engineering profession from scratch and prepare for the CCNA industrial certification.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *