Today we’re not talking about basic email authentication, ladies and gentlemen. The (web hosting) dream of the 90s is still alive at Fraudmarc. Enjoy a sketch of Portlandia (using a VPN) and then we’ll dive into the origins of our server company.
Moving bits around the network has been our business for decades. Fraudmarc was created from a hosting company with geographically redundant servers with high uptime. This gives us the edge to operate a global infrastructure around the clock that processes critical data for messaging.
This site is made in WordPress, and until recently, we have relied on the best and most expensive WP provider company. But our wp-admin interface has been plagued by poor performance for several months, despite the best efforts of their friendly tech support team. If you were asking me, I would assume that they fell into the classic trap of redundant infrastructure to drive higher margins in the increasingly competitive WordPress hosting space. I understand that, but thanks, don’t.
Can’t slow down, can’t get distracted
Hundreds of businesses are delegating management of Fraudmarc’s email policy so they can focus on what only they can do. They focus on maximizing earnings, and delegate email to us.
Likewise, Fraudmarc does a great job with email authentication (SPF, DKIM and DMARC), which is why we are as focused as possible on providing this service to our clients. We cannot be distracted by managing our own web hosting servers. We use many other web frameworks, so this temptation is always present, especially when our old WP hosting provider started to lose ground. And this is where SpinupWP comes in, a new WP provider company that allows us to provide the underlying servers while they do all the dirty work with WP. I am happy to promote other startups that are making the network better, so there is a $ 50 promo at the bottom of this article to motivate you to try their services.
Currently we love ARM processors and use them for many API, DB and DNS infrastructures in Fraudmarc, but you have to be careful because they are sometimes much slower than the good old proven x86 processors.
WordPress is the only place where Fraudmarc uses php, so I decided to search for existing php benchmarks on arm64. PHP on Arm64 from Amazon AWS was a good start, but missed some important points. Their post was written before php 8 was released and inadvertently (?) Skipped single-threaded tests that show arm64 is 50% slower than x86. You end up reading this article thanks to php on ARM, and I’ll explain why.
We will benchmark with two official php scripts: bench and micro bench… Our testing will be done on Ubuntu 20.04 on x86_64 c5.large and arm64 c6g.large instances. Each instance has two virtual cores (vCPU) and 4 GB of memory. Php 7 is installed from the default Ubuntu repository and php 8 is installed from the ondrej / php PPA. Here are the specific versions we used:
PHP 7.4.3 (cli) (built: Oct 6 2020 15:47:56) ( NTS )
Copyright (c) The PHP Group
Zend Engine v3.4.0, Copyright (c) Zend Technologies
with Zend OPcache v7.4.3, Copyright (c), by Zend Technologies
PHP 8.0.3 (cli) (built: Mar 5 2021 07:54:13) ( NTS )
Copyright (c) The PHP Group
Zend Engine v4.0.3, Copyright (c) Zend Technologies
with Zend OPcache v8.0.3, Copyright (c), by Zend Technologies
Like SPF compression, this WP site makes heavy use of caching at hundreds of network edge points. This means that requests are served very quickly from locations close to the visitors (or mail receiving servers in the case of SPF compression) and continue to work even if the origin server goes down. The servers we are testing have two vCore, which is enough for our needs as the vast majority of requests are handled by peripheral nodes without having to run any php on the origin server.
We will test the servers using 1-4 parallel benchmarks, as this should correspond to the loads of the requested equipment at the level of 50% -200%. Obviously, we cannot use more than 100% of the hardware (two vCores), but due to the nature of internet traffic prone to peak load, there is always more work in the queue than the server can handle. In the above AWS post, benchmarks were only performed at 100% load of the requested hardware, which is unlikely to ever happen in real life. It is more interesting for me to observe underloaded and overloaded cases when the number of tasks does not ideally coincide with the number of virtual cores. In particular, I want to see the multitasking penalty when the OS is doing more tasks than it has virtual cores.
Average execution time (in seconds) over multiple runs. 2x, 3x and 4x represent the number of benchmarks running simultaneously.
The green bar shows that AWS did not show: single-threaded php is 50% slower on instances with arm64 “Graviton2”. Well it’s. But that’s not all.
The white bar shows what happened when we ran two benchmarks at the same time. Under full load on both vCores, the arm instance was 20% faster. Arm continued to lead as the number of parallel benchmarks increased.
Micro_bench.php showed a similar story:
Let’s visualize these results with the same column colors.
I believe that what we see here is the difference when concurrent multithreading and full load on the processor cores. Desktop vendors and server companies tend to equate threads and kernels. A typical cloud virtual core (vCPU) is just a thread, not a whole physical core. For comparison, imagine that two threads are using the same processor core. This is roughly how it works in x86 environments. Arm servers differ in this. C6g runs on AWS Graviton2 arm64 processor without SMT, so the virtual core is a real CPU core.
Php on arm64 was 50% slower than php on x86 when comparing physical cores. I expect this to improve over time as AWS and other developers continue to release arm-related php patches. Php on arm is already fast enough for that to matter. Comparing virtual cores, arm64 was faster and significantly cheaper than x86. Compared to x86, AWS charges about 20% less for vCore on arm64. Their arm64 virtual core is a complete physical core, whereas x86 virtual cores are just threads sharing a single physical processor core. Let me introduce this to you visually:
x86 (green arrow) was faster than arm (red arrow) in terms of physical cores. Each of our test instances had two virtual cores, which means only one physical core on x86 and two cores on arm. At the minimum level of parallelism, one x86 core runs php the fastest.
Look to the right of the red arrow and you will notice where the arm actually expands. Because the arm vCores are full CPU cores, an instance with two arm vCores ran two simultaneous benchmarks at the same speed.
arm64 is faster when comparing virtual cores
Remember what happened when we launched two or more tasks on these two vCPU instances?
The green arrows show how far the actual processor cores on arm outperform the shared (multithreaded) cores on x86.
In one sentence: this site now works on arm
On the notoriously expensive AWS, ARM vCores were 20% cheaper than x86 vCores, not to mention that redundant vCore allocation was required to match the physical cores of an equivalent ARM-based instance. There is something very natural about decoupling server selection from WP management, especially after months of poor performance from an expensive WP platform. Great Query Monitor plugin shows how fast our admin pages load.
Our source server is now running WordPress (php) on an arm processor of our choice, but run by real WP experts, so we still focus on email innovations like universal SPF – a next generation solution that overcomes the limitations of DNS lookups and other nuances of SPF records. By adding a generic SPF string to the beginning of any SPF record, you can easily and almost instantly increase the speed of email delivery.
If you are on ~ 40% of the network that is powered by WordPress, then you should consider moving to arm, because it is not often possible to improve performance while lowering costs. We haven’t found any WP hosting providers providing such solutions, but it’s only a matter of time. Whether they will save on our end users is anyone’s guess. It looks like we were the first SpinupWP client to use arm86 instead of x86_64, but Ubuntu has excellent arm support, so our only problem was that backups didn’t work initially. Hopefully someone at SpinupWP sees this and updates their install script to install an architecture-appropriate version of rclone.
And here is the referral link to $ 50 to SpinupWP account in case you would like to support another small startup that does important things.
Learn more about the course “PHP developer”…
Watch the webinar “PHP Ecosystem”.