Schematically, in simple words about the internal structure of PHP (Zend Engine, OPCache, JIT)

PHP's internal structure in simple terms

PHP's internal structure in simple terms

Introduction

This post is aimed at inexperienced PHP specialists. This information will not make you a better programmer. Expected benefits:

  • I feel better cognitively and mentally when the “magic” of what I'm working with is reduced. Maybe you do too.

  • Perhaps articles on Habr will scare you off a little less often

I will explain using 4 examples – each one is only a little more complicated than the previous one.

Example 1: Running a program written in a compiled language

Let's start with the basics. You know that programming languages can be compiled or interpreted (this division is very arbitrary, and even within the framework of this post the arbitrary nature will be visible in the last example).

PHP is interpreted.

Nevertheless, let us consider in the first example the operation of an application written in a compiled language.

If the language is compiled, it means that the written program can be compiled immediately into machine codewhich can be directly executed by the computer processor.

Here is a simple diagram of how a Go program works (the binary code is fictitious, for illustration purposes only):

Example 1: Go

Example 1: Go

First we have the Go “source code” file hello-world.go:

package main

import "fmt"

func main() {
    fmt.Println("hello world")
}

The processor cannot run such code on its own – first we need to convert it into machine code.

Process #0 – Compilation to machine code

Number zero, because we will run the script only in the next action. But without the current one, there is no way.

To get the machine code of the program, run compiler Go by running the command:

go build hello-world.go

Compiler – is a program-converter from one programming language to another. In practice, almost always, from a higher-level language (i.e. usually closer to human understanding) to a low-level one (i.e. closer to computer understanding).

As a result of compilation, a file appears in the same folder hello-worldit has a binary code (zeros and ones) – that's it machine code.

IN machine code commands and data for execution on a specific processor have already been collected. For example, the machine code of the same program for x86 and ARM will differ.

Process #1 – Machine Code Execution

The finished machine code can now be executed directly. Enter the command:

./hello-world

It turns out that to run our script after one-time compilations just one action is enoughrunning machine code.

Example 2: Running a PHP script without OPCache and JIT (i.e. running PHP before version 5.5)

Let's get back to PHP – in interpreted languages ​​it is implied that the program will be executed not immediately from the machine code. In the case of PHP – it is the source code that is launched.

This means that every launch The program system must analyze the source code and convert it into code that is understandable to the processor (i.e., into machine code).

Here is a schematic representation of the entire sequence of work of the PHP script without included OPCache And JIT (We will consider each of them separately in the next two examples).

Example 2: PHP (no OpCache & no JIT)

Example 2: PHP (no OpCache & no JIT)

We have the usual PHP “source code” (file hello-world.php):

<?php
echo "Hello world";

Again, let's go in order, let's look at what processes occur by running the command:

php hello-world.php

Process #1 – Compilation to bytecode
First, the source code is processed Zend Compiler – this is PHP compiler. The first of two main components Zend Virtual Machine.

Unlike the Go compiler discussed above:

  • The job of the PHP compiler is to transform the source code not into machine code, but into an intermediary code – byte code;

  • compilation process happens every time the program is started (instead of just once – before the program is launched, as in the Go example)

You can read more about the PHP compilation process in this post on Habr.

In the case of PHP this byte code named PHP OPCode.

Byte code — lower level than source code, containing a set of commands for the interpreter (more about the interpreter in the next paragraph). Bytecode cannot be executed directly by the processor.

To see the result of the compiler's work – the bytecode itself – we execute the command (detailed article about getting PHP bytecode):

php -d opcache.opt_debug_level=0x20000 -d opcache.enable_cli=1 hello-world.php

We get:

$_main:
; (lines=3, args=0, vars=0, tmps=1)
; (after optimizer)
; /hello-world.php:1-2
0000 EXT_STMT
0001 ECHO string("Hello World")
0002 RETURN int(1)
  • At first we see $_main: — means that the following lines refer to the function mainThe appearance of such a function in the bytecode for the global scope of PHP is an interesting historical feature that has come down from other languages;

  • The next 3 lines start with ; — this is how comments are designated. One of the purposes is for debug information;

  • The last 3 lines are the actual code of our application, which will be executed by the virtual machine in the next step.

Process #2 – Bytecode Execution

Performs Zend Executor. This is PHP interpreterthe second of two main components Zend Virtual Machine.

He converts bytecode into machine code.

Process #3 – Machine Code Execution

The interpreter passes the output to the processor in portions for execution. machine code. We get what we want Hello world.

The big picture

Of course, there will be a performance penalty between this approach and the approach from the previous Go example, but there are also advantages – both of which will not be discussed in this post.

Already in this example we see that each launch of a PHP script is associated with the use of a virtual machine Zend Virtual Machine (or Zend Engine). It is responsible for processes #2 and #3 in the scheme (bytecode compilation and bytecode execution). More details: by link.

Example 3: Running a PHP script with OPCache, but without JIT (i.e. running PHP 5.5 – 7.4)

To improve performance, the PHP guys took care of the following:

To avoid compiling the source code into bytecode over and over again for all sections of code each time the script is run, the compiled bytecode is placed once in a separate cache. OPCache (he is also – OPCode Cache). Unfortunately, it will not be possible to store all sections of our program code in OPCache for various reasons.

The extension is enabled by default starting with PHP 5.5, but can be installed on earlier versions.

Example 3: PHP 5.5 (OpCache)

Example 3: PHP 5.5 (OpCache)

Let's compare it with the previous diagram:

  • 2 additional actions appeared (look in cache and write to cache)

  • A fork appeared immediately after the first act

  • It can be seen that this approach allows us to skip the compilation stage for specific sections of code that were already in the cache.

Of course, such caching under the hood is not easy to implement – many thanks to the guys.

Very detailed about the expansion in article on habr.

Example 4: Running a PHP 8 Script

Finally got to the latest versions of PHP!

If you look closely at the logic of the previous diagram, the question may arise: why are the areas cache the source code not in bytecode, but directly in machine code? That's exactly what he does. JIT.

An unobvious feature: JIT is an add-on to OPCache. That is, without OPCache enabled (it is enabled by default), JIT will not work. This can be understood from the last diagram presented:

Example 4: PHP 8 (OpCache & JIT)

Example 4: PHP 8 (OpCache & JIT)

Let's consider the most optimistic path of our script execution, when the source code section was found first in the OpCode Cache, and then in the JIT buffer (i.e. the leftmost path in the diagram) – we have come closer to the principle of running the script from the very first example in Go. In other words, at the time of running the code section, the machine code for it had already been compiled.

This may seem like a panacea – we're getting closer to the performance of compiled languages ​​- but it's not.
There is an effective limit to the size of the JIT buffer – so profiling (i.e. analysis of execution) of the opcodes is performed (in the diagram, action #5) – the Zend virtual machine decides whether it makes sense to store this machine code in the buffer.

Read more about JIT at php.watch

Conclusion

Congratulations, you've made it to the end! Hopefully, the PHP execution process is now clearer to you, and the next articles you read on the topic won't seem so confusing.

It seems to me that these ideas are quite sufficient to quickly get your bearings on how most other popular languages ​​work.

I would be glad to see someone in my little PHP telegram blog

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *