JavaScript call stack and more magic

In early April, an article “JavaScript: Call stack and the magic of its size” was published on Habré – its author came to the conclusion that each stack frame takes (72 + 8 * number of local_variables) bytes: “It turns out that we calculated everything correctly and can say that the size of an empty ExecutionStack in Chrome is 72 bytes, and the size of the callstack is a little less than one megabyte. Great job! “

To start, let’s slightly change the code used by AxemaFr for experiments:

{let i = 0;

const func = () => {
  i += 1.00000000000001;
  func();
};

try {
  func();
} catch (e) {
  console.log(i);
}}

Instead of 1, now at each step we add a little more, and as a result, instead of 13951, we get 12556.000000000002 – as if a local variable was added to the function!

Let’s repeat the questions Senior Frontend Developer AxemaFr asked: “Why is it so? What changed? How to understand, looking at the function, how many times it can be executed recursively ?! “

Cooking tools

On the Chrome command line, you can pass arguments to the JS engine; in particular, you can change the stack size from 984 KB to any other key --js-flags=--stack-size=

To figure out how much stack each function requires, the key will help us --print-bytecodealready mentioned earlier. It was not mentioned that the debug output is sent to stdout, which Chrome under Windows stupidly does not have, because it is compiled as a GUI application. The fix is ​​easy: make a copy of chrome.exe and fix the byte in your favorite hex editor 0xD4 from the meaning 0x02 on the 0x03 (for those who are not friends with the hex editor, this byte will help fix Python script). But if you’re reading this article right now in Chrome, and just run the patched file – let’s say you named it cui_chrome.exe – then a new window will open in an existing browser instance, and the argument --js-flags will be ignored. To start a new instance of Chrome, you need to pass it some new one --user-data-dir:

cui_chrome.exe --no-sandbox --js-flags="--print-bytecode --print-bytecode-filter=func" --user-data-dir=WindowsTemp

Without --print-bytecode-filter you will be drowned in kilometer bytecode dumps of functions built into Chrome.

After launching the browser, open the developer console and enter the code used by AxemaFr:

{let i = 0;

const func = () => {
  i++;
  func();
};

func()}

Before you hit Enter, a dump will appear in the console window behind Chrome:

[generated bytecode for function: func (0x44db08635355 <SharedFunctionInfo func>)]
Parameter count 1
Register count 1
Frame size 8
   36 S> 000044DB086355EE @    0 : 1a 02             LdaCurrentContextSlot [2]
         000044DB086355F0 @    2 : ac 00             ThrowReferenceErrorIfHole [0]
         000044DB086355F2 @    4 : 4d 00             Inc [0]
         000044DB086355F4 @    6 : 26 fa             Star r0
         000044DB086355F6 @    8 : 1a 02             LdaCurrentContextSlot [2]
   37 E> 000044DB086355F8 @   10 : ac 00             ThrowReferenceErrorIfHole [0]
         000044DB086355FA @   12 : 25 fa             Ldar r0
         000044DB086355FC @   14 : 1d 02             StaCurrentContextSlot [2]
   44 S> 000044DB086355FE @   16 : 1b 03             LdaImmutableCurrentContextSlot [3]
         000044DB08635600 @   18 : ac 01             ThrowReferenceErrorIfHole [1]
         000044DB08635602 @   20 : 26 fa             Star r0
   44 E> 000044DB08635604 @   22 : 5d fa 01          CallUndefinedReceiver0 r0, [1]
         000044DB08635607 @   25 : 0d                LdaUndefined
   52 S> 000044DB08635608 @   26 : ab                Return
Constant pool (size = 2)
Handler Table (size = 0)
Source Position Table (size = 12)

How will the dump change if the line i++; replaced by i += 1.00000000000001;?

[generated bytecode for function: func (0x44db0892d495 <SharedFunctionInfo func>)]
Parameter count 1
Register count 2
Frame size 16
   36 S> 000044DB0892D742 @    0 : 1a 02             LdaCurrentContextSlot [2]
         000044DB0892D744 @    2 : ac 00             ThrowReferenceErrorIfHole [0]
         000044DB0892D746 @    4 : 26 fa             Star r0
         000044DB0892D748 @    6 : 12 01             LdaConstant [1]
         000044DB0892D74A @    8 : 35 fa 00          Add r0, [0]
         000044DB0892D74D @   11 : 26 f9             Star r1
         000044DB0892D74F @   13 : 1a 02             LdaCurrentContextSlot [2]
   37 E> 000044DB0892D751 @   15 : ac 00             ThrowReferenceErrorIfHole [0]
         000044DB0892D753 @   17 : 25 f9             Ldar r1
         000044DB0892D755 @   19 : 1d 02             StaCurrentContextSlot [2]
   60 S> 000044DB0892D757 @   21 : 1b 03             LdaImmutableCurrentContextSlot [3]
         000044DB0892D759 @   23 : ac 02             ThrowReferenceErrorIfHole [2]
         000044DB0892D75B @   25 : 26 fa             Star r0
   60 E> 000044DB0892D75D @   27 : 5d fa 01          CallUndefinedReceiver0 r0, [1]
         000044DB0892D760 @   30 : 0d                LdaUndefined
   68 S> 000044DB0892D761 @   31 : ab                Return
Constant pool (size = 3)
Handler Table (size = 0)
Source Position Table (size = 12)

Now let’s figure out what has changed and why.

Exploring examples

All V8 opcodes are described in github.com/v8/v8/blob/master/src/interpreter/interpreter-generator.cc

The first dump is decoded like this:

LdaCurrentContextSlot [2]           ; a := context[2]
ThrowReferenceErrorIfHole [0]       ; if (a === undefined)
                                    ;   throw("ReferenceError: %s is not defined", const[0])
Inc [0]                             ; a++
Star r0                             ; r0 := a
LdaCurrentContextSlot [2]           ; a := context[2]
ThrowReferenceErrorIfHole [0]       ; if (a === undefined)
                                    ;   throw("ReferenceError: %s is not defined", const[0])
Ldar r0                             ; a := r0
StaCurrentContextSlot [2]           ; context[2] := a
LdaImmutableCurrentContextSlot [3]  ; a := context[3]
ThrowReferenceErrorIfHole [1]       ; if (a === undefined)
                                    ;   throw("ReferenceError: %s is not defined", const[1])
Star r0                             ; r0 := a
CallUndefinedReceiver0 r0, [1]      ; r0()
LdaUndefined                        ; a := undefined
Return

The last argument of the opcodes Inc and CallUndefinedReceiver0 specifies a feedback slot, in which statistics about used types are accumulated for the optimizer. This does not affect the semantics of the bytecode, so today we are not at all interested.

Under the dump there is a postscript: “Constant pool (size = 2)” – and indeed we see that two lines are used in the bytecode – "i" and "func" – for substitution in the exception message when symbols with such names are undefined. There is a postscript above the dump: “Frame size 8” – in accordance with the fact that the function uses one interpreter register (r0).

Our function’s stack frame consists of:

  • the only argument this;
  • return addresses;
  • number of arguments passed (arguments.length);
  • references to constant pool with used strings;
  • links to context with local variables;
  • three more pointers needed by the engine; and finally
  • space for one register.

Total 9 * 8 = 72 bytes, as Signor AxemaFr calculated.

Of the seven listed terms, theoretically, three can change – the number of arguments, the presence of a constant pool, and the number of registers. What did we get in the version with 1.00000000000001?

LdaCurrentContextSlot [2]      ; a := context[2]
ThrowReferenceErrorIfHole [0]  ; if (a === undefined)
                               ;   throw("ReferenceError: %s is not defined", const[0])
Star r0                        ; r0 := a
LdaConstant [1]                ; a := const[1]
Add r0, [0]                    ; a += r0
Star r1                        ; r1 := a
                               ; ...дальше как раньше

First, the added constant took third place in the constant pool; secondly, one more register was needed to load it, so the function’s stack frame grew by 8 bytes.

If you do not use named symbols in the function, then you can do without the constant pool. On the github.com/v8/v8/blob/master/src/execution/frame-constants.h#L289 the format of the V8 stack frame is described and it is indicated that when the constant pool is not used, the size of the stack frame is reduced by one pointer. How can you be sure of this? At first glance, it seems that a function that does not use named symbols cannot be recursive; but take a look:

{let i = 0;

function func() {
  this()();
};

const helper = () => (i++, func.bind(helper));

try {
  helper()();
} catch (e) {
  console.log(i);
}}

[generated bytecode for function: func (0x44db0878e575 <SharedFunctionInfo func>)]
Parameter count 1
Register count 1
Frame size 8
   37 S> 000044DB0878E8DA @    0 : 5e 02 02 00       CallUndefinedReceiver1 <this>, <this>, [0]
         000044DB0878E8DE @    4 : 26 fa             Star r0
   43 E> 000044DB0878E8E0 @    6 : 5d fa 02          CallUndefinedReceiver0 r0, [2]
         000044DB0878E8E3 @    9 : 0d                LdaUndefined
   47 S> 000044DB0878E8E4 @   10 : ab                Return
Constant pool (size = 0)
Handler Table (size = 0)
Source Position Table (size = 8)

The goal – “Constant pool (size = 0)” – has been achieved; but the stack overflow, as before, occurs through 13951 calls. This means that even when the constant pool is not used, the function’s stack frame still contains a pointer to it.

Will it be possible to achieve a smaller stack frame size than the minimum value calculated by AxemaFr? – yes, if no register is used inside the function:

{function func() {
  this();
};

let chain = ()=>null;
for(let i=0; i<15050; i++)
  chain = func.bind(chain);

chain()}

[generated bytecode for function: func (0x44db08c34059 <SharedFunctionInfo func>)]
Parameter count 1
Register count 0
Frame size 0
   25 S> 000044DB08C34322 @    0 : 5d 02 00          CallUndefinedReceiver0 <this>, [0]
         000044DB08C34325 @    3 : 0d                LdaUndefined
   29 S> 000044DB08C34326 @    4 : ab                Return
Constant pool (size = 0)
Handler Table (size = 0)
Source Position Table (size = 6)

(In this case, the call chain of 15051 already results in “RangeError: Maximum call stack size exceeded”.)

Thus, the conclusion of Signor AxemaFr that “Size of empty ExecutionStack in Chrome is 72 bytes”, successfully refuted.

Refining predictions

We can argue that the minimum stack frame size for a JS function in Chrome is 64 bytes. To this you need to add 8 bytes for each declared formal parameter, another 8 bytes for each actual parameter in excess of the number of declared ones, and another 8 bytes for each used register. A register is allocated for each local variable, for loading constants, for accessing variables from an external context, for passing actual parameters during calls, etc. It is hardly possible to determine the exact number of used registers from the source code in JS. It is worth noting that the JS interpreter supports an unlimited number of registers – they are not related to the registers of the processor on which the interpreter is running.

Now it’s clear why:

  • adding an unused formal parameter (func = (x) => { i++; func(); };) consumes as much memory as an additional local variable;
  • passing undeclared actual parameter (func = () => { i++; func(1); };) consumes twice as much memory as the additional local variable – because the transfer required an additional register:
    [generated bytecode for function: func (0x44db08e12da1 <SharedFunctionInfo func>)]
    Parameter count 1
    Register count 2
    Frame size 16
       34 S> 000044DB08E12FE2 @    0 : 1a 02             LdaCurrentContextSlot [2]
             000044DB08E12FE4 @    2 : ac 00             ThrowReferenceErrorIfHole [0]
             000044DB08E12FE6 @    4 : 4d 00             Inc [0]
             000044DB08E12FE8 @    6 : 26 fa             Star r0
             000044DB08E12FEA @    8 : 1a 02             LdaCurrentContextSlot [2]
       35 E> 000044DB08E12FEC @   10 : ac 00             ThrowReferenceErrorIfHole [0]
             000044DB08E12FEE @   12 : 25 fa             Ldar r0
             000044DB08E12FF0 @   14 : 1d 02             StaCurrentContextSlot [2]
       39 S> 000044DB08E12FF2 @   16 : 1b 03             LdaImmutableCurrentContextSlot [3]
             000044DB08E12FF4 @   18 : ac 01             ThrowReferenceErrorIfHole [1]
             000044DB08E12FF6 @   20 : 26 fa             Star r0
             000044DB08E12FF8 @   22 : 0c 01             LdaSmi [1]
             000044DB08E12FFA @   24 : 26 f9             Star r1
       39 E> 000044DB08E12FFC @   26 : 5e fa f9 01       CallUndefinedReceiver1 r0, r1, [1]
             000044DB08E13000 @   30 : 0d                LdaUndefined
       48 S> 000044DB08E13001 @   31 : ab                Return
    Constant pool (size = 2)
    Handler Table (size = 0)
    Source Position Table (size = 12)
  • changing the added value in the previous example to 1.00000000000001 does not affect the size of the stack frame – because the same r1 is used both to load a constant and to pass an actual parameter.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *