Phantom double in firmware for Cortex-M* cores
Recently, many microcontrollers based on ARM Cortex-M* cores have appeared that support the hardware implementation of floating point mathematics (FPU). Basically FPUs work with single precision (float) and it is quite enough to work with the signals received from the ADC. FPU allows you to forget about discretization problems and integer overflow problems. FPU is fast – all math operations with single float
except for division and taking the root, occupy on the Cortex-M4F one beat. Therefore, after switching to Cortex-M4F, we breathed freely and began to write mathematics on float
. How surprised we were when we found in the compiled code mathematical operations on double
with software, very slow emulation.
The article tells how discover And to correct the presence of double in firmware, where the kernel supports the type in hardware float
but does not support double
.
The work is carried out in the IAR Embedded Workbench environment using real C code as an example.
The problem of double numbers in ARM Cortex-M*
Is it a big problem that some of the math was done with software emulation double
instead of hardware supported float
? Here it is shown that double
slower in 27 timesA Here smaller numbers (I counted the difference in 7-15 times). Agree that to lose in the speed of mathematical algorithms by an order of magnitude due to erroneous use double
where enough float
, sad enough. At the same time, as a solution to the problems of phantom double
sometimes suggested just don’t use FPU:
My mantra is not to use any floating point data types in embedded applications,
or at least to avoid them whenever possible: for most applications they are not
necessary and can be replaced by fixed point operations. Not only floating point
operations have numerical problems, but they can also lead to performance problems…
Well, we are afraid performance problems let’s not and learn phantom double
avoid.
We are interested in Cortex-M cores, in which there is an FPU for single precision, but not for double precision. Below is table of Cortex-M cores. The presence of an FPU for cores marked with ✅ is optional. Kernels with this block are often marked with a suffix F: such as Cortex-M4F.
Kernel version | FPU (half precision) | FPU (single precision) | FPU (double precision) |
---|---|---|---|
Cortex-M0 | ❌ | ❌ | ❌ |
Cortex-M0+ | ❌ | ❌ | ❌ |
Cortex-M1 | ❌ | ❌ | ❌ |
Cortex-M3 | ❌ | ❌ | ❌ |
Cortex-M4 | ❌ | ✅ (Optional) | ❌ |
Cortex-M7 | ❌ | ✅ (Optional) | |
Cortex-M23 | ❌ | ❌ | ❌ |
Cortex-M33 | ❌ | ✅ (Optional) | ❌ |
Cortex-M35P | ❌ | ✅ (Optional) | ❌ |
Cortex-M55 | ✅ (Optional) | ||
Cortex-M85 | ✅ (Optional) |
Software emulation
The C standard requires support for floating point numbers, as float
and double
. How is code compiled for kernels without hardware support for fractional types? Each mathematical operation is replaced by a function call software emulation. Each compiler has its own implementation of such emulators. Any work with fractional types on kernels without an appropriate FPU will lead to the call of emulation functions! Moreover, if you do not separately enable the FPU support option in the compiler settings, then software emulation for floating point numbers will be used, even if they could be implemented in hardware.
Consider the case of emulating floating point numbers.
// Example for no-FPU device
const int N = 1000;
int half_of_N;
// 1) Inefficient way
half_of_N = 0.5 * N; // C standard treats 0.5 as double literal
// 2) Optimized way
half_of_N = N / 2;
Multiplying an integer by 0.5
will take place in four stages:
convert N to
double
type, corresponding function call..._i2d()
multiplication
double
constants0.5
to converted todouble
N, i.e. emulating function call –..._dmul()
converting the result back to
int
– function..._d2i()
assigning the result to half_of_N.
A simple multiplication by a fractional number on a kernel without an FPU resulted in at least three third-party functions being called. Moreover, implementations of these functions were linked into the firmware, which increases its size by roughly 1 kB. So that the firmware does not increase, you need to find All places where software emulation is used and remove them.
Checking for double emulation
So, we decided to check if there is a random use of double in our firmware. Sometimes the compiler itself can diagnose (link, see comments). So, for example, gcc has a special flag -Wdouble-promotion
which can alert the programmer to an implicit conversion float
V double
. Also fractional constants can be interpreted as float
if gcc pass the flag -fsingle-precision-constant
. IN IAR Embedded Workbench there is no built-in tool to solve this problem. You will have to reverse the process of assembling the firmware from the end.
Open the project and go to its settings.
Select a sub-item
Linker->List
and check the checkboxGenerate linker map file
.Collect the project. A *.map file should appear in the output folder in the project tree.
Open it and scroll down to the “ENTRY LIST” section. In it you will find the names of all the functions that are used in the code. including all kinds of
..._f2d()
,..._dmul()
and others.It is enough to search for conversion functions
f2d
,ui2d
Andl2d
. In my case, I found the following functions.
This means that the phantom double
is in the firmware. The screenshot shows that the list contains many more functions for type conversion and software emulation of floating point mathematical operations. At the time of getting rid of the last use of the type double
they disappear together.
Search for double in code.
As a reminder, we are optimizing code for a kernel with an FPU that only supports float
. Cortex-M4F for example.
First step – search by code text
Skip the uninteresting search in all project files (Ctrl + Shift + F) words double
. It turned out that somewhere the programmer’s hand trembled and he nevertheless wrote it.
Built-in IAR EWARM search in project files for the word doubleAnd here you can look for a common mistake:
// ST5918L3008 parameters are used as default values.
const static StepperParameters_t StepperParametersDefault =
{
.L = 0.0076 * 0.1f, // H -> 0.1H
.R = 2.2 * 0.1f, // Ohm -> 0.1Ohm
.Fm = 0.009 * 100.0f, // Wb -> cWb
};
static void StepperFOCSensorless_PIDReconfigureCallback(void)
{
/*
* Update PID if SPID was changed
* 1. Kp, Ki, Kd units in firmware: mA/rad, mA/(rad*T) and mA*T/rad, where T is period.
* 2. Kp, Ki, Kd units in GUI: A/rad, A/(rad*s), A*s/rad but XiLab multiplies them on 0.001 before sending.
* 4. Parameters can be changed during movement.
*/
// assign new values
PositionRegulator_param.Kp = 1e6 * BCDFlashParams.SPID.Kpf; // rad -> mA
PositionRegulator_param.Kd = 1e6 * BCDFlashParams.SPID.Kdf * STEPPER_FOC_PWM_FREQ ; // rad/T -> mA
PositionRegulator_param.Ki = 1e6 * BCDFlashParams.SPID.Kif * STEPPER_FOC_PWM_PERIOD_FLOAT; // rad T -> mA
}
The C language is insidious in that, by default, all fractional numbers are interpreted as double
. For example, 0.0076
– This double
. If we multiply it by 0.1f
then the resulting type is chosen as the more extended of the two, i.e. double
. Even more insidiousness is manifested when working with the exponential form: the number 1e6
is not integer, its type is too double
. Use the suffix f
/F
at the end of numeric constants to explicitly specify the type float
. For example 1e6f
.
How to find the wrong constants in the firmware? To do this, we use regular expression search, which is built into the IAR Embedded Workbench. Search expression [^\.\d\w#][0-9]+[\.eE][0-9]*[^\.\w0-9]
.
The regular expression looks for numbers that don’t start with letters, dots, or # and have a dot or exponent letter in the middle or end, followed by no dot, letter, or number. Such a regular expression gives a lot of hits in the comments to the code, which is filtered visually quite quickly. It also may not handle constants 3e-4
or 1e3L
having no type float
but this regular expression was enough for me.
Now the errors are more complicated:
/*
* right half of voltage cicle is inside current circle
* NOTE: if voltage circle is inside current circle (Imax > Iu + I0) then intersection point does not exists,
* but Iu^2 - Imax^2 - I0^ < -2*I0^2 -2*I0*Iu and Id < -I0 - Iu <= I0
*/
if (Id_limit < -I0)
{
// FW_VOLTAGE_ONLY: limit only voltage
if (fabs(Irq) < Iu)
{
pIr->d = -I0 + sqrtf(Iu * Iu - Irq * Irq);
if (pIr->d > 0)
{
pIr->d = 0.0f;
}
pIr->q = Irq;
return; // nolimit
}
else
{
pIr->d = -I0;
if (Irq > 0.0f)
{
pIr->q = Iu;
}
else
{
pIr->q = -Iu;
}
return; // limit
}
}
/*
* Main case: Id_limit is between located between -I0 and 0.
*/
/*
* Check if line Iq = Irq intersects the voltage circle
* then calculate intersection Id coordinate and check if it is is not greater then Id_limit
*/
if (fabs(Irq) >= Iu || (Id_fw = -I0 + sqrtf(Iu * Iu - Irq * Irq)) < Id_limit)
{
/*
* LIMIT_MAIN_CASE: apply maximal current in Irq direction
*/
pIr->d = Id_limit;
pIr->q = sqrtf(Imax * Imax - Id_limit * Id_limit);
if (Irq < 0)
{
pIr->q = -pIr->q;
}
return; // limit
}
In this code, all fractional variables are of type float
But double
still occurs. And it’s about function calls. Try to guess what function it is.
In the meantime, a little theory of naming the functions of the mathematical library. For the functions of taking the root, sine, cosine, and even for the modulus of a number, there are functions that accept and return the type double
but there are faster alternatives that work with the type float
. They have the suffix f
at the end.
// Double functions
sin(x); // double
cos(x); // double
sqrt(x); // double
fabs(x) // double
// Float functions
sinf(x); // float alternative
cosf(x); // float alternative
sqrtf(x); // float alternative
fabsf(x); // float alternative
In our case, it was fabs()
. At first glance it may seem, and it seemed to me that fabs()
already returns float
. This is wrong. fabs
means floating point abs returning double
. A fabsf
– floating point abs for float, that’s already float
function.
How to find functions that accept double type? I can only advise you to search for mathematical functions: sqrt, cos, sin, fabs … Replace them with float
analogues: sqrtf, cosf, sinf, fabsf…
The first stage has been completed. Compile the project and re-diagnose. In our case, diagnostics showed the presence of double. So we dug deeper.
Second step – searching through object files
If the first step did not save you from double
, then you need to at least narrow the search space. To do this, make a list of emulation functions found in the diagnostics, and then search for the names of these functions in object files, *.o
. The idea here is that the linker added them to the project because they are referenced in one or more object files. The link is written by the name of the function, and we know it. We used Double Commander to search. It is an open source, cross platform clone of Total Commander.
In one of the object files there is a use of functions from *.map
file (if not, then you are linking some other binary code, for example, libraries; in our case, this was not the case). Corresponding *.c
file explicitly or implicitly requires the use of double
. Unfortunately, the only thing left to do is look through the code line by line and look for problems with your eyes. Still, it’s better than nothing 🙂
For example, searching through object files told me that double
used in file bldc.c
. Carefully studying the code, I came across an unfamiliar library function arm_inv_clarke_f32
. Going into its definition, I found the problem:
Turned out to be an implementation error. library functions in arm_math.h from CMSIS, revision: v1.4.4. Here we see the multiplication familiar from the example above by 0.5
. This double
type! I had to patch the system library.
conclusions
What is done in GCC with one or two flags is done in IAR by searching with regular expressions, as well as searching in binary files, and then scrutinizing the code. And the most difficult mistake was found in the CMSIS library. That is, even if you write error-free code, this does not guarantee the absence of phantom doubles. They can be inherited from libraries and slow down the work of mathematical algorithms by an order of magnitude.
The result was achieved: with the help of the actions outlined in this article, it was possible to get rid of double
in real firmware for Cortex-M4F. All code examples are real. The job took about 6 hours. The number of object files in the firmware is about 80.
Finally, an example of a disassembler of a code fragment before and after the changes.
Authors: Zapunidi Sergey, Shampletov Nikita