Whetstone Benchmark – Historical Performance Test

I continue my work on reviewing old cross-platform performance tests, the first article is about an integer benchmark Dhrystone.

The test was developed by Harold Curnow (CCTA) in 1972 in Algol-60, followed by a Fortran implementation in 1973, and Roy Longbottom’s C implementation in 1996. The test is very simple and includes about 150 instructions with eight active loops, three of which are executed using procedure calls.

The dominant loop, which typically accounts for 30-50% of the time, performs floating point calculations through procedure calls. Performance estimates are expressed in millions of Whetstone operations per second (MWIPS). For better performance evaluation, the processor must have an FPU unit.

Whetstone Test Structure

The test consists of 8 sections:

  • N1 floating point: Test of arithmetic over floating point arrays;

  • N2 floating point: Test of arithmetic over floating point arrays, which are passed as parameters to the function;

  • N3 if then else: Conditional jump test;

  • N4 fixed point: Integer arithmetic;

  • N5 sin, cos etc.: Test of trigonometric functions;

  • N6 floating point: Test of calling procedures with passing parameters through pointers;

  • N7 assignments: Assignments;

  • N8 exp, sqrt etc.: Test of other mathematical functions (root, exponent, logarithm).

Advantages of the test:

  • Small amount of code;

  • It is possible to compare the results of processors of different architectures;

  • Easy to compile and port (but requires FPU).

Flaws:

  • Does not allow you to use all the capabilities of the processor;

  • The results are highly dependent on compiler optimization;

  • Placed in the processor cache (reduces the load on the memory subsystem).

Output of the Whetstone benchmark result

Benchmark output example (MWIPS is the main performance indicator)
##############################################

Whetstone Single Precision C Benchmark  amd64 _ optimized, Fri Jan 31 22:19:52 2020


Loop content                   Result              MFLOPS      MOPS   Seconds

N1 floating point      -1.12475013732910156       796.837               0.081
N2 floating point      -1.12274742126464844       627.771               0.721
N3 if then else         1.00000000000000000                5177.749     0.067
N4 fixed point         12.00000000000000000                3237.101     0.328
N5 sin,cos etc.         0.49911010265350342                  81.010     3.461
N6 floating point       0.99999982118606567       647.448               2.808
N7 assignments          3.00000000000000000                1752.687     0.355
N8 exp,sqrt etc.        0.75110864639282227                  57.663     2.174

MWIPS                                            3371.316               9.996
There is a multi-threaded variation of the test (Whetstone MP)

I ported the Whetstone benchmark to other programming languages: Python, JavaScript, PHP, Lua, C#, Java. The code can be viewed here: https://github.com/entityfx/entityfx-bench.

Table with test results for some processors
                      MWIPS  MFLOP  MFLOP  MFLOP   COS    EXP   FIXPT   IF    EQUAL
CPU              MHz            1      2      3    MOPS   MOPS   MOPS   MOPS   MOPS

AM386/387         40   5.68  0.928  0.884  0.673  0.461  0.275   2.36   2.16  0.638
80486DX2          66   15.3   4.92   3.59   2.38  0.501  0.320   6.18   5.91   5.32
AMD 5X86         100   25.0   7.70   6.04   4.05  0.806  0.522   9.52   9.37   7.83
Winchip C6       200   40.6   16.9   12.3   7.81   1.02   0.71   18.4   28.4   17.0
Pentium           75   48.2   19.6   12.6   7.39   1.86   1.12   11.9   14.6   16.7
Cyrix P150       120   53.5   14.1   11.6   7.55   2.07   1.36   18.3   30.0   12.9
IBM C6x86        150   66.1   17.3   14.3   9.32   2.56   1.68   22.6   36.9   15.9
Pentium          100   66.2   27.1   17.3   10.2   2.56   1.53   16.3   19.9   23.1
Apple G3         266   66.5   17.3   13.7   11.4   2.73   1.13   39.6   25.0   10.9
Pentium          120   79.5   32.4   20.8   12.2   3.07   1.83   19.5   24.1   27.7
Cyrix PR233M     188   88.0   24.3   19.5   12.8   3.28   2.16   28.5   48.4   23.4
Pentium          133   88.3   36.1   23.1   13.6   3.41   2.03   21.7   26.5   30.8
Cyrix MII300     233    109   30.6   24.0   15.8   4.09   2.70   35.7   59.4   29.4
Pentium          166    109   44.6   28.5   16.8   4.21   2.50   26.9   33.0   38.0
Pentium MMX      166    112   45.0   28.9   17.0   4.28   2.53   28.4   38.1   38.6
AMD K6           200    124   46.3   29.8   16.9   5.54   2.53   68.6   25.2   35.4
Pentium          200    132   54.2   34.7   20.3   5.12   3.03   32.5   39.8   46.2
Pentium MMX      200    134   54.2   34.3   20.3   5.13   3.03   34.0   46.2   45.8
Pentium Pro      200    161   50.3   45.2   31.5   4.46   2.77    102   20.6    119
AMD K6           266    167   65.9   41.4   22.9   7.48   3.36   92.0   33.0   47.0
Pentium Pro      233    189   58.9   51.3   36.7   5.36   3.30    119   24.0    116
AMD K6           300    191   75.5   47.1   26.2   8.51   3.83    105   37.3   53.4
Pentium II       233    191   58.8   51.9   36.9   5.37   3.34    119   24.3    141
Pentium II       266    218   67.3   59.6   42.1   6.12   3.81    137   27.4    160
Celeron A        300    245   76.2   66.7   47.4   6.91   4.28    153   31.0    181
Pentium II       300    245   75.7   66.9   47.4   6.90   4.28    152   31.1    180
AMD K63          450    286    113   70.8   39.2   12.8   5.74    157   56.2   79.7
Pentium II       350    286   88.1   78.6   55.4   8.06   4.99    178   36.1    212
AMD K62          500    309    122   76.4   42.3   13.8   6.20    170   61.0   86.0
Pentium III      450    367    114    101   71.2   10.3   6.42    231   46.6    271
Pentium II       450    368    115    100   71.3   10.4   6.41    230   46.5    269
Athlon           500    381   98.9   91.6   60.1   10.9   8.34    265    115    128
Celeron          466    382    118    103   73.8   10.8   6.66    240   48.6    279
Pentium III      550    448    139    122   86.8   12.6   7.83    279   56.9    331
Duron            600    463    121    111   73.1   13.3   10.2    321    140    156
Duron            700    541    140    131   85.3   15.5   11.8    377    164    181
Athlon           700    544    141    131   85.8   15.6   11.9    380    165    185
Atom M          1600    554    248    159   89.9   15.1   10.2    418    338    198
Pentium III      700    572    176    156    111   16.2   9.96    357   72.3    415
Celeron          733    598    185    162    116   16.9   10.4    373   75.6    437
Pentium 4       1700    603    204    166    104   21.3   12.1    369   71.1    155
Pentium 4       1800    639    216    176    110   22.6   12.8    391   75.2    164
Pentium 4       1900    671    227    184    116   23.6   13.4    409   79.4    172
Pentium 4       2052    726    245    200    125   25.6   14.5    443   85.5    187
Athlon Tbird    1000    769    200    185    121   22.0   16.8    532    233    260
Duron           1000    772    200    186    122   22.1   16.9    536    235    260
P4 Xeon         2200    773    260    213    134   26.9   16.2    469     85    198
Pentium III     1000    816    253    222    158   23.1   14.2    510    103    599
Atom Z8300      1840    954    432    292    162   26.5   15.9    986    256    417
PIII Tualatin   1200    972    304    268    188   27.4   16.9    604    123    715
Pentium 4E      3000   1028    313    262    154   38.7   18.5    824    217    298
Celeron M       1295   1034    321    301    202   29.0   17.6    661    132    762
Pentium 4       3066   1119    365    300    188   40.2   22.3    660    152    278
Athlon 4        1533   1193    308    284    186   34.6   26.3    833    358    398
Pentium 4       3382   1233    402    331    206   44.4   24.7    726    167    306
Pentium 4       3678   1342    436    357    224   48.3   26.8    791    184    333
Athlon 4        1789   1389    358    331    217   40.3   30.6    971    416    464
Ath4 Barton     1800   1397    348    333    227   40.3   30.4    957    357    447
Athlon XP       1865   1450    373    345    226   42.1   32.0   1010    437    484
Turion 64 M     1900   1506    375    346    245   42.3   32.1   1291    437    473
Pentium M       1862   1538    471    439    292   42.0   25.3    945    293   1113
Core 2 Duo M    1830   1557    431    437    293   42.8   26.2   1641    286    590
Opteron         1991   1580    393    364    255   44.7   33.8   1349    457    496
Celeron C2 M    2000   1688    487    472    315   46.9   28.4   1792    310    632
Athlon 64       2150   1720    427    395    280   48.5   36.6   1465    495    537
Athlon 64       2211   1766    439    406    286   49.9   37.6   1503    509    552
Athlon XP       2338   1805    457    424    295   51.5   39.7   1224    463    581
Core i5 2467M   @@@@   1813    537    501    317   50.4   32.1   2242    457    561
Core 2 Duo 1 CP 2400   2057    586    580    387   56.7   34.4   2192    381    771
Phenom II       3000   2145    594    492    297   67.1   50.7   2053    694    703
Core i7 930     ****   2496    691    671    441   72.0   43.9   2663    489    838
Core i7 860     ####   2790    793    752    499   81.2   49.5   2133    559    964
Core i7 3930K   &&&&   3004    883    829    525   83.6   53.1   3715    749    936
Core i7 4820K   $$$1   3063    887    831    533   85.2   53.7   4110    769    972
Core i7 4820K   $$$2   3124    920    857    545   86.2   55.3   4149    782    979
Core i7 3930K   OC     3722   1096   1017    649    104   65.7   4538    937   1166

      ####   Rated as 2800 MHz but running at up to 3466 MHz using Turbo Boost     
      ****   Rated as 2800 MHz but running at up to 3066 MHz using Turbo Boost     
      @@@@   Rated as 1600 MHz but running at up to 2300 MHz using Turbo Boost     
      &&&&   Rated as 3200 MHz but running at up to 3800 MHz using Turbo Boost     
      $$$1   Rated as 3700 MHz but running at up to 3900 MHz using Turbo Boost     
      $$$2   Performance not Balanced Power Setting for 3900 MHz                   
        OC   OverClocked ~4730 MHz                                                    
        M    Mobile CPU           

Links

PS

For Windows, the benchmark can be downloaded here: http://www.roylongbottom.org.uk/benchnt.zip. For Linux, compile from sources from my repository: https://github.com/EntityFX/anybench.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *