I've set up some benchmarks that can be run on a variety of hardware and software. I've run these benchmarks on various machines I have access to, trying to keep them comparable through time and across architectures.
Matrix (naive): Multiplying 1000x1000 matrices by the simplest possible algorithm. They will not fit in the CPU cache, so every operand reference requires a trip to memory, and with a fast floating point processor the test is really testing the memory. The reported number is in units of Mflops/sec, that is, 1.0e6 flops/sec, where in the naive algorithm a pair of matrices of size N are multiplied in 2*N^3 flops (one addition and one multiplication per cell unit).
Blocked matrix: Multiplying 1000x1000 matrices with a hand-optimized algorithm, doing 28x28 internal blocks. These fit in the cache, so under 4% of the operand references come from memory. This test puts the maximal load on the floating point processor. Also reported in Mflops/sec.
Sums: Does the SysV checksum algorithm on 8 Mbytes of memory,
over and over. This emphasizes integer and memory performance.
The reported number is a ratio with the speed empirically measured
for a Pentium III Coppermine
at 1.0 GHz.
X11perf: 2D graphics acceleration, 9 tests (out of 374 possible) emphasizing the larger objects, mostly 500x500px rectangles. The reported number is an average (equal weights per test) of the ratios of speed with an ATI Radeon X1400. This is a somewhat naive figure since different graphic processors vary greatly between tests in test speed, and actual graphic workloads are not analysed to produce a rational weighting for the different tests.
Mesademo Fire: 3D graphics acceleration. This is a very old demo
from the Mesa OpenGL library, but I like it because it seems to put
more stress on the graphics processor than glxgears. The help text
and fog are turned off, except for figures marked *
; in these
cases the measured number is divided by 0.72 which is the empirical
ratio with and without junk on the ATI Radeon X1400.
Glxgears: 3D graphics acceleration. This demo is widely used as a speed test in web postings.
The date
refers to approximately when the machine was
acquired. Some of these machines are at the UCLA Mathematics
Department.
Date | System | Processor | Matrix | Blocked | Sums |
---|---|---|---|---|---|
Graphics | X11perf | Fire | Glxgears | ||
2007-12-27 | Mica | TI OMAP-2420 0.4GHz | -- | -- | 0.10 |
Epson S1D13745 | -- | -- | -- | ||
2007-12-27 | Jacinth | AMD Geode LX 800@0.9W 0.5GHz | 25.3 | 56.4 | 0.28 |
Geode Cimmaron(fbdev) | 0.19 | 0.4 | 59 | ||
Geode Cimmaron(amd) | 0.24 | 0.4 | 73 | ||
2003-09-27 | Fafnir | Intel Pentium 4 2.40GHz | 147.9 | 874.1 | 1.83 |
nVidia GeForce4 MX 440 AGP 8x | 0.92 | 179* | 1943 | ||
2006-11-17 | Diamond | Intel 6300 1.86GHz x2 | 245.7 | 1333.3 | 3.18 |
ATI Radeon X1300Pro | 1.73 | 537 | 4901 | ||
2007-05-15 | Xena | Intel T5600 1.83GHz x2 | 240.7 | 1295.2 | 3.08 |
ATI Radeon X1400 Mobility | 1.0 | 203 | 1504 | ||
2007-10-02 | Godzilla | Intel Xeon X5365 3.0GHz x8 | 325.7 | 1886.8 | 6.21 |
nVidia Quadro FX 4600 | 5.41 | -- | 22843 | ||
2003-01-01 | Bamboo | Intel Pentium 4 3.06GHz | 169.5 | 1224.5 | 2.67 |
(No graphics, compute cluster) | -- | -- | -- |