Performance Measurement on ARM

Robert Schwebel | | benchmarks, kernel, performance

After working mostly with different ARM processors in the 200...400 MHz range in lots of Embedded Linux projects over the last years, we have seen an interesting development in the market recently:

  • ARM cpus, having been known for their low power consumption, are becoming faster and faster (example: OMAP3, Beagleboard, MX51/MX53).
  • x86, having been known for its high computing performance, is becoming more and more SoC-like, power friendly and slower.

If you read the marketing stuff from the chip manufacturers, it sounds like if ARM is the next x86 (in terms of performance) and x86 is the next ARM (in terms of power consumption). But where do we stand today? How fast are modern ARM derivates?

More Infos

Want to learn more about our benchmarking setup?

The Pengutronix Kernel team wanted to know, and so we measured, in order to get some real numbers. Here are the results, and they turn up some interesting questions. Don't take the "observations" below too scientifically - I try to sum up the results in short claims.

As ARM is explicitly a low power architecture, it would have been interesting to measure some "performance vs. power consumption" data. However, as we have done our experiments on board level products, this couldn't be done. Some manufacturers tend to put more peripheral chips on their modules than others, so we would have only measured the effects of the board BoMs.

Test Hardware

In order to find out more about the real speed of today's hardware, we collected some typical industrial hardware in our lab, so this is the list of devices we have benchmarked:

Test Hardware CPU Freq. Core RAM Kernel
phyCORE-PXA270 PXA270 (Marvell) 520 MHz XScale (ARMv5) SDRAM 2.6.34
phyCORE-i.MX27 MX27 (Freescale) 400 MHz ARM926 (ARMv5) DDR 2.6.34
phyCORE-i.MX35 MX35 (Freescale) 532 MHz ARM1136 (ARMv6) DDR2 2.6.34
O3530-PB-1452 (Texas Instruments) OMAP3530 500 MHz Cortex-A8 (ARMv7) DDR 2.6.34
Beagleboard C3 OMAP3530 500 MHz Cortex-A8 (ARMv7) DDR 2.6.34
phyCORE-Atom Z510 (Intel) 1100 MHz Atom DDR2 2.6.34

LMbench command lines

lat_ops

root@target:~ lat_ops 2>&1

filtered by

grep "^float mul:" | cut -f3 -d" "

bw_mem

root@target:~ list="rd wr rdwr cp fwr frd bzero bcopy"; \
for i in $list; \
do echo -en "$i\t";  done; \
echo; \
for i in $list; \
do res=$(bw_mem 33554432 $i 2>&1 | awk "{print \$2}"); \
echo -en "$res\t"; done; \
echo MB/Sec

filtered by

awk "/rd\twr\trdwr\tcp\tfwr\tfrd\tbzero\tbcopy/ { getline; print \$3 }"