The ARM Thumb-2 instruction set is not a new thing. In fact it was announced already in 2003. Yet, the standard ARM instruction set is often still used because it is the default option, while Thumb-2 could be a better alternative. This post explains why the Thumb-2 can be a better option for many applications and also how to configure it in Yocto build system for Linux kernel, system libraries, utilities and user binaries.
ARM, Thumb and Thumb-2
The original ARM instruction set is a RISC (reduced instruction set computing) architecture that uses 32-bit fixed-length instructions and registers (the newer ARM cores also now support 64-bit registers and memory). In short, the RISC-approach uses simpler instruction set than CISC (complex instruction set computing) such as the x86 architecture which leads to simpler and more power-efficient CPU design. A RISC-machine might however require multiple instructions to accomplish a task that can be implemented with a single instruction in a CISC-machine.
Even in the RISC-machines most programs only use a small subset of the instruction set frequently. To improve code density, ARM introduced Thumb instruction set which was created by deriving the best-fit 16-bit instructions from the whole ARM instruction set. This reduces code size but sacrifices some performance because sometimes the more complex instructions that were left out from Thumb would yield the best performance. It is possible to mix Thumb and ARM instructions but it requires a context switch which also introduces performance penalty.
The next iteration was Thumb-2 which allows the compiler to freely mix the short 16-bit instructions with the more expressive 32-bit instructions. This allows a good combination of performance and code density.
Thumb-2 performance comparison
One performance comparison for ARM, Thumb and Thumb-2 can be found from Embedded Linux conference presentation from 2007. The following figure from the presentation shows that in these benchmarks the performance of Thumb-2 was around 98% of the ARM performance.
The code size with GCC and -O2 optimization level was reduced for common libraries like the libc for ~20% and for Linux kernel by ~29%. Even though these results are already a bit old, these technologies have remained the same, so the results should be comparable in the current situation. Of course the code size reduction and performance depend on the application so best is to experiment with an own project.
Benefits of higher code density
As the performance comparison shows, the main benefit of Thumb-2 instruction set is higher code density with comparable performance with the standard ARM instruction set. The smaller binary size that comes with higher code density has many benefits that make the Thumb-2 worth considering.
When a device boots up, the kernel binary is loaded from non-volatile storage such as hard drive or flash chip to the main memory. The load time increases roughly linearly in respect to binary size, so 30% reduction in binary size would decrease the load time and thus the boot time in same ratio. Often the kernel image is also compressed and the decompression time is also reduced when the binary size is smaller.
The same applies also for libraries and applications. When the device starts up, these binaries are loaded from storage to main memory. Overall, the significantly smaller binary size provides faster boot time than slightly faster ARM instruction set with bigger binaries.
The smaller binary size will also reduce memory footprint and requires less storage. It is heavily application dependant how important these things are. Especially for low-end devices with small memories and limited storage capacity these aspects might be very important.
The smaller binaries will also mean smaller firmware update packages. Now that ARM cores are running in many IoT-thingies, the size of the over-the-air update package can be important. Especially if mobile connections are used.
The denser code can also help to better utilize the instruction cache in the MCU and thus reduce memory accesses to main memory. This might have some positive impact on the energy efficiency (at least according to ARM marketing material).
Configuring Thumb-2 in Yocto
The most benefit from Thumb-2 instruction set comes when it is used for all the components in the system. That is, kernel, system libraries and utilities as well as user applications and libraries. Manually setting necessary compiler flags for all these components would be very laborious. Luckily switching to Thumb-2 in Yocto is relatively easy.
The Thumb-mode for ARM Linux kernel can be easily enabled from the kernel configuration. In Yocto, the kernel configuration can be modified with, for instance, the menuconfig bitbake command as shown below.
$> bitbake -c menuconfig virtual/kernel
This command opens the graphical kernel configuration editor. The Thumb-2 mode can be selected from the “Kernel Features” menu as shown below.
To add support for userspace Thumb binaries, the “Support Thumb user binaries” option should be selected from the “System Type” menu.
Now, when the kernel is compiled, it will use Thumb-2. Remember, when the kernel configuration is edited with bitbake menuconfig command, the kernel recipe and the defconfig are not automatically updated. These changes should be applied to the kernel recipe so that they are also used in subsequent builds. For more details, refer to the Yocto kernel development manual.
System and user binaries
Yocto provides a global configuration option that can be set to enable Thumb-mode for all the target libraries and utilities that are built by the normal Yocto build, and end up in the core image.
ARM_INSTRUCTION_SET = "thumb"
This option can be configured for instance in the local.conf or in the respective machine configuration. Note that there is no “thumb2” option. If the target processor is new enough, Thumb-2 will be used automatically. Now when the build is started, “thumb” should be visible in the TUNE_FEATURES list in bitbake output.
This same option also sets the Thumb-mode in the Yocto SDK compiler options. Basically this means that all the user applications and libraries built with the Yocto generated SDK will also use Thumb automatically without additional configuration. This can be observed from the environment-setup script that is installed with the SDK. The compiler flags for GCC should include “-mthumb” flag.
That’s it. Kernel and all the userspace binaries are now using Thumb-2. If the device uses U-Boot as the bootloader CONFIG_SYS_THUMB_BUILD option might be interesting to check out.