Friday, November 04, 2011

Alignment and modern ARM processors

Back when RISC OS was in its heyday, and the ARM processor was new, reading or writing a word at a non-word aligned address didn't cause an exception or cause two memory words to be read (and half discarded); instead it would read the word at (address & ~3) (i.e. the word containing the byte addressed) and load it into the register, rotated so that the addressed byte was the least significant byte in the register (IIRC).

ARM Architecture Reference Manual:

load single word ARM instructions are architecturally defined to rotate right the word-aligned data transferred by a non word-aligned address one, two or three bytes depending on the value of the two least significant address bits.

Modern ARM processors, when faced with this case, can trigger an alignment exception, allowing the OS to fixup the instruction to behave in the (admittedly more high-level-language-friendly) way of reading the four bytes starting at the address and treating them as a single word); some hardware can even perform the fixup automatically.

Unfortunately, that means that some RISC OS (read: old) code breaks.

There are two Linux features that appear to have a bearing on the problem:
  1. /proc/cpu/alignment (allows you to set whether an alignment exception should be fixed up or a SIGBUS sent to the process).  Unfortunately, that has at least two problems; firstly, it's system-wide, and so may cause other programs to break, secondly, it doesn't register unaligned LDRs (probably because the hardware performs the fixup).
  2. The prctl has the following values defined in linux/prctl.h: PR_SET_UNALIGN and PR_GET_UNALIGN, and the possible values: PR_UNALIGN_NOPRINT (for silent fixup) and PR_UNALIGN_SIGBUS (to signal the exception to the process for fixing up).  This is a per-process feature but, unfortunately, these values are not implemented in the ARM kernel.
Ideally, I think I'd like to use the prctl approach and have a third value to set the system to, PR_UNALIGN_ARM_TRADITIONAL, which would simply work the way ARM processors used to.  How practical that is remains to be seen (especially in the presence of multi-core processors; the behaviour has to be settable on a per-core basis).


Post a Comment

Subscribe to Post Comments [Atom]

<< Home