Wednesday, June 15, 2011

Why does the language version being used affect standard semantics?

knoppix@Microknoppix:/tmp$ cat x.c
#include <signal.h>

siginfo_t info;

knoppix@Microknoppix:/tmp$ gcc x.c -c
knoppix@Microknoppix:/tmp$ gcc x.c -c -std=c99
x.c:3: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'info'


Saturday, June 11, 2011

Root filesystem for Beagleboard on MMC card

Update: There is of course the rootwait kernel parameter, which is better. (I'll leave the rest in case someone else searches for the same symtoms.)

Just in case someone else has the same problem, I've had a frustrating few days trying to get my own kernel to start up with the root filesystem on /dev/mmcblk0p2.

I based my kernel on the config found here, and based my boot.scr file on the one supplied with the Debian Net Install generates with " --uboot beagle_bx".

No matter which kernel I tried, or boot.scr options I could find, I kept getting a kernel panic whenever I tried to use root=/dev/mmcblk0p2 (rather than an initfd):

[    4.075317] regulator_init_complete: VDAC: incomplete constraints, leaving on
[    4.083404] md: Waiting for all devices to be available before autodetect
[    4.090637] md: If you don't use raid, use raid=noautodetect
[    4.097534] md: Autodetecting RAID arrays.
[    4.101928] md: Scanned 0 and added 0 devices.
[    4.106628] md: autorun ...
[    4.109558] md: ... autorun DONE.
[    4.113159] Root-NFS: no NFS server address
[    4.117614] VFS: Unable to mount root fs via NFS, trying floppy.
[    4.124816] VFS: Cannot open root device "mmcblk0p2" or unknown-block(2,0)
[    4.132141] Please append a correct "root=" boot option; here are the available partitions:
[    4.140991] 1f00             512 mtdblock0  (driver?)
[    4.146362] 1f01            1920 mtdblock1  (driver?)
[    4.151672] 1f02             128 mtdblock2  (driver?)
[    4.157012] 1f03            4096 mtdblock3  (driver?)
[    4.162384] 1f04          255488 mtdblock4  (driver?)
[    4.167724] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(2,0)
[    4.176483] [] (unwind_backtrace+0x0/0xf8) from [] (panic+0x5c/0x190)
[    4.185150] [] (panic+0x5c/0x190) from [] (mount_block_root+0x160/0x214)
[    4.194061] [] (mount_block_root+0x160/0x214) from [] (mount_root+0xa8/0xc4)
[    4.203338] [] (mount_root+0xa8/0xc4) from [] (prepare_namespace+0x15c/0x1c0)
[    4.212707] [] (prepare_namespace+0x15c/0x1c0) from [] (kernel_init+0x150/0x194)
[    4.222381] [] (kernel_init+0x150/0x194) from [] (kernel_thread_exit+0x0/0x8)

The solution turns out to be dead simple (but I couldn't find a reference to it on the web): the rootdelay kernel parameter gives the kernel time to detect the MMC card before trying to mount it.

Now, if I can just work out how to stop Debian's "Detecting Network Hardware" stopping the installation process in its tracks, I'll be a happy man...

Friday, June 03, 2011

Why did Linux use a RISC OS SWI number, when they had 24 million to choose from?

The EABI syscall interface for Linux on ARM processors uses the SWI (SoftWare Interrupt instruction) number zero as its way into supervisor mode and the kernel.  Unfortunately, that instruction has been in use for more than twenty years by the operating system used by the originators of the ARM, Acorn's RISC OS.

Now, Acorn weren't against looking ahead (I like to think of their software interfaces as forward-compatible, with support for likely developments looking forward many years, or even decades).  One example of this was when they defined the new ARM SWI instruction, they included these bits:

Bits 20 - 23
       These are used to identify the particular operating system that the SWI expects to be in the machine. All SWIs used under RISC OS have these bits set to zero. Under RISC iX, bit 23 is set to 1 and all other bits are set to zero.
Now, Linux used to stick to this definition, using the number 9 for the OABI, but when they moved to EABI, the operating system bits were quietly dropped and  SWI 0 (RISC OS's OS_WriteC - "Writes a character to all of the active output streams") was used.

Now, this has become a bit of a headache for me, because I'm trying to get Linux to run RISC OS programs using the ROLF compatibility library, which provides native Linux implementations for many RISC OS system calls.  What I want is for the kernel to signal the application when a SWI is encountered in RISC OS code so that the ROLF routines can be executed in user space and the application (or module) can continue on its merry way.

There are a few ways to approach this:
  1. Check the personality of the process for each SWI executed
  2. Redefine the EABI to insist on SWI 0x009fxxxx as the kernel entry SWI (and treat all other SWIs as an illegal instruction or use a special signal code)
  3. Make OABI the standard interface
  4. Use a similar mechanism to the x86 emulator to generate Linux code from the RISC OS code, which won't execute any RISC OS SWIs
  5. Make the personality implementation more object-oriented and allow each process to determine how it should handle SWIs, Unknown instructions, data aborts, etc.
Option 1 will work, but will slow down all system calls even more than just moving back to OABI.

Option 2 will make some existing programs incompatible with the kernel, although not if they just use libc to make system calls, or are recompiled against a new unistd.h.

Option 3 would be a step backward for Linux (and support will probably be phased out over time).  Also, it suffers from the same incompatibilities as option 2.

Option 4 will be fully compatible and would allow running of old 26-bit applications but would be much more work and probably quite a lot slower.

Option 5 would be cool, but probably too high impact.

Jan Rinze likes 1, I prefer 2, but I'll probably go with whichever produces a working solution first!