ARM Emulation on ARM - progress
There are still some bugs to be ironed out, and the (unoptimised) speed is less than I'd hoped. Actually it's about 100 times slower than my 200MHz RiscPC!
Still, I have several features that still need implementing, such as a hash table for cache searching, fixing up jumps to non-local code so that the second and subsequent jumps make no search, compiling the library with gcc optimisation on (which made a surprisingly large difference on the x86 ARM emulator) and, if all else fails, reorganising the instruction identification mechanism to be more efficient.
Update 19:39. Implemented a simple three instruction hash, for a better than 10x speedup. Now only 10 times slower than the 15 year old machine!
Update 19:55. -O4 optimisation gives a 30% speedup (a test that used to take 167cs, now takes 115cs), although that's probably more about turning off some debug than anything else.
Update 00:21. One fixup approach takes the time down to 78cs (for 1 million times around a FOR...NEXT loop).
Initially, I thought I'd have to clear the cache (which involves a system call, plus whatever the OS does) for each fixup, but I realised that I could use LDR pc, [pc,...] to load the destination address using the data cache, (where the destination address would initially be the fixup routine and later the actual code address). I might try some other approaches in the morning. (I just noticed some debug output still in there, so I deleted it and... the time went UP to 106cs. I'm going to bed.)
0 Comments:
Post a Comment
Subscribe to Post Comments [Atom]
<< Home