Better timings
Using gprof, I found that:
- The image translation from RISC OS Sprite to screen mode was unoptimised (i.e. making two unoptimised function calls per pixel). This, I improved, but it only made a small overall difference since it occurs after the rendering work is done in the emulator.
- gprof is useless for following what's going on in the emulator generated code.
With that six or seven line code change, the situation is now that the BeagleBoard renders the celtic_knot3 file in a little over 37s, down from 85s.
x86 PC: 14s, Beagle Board: 37s |
The branch fixup code currently avoids having to clear the code cache by calling the fixup code using LDR pc, [pc, #...], having stored the code address in a scratch register. That way, only the word loaded by that instruction has to be changed to point to the generated code instead of the fixup routine, and the next time the instruction is reached, the fixup routine will be bypassed (although the setup for the call remains).
Further possible improvements to be tried:
- Modify the first instruction of the fixed up code to be a proper branch to the address, as well as the current change; if the code falls out of the code cache by itself, the next time the code is run it will be quicker.
- Clear the ARM code cache explicitly, so that the faster code will be called straight away. This may be slower, due to the overhead of a system call the first time the branch occurs.
If you compile with -DSTANDALONE, it will create an executable that takes the name of a file that should contain ARM instructions, and run them (only really useful with gdb, to see what's going on).
The handling of unaligned memory accesses is still incorrect (except on my custom kernel, which fixes up the accesses in the old fashioned way).
Update: http://ro-lf.svn.sourceforge.net/viewvc/ro-lf/ROLF/rolf/Libs/Compatibility?view=tar downloads the whole ROLF compatibility library, including the include files and disassembly code. (I can't check this at the moment, but...) The following should generate an executable on an ARM system:
tar xf ro-lf-Compatibility.tar.gz cd Libs/Compatibility/ # Possibly other subdirectory touch config.h # Usually generated by the ROLF configure routine gcc -o standalone_emulator arm_arm_emulator.c arm_d*.c -DARCH_ARM -DSTANDALONE -DDISASSEMBLE -Iincludes -I.
2 Comments:
Hi Simon!
I've just found Rolf on S/F - looks very nice!
I was just wondering...
I'm a real "public-domain weenie" (fan of PD software)so just thought I'd ask - would you consider releasing Rolf as "publc domain" (or CC0)? (Hope you're not offended by my asking - I mean no offense).
It's just that there are *so many* GPL GUI apps "out there", and the "public-domain world" could **really** do with a beautiful app like this!
It would also make Rolf **really stand out** from the rest of the pack of apps like this.
I'm not much of a coder myself, but I like to support PD software however I can (and I bought a book about a recently-released PD subset-of-C compiler
to show my support for the app author).
Thanks for your time.
I imagine the answer will probably be "no", but just thought I'd ask.... :)
Keep up the great work! Bye for now -
- Andy
(andy dot elvey at paradise dot net dot nz)
Hi Andy,
Thanks for the nice comments about the ROLF. (Have you actually tried to use it?)
I must admit I don't understand the question (I'm working on finding out about the CC license and PD). Is there something about the GPL that stops you (or anyone else) from building a system and releasing it as PD? The source is less than a megabyte, and you could (as far as I know) just include a link to SourceForge where the source is? (e.g. http://ro-lf.svn.sourceforge.net/viewvc/ro-lf/ROLF/rolf/?view=tar&pathrev=236). What would you like to include in it?
tl;dr: Not no, just why?
Post a Comment
Subscribe to Post Comments [Atom]
<< Home