Vampire FPU emulation
The very first version of the SoftFPU called femu 0.1 is released and of course I want to try how good (and how fast) it works. It is the first version so no one should expect wonders. It comes in three versions, 030, 040 and 080 (why there is no 020?). In principle I wanted to try all of them but only the 080 Version works on the Vampire. the 030 Version crashes directly, the 040 crashes on first FPU command. So I have stay with the 080 Version.
First again my Mandelbrot program. (sadly the picture output does not work currently, not big endian compatbile 😉 so I can not check if the result is ok)
Mandelbrot results (Runtimes, shorter is better)
|Test||68060/50 MHz FPU||68060/50Mhz SoftFPU||Vampire SoftFPU||68030 68882/50 Mhz FPU||68030 SoftFPU||Vampire Femu 0.10|
|Mandelbrot single precision||0.12 s||9.53 s||3.81 s||2.14 s||38.03 s||11.14 s|
|Mandelbrot double precision||0.15 s||23.72 s||13.37 s||2.31 s||71.87 s||10.31 s|
Thats already rather interesting, it seems the femu calculates everything in double, which makes sense because the FPU always use extended. There was a hint already in the manual that femu needs the double precision math libraries from the system. It seems that femu is just a wrapper to guide the TRAPs to the libraries. Not a bad idea actually. In double it’s even a little bit faster than the FPC SoftFPU, not bad, as guessed in the FPC SoftFPU is a lot of optimization potential 😉
Next the Scimark test:
SciMark2 results (MFlops, higher is better)
Vampire V600 V2+ 128 MB FPU Code femu 0.10Mininum running time = 2.00 seconds Composite Score MFlops: 0.08 FFT Mflops: 0.04 (N=1024) SOR Mflops: 0.12 (100 x 100) MonteCarlo: Mflops: 0.05 Sparse matmult Mflops: 0.09 (N=1000, nz=5000) LU Mflops: 0.09 (M=100, N=100)
Vampire V600 V2+ 128 MB SoftFPU codeMininum running time = 2.00 seconds Composite Score MFlops: 0.06 FFT Mflops: 0.03 (N=1024) SOR Mflops: 0.12 (100 x 100) MonteCarlo: Mflops: 0.03 Sparse matmult Mflops: 0.08 (N=1000, nz=5000) LU Mflops: 0.02 (M=100, N=100)
This SciMark tests are usually done in Double precision so we see the same trend as in the Mandelbrot. It’s very nice that this tests run without any problems already kudos to the coder, it works.
To check for more FPU commands I took out my real time raytracer (ok on Amiga not that real time anymore :-P) changed that to a saving routine of a single picture and compiled for FPU and SoftFPU. It works and the picture looks very nice, as it should be:
As visible in the picture it needed 730 s to render that picture (as I said, not really realtime) with fpc SoftFPU it needs 280s the 68030/68882/50 Mhz needs 224 s. (sidemark on my AROS i386 box that image needs 0.2 s) and for all cases the picture looks good. The femu does what it promised, not actually very fast but reliable. A little bit disturbing of course is the freezing mouse, when the TRAPs appear. But here the coder of femu can’t do anything, as far as I understood he works closely together with the Vampire developer, so maybe he get a faster (or even not-) TRAP mechanism for the emulation in a later Vampire Firmware.
At the moment I still would prefer to use FPCs SoftFPU for MUIMapparium because there the Mouse will not freeze so the GUI feels more snappy.