Vampire V2.7 with FPU

Posted by ALB42 on 2. MΓ€rz 20188 Comments

The new Vampire firmware is released V2.7 which contains a Hardware FPU in the FPGA (some seldom 68881/68882 commands are still emulated, like the 68060/68040 also do) but nevertheless, thats very nice for my MUIMapparium (you remember the problem?). Of course I flashed directly the new version, first the bad news, it’s VERY unstable for me, it’s said it needs some soldering because there are some errors on the early Vampire cards which make the power supply to the FPGA bad… something like this and I’m affected with that … so it’s a little bit annoying to work with it because there are drawing errors on the screen and it crashes often. I reduced the screen resolution and ended all background program which made it much better. But nevertheless to really try it out I have to wait until someone fix my card. I can’t do that myself, most scary thing in the world a coder with a screwdriver let alone a soldering equipment :-P.
But this was not the topic of this post. I tried MUIMapparium FPU version on my new Vampire 2.7… good news it starts does not crash, bad news the map stays empty. The same Executable worked well with FEmu (I checked especially before I flashed the new one) on the old Gold 2 and still work in UAE. But the FPU calculation seems to work because the mouse pointer movement shows reasonable coordinates. I was a little bit surprised because even the GUI in the map window was gone. I checked the code ah yes there is a tiny floating point calculation, fine let’s see whats that. An my guess was right it is the floating point calculation, the Button size is calculated by the Font Size * 1.2 to have a little bit more space around it. After adding some debug output it seems that the floating point calculation works well but the rounding always return zero, so I wrote a little test program to test the rounding here the outputs of the testprogram in my setups:

Vampire 2.7 Amiga 1200/030/68882 UAE 68060 emul
a := 5 = 5
a * 1.0 = 5.000000000E+00
Round(a * 1.0) = 0
b := a * 1.3 =  6.499999523E+00
Round(b) = 0
Ceil(b) = 7
Floor(b) = 6
Floor(b) = 6
press enter
end
a := 5 = 5
a * 1.0 = 5.000000000E+00
Round(a * 1.0) = 5
b := a * 1.3 =  6.500000000E+00
Round(b) = 6
Ceil(b) = 7
Floor(b) = 6
Floor(b) = 6
press enter
end
a := 5 = 5
a * 1.0 = 5.000000000E+00
Round(a * 1.0) = 5
b := a * 1.3 =  6.500000000E+00
Round(b) = 6
Ceil(b) = 7
Floor(b) = 6
Floor(b) = 6
press enter
end

Na? who spots the difference. Funny that only Round() have this problem but ceil, trunc, floor not. This also explains why MUIMapparium shows no maps at all, if all is rounded to 0. Ok I have to wait until they fix that… yeah I could replace all Round(a) by Floor(a+0.5) but why should I do that, here is clearly something broken in the FPGA.
 
You want to try on your own computer – Exe for m68k with FPU: TestFPU and the source: TestFPU.pas

About this year … 2017

Posted by ALB42 on 13. Januar 2018No Comments

A little bit delays, but as always the summary of last year. Due to personal circumstances I have only little time for these things and also not much mood. I hope this will clear up through out the year.
 

But back to the review. The year started as claimed in the last summary suggested. The MUI-LCL was accepted to the official Lazarus repository and also did some (little) improvements of it. Sadly I always hit the border what MUI can do so not much advance here.
 

Charlie improved FreePascal and Frank Wille improved vasm and vlink to support section linking for the Amiga Platforms. This decreases the file size very much. Finally vlink and vasm became the default assembler and linker for Amiga68k, MorphOS and AmigaOS4
 
FPC for Amiga systems got it’s own Subforum at Amigacoding.de and in the beginning there were some discussion, sadly it slept in again.
 
Magorium played around with SDL stuff at AROS with FPC and I also used it to bring a gaming tutorial source to AROS. I’m always interested in nice 3D routines or even realtime raytracers. I did before some coding about it but also this year.
 
Over the easter holidays I was on a trip, and I took my raspberry pi with an attached touchscreen and keyboard with me. Of course with AROS arm installed. Therefore I was able to code at the evenings in the hotel. I started to make a new approach for Mapparium. Mapparium is a nice program (I use it mainly to depict my bicycle tours recorded with my GPS device or iPhone) but the GUI ist still rather clumsy because MUI/Zune does not like this direct placement of positions. On a native 68k Amiga it become very slow because of this huge LCL Layer. I decided to write a native MUI/Zune application, as I did before the ZuPaPlayer, again to learn a little bit how to code MUI and also to fix some serious problems I found in Mapparium. The New Version is called MUIMapparium and went through several releases until the current 0.5 release. MUIMapparium got an own Release page. It still does not have the same feature set as the LCL Mapparium but it works much more smooth, really nice even on a native 68k Amiga especially on a Amiga 600 with Vampire.
 
Speaking of Vampire, I bought a complete Amiga 600 with a Vampire V2 card, really nice piece of hardware. Very fast (the card just keep popping away from the chip where it should be attached). The only drawback is, that there is no FPU included in the FPGA emulation (so it’s more like a 68030/68020 than a 68040/68060 which has built in FPU). For my programs thats not a big problem, there is a SoftFloat option in FreePascal and it works really well. I never thought about the speed of that routines. Yes, I knew they are much slower than a native FPU calculation but I had no idea how much. Later someone released a FPU-emulation software femu which improved the situation a lot. But still a lot to improve there. I heard that the next released core should have included some more basic FPU function and the femu is somehow more connected to the FPGA. I guess they build a way to prevent the Trap which appears on every unknown FPU command and eats up a lot of time. But I have no idea if this is true and this will ever be released, In fact I doubt it a little bit, seems the whole project is sleeping or so. Just to depict that core releases from Vampire:
GOLD2       2017-01-23
GOLD       2016-09-05
SILVER9       2016-08-03
SILVER8       2016-07-29

SILVER7       2016-07-10
SILVER6       2016-05-16
SILVER5       2016-05-06
SILVER3a       2016-04-04
SILVER3       2016-04-03
SILVER2       2016-02-28
SILVER1       2015-12-25

That means the last FPGA core release was almost a year before, and Screens/Show offs of the coming Gold 2.7 or Gold 3 are there since several month already but no sign of an actual release. (to be complete, there is a Gold 2.5 release, but there the FPGA core is not changed, same build number)
 
My Blizzard 1260 broke and I send it to repair, (sadly still unknown whats exactly wrong, MACH chips? I have no knowledge about such stuff). For the time without a proper turbocard for my A1200 I bought a Blizzard 1230IV which works very nicely. An other thing the Vampire is missing, is an MMU (at least an Motorola compatible MMU) so until now Linux or NetBSD will not work on Vampire Amiga. But on this Blizzard 1230 it runs… even very slow of course. I did some qemu-m68k stuff on my home server to let some automatic tests run of the freepascal 68k compiler Charlie is improving the whole time. My plan was to also try it on a real Amiga with Linux. Nice, but it really needs ages to do something like compile or install stuff. Maybe when my Blizzard 1260 is back I will try that again.
 
I read some interesting article about a c compiler in a web browser (in javascript) and I got the idea to promote the FreePascal compiler for Amiga systems a little bit more. it should be possible to create a page with a simple text editor on it, which can compile Programs for all our beloved Amiga systems from pascal sources. I don’t have much php knowledge but for that it’s more that enough. The Online Compiler was born. This project even got attention of a big german tech-news page Heise, which published an an article about that. But also the Amiga community showed some interest by using it and the biggest Amiga journal today “Amiga Future” wrote an article about Pascal and also published an interview with me.
 
To keep the enemies separated I kept the Atari online compiler on a different page. Both are still nicely in use by some people.
 
Currently I try to find some motivation to continue on the MUIMapparium stuff. I already improved it a lot, especially the route finding stuff but it’s still not ready to release. in FPC I was working on the basic threading stuff, like events which were still missing. It was triggered by a change in the FCL package but need the event functionality. It still needs improvement.

Unify ASL

Posted by ALB42 on 21. August 2017No Comments

I checked the ASL.library units of MorphOS and Amiga 68k against the official C includes of the SDKs. Especially the TFileRequester structure was always a little bit trouble because the old Amiga asl unit still used the old field names rf_* but from V38 of the Library this fields are all names fr_* and some other tiny dame differences (e.g. Dir vs. Drawer). In AROS and AmigaOS4 I only added the newer names because here I do not have any “old” code. This resulted in a big inconsistency between the platforms and need ifdef’s in the final programs. To prevent a direct breakage of the existing sources (e.g. LCL, MUIMapparium) Amiga and MorphOS have both field names in the structure (as case). The aim will be to remove that ifdef’s from the sources.

More colors

Posted by ALB42 on 20. August 2017No Comments

Due to user wishes one can now change the color of each track and route individually, which is also saved to the GPX File. Routes and Tracks in GPX have an extension area where you can add own properties without violating the GPX format, which is very nice. The Routes and Track property window have now a Color Button next to the Name to choose the color of the feature (you have to save that before the change is visible in the map).
In principle it would be nice to have the color next to the name in the Track/Route List as a little colored square (like I did for track plot axes). But I’m not sure if and how that is possible at all for such a list, without creating a selfdrawn one.
 
Besides that I implemented that MUIMapparium remembers the position and open status of the Statistics window, seems some user like to keep it always open to observe the loading status or something like this.
 

Colored routes and Tracks in MUIMapparium

Vampire FPU emulation

Posted by ALB42 on 29. Juli 20172 Comments

The very first version of the SoftFPU called femu 0.1 is released and of course I want to try how good (and how fast) it works. It is the first version so no one should expect wonders. It comes in three versions, 030, 040 and 080 (why there is no 020?). In principle I wanted to try all of them but only the 080 Version works on the Vampire. the 030 Version crashes directly, the 040 crashes on first FPU command. So I have stay with the 080 Version.
First again my Mandelbrot program. (sadly the picture output does not work currently, not big endian compatbile πŸ˜‰ so I can not check if the result is ok)

Mandelbrot results (Runtimes, shorter is better)

Test 68060/50 MHz FPU 68060/50Mhz SoftFPU Vampire SoftFPU 68030 68882/50 Mhz FPU 68030 SoftFPU Vampire Femu 0.10
Mandelbrot single precision 0.12 s 9.53 s 3.81 s 2.14 s 38.03 s 11.14 s
Mandelbrot double precision 0.15 s 23.72 s 13.37 s 2.31 s 71.87 s 10.31 s

Thats already rather interesting, it seems the femu calculates everything in double, which makes sense because the FPU always use extended. There was a hint already in the manual that femu needs the double precision math libraries from the system. It seems that femu is just a wrapper to guide the TRAPs to the libraries. Not a bad idea actually. In double it’s even a little bit faster than the FPC SoftFPU, not bad, as guessed in the FPC SoftFPU is a lot of optimization potential πŸ˜‰

Next the Scimark test:

SciMark2 results (MFlops, higher is better)

Vampire V600 V2+ 128 MB FPU Code femu 0.10
Mininum running time = 2.00 seconds Composite Score MFlops: 0.08 FFT Mflops: 0.04 (N=1024) SOR Mflops: 0.12 (100 x 100) MonteCarlo: Mflops: 0.05 Sparse matmult Mflops: 0.09 (N=1000, nz=5000) LU Mflops: 0.09 (M=100, N=100)


Vampire V600 V2+ 128 MB SoftFPU code
Mininum running time = 2.00 seconds Composite Score MFlops: 0.06 FFT Mflops: 0.03 (N=1024) SOR Mflops: 0.12 (100 x 100) MonteCarlo: Mflops: 0.03 Sparse matmult Mflops: 0.08 (N=1000, nz=5000) LU Mflops: 0.02 (M=100, N=100)

This SciMark tests are usually done in Double precision so we see the same trend as in the Mandelbrot. It’s very nice that this tests run without any problems already kudos to the coder, it works.

To check for more FPU commands I took out my real time raytracer (ok on Amiga not that real time anymore :-P) changed that to a saving routine of a single picture and compiled for FPU and SoftFPU. It works and the picture looks very nice, as it should be:

TraceRay FPU on Vampire with femu 0.10

As visible in the picture it needed 730 s to render that picture (as I said, not really realtime) with fpc SoftFPU it needs 280s the 68030/68882/50 Mhz needs 224 s. (sidemark on my AROS i386 box that image needs 0.2 s) and for all cases the picture looks good. The femu does what it promised, not actually very fast but reliable. A little bit disturbing of course is the freezing mouse, when the TRAPs appear. But here the coder of femu can’t do anything, as far as I understood he works closely together with the Vampire developer, so maybe he get a faster (or even not-) TRAP mechanism for the emulation in a later Vampire Firmware.
At the moment I still would prefer to use FPCs SoftFPU for MUIMapparium because there the Mouse will not freeze so the GUI feels more snappy.

Show me some routes

Posted by ALB42 on 28. Juni 2017No Comments

Working again a little bit on MUIMapparium. I want to include some more features before the next Release 0.5. I included marker for the plot which is then also shown as little triangle in the map. I’m not really satisfied with the colors and visibility of the current open track and marker for the point maybe I get a better solution later. The marker is atm. only one pixel wide, in principle it would be possible to make it 2 pixel or 3 pixel, but that looked a little bit too massive.

MUIMapparium with Track Marker

Another thing I wanted to include for the next version are calculated routes and maybe also photos with EXIF tags. If this is done MUIMapparium has the same Featureset as Mapparium, even a little bit more. I started with Routes which is not very difficult, some routines even can be reused from tracks drawing and so on. It only loads tracks from gpx files currently (created by Mapparium for example). The creation of a new route will be the next step.

MUIMapparium with a Route


Also visible in the image is the new possibility to disable all Marker, Tracks and Route drawing. Helpful if you have many items and want to concentrate on the Map or so (or just to increase the speed).

To FPU or not to FPU

Posted by ALB42 on 28. Mai 2017No Comments

When I work on MUIMapparium usually I only work in Linux and test on AROS Linux-hosted. which is very convenient and fast. When starting the MUIMapparium I also tested at the end on every platform if it works and how is the speed. For the last two Releases I skipped this part, due to lack of time.

But yesterday I tried MUIMapparium on my Amiga 600 with Vampire and was shocked how slow it behave. The map moving is just not usable around a second reaction time. I downloaded/compiled older versions to check when this problem appeared. Deep in the back of my brain I guessed already that the fixed position calculation could be the reason (see here). Thats pure floating point calculation and a lot of them. I tested that on the initial implementation and it seemed not too slow, because for simple map moving and zoom only very two-three times this conversation have to be done, so the influence is not very big.

So why it’s now so slow? The difference is that before I tested with a bare MUIMapparium without any marker or tracks loaded. Marker only add a single conversation to the list. But Tracks need a conversation for every recorded (and maybe drawn) point. Remember the most GPS devices measure the position once per second, that means for an hour walk you get something about 3600 points (usually the GPS already strip them from “not moved” points, nevertheless you get around 1000 points). For NG Amigas with their massive computing power especially on the FPU side, this is not much of a problem, 1000 fpu calculation with some hundreds MFlops are just done some milliseconds.
But on Vampire it’s a different story, no FPU, it has to use the softFPU emulation of FreePascal. This raised the question: How fast is the softFPU emulation on a Vampire in comparison to a real 68060 / 50Mhz FPU. The Vampire integer performance is much higher than the 68060 (around twice as fast, see here) but emulated FPU, there is a lot of code needed to emulate that correctly.
I used two tests for that, a simple Mandelbrot algorithm, in single and double precision and the well known SciMark from NIST. Compiled with either with FPC SoftFPU emulation or the 68881 FPU support.

Mandelbrot results (Runtimes, shorter is better)

Test 68060/50 MHz FPU 68060/50Mhz SoftFPU Vampire SoftFPU
Mandelbrot single precision 0.12 s 9.53 s 3.81 s

Mandelbrot double precision 0.15 s 23.72 s 13.37 s

When comparing the SoftFPU times of 060 and Vampire you can see the 2-3 times I experienced before already. But the (often called “very slow”) 68060 FPU leaves the SoftFPU Vampire in the dust far behind it. (In fact the dust is already settled down again, before the SoftFPU finished the calculation). Of course the errorbars for the calculations with FPU are huge, the time is too short for a reliable time measurement, but a bigger calculation just would need ages with SoftFPU πŸ˜‰ and the trend is nicely visible.

Next is the SciMark, it uses various real life floating point calculation, like FFT, matrix multiplication, monte carlo simulation, if you work in science you know that stuff, if not just believe me that is what science programs do all day πŸ˜‰

SciMark2 results (MFlops, higher is better)


Vampire V600 V2+ 128 MB SoftFPU code
** ** ** SciMark2a Numeric Benchmark, see http://math.nist.gov/scimark ** ** ** ** Delphi Port, see http://code.google.com/p/scimark-delphi/ ** ** ** Mininum running time = 2.00 seconds Composite Score MFlops: 0.06 FFT Mflops: 0.03 (N=1024) SOR Mflops: 0.12 (100 x 100) MonteCarlo: Mflops: 0.03 Sparse matmult Mflops: 0.08 (N=1000, nz=5000) LU Mflops: 0.02 (M=100, N=100)

Amiga1200 68060/50 FPU code
**                                                               **
** SciMark2a Numeric Benchmark, see http://math.nist.gov/scimark **
**                                                               **
** Delphi Port, see http://code.google.com/p/scimark-delphi/     **
**                                                               **
Mininum running time = 2.00 seconds
Composite Score MFlops:     2.26
FFT             Mflops:     1.18    (N=1024)
SOR             Mflops:     5.05    (100 x 100)
MonteCarlo:     Mflops:     0.86
Sparse matmult  Mflops:     1.81    (N=1000, nz=5000)
LU              Mflops:     2.41    (M=100, N=100)

So it just shows the same trend. Attention: do not compare this MFlops with the theoretically MFlops most speedtests show you (like sysinfo), you can see, how different the tests behave. It depends very strong on which commands are used and how much memory bandwidth is needed.

In conclusion it shows really nicely why the MUIMapparium with a track on a Vampire is so slow currently, because of the slow SoftFPU. Very sad that the Vampire still lacks a proper FPU support. We (ChainQ and me) believe that it is possible to optimize the SoftFPU performance maybe 50% faster or even double, or lets aim for the stars.. 10 times faster than now (I do not believe that is even close to possible at all). It would still be around 5 times slower than a 68060/50 Mhz FPU, for the people believing a SoftFPU implementation could be a replacement for a native FPU in the FPGA.

That means, if it reacts very slowly on Vampire, just remove the track. πŸ˜‰ I will work on this, reduce the needed calculations, (by using more memory), see at which places I could possibly go down to single precision (not much hope there ;-)) and of course reduce the number of points, in principle a LOD on the Zoomlevel.

P.S.
if you want to test SciMark you can download the FPu and SoftFPU exe from my server:SciMark FPU Version, SciMark SoftFPU Version. (I would be very interested in 68881/2 Results)

Some Curves

Posted by ALB42 on 18. Mai 2017No Comments

LCL has powerful packages for example the Chart component (or the Edit component with Highlighter). MUI already have some, but not so powerful and not available for all platforms (especially ARM-AROS and AROS64). I try to keep on the included basic classes to keep it compatible.

I used the very powerful TAChart component in Mapparium to show the Height/Speed trace of a track. Of course for MUIMapparium I also want to have that, so I started to implement a plot class for MUI. Not so powerful but already very nice with two Y-Axes, Zoom and Autoscale.

MUI Plot component start

It already works rather nicely, as first approach can be used like this.

I designed it already in a very abstract way in principle a TPaintBox for MUI and based on that the Plot class, that means I can use them in other programs as well.

Locale Localization position

Posted by ALB42 on 8. Mai 2017No Comments

Working on a very old bug. I’m not sure if someone noticed it, at least nobody reported it. The coordinate to pixel conversation was not very precise because it used a average resolution for every tile. This works well for higher zoom levels where the resolution does not change much inside a tile. But for lower zoom levels, especially the whole world picture this is certainly not right any easily visible when using way points. See for example Mapparium 0.6 on the right side of the image, all way points are (and tracks) are shifted to north. The solution was not very difficult but needed some thinking, basically a rounding error and precision problem.

MUIMapparium (Left) and Mapparium (right) Waypoint position comparison and german locale

I also start to play with localization. I never did that before, especially not in FreePascal but it’s not very complicated, just diligent work to replace all strings. So next version will also be available in german (and maybe later some more languages, at least I got an offer for french localization). There is one small problem with that, there is no locale library unit in FreePascal for AmigaOS4 so either I make some defines to turn it off for OS4 or implement the library unit.

Symbols

Posted by ALB42 on 30. April 2017No Comments

Yesterday I was at the Amiga and retro computer meeting here in Berlin, always nice to be there and talk to the people. ChainQ was also there so we worked a little bit on this strange WBStartup problem on MorphOS. As some already mentioned in the comments to MUIMapparium, it is not possible to start fpc compiled programs at MorphOS via an Icon. The programs just do not start directly freeze, the reason is simple, when starting via Icon the WBStartup Message is sent to the application via an MsgPort at the Process structure. Freepascal waits for this Message, which never arrives. After some trying we found, that the needed structures and functions in fpc are in good shape so not the root of this problem. The problem appeared when ChainQ changed the startup code from a assembler one to a pure Pascal startup code. So my guess was that it somehow relates to the needed symbol inside the executable, which defines it as a MorphOS executeable __abox__. Of course I do not have much knowledge about it, just a wild guess. But it turns out that this was quite right. The first problem is that I used an older version of vlink, which seems to have a bug and removed this symbol (because never used). After an new compilation of vlink the __abox__ symbol is there but still it do not work. ChainQ knows a little bit more about this stuff and also knows whom to ask in the inner circle of MorphOS developer what could be the root of it. Some strange things I was able to see, the program I started stopped when starting via Icon. But via TaskManager you can see, two of them are started. An other thing with Snoopium you can check when the program does at the start. Especially the OpenLibrary function is interesting in this case. You can see how the program tries to open the ppc.library, which I learned now, is a sign that MorphOS thinks that this program is an PowerUp executable, which explains the odd behavior. In the end the solution was not so difficult, FreePascal did not set symbol type and symbol size as expected by MorphOS. (The size is important because the __abox__ symbol should point to a longword with value one) It appears FreePascal already have a function to care about such things, which only have to be activated for the MorphOS compiler. Finally it’s fixed again. Thanks for the help.