Monday, August 27, 2012

Codename Revised

Spinning further with the idea of writing to the sdbuffer-array directly instead of first cache'ing the data, I decided to completely redesign the loading.

I did a couple of major changes -

1) The SD card is now raw sectors. No filesystem, just packed bytes one after the other.
2) Ditched the Arduino based SPI library and went with Digilent's DSPI library.
3) Streamlined the DMD rendering to use Chipkit specific latch-registers instead.

Using raw sectors I have immediate access to whatever sector I need. The sectors of the SD card is conveniently stored in 512B-chunks and three chunks make up one frame of animation. The FAT filesystem requires me to find filenames first and is really made for non-static data. By letting a script generating the SD card file for me I know exactly where each file is placed on the card, allowing me to access it directly. There's also no cache, so if I ask for a byte - I get it. Besides direct access when needed, this allows me to pause, rewind and speed up animations on demand, for example.

By leaving the rather limited SPI library, I've managed to increase the speed of both SD card reading and the time it takes to update the DMD. They're now also on two independent SPI-channels allowing them to be run at the "same" time, that's is - being active at the same time. The DSPI-library also allows me to do interrupt-transfers. That means I could download data from the card in the background while the program continues execution, kind of like multithreading on a PC. I have not yet enabled this feature, but I'm looking in to it.

The DMD rendering was pretty tight previously, but by using the latch-registers (LATxINV, LATxSET, LATxCLR) together with a higher speed I got an extremely nice performance boost. I did a little restructuring as well so that I could perform multiple operations at the same time, for instance - toggling the latch and row-clock directly after each other instead of having to wait for the first one to finish entirely. As each of these calls are done 28 672 times, each nanosecond quickly builds up! I've also done a little manual loop unrolling to further maximize performance. I've spent quite a while reading the spec-sheet for the display to make sure that I don't delay more than absolutely necessary. I've also added an exponential delay between each row update so that the overall image looks much better. By having a fixed delay, the low end was represented in a non-visually pleasing manner, since the low end was very sharp. It simply looks better to have smaller differences in brightness in the low end and bigger differences in the brighter areas.

With these major changes I've pushed the performance of the SD card reading a full frame from around 14 ms/frame to 9 ms/frame 7.5 ms, and the DMD refresh from 8 ms/frame to 4 ms/frame 7 ms (I've inserted a larger delay as well to increase the contrast) . These are done interleaved so I get a rock solid ~70Hz and streaming video at 30 fps.

The nice thing about this is that I got 5 milliseconds of reserved space that can be used for any processor heavy calculating, such as transitions, layer handling, transparency etc - without it affecting the frame rate at all. As long as the total time is 5 ms or less, that is.

I could push the number of frames further, but the game logic and SD card reading must take exactly the same amount of time to execute, otherwise the display will change in brightness as soon as the SD card reading takes place. This is highly unwanted, but I've seen this behavior in "proper" pinball machines and it's nothing that I want in my machine.

If the SD card reading gets further pushed down the overall refresh rate will only increase, until a breaking point is reached when the game logic takes longer to perform.

I'm quite satisfied with the progress so far!


  1. not a word of that made sense to me! Do you have a background in electronics and programming as you seem very knowledgeable in that area.

  2. Hehe, yes - programming is my major field. :)
    Electronics and hardware programming is a relatively new field to me however, I've only got a few years of experience in that area.
    It's quite nice to go back to basics and learn how I/O really works compared to just switching something on and "it just magically works". This project has made me dig deeper to understand WHAT makes it work etc.

    Much fun, in a masochistic sort of way! ;)

  3. nice. I use my SD card raw too, my 'filesystem' is kind of like old style doom wad/pack file. hacky ruby scripts to the rescue, it also generates a .h file to embed in the C code which tells the offsets of files/length etc. the downside is the code 'rom' is tied directly to the sd disk 'rom' image.

  4. Same here. :)
    Except I plan to do hacky python scripts to do the rescuing!

    I don't expect the code and disc being tied together to be a problem, since most of the time you'll edit the code anyway if you're adding animations etc. Besides, you could just leave the old animation in place and simply extend the rom by appending to the end of the file. It's not like there's no space on the SD cards these days... :)

  5. Also - If you'd get really creative you could allocate the first two sectors or so to contain the animation start/stop/offsets and parse them during startup. That way no data would be static. But then comes the problem of dynamically assigning animations to modes etc. It's more trouble than it's worth. ;)