I found out that I can improve the speed of the display rendering with about 50%.
It's quite simple really.
Currently I'm checking every frame, after every color, after pixel on/off...
But my machine will not feature 60 fps full motion video. Heck, even cinemas don't have that. You can easily see that this is a large overhead with largely redundant information.
By extracting the draw buffer (front) from the videolayer (back) I can keep drawing the frontbuffer "no questions asked" at about 1 ms per color. All the screen updates, such as text and images, happens in the backbuffer and when say a certain interval has passed the buffers are swapped. Rinse and repeat.
The beauty about this is that I remove all the heavy checks from the draw routine and can spread out the work of building a frame over a "lot" of time. Every clock cycle is crucial in this build since the Arduino is a single core, single process type of MCU.
A quick example (numbers are guesstimates, since I have not coded the backbuffer part yet):
Single frame: 18 ms (drawing) + 20 ms (building) = 38 ms, gives around 26 fps if the frame would be recreated every frame, such as during a video scene. This is very low and flickering is noticeable. Normal case is around 52 fps if the frame is created only once and drawn many times.
Single frame: 8 ms (drawing) + 2 ms (building) = 10 ms, gives around 100 fps if video was created at about 15 frames/sec. I could easily limit the drawing to 70 Hz to further free up resources without being noticeable for the player. Time that they however will feel with the game being more responsive.
I can't believe I've missed this -
I've done this a million times before in regular graphics programming...