Version 1.1.4 - 5x Faster startup

tldr: App launch is 5x faster in 1.1.4 than in 1.1.3

Measure Twice Cut Once (>2.0s)

Playing on device I noticed that game startup times were starting to a feel a bit sluggish.  Most of the time I work in the simulator, which is super fast (30-40ms startup) but while playing on-device the other day I found myself impatiently waiting for the app to resume. So I wrote a small wrapper function to time individual function calls: 
Unsurprisingly, nearly the entire startup time is being spent generating images.  See, instead of generating 400-odd PNGs and storing them in-repo and compiling them into 400 pdi files --  we draw the balls programmatically generating images we need at startup time.  On the simulator this takes ~35ms (yay!) but on the actual device this actually takes longer takes >2000ms. Approximately 160ms ~1775ms (eek! 1.8seconds!). So let's see if we can do better.

Just be smaller (2.15x faster) 

We have the playdate draw images for sizes ranging from 9 pixels to 175 pixels so the largest circle has a radius 175 pixels. Wait, do we ever need a ball that big? We do not -- the height of the playdate display is 240 pixels so we definitely never need a ball bigger than 120pixel radius.  This immediately drops our load time from 1775ms to 825ms a 2X speedup! 

Only do threes (3.3x faster)

We're generating all the images we'll ever need at startup time, after that the game and animations will run smoothy without the need to draw anything at runtime.  Originally I was only drawing images for ball #3, ball #2 and ball #1 for each of 175 sizes. At some point I implemented drawing zeros (no circle) to fall away when the ball is cleared. The time we're most worried about smooth animations is while the ball is growing.  It turns out that this only happens for the "3" ball, we never animate the growing of the other sizes. So let's pre-draw all all the 3's -- and only draw-on-demand the images for 2,1,0. Now our minimum setup is only 245ms (down from 825ms). 

We've shifted the cost to runtime though -- on frames where the shot bounces off another ball we've added 1-2ms on-device to that frame (you can only have one bounce per frame). 30fps is a ~33ms budget and so that should be safe. Of note, this now means it takes a variable quantity of time to draw a cold loading of a previously in-progress game. With each additional ball with only 1 or 2 hits remaining we add a couple milliseconds to load time.  So a game with 20 balls on the board is still sub 300ms. 

Temporarily be even smaller (1.6x faster)

Looking back at the 120pixel choice above -- 120px radius balls are the biggest you could ever use, but 89pixel radius are the biggest we're currently using. I want to support a full screen rotated mode in the future which could require 120pixel radii, but there's no reason to make people wait today to support a feature of (maybe) tomorrow.   This brings the best case load time (empty board) down to a very respectable 153ms (down from 245ms).

Be the 11x (or 5x) programmer you want to be

Version 1.1.3 loads images in ~1775ms
Version 1.1.4 loads images in ~155ms (+1-2ms per ball on the board).
We've reduce image generation time by a factor of 11x.
Even on the simulator we've dropped loading time from ~45ms total to ~7ms (5x)
Total device load time is now 400ms (down from >2000ms) -- roughly 5x faster.

What's next

If I want to make additional speedups on load time, I'll have to attack various parts of the import of gimme.lua.  This import takes ~230ms on device.  There's not a lot of room: loading sound/music (190ms), loading two fonts (10ms),  loading four images and creating sprites for them (30ms).  I could skip loading sound if the volume is muted (I previously have created a tiny 40KB build without music) or switch to the streaming sound player which would load music async from flash. Similarly I could lazy-load the splash logo and the game-over image and that might save 20ms.  I think sub 200ms total load latency is possible, but no one is buying a game (or paying more after the fact) because it loads in 200ms and not 400ms.

See previous devlog: Implementing Classic Mode for more.
As always, you can see all the changes in the release on Github.

Files 1 MB
Version 1.1.4 May 29, 2022

Get Gimme Friction Baby Playdate Tribute

Buy Now$2.99 USD or more

Leave a comment

Log in with to leave a comment.