I took a break from writing Week Notes last week. It's been a busy few weeks at work and, well, something's gotta give. To be honest, things are still pretty busy, but in the spirit of “don't skip a routine twice”, I figured I'd write something for this week.

Twitch's 2.5 Million Emotes

This week was SGDQ2020 (“Summer Games Done Quick”, a speed running marathon / charity event), and as per usual I was running my GDQStats project, which displays real-time data about the event (viewership, donations, etc.). I've written about the infrastructure of GDQStats before, and it's been pretty stable for the past few years. To be honest, I've kinda lost interest in that project, but the site gets a lot of traffic each time GDQ rolls around and I get a fair amount of positive feedback about it, so I've kept it running.

Usually, I don't run into any technical issues with GDQStats, but I noticed at some point this week that the VM that was running the metrics collector for the site was running really low on memory. The VM had 4GB of memory, and nothing about the metrics collector should have needed that much RAM, so I started investigating.

As a bit of background, one of the (rather silly) metrics that I record about GDQ is the number of emotes used in the Twitch chat over time. For some of the more energetic runs, most of the chat is comprised of Twitch emotes, so it's kinda fun to see those spikes in the data. I was tracking emote usage by downloading the entire set of Twitch emotes (available via their API) upon service startup, keeping the emote list in memory, and then running a very basic token recognition algorithm over each incoming chat message. This approach worked fine for the past few years. However, Twitch has been increasing in popularity and, as I discovered, the number of custom emotes has ballooned to an astronomical ~2.5 million emotes, as of writing.

To give a sense of scale, the API endpoint that returns the list of custom emotes returns a >100Mb JSON response. (To be fair to Twitch, they do warn that this endpoint “returns a large amount of data”.)

My metrics collector is dockerized and scaled horizontally for some redundancy, so this meant I had multiple instances of my service calling the Twitch API, which quickly saturated the VM's network connection each time I started up the service. Throw in some carelessness in memory usage, and… soon your servers start having a Bad Time™️. Admittedly, I don't know that much about tuning Python for production services, so there are probably some low hanging fruit I am unaware of.

The “correct” solution would probably to have cached the emote list ~once, and used a more efficient mechanism to recognize emotes (perhaps offloading that task to my database, or by using batching). But, instead I just axed the metric. It stopped being very interesting after GamesDoneQuick locked down their Twitch chat to paying subscribers.

A quick site update later, and emotes have been removed. Also, my VM is back down from >95% memory utilization to ~25%.

Factorio 1.0

Factorio, the amazing factory building logistics game, finally hit its Version 1.0! 🥳🔧 It'd been in a (very playable, very fun) early access for ~8.5 years. I searched up my Factorio receipt, and found that I bought the game in December 2014. Time flies…

Anyways, I hadn't played in a while, and so I spun up Factorio this week and started a new factory. It's still an incredibly engrossing game. I'm impressed with how they managed to make it strictly better as time has gone on. In one of my favorite updates, they added a multiplayer mode, which was some of the most fun cooperative gaming I've ever experienced.

There's a joke that I make often, though I'm not sure the original source of it, that lots of hobby programming is effectively 21st century model train building. Factorio is the epitome of this in gaming form – you can literally build an automated model train system (complete with schedules, rail signals, and automatic cargo loading).

My current save is still firmly in the “early-game”, and I think much of the change in the past few years have been to the mid- and late-game, so I'm looking forward to playing more.