New Tools

This month I put some time into writing a couple more Datasette-adjacent tools: (See previous Datasette posts)

Instapaper: For a while, I’ve been using Instapaper’s RSS feed to export my saved articles. However, the RSS feeds that Instapaper exposes for one’s personal collections is limited to the latest ~10 articles, and doesn’t expose any service-specific metadata. instapaper-to-sqlite talks directly to the Instapaper API, so it can get additional information like when bookmarks were saved, how far you read into the article, and if you starred the article or not.

Overcast: Overcast has a mechanism of exporting your podcast listening history as an OPML file. As a bonus, this OPML file also includes some player metadata for individual podcast episodes, like how far you listened into the episode, if you completed the episode, etc. overcast-to-sqlite is a small utility that pulls the OPML from Overcast, finds the metadata for each feed and episode, and writes them all to a SQLite database.

I’m still a big fan of the Datasette / Dogsheep pattern for managing personal data: The idea is that you write a small export tool to download your personal data to a SQLite database, and then you can use Datasette as a generic data exploration/visualization tool. Each of these little export tools, of which there is a growing number, further increases the value of this approach. Also, this month, the Datasette author released a version of it as a standalone macOS app – pretty cool!

This month, I also wrote wayback-archiver, a CLI tool for archiving links to the Wayback Machine. See this blog post about how I use it to prevent link rot in my Obsidian vault.

I was looking for an excuse to write some more Rust, so I chose to write overcast-to-sqlite and wayback-archiver in Rust. The experience was quite positive. Sure, there’s more initial overhead to writing a CLI tool in Rust than, say, Python. But, once you get the boilerplate down, Rust is quite a productive language to work in. The VSCode + rust-analyzer developer experience has improved a lot over the past year or so.

Github Copilot

A few months ago, I applied for and got in to the Github Copilot beta. Github Copilot is a AI developer tool that tries to be “even better autocomplete”, powered by OpenAI’s Codex, which is a descendent of GPT-3. Instead of tab-completing single variables or methods, Copilot can suggest entire code blocks based on the context of what you’re writing.

My experience with Copilot has ranged from “eerily good” to “essentially useless”. For very common, well-defined circumstances, it’s quite useful. As an example, it’s quite good at detecting rote tasks like opening a file and reading it to a variable. In domain-specific situations, it often falls flat. A general rule of thumb seems to be that if you could Google search what you’re trying to do and copy/paste the first StackOverflow answer verbatim into your code, Copilot is able to make that suggestion; anything more complicated, and the suggestions become less useful.

There’s also a bit of an art to “driving” the AI. You can nudge it into giving more useful suggestions by prepending code with an (plain English) comment like “// Read lines from a file”. This was the “eerily good” experience – when this works, it feels like magic to type an plain language explanation of something and see the tool spit out working code.

I hope Copilot continues to evolve and become more useful, but currently it’s more of a curiosity than a “workhorse tool”.

As a fun aside, you can enable Copilot in non-code files, e.g. in Markdown files. So even though it’s billed as a code tool, you can also use it for writing READMEs or… blog posts. (I’ve had Copilot enabled for the full time I’ve been writing this post, but the suggestions have so poor that it’s been more annoying than useful 😛).

  • Guesstimate
    • Spreadsheet-esque tool for modeling systems and producing estimates.
  • How Discord Stores Billions of Messages (2017)
    • Interesting technical writeup on Discord’s migration from MongoDB to Cassandra.
  • RustConf 2021
  • Problem Solving Instinct and Culture
    • A short piece on a programming competition between Donald Knuth and Doug McIlroy. It’s worth a full read, but the last line made me chuckle – McIlroy criticzing Knuth’s solution in favor of his composable Unix pipe approach:

      [Knuth] has fashioned a sort of industrial strength Faberge egg – intricate, wonderfully worked, refined beyond all ordinary desires, a museum piece from the start.

  • Circuit Breaker Pattern | Microsoft Docs
    • Handle faults that may take a variable amount of time to rectify when connecting to a remote service or resource. This pattern can improve the stability and resiliency of an application.

    • An interesting design pattern for handling transient errors in distributed systems. The whole Microsoft Docs site for Cloud Design Patterns is worth perusing.