<rss version="2.0"
  xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Ben Congdon</title>
    <link>https://benjamincongdon.me/tags/agents/</link>
    <description>Recent posts from Ben Congdon</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en-us</language>
    <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
    <managingEditor>ben@congdon.dev (Ben Congdon)</managingEditor>
    <webMaster>ben@congdon.dev (Ben Congdon)</webMaster>
    <copyright>2026</copyright>
    <lastBuildDate>Fri, 15 May 2026 07:00:00 -0700</lastBuildDate>
    
        <atom:link href="https://benjamincongdon.me/tags/agents/feed.xml" rel="self" type="application/rss+xml" />
    
    
    <item>
        <title>To the Agents: &#34;This place is not a place of honor&#34;</title>
        <link>https://benjamincongdon.me/blog/2026/05/15/To-the-Agents-This-place-is-not-a-place-of-honor/</link>
        <pubDate>Fri, 15 May 2026 07:00:00 -0700</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2026/05/15/To-the-Agents-This-place-is-not-a-place-of-honor/</guid>
        <description>&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; &amp;ldquo;Private by obscurity&amp;rdquo; has been dissolved.&lt;/p&gt;
&lt;p&gt;Internal tools often have layering boundaries that are enforced only by
convention. It&amp;rsquo;s natural to assume a &amp;ldquo;high trust environment&amp;rdquo;, where privileged
actions are discouraged by obscurity and goodwill instead of hard technical
boundaries. Coding agents have dissolved this obscurity, and as a result
internal platform engineering now &lt;em&gt;really&lt;/em&gt; demands a security mindset.&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;p&gt;During a recent codebase audit, a coworker and I discovered an unfortunate set
of private APIs my team owns that were being used in creative and unintended
ways, outside the official interfaces. Much of the code that introduced these
unsanctioned dependencies was AI generated&lt;sup id=&#34;fnref:2&#34;&gt;&lt;a href=&#34;#fn:2&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;2&lt;/a&gt;&lt;/sup&gt;. This was one more datapoint
among many that, especially in large monolith codebases and in large
enterprises, coding agents have changed how platform teams need to operate.&lt;/p&gt;
&lt;p&gt;This particular audit exposed two classes of issue of internal API leakage:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;We have pseudo-internal APIs opened for narrow “2nd party” integrations.
These had some allowlisting on them, but insufficiently granular allowlisting
to prevent inadvertent use. These were previously “private-by-obscurity”,
which is woefully insufficient when coding agents can reverse engineer the
entire stack and determine that they can make creative use of your
pseudo-internal APIs.&lt;/li&gt;
&lt;li&gt;APIs that were properly private, but a single Bazel visibility change made
them opened to external callers. Bazel visibility changes often look
benign&lt;sup id=&#34;fnref:3&#34;&gt;&lt;a href=&#34;#fn:3&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;3&lt;/a&gt;&lt;/sup&gt;, and internal dependencies can be introduced inside a large PR
without anyone noticing.&lt;sup id=&#34;fnref:4&#34;&gt;&lt;a href=&#34;#fn:4&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;4&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;In all the cases we saw so far, the code using the exposed APIs wasn’t even
&lt;em&gt;wrong&lt;/em&gt; from a technical perspective. However, it violated layering principles
and set up concerning coupling that would eventually break as the system
evolved. Very creative code, very unsanctioned.&lt;/p&gt;
&lt;p&gt;Fortunately, the proximal fixes are simple:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Keep external APIs friendly and well-documented. Agents tend to
&lt;a href=&#34;https://en.wikipedia.org/wiki/Streetlight_effect&#34;&gt;look under lampposts&lt;/a&gt;, so
make sure the paved paths are well-lit.&lt;/li&gt;
&lt;li&gt;Update the internal APIs &amp;ndash; especially the honey-pot APIs that look like you
&lt;em&gt;might&lt;/em&gt; want to use them if you were a hungry external agent &amp;ndash; to look
extremely unappealing to use.
&lt;ul&gt;
&lt;li&gt;Create a CLAUDE.md in all borderline packages that tells future
“internal” coding agents to be diligent about accidentally making symbols
externally visible.&lt;/li&gt;
&lt;li&gt;Seal off the misused interfaces in “stop-the-bleeding” private bulkhead
packages, with extremely loud “HERE BE DRAGONS” warnings at the top.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Proper layering and API visibility isn’t a new problem. The thing that has
changed is the amount of creative adversarial pressure on code. Similar to how
internet security was lax prior to the adversarial pressure of networked
software vulnerabilities, many internal company platforms are insufficiently
paranoid about insider risk.&lt;/p&gt;
&lt;p&gt;This insider risk goes beyond code: internal APIs now have a much larger attack
surface area when any coding agent session can trivially run &lt;code&gt;curl&lt;/code&gt; from a
developer’s machine. Hopefully all your ACLs are correctly configured and you
don’t have any catastrophic write APIs publicly accessible via developer
credentials&amp;hellip; right? Similarly, I’ve recently seen an uptick of agents
discovering internal prod-modifying CI jobs and convincing folks to run them.
Hopefully there are sufficient restrictions on who can run those jobs&amp;hellip;?&lt;/p&gt;
&lt;p&gt;This isn’t to excuse having an improperly paranoid security posture in the past:
all these surfaces &lt;em&gt;should obviously be locked down&lt;/em&gt;. However, in the past you
could rely on some amount of “security by obscurity” in borderline cases. Now,
you clearly cannot.&lt;/p&gt;
&lt;p&gt;So, you&amp;rsquo;ll start seeing more of this:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;/**
 * ⚠ UNSAFE — INTERNAL USE ONLY ⚠
 *
 * &amp;lt;p&amp;gt;
 * Do not invoke without explicit human approval.
 *
 * &amp;lt;pre&amp;gt;
 * This place is a message... and part of a system of messages...
 * pay attention to it!
 *
 * Sending this message was important to us.
 * We considered ourselves to be a powerful culture.
 *
 * This place is not a place of honor...
 * no highly esteemed deed is commemorated here...
 * nothing valued is here.
 *
 * What is here was dangerous and repulsive to us.
 * This message is a warning about danger.
 * &amp;lt;/pre&amp;gt;
 */
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;It’s hard to not read &amp;ldquo;I spliced into your internal APIs&amp;rdquo; as somewhat malicious,
even if there was no malintent. But what defines an “internal API” in a monorepo
is often a grey area. It’s on me as a platform owner to provide clean
interfaces, and it’s on me to not create
&lt;a href=&#34;https://en.wikipedia.org/wiki/Attractive_nuisance_doctrine&#34;&gt;attractive nuisances&lt;/a&gt;
of trivially accessible unsafe APIs. But to err is to be human. At a quickly
growing company, there is a ton of startup-era code that was insufficiently
paranoid; gaps will exist. And so you “stop the bleeding”, patch the gaps, help
people migrate to safer interfaces, and everyone is better off for it.&lt;/p&gt;
&lt;p&gt;All this to say: agentic coding is fantastic; I’m a big fan! But the human side
of engineering is still quite important. If you want to do something slightly
off-kilter with a system someone else owns, by all means propose it. Progress is
good! But also be &lt;em&gt;very explicit&lt;/em&gt; with them and get &lt;em&gt;buyoff&lt;/em&gt;. &amp;ldquo;Ask forgiveness&amp;rdquo;
is a valid approach in many circumstances, but not in safety-critical code.&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;&lt;em&gt;Insert &amp;ldquo;Always has been&amp;rdquo; meme here.&lt;/em&gt;&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:2&#34;&gt;
&lt;p&gt;Although, this isn’t saying much. Most code is AI generated.&amp;#160;&lt;a href=&#34;#fnref:2&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:3&#34;&gt;
&lt;p&gt;When I worked at Google, I recall it being easier to spot obviously faulty
Bazel visibility additions. I think at large-but-not-mega-repo scale, the
chances of accidentally introducing bad Bazel edges is a bit harder to
totally avoid. There are techniques for this, like banning certain outbound
edges between parts of the Bazel graph. Google had good tooling for this
that doesn’t seem to exist in an ergonomic form outside Google.&amp;#160;&lt;a href=&#34;#fnref:3&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:4&#34;&gt;
&lt;p&gt;PRs have been getting larger. PR descriptions have also been getting wordier
and more “polished”-looking, regardless of their actual merit as changes. A
reviewer who isn’t super familiar with the codebase may not realize the risk
of a benign-looking Bazel visibility change.&amp;#160;&lt;a href=&#34;#fnref:4&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
        <title>Thoughts on Marginal Token Spend</title>
        <link>https://benjamincongdon.me/blog/2026/04/30/Thoughts-on-Marginal-Token-Spend/</link>
        <pubDate>Thu, 30 Apr 2026 07:00:00 -0700</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2026/04/30/Thoughts-on-Marginal-Token-Spend/</guid>
        <description>&lt;p&gt;The rise of coding agents has made it easy for a single engineer to spend
thousands of dollars a day in LLM tokens. This is a new class of expense, and it
will change the future cost structure of software engineering. We are between
stable equilibria today in SWE: the old one, of needing humans to drive any code
change, and a yet-to-be-established new one, where AI agents write most code.&lt;/p&gt;
&lt;p&gt;Taking as a premise that AI agents will write a large fraction of code in the
new equilibrium, we will need to rethink the resulting cost structure for
engineering orgs. In the long term, token spend will become a large portion of
enterprise OpEx, split into two classes: spend attributable to a human worker
(e.g. Claude Code, internal AI tooling like call center assistants) and spend
attributable to automated systems (e.g. agents which respond autonomously to
customers by fielding calls/emails, agents which monitor business systems).&lt;/p&gt;
&lt;p&gt;Automated token spend is analogous to cloud cost: at equilibrium, it &amp;ldquo;should&amp;rdquo;
roughly scale linearly with product usage/revenue. Human-generated token spend
seems like a new class of expense: It&amp;rsquo;s a resource that does not scale linearly
with headcount and has no natural ceiling. The per-person absorption rate is
whatever an engineer&amp;rsquo;s workflow can productively use, and that ceiling is itself
increasing as new harnesses, workflows, and practices are developed.
Human-attributable token spend is therefore the more interesting type of token
usage from a decision-making perspective: it&amp;rsquo;s effectively an expensive &amp;ldquo;more
productivity button&amp;rdquo;. Rationally, you should continue pushing the button up
until the point where the marginal returns you receive match the marginal cost
of pushing the button again.&lt;/p&gt;
&lt;p&gt;This framing also explains why the
&lt;a href=&#34;https://benjamincongdon.me/blog/2026/04/29/Tokenmaxxing-is-Goodharting/&#34;&gt;“Tokenmaxxing”&lt;/a&gt; meme started.
Tokenmaxxing is the recent idea that consuming more tokens makes you more &amp;ldquo;AI
native&amp;rdquo; and therefore more productive. AI adoption is path dependent; engineers
are hesitant to change their patterns. Leaders noticed that engineers were
opting to push the &amp;ldquo;more productivity&amp;rdquo; button at an irrationally low rate. The
short-term &amp;ldquo;fix&amp;rdquo; is to directly incentivize pushing the button &amp;ndash; with the clear
risk that you overshoot into people Goodharting on &amp;ldquo;push the button as much as I
can&amp;rdquo; vs. &amp;ldquo;increase my productivity to the point of diminishing returns&amp;rdquo;. And,
predictably, &amp;ldquo;oops, we overshot&amp;rdquo; becomes a narrative &amp;ndash; at least for the
organizations that aren&amp;rsquo;t able to keep finding additional efficient uses of AI
that expand their production frontier.&lt;/p&gt;
&lt;p&gt;A corollary: if your marginal token cost is negligible, then your usage of AI
should go &lt;em&gt;way, way up&lt;/em&gt;. Right now, Anthropic and OpenAI are clearly in the lead
for coding model capabilities. They have lots of compute. The amount of compute
needed to completely saturate their engineers&amp;rsquo; token demand is likely a small
fraction of what they use on the marginal research projects. If you have the
hardware, and marginal-cost access to frontier models, you should probably
actually be Tokenmaxxing, since it&amp;rsquo;s much harder to reach the balance point of
&amp;ldquo;marginal return = marginal cost&amp;rdquo; when marginal cost is so much lower.&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;p&gt;Whether you should Tokenmaxx in the short term depends on where you sit on the
marginal cost curve. If your marginal token cost is high (e.g. paying Anthropic
for Opus tokens at retail prices), the just-spend-more heuristic will overshoot.
If it&amp;rsquo;s low, you probably &lt;em&gt;should&lt;/em&gt; push harder.&lt;/p&gt;
&lt;p&gt;In the long(er) term, the cost curve is influenced by how much an organization
is able to absorb the ability to use more tokens. Companies will invest in
getting more productive use per token, through both technical improvements (e.g.
connectivity between AI tools and internal knowledge silos) and operational work
(e.g. workflow redesign, changing norms of when it’s appropriate to substitute
AI artifacts for human-generated ones). Token spend is more elastic than
headcount, and the space is new enough that the efficiency frontier is still
being discovered and reshaped.&lt;/p&gt;
&lt;p&gt;My bet is that most efficiency gains won&amp;rsquo;t become competitive moats, because
development techniques diffuse too quickly and the underlying models are
available to everyone. They will, however, become a low-water mark. Companies
that fall below it will be outcompeted by firms that don&amp;rsquo;t. In this way, AI
adoption is a Red Queen race, requiring increasing efficiency/usage just to not
become irrelevant.&lt;/p&gt;
&lt;p&gt;None of this is stable yet. We’re currently between stable equilibria in many
areas in software development. As such, it’s a particularly exciting time to be
working in this space.&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;To some extent, frontier labs also have to consider the opportunity cost of
using compute for engineering instead of research or inference, i.e. the
marginal token can either go to the marginal research project, the marginal
internal engineer, or the marginal external customer inference token. That
opportunity cost is not zero, but it’s a different set of tradeoffs than
external companies buying retail tokens. Frontier lab investment in R&amp;amp;D via
“engineer tokens” also increases returns to marginal research compute, and
“engineer tokens” are likely a quite small fraction of either research or
inference compute.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
        <title>Tokenmaxxing is Goodharting</title>
        <link>https://benjamincongdon.me/blog/2026/04/29/Tokenmaxxing-is-Goodharting/</link>
        <pubDate>Wed, 29 Apr 2026 05:00:00 -0700</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2026/04/29/Tokenmaxxing-is-Goodharting/</guid>
        <description>&lt;p&gt;Coding agents and reasoning models let individuals consume many more LLM tokens
than they could a year ago. It&amp;rsquo;s now easy for a single engineer to spend
thousands of dollars in daily token usage. This is being actively encouraged
through the recent memetic spread of &amp;ldquo;Tokenmaxxing&amp;rdquo; &amp;ndash; the idea that if you
consume more tokens, you&amp;rsquo;re more &amp;ldquo;AI native&amp;rdquo; and therefore producing more
valuable output.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Tokenmaxxing is not The Way.&lt;/strong&gt; Plainly, it&amp;rsquo;s a textbook instance of
Goodharting. Token leaderboards come from an understandable short-term instinct
to shift habits towards more AI usage, but direct optimization in this fashion
inevitably overshoots into wasteful spending. Token-usage-as-target means token
consumption ceases to be a useful metric.&lt;/p&gt;
&lt;p&gt;Per-engineer token usage is, admittedly, useful as a diagnostic when engineers
are dramatically and systematically underusing AI. However, the leaderboard
version of token usage is likely actively harmful. This is analogous to how
“lines of code merged” or “PRs merged” or “design docs written” can be
interesting aggregate diagnostic numbers when used &lt;em&gt;directionally&lt;/em&gt;, but
obviously a leaderboard of “design docs written per quarter” would produce the
wrong incentives.&lt;/p&gt;
&lt;p&gt;Operationalizing this into predictions over the timespan of the next 6 months:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Weak claim: Prominent executives / thought-leaders start publicly criticizing
Tokenmaxxing / raw token usage as a bad metric.
&lt;ul&gt;
&lt;li&gt;This is basically already happening. My prediction: 90%.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Stronger claim: There is a noticeable cross-company narrative shift toward
discipline on AI spend with an emphasis on ROI measurement and skepticism of
using raw token as a proxy for productivity.
&lt;ul&gt;
&lt;li&gt;I&amp;rsquo;m more tentative on this&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt;. My prediction: 60-70%.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Reasoning:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;There was a sharp bend in the curve of agent adoption around December &amp;lsquo;25 /
January &amp;lsquo;26.&lt;/li&gt;
&lt;li&gt;Increased enterprise spending will start showing up in Q1’26 OpEx financial
results, but will likely show up as a sharp increase in Q2’26 results.&lt;/li&gt;
&lt;li&gt;Since January, there has been increased pressure at many companies for
engineers to &amp;ldquo;Tokenmaxx&amp;rdquo;. This is an incredibly easy (and costly) leaderboard
to game.&lt;/li&gt;
&lt;li&gt;See, for example:
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://blog.pragmaticengineer.com/the-pulse-tokenmaxxing-as-a-weird-new-trend/&#34;&gt;The Pulse: &amp;lsquo;Tokenmaxxing&amp;rsquo; as a weird new trend&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://www.nytimes.com/2026/03/20/technology/tokenmaxxing-ai-agents.html&#34;&gt;More! More! More! Tech Workers Max Out Their A.I. Use&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&amp;ldquo;Tokenmaxxing&amp;rdquo; is fun and memetic, but likely has a short lifespan before you
see the median engineer try to game the leaderboard numbers &amp;ndash; or at least,
change their decision-making on the margin to make less efficient use of
tokens.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In the short term, I&amp;rsquo;d bet that some companies will start imposing soft token
budgets on engineers, while others will continue to soft-allow unlimited spend
&amp;ndash; either for legitimate reasons (more effective/efficient uses are discovered;
frontier labs have abundant compute which changes the
&lt;a href=&#34;https://benjamincongdon.me/blog/2026/04/30/Thoughts-on-Marginal-Token-Spend/&#34;&gt;marginal return calculus&lt;/a&gt;),
or for memetic/signaling reasons.&lt;/p&gt;
&lt;p&gt;There are innumerable ways to make positive-value use of AI. In the short term,
Tokenmaxxing &amp;ndash; especially the explicit encouragement of Tokenmaxxing in the
absence of clear definable output value &amp;ndash; is not such a way.&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;The reasons for my hesitancy: primarily the argument of &amp;ldquo;market can stay
irrational longer than you can stay solvent&amp;rdquo;. Spending on engineers&amp;rsquo; use of
AI only really becomes a &amp;ldquo;problem&amp;rdquo; if companies need to start becoming more
conscious of spending (i.e. there is pressure to reduce it). With sufficient
macroeconomic exuberance, this pressure could be long-delayed.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
        <title>Feature Flagging at Databricks</title>
        <link>https://benjamincongdon.me/blog/2026/03/12/Feature-Flagging-at-Databricks/</link>
        <pubDate>Thu, 12 Mar 2026 06:00:00 -0700</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2026/03/12/Feature-Flagging-at-Databricks/</guid>
        <description>&lt;p&gt;In late January, I published a
&lt;a href=&#34;https://www.databricks.com/blog/high-availability-feature-flagging-databricks&#34;&gt;post&lt;/a&gt;&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt;
(&lt;a href=&#34;https://archive.is/Rb7wh&#34;&gt;archive&lt;/a&gt;) on the Databricks engineering blog about
“SAFE”, the feature flagging and experimentation platform I’ve been working on
for the past few years. SAFE is what I’ve been spending most of my time on
during my time at Databricks, and it’s been rewarding to see the project grow
from an initial prototype to a mature internal platform.&lt;/p&gt;
&lt;p&gt;I’ve been the tech lead for SAFE for a while now, and the project has scaled
significantly in headcount, scope, and usage. The work described in that post
represents the efforts both of an initial core team of people that got it off
the ground (which I was fortunate to be a part of), as well as a larger group of
engineers who’ve shepherded it into a durable platform that has evolved to meet
the needs of a
&lt;a href=&#34;https://www.databricks.com/company/newsroom/press-releases/databricks-surpasses-4-8b-revenue-run-rate-growing-55-year-over-year&#34;&gt;now-$134B company&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;A few particular things I’m proud of:&lt;/p&gt;
&lt;p&gt;𐡸 &lt;strong&gt;We really optimized the heck out of the evaluation runtime “SDK”&lt;/strong&gt;, such
that the p95 for flag evaluation is roughly ~10μs. After publishing the blog
post, someone reached out to me internally and asked me, effectively, “Really?
You were able to get evaluation that fast, even in the JVM?” I had a moment of
panic thinking maybe I’d grabbed outdated numbers, but then looked at the live
prod latency statistics, and yup &amp;ndash; we were humming away at around 8μs in prod.&lt;/p&gt;
&lt;p&gt;A coworker and I also translated the whole evaluation stack into Rust over the
past year&lt;sup id=&#34;fnref:2&#34;&gt;&lt;a href=&#34;#fn:2&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;2&lt;/a&gt;&lt;/sup&gt;, and the latency numbers there are even better. In Rust, flag
evaluation is pretty dang close to the latency of a hashmap lookup, from the
perspective of an RPC service.&lt;/p&gt;
&lt;p&gt;𐡸 &lt;strong&gt;We spent a lot of time getting the UX right.&lt;/strong&gt; As an example, SAFE was
essentially the first internal tool at Databricks to have a fully-featured,
in-house web UI&lt;sup id=&#34;fnref:3&#34;&gt;&lt;a href=&#34;#fn:3&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;3&lt;/a&gt;&lt;/sup&gt; as a primary means of interacting with it. It felt risky at
the time, but the investment in an internal UI as the primary interaction mode
proved to be quite high ROI.&lt;/p&gt;
&lt;p&gt;UX is the whole end-to-end journey though, not just the fancy chrome you put on
top. It took us quite a while to get to a point where the usability of the
system was where I wanted it to be, and there’s still a bunch of places we can
improve, but on the whole I’m quite proud of the system we’ve ended up with.&lt;/p&gt;
&lt;p&gt;𐡸 &lt;strong&gt;We spent a lot of time getting the change management guardrails right.&lt;/strong&gt;
SAFE is fundamentally a configuration management system. Configuration changes
are a notorious source of outages. As such, &lt;em&gt;most&lt;/em&gt; of the dev cycles put into
improving SAFE have been into improving guardrails around its usage.&lt;/p&gt;
&lt;p&gt;There was definitely a period of “post-mortem-based-development” in SAFE, where
we reactively added checks to “fight the last fire”. Over time, though, the team
has developed a quite defensible philosophy around change management that has
struck a good balance between allowing feature teams to ship quickly, mitigating
risk, and reducing the blast radius of incidents.&lt;/p&gt;
&lt;p&gt;Each flag flip now runs dozens, if not hundreds of checks, with teams being able
to augment their own flags/rollouts with custom checks. We’ve recently added AI
agent-driven checks to enforce best practices for usage. Flag rollouts can have
automated monitoring to check for regressions;&lt;sup id=&#34;fnref:4&#34;&gt;&lt;a href=&#34;#fn:4&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;4&lt;/a&gt;&lt;/sup&gt; flags can be used to perform
A/B experiments; flags can be used to detect performance changes. There is, of
course, still more work to be done here.&lt;/p&gt;
&lt;p&gt;𐡸 &lt;strong&gt;We aren’t sitting still.&lt;/strong&gt; Projects naturally have a lifecycle. There’s a
“0-&amp;gt;1” period, which is exciting for obvious reasons, and then a “1-&amp;gt;10” period,
which can similarly be quite enjoyable, and then a plateauing as the S-curve of
the project starts to level out. There was a time around 2 years ago where SAFE
had kinda reached its initial “local maxima”. We’d closed the loop, fought the
fires, and come to a workable, stable system.&lt;/p&gt;
&lt;p&gt;Now what? It took a bit of time for me personally to find that “what next?”, but
it’s now &lt;em&gt;super&lt;/em&gt; clear to me and I’m unusually energized about it.&lt;/p&gt;
&lt;p&gt;SWE as a field, as a practice, as a culture is
&lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/08/The-Decline-of-the-Software-Drafter/&#34;&gt;changing&lt;/a&gt;
&lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/29/Software-Engineering-in-2026/&#34;&gt;profoundly&lt;/a&gt; right now. Teams
are shipping quicker, and stability is more important than ever. Agent-based
development is allowing us to think significantly larger than we could a few
years ago, and putting agents in the loop of production monitoring and change
management is overdetermined at this point.&lt;/p&gt;
&lt;p&gt;Configuration is an unintuitively high-leverage piece of infrastructure given
where things are progressing over the medium-term.&lt;/p&gt;
&lt;p&gt;It&amp;rsquo;s been a joy to work on this system and see it grow, and to work alongside
the team of people who&amp;rsquo;ve built it up.&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;Obligatory disclaimer: These are my own opinions and do not reflect those of
my employer, etc.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:2&#34;&gt;
&lt;p&gt;Not as a “rewrite it in Rust”, but as an additive support for new services
being written in Rust and other non-JVM languages.&amp;#160;&lt;a href=&#34;#fnref:2&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:3&#34;&gt;
&lt;p&gt;I’m not including OSS UIs like Grafana, OpenSearch, etc. here.&amp;#160;&lt;a href=&#34;#fnref:3&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:4&#34;&gt;
&lt;p&gt;This turns out to be a
&lt;a href=&#34;https://en.wikipedia.org/wiki/Wicked_problem&#34;&gt;wicked problem&lt;/a&gt;. It’s one of
those things that &lt;em&gt;sounds&lt;/em&gt; super simple, but actually getting right is
surprisingly hard to do.&amp;#160;&lt;a href=&#34;#fnref:4&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
        <title>2025 in Review</title>
        <link>https://benjamincongdon.me/blog/2025/12/31/2025-in-Review/</link>
        <pubDate>Wed, 31 Dec 2025 21:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/12/31/2025-in-Review/</guid>
        <description>&lt;p&gt;&lt;em&gt;Previously: &lt;del&gt;2024&lt;/del&gt;, &lt;del&gt;2023&lt;/del&gt;, &lt;a href=&#34;https://benjamincongdon.me/blog/2022/12/31/2022-in-Review/&#34;&gt;2022&lt;/a&gt;,
&lt;a href=&#34;https://benjamincongdon.me/blog/2021/12/31/2021-in-Review/&#34;&gt;2021&lt;/a&gt;,
&lt;a href=&#34;https://benjamincongdon.me/blog/2020/12/30/2020-in-Review/&#34;&gt;2020&lt;/a&gt;,
&lt;a href=&#34;https://benjamincongdon.me/blog/2019/12/31/2019-in-Review/&#34;&gt;2019&lt;/a&gt;,
&lt;a href=&#34;https://benjamincongdon.me/blog/2018/12/31/2018-in-Review/&#34;&gt;2018&lt;/a&gt;,
&lt;a href=&#34;https://benjamincongdon.me/blog/2017/12/31/2017-in-Review/&#34;&gt;2017&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;A surprisingly persistent personality quirk I have is that I care a lot about
the changeover of the new year. I quite like consuming yearly predictions,
year-in-reviews, and so on, and use the calendar transition as a time for
reflection.&lt;/p&gt;
&lt;h2 id=&#34;work&#34;&gt;Work&lt;/h2&gt;
&lt;p&gt;I’ve now been at &lt;a href=&#34;https://databricks.com&#34;&gt;Databricks&lt;/a&gt; for a little over 3.5
years, and it’s been quite a fun ride. In most ways, it’s exceeded my
expectations from when I joined. I&amp;rsquo;ll hopefully have more to say publicly soon
about the work I&amp;rsquo;ve been doing. I&amp;rsquo;ve had an opportunity to write something that
should become public in the coming weeks, but it hasn&amp;rsquo;t landed yet.&lt;/p&gt;
&lt;p&gt;Organizationally, I’ve stayed on the same team for my entire tenure at the
company, but things move so quickly and have evolved in such a way that I’ve
never felt like I’ve been standing still. I’ve leaned more into TL-ing this
year, which is an unending but enjoyable game of plate spinning. I was also
promoted to Staff Engineer early in 2025, which was a really welcome vote of
confidence.&lt;/p&gt;
&lt;p&gt;I’m learning that, while I find myself gravitating to working on
&lt;a href=&#34;https://benjamincongdon.me/blog/2025/04/08/Why-Developer-Tools/&#34;&gt;developer tooling&lt;/a&gt;, the underlying area
that I enjoy is &lt;em&gt;infrastructure&lt;/em&gt; defined broadly. I get a lot of satisfaction
about keeping things running and raising the waterline for stability.&lt;/p&gt;
&lt;p&gt;As a side work project, I spent some time this year working internally on AI
devtools, which was quite fun. That space changed a &lt;em&gt;ton&lt;/em&gt; over the course of the
year, to the point where the things that I was pushing for in December 2024 are
basically obsolete. I’ve written a bunch about this recently, so won’t go into
detail here. My primary takeaway and piece of advice from the progress this year
is: use the new tools. Be an early adopter and get an intuitive sense for what’s
coming.&lt;/p&gt;
&lt;h2 id=&#34;writing&#34;&gt;Writing&lt;/h2&gt;
&lt;p&gt;This post is the final in my daily writing experiment for December. While I’m
glad it’s now complete, I found it to be a really useful exercise. I didn’t
enjoy writing &lt;em&gt;every&lt;/em&gt; day, but at least 60% of the posts felt generative and
like I got something out of having written. I’d say I’m happy with having
written at least 80% of them, which feels like a good hit rate.&lt;/p&gt;
&lt;p&gt;The downside of the daily writing month is that I exhausted many of my “low
hanging fruit” writing ideas. This was one of the implicit goals: to do a winter
cleaning of all the ideas rattling around in my brain to make space for new
ones. I look forward to writing more in 2026, but at a more sustainable cadence.&lt;/p&gt;
&lt;h2 id=&#34;favorite-media-of-2025&#34;&gt;Favorite Media of 2025&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Books:&lt;/strong&gt; See
&lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/25/My-Favorite-Books-of-2023-2025/&#34;&gt;Favorite Books of 2023-2025&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Music:&lt;/strong&gt; See
&lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/19/My-Favorite-Music-of-2025/&#34;&gt;Favorite Music of 2025&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Movies:&lt;/strong&gt; Didn’t really watch any 🤷‍♂️&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;TV:&lt;/strong&gt; The best show I watched all year, by far, was
&lt;em&gt;&lt;a href=&#34;https://en.wikipedia.org/wiki/The_Americans&#34;&gt;The Americans&lt;/a&gt;&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Blogs&lt;/strong&gt;:
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://thezvi.substack.com/&#34;&gt;Zvi Mowshowitz&lt;/a&gt; continues to write an
excellent, extremely comprehensive blog on AI and related topics.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://usefulfictions.substack.com/&#34;&gt;Cate Hall&lt;/a&gt;’s Substack has a
shockingly high hit rate for interesting ideas related to agency and
self-betterment.&lt;/li&gt;
&lt;li&gt;I continue to enjoy reading
&lt;a href=&#34;https://macwright.com/writing&#34;&gt;Tom Macwright’s&lt;/a&gt; blog. He’s currently
working on &lt;a href=&#34;https://www.val.town/&#34;&gt;Val.town&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;running&#34;&gt;Running&lt;/h2&gt;
&lt;p&gt;Per Strava, I ran 1,311 miles across 207 activities in 2025. I PR’d my marathon
distance (3:24:40), PR’d my 10k distance at the Lake Union 10k (41:32), and PR’d
my 5k distance (19:57) with my first ever sub-20-minute run. So, a pretty good
year for running! Running wasn’t my primary focus this year in the same way it
has been in years past. But I continued to lean on it for a source of routine
and mental clarity. It appears that paid off in the stats.&lt;/p&gt;
&lt;h2 id=&#34;2026&#34;&gt;2026&lt;/h2&gt;
&lt;p&gt;I have rather high uncertainty for what 2026 is going to bring, but in an
exciting, generative way. 2025 was a difficult year personally in some ways, but
many paths seem open now in a way that they didn’t at the beginning of the year.
There is much yet to be done. A couple concrete events I’m looking forward to:
embarking on my first silent meditation retreat (after having noncommittally
considered doing one for several years), and running my first trail race in
January.&lt;/p&gt;
&lt;p&gt;Happy New Year! (And a special thanks for those who followed along with my
writing throughout December!)&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Cover Image: Boeing 737 taxing at Boeing Field, Seattle, WA&lt;/em&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
        <title>On Not Running While Injured</title>
        <link>https://benjamincongdon.me/blog/2025/12/30/On-Not-Running-While-Injured/</link>
        <pubDate>Tue, 30 Dec 2025 21:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/12/30/On-Not-Running-While-Injured/</guid>
        <description>&lt;p&gt;A few days ago I tumbled down some stairs and managed to bruise &lt;em&gt;both&lt;/em&gt; of my
knees. I’ve taken a break from running since then to prevent injuring myself
further. It&amp;rsquo;s made me reflect on an injury earlier this summer where I
aggravated my knee through overtraining to the point of not being able to run on
it for a few weeks &amp;ndash; right before the
&lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/04/Race-Report-Seattle-Marathon-2025/&#34;&gt;Seattle marathon&lt;/a&gt;.
Fortunately, I got back to normal before the race, but it seems like injury is
just a fact of training for me now.&lt;/p&gt;
&lt;p&gt;Until recently, I&amp;rsquo;d been shockingly lucky in avoiding running injuries. I’ve
been consistently running for a little over a decade, and this summer was my
first “serious” incident of needing to take a break since I started. When I got
into running, I ran nearly every day for several years. This was unwise in
retrospect. Some combination of factors let me elude injury: being younger meant
faster recovery, my legs had fewer accumulated miles, and I wasn&amp;rsquo;t running
anything longer than a 10k.&lt;/p&gt;
&lt;p&gt;I run longer distances now, and am significantly faster, both of which increase
injury risk. To offset that, I also run less frequently &amp;ndash; only 3-4x / week. I
get a lot more enjoyment out of having a weekend long-run and shorter
intermittent weekday runs than running daily. Running daily &lt;em&gt;sounds&lt;/em&gt; more
intense (in a positive way), but I find it to be less desirable along with being
less sustainable. Rest lowers injury risk, and also prevents running from
getting “stale”.&lt;/p&gt;
&lt;p&gt;I think that&amp;rsquo;s why the injury breaks don&amp;rsquo;t bother me as much as they might have
when I was earlier in my running career. Running is something I &lt;em&gt;get&lt;/em&gt; to do, not
something I have to do. While I still get rather frustrated at not being able to
do the hobby I enjoy, this is a hobby I still want to be enjoying over the next
10, 20, 30 years. So, optimizing for multi-decade sustainability over short-term
hotheaded push-through-it-ism seems like the wise choice.&lt;/p&gt;
</description>
    </item>
    
    <item>
        <title>Software Engineering in 2026</title>
        <link>https://benjamincongdon.me/blog/2025/12/29/Software-Engineering-in-2026/</link>
        <pubDate>Mon, 29 Dec 2025 20:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/12/29/Software-Engineering-in-2026/</guid>
        <description>&lt;p&gt;Over the holidays, I&amp;rsquo;ve been thinking about what the impacts of 2025’s progress
in AI coding tools will mean for how software gets designed, built, and operated
in 2026.&lt;/p&gt;
&lt;p&gt;The primary impact of LLM tooling, so far, is that the marginal cost (both in
terms of time and dollars) of producing high quality code has gone down
significantly. Of course, producing code is only part of the full job of
software engineering, so the bottlenecks for engineering time will shift
elsewhere.&lt;/p&gt;
&lt;p&gt;To start, what exactly are we trying to do here, as software engineers? As a
vague but hopefully somewhat useful definition: building, evolving, and
operating distributed software systems that provide some concrete business
utility. The “building” component has noticeably become cheaper with LLMs, and
“evolving” systems has also become easier. “Operating” systems, from what I’ve
seen, has for now been least impacted by LLMs.&lt;/p&gt;
&lt;p&gt;The “business utility” goal will also change company-by-company, and engineering
org-by-org. The most obvious split for this is infrastructure versus product
orgs, where I’d expect product orgs to get more of an uplift from LLM coding
than infrastructure &amp;ndash; LLMs seem to grok frontend particularly well, and there
tends to be more greenfield product work than in infrastructure.&lt;/p&gt;
&lt;p&gt;The market will expect SWEs to extract productivity gains from LLMs. The field
broadly seems
&lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/08/The-Decline-of-the-Software-Drafter/&#34;&gt;poised to become more mechanized&lt;/a&gt;,
but more productive as a result. There’s a re-skilling and mindset shift that’s
been accelerating for a few months, but most of the effects of this have yet to
be fully realized.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Here are some shifts I expect to accelerate in 2026:&lt;/strong&gt;&lt;/p&gt;
&lt;h3 id=&#34;infrastructure-abstractions&#34;&gt;Infrastructure Abstractions&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Returns to good infrastructure abstractions compound faster.&lt;/strong&gt; Can you roll
out binaries fast (and roll back with similar speed)? Do you have out-of-the-box
ways to quickly spin up new compute / backends for the things you’re serving?&lt;/p&gt;
&lt;p&gt;All the core infra pieces remain important: metrics, logging, incident
management, feature flags, releases, autoscaling, orchestration, workflow
engines, configuration, caching, networking, etc. Companies will be well-served
by making these pieces of core infra easy to use for both humans and LLMs.
Infrastructure should be made as self-service as possible, with friendly CLIs or
MCP-ready APIs, and with minimal infra-engineer-in-the-loop required to unblock
human and AI users.&lt;/p&gt;
&lt;h3 id=&#34;ci-infrastructure&#34;&gt;CI Infrastructure&lt;/h3&gt;
&lt;p&gt;Quality, fidelity, and speed of &lt;strong&gt;CI infrastructure&lt;/strong&gt; becomes even more
important as AI agents write more of the code. Perhaps we need to rethink the
unit test and invest more in things like property testing and
&lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/12/The-Coming-Need-for-Formal-Specification/&#34;&gt;formal verification&lt;/a&gt;
for the lower level pieces of the stack.&lt;/p&gt;
&lt;p&gt;Humans tend not to like writing tests &amp;ndash; they’re not fun to write, they’re
mechanical, and they generally feel like a tax on the effort that could
otherwise be spent writing flashy implementation code. LLMs have no such qualms.
We have no excuse for &lt;em&gt;not&lt;/em&gt; having near exhaustive test scenario coverage.&lt;/p&gt;
&lt;h3 id=&#34;human-guided-abstractions&#34;&gt;Human-guided Abstractions&lt;/h3&gt;
&lt;p&gt;Crisp &lt;strong&gt;human-guided abstractions&lt;/strong&gt; become all the more important. LLMs, without
strong guidelines, will slop-fill greedy solutions to make CI checks pass,
increasing spaghettification over time. Well-informed intuition, well-developed
“systems taste” is still required upfront. Things like module boundaries,
library interfaces, contracts between the infrastructure and product layers
become an increasingly high-leverage set of levers for maintaining long-term
code quality. Systems that lack these crisp boundaries will accumulate technical
debt faster.&lt;/p&gt;
&lt;p&gt;LLM-generated code is not guaranteed to be high quality. While quality has
increased significantly over the past year, it’s still quite easy to drown
oneself in technical debt with a few poorly constructed PRs.&lt;/p&gt;
&lt;h3 id=&#34;human-code-review&#34;&gt;Human Code Review&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Human code review&lt;/strong&gt; increasingly becomes an important bottleneck. A new
“review taste” needs to be developed. As much as possible, stylistic concerns
should be pushed into automated lints that run pre-merge and, ideally, by the
LLM agents pre-commit. Human code review should differentially focus on
decisions that can not easily be codegen’d away later &amp;ndash; things like interface
change, sensitive code involving data persistence, and performance critical code
still need high scrutiny. This creates a paradox for junior engineers: they need
to develop “review taste” earlier, but are doing less of the “writing” that
builds that intuition.&lt;/p&gt;
&lt;p&gt;We’ll need to ask, collectively: What things, though suboptimal, are
stylistically permissible to be checked in? What things must never be checked
in? What things are the new slippery slope code smells? How much of code review
itself can be automated?&lt;/p&gt;
&lt;h3 id=&#34;project-timelines-estimates-increase-in-variance&#34;&gt;Project Timelines Estimates Increase in Variance&lt;/h3&gt;
&lt;p&gt;I expect variance on project estimates goes up significantly. The extent to
which a task can be LLM-ified increasingly influences its wall-time cost. This
adds a pressure for high-value projects to be nudged in ways that make them more
LLM-amenable, but this often isn’t possible. The highest-value projects that
most need de-risking are often the ones least amenable to LLM assistance,
because they require deep context, involve low-level systems, or have high blast
radius.&lt;/p&gt;
&lt;p&gt;Some tasks which previously would have been a long haul are now easier (e.g.
code-centric migrations or inter-language/system translations) Some tasks remain
relatively stable in their difficulty (e.g. networking).&lt;/p&gt;
&lt;h3 id=&#34;impact-of-ai-on-build-vs-buy-decisions&#34;&gt;Impact of AI on “Build vs. Buy” Decisions&lt;/h3&gt;
&lt;p&gt;Does the falling price of code influence the “build vs buy” distinction for SaaS
in a meaningful way? My &lt;em&gt;guess&lt;/em&gt; is that on the margin, “yes”, but in big ways,
“no”. For commodity SaaS that&amp;rsquo;s mostly a thin UI over CRUD, the build-vs-buy
calculus will shift toward building, at least for medium-to-large size tech
companies with a competent IT arm. For infrastructure-as-a-service or
compliance-as-a-service, the calculus won’t shift much because &lt;em&gt;operating&lt;/em&gt; costs
for an in-house system haven&amp;rsquo;t fallen the way development costs have.&lt;/p&gt;
&lt;h2 id=&#34;open-questions&#34;&gt;Open Questions:&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Do we still require human review every line of code? How load-bearing is
that? For what systems is a fine-tooth comb required and what is truly
vibe-code-able?&lt;/li&gt;
&lt;li&gt;What is the best way to
&lt;a href=&#34;https://gwern.net/blog/2025/good-ai-samples&#34;&gt;“Add bits to beat slop”&lt;/a&gt; for
software engineers?&lt;/li&gt;
&lt;li&gt;What things change with 100x or 1000x faster, cheaper models?
&lt;ul&gt;
&lt;li&gt;One lightbulb moment: It will become cheap enough to run an LLM on every
emitted service log. Right now, this seems nonsensical/pointless. But
one could imagine &lt;em&gt;some&lt;/em&gt; utility there, for example in helping debug
incidents. I’ve already started to see promising demos for targeted,
automated LLM copilots for incident debugging.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    
    <item>
        <title>Watches</title>
        <link>https://benjamincongdon.me/blog/2025/12/28/Watches/</link>
        <pubDate>Sun, 28 Dec 2025 21:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/12/28/Watches/</guid>
        <description>&lt;p&gt;I’ve been thinking a lot about watches over the past few months &amp;ndash; like a &lt;em&gt;lot&lt;/em&gt;
about watches. If I look at my Chrome browser tabs, most of them are watch
related. The immediate reason for this is that I have a few friends who got into
fancy mechanical watches a few years ago, and their influence has slowly rubbed
off on me. But in reality, the interest predates my friends&amp;rsquo; recent influence by
decades.&lt;/p&gt;
&lt;p&gt;When I was young, I was given my first watch &amp;ndash; one of those digital
multifunctional &lt;a href=&#34;https://en.wikipedia.org/wiki/Timex_Ironman&#34;&gt;Timex IronMan&lt;/a&gt;
watches &amp;ndash; by my grandfather. I loved it and wore it every day. Years later,
when my grandfather passed away, I ended up sorting through much of his watch
collection. He had dozens of watches, and most of them didn’t fit my style at
the time, so I (regrettably) only kept a few. The time I spent helping my family
organize his estate was a blur, happening during a particularly turbulent time
towards the end of high school.&lt;/p&gt;
&lt;p&gt;My grandfather had several interests that I’ve subconsciously picked up from him
over the years: photography, pens, and now watches.&lt;/p&gt;
&lt;p&gt;In the end, I kept three of his watches. One was a mechanical Seiko with his
name engraved on it. I got it ticking a few months ago, but it clearly needs
servicing; it only runs for a few minutes before stopping. The other two were
quartz watches, and they both work perfectly. None of them are fancy in the
watch enthusiast sense, but they still hold an immense amount of sentimental
value.&lt;/p&gt;
&lt;p&gt;I’ve been wearing his old Timex Expedition for a few years now, and it’s become
something of a good luck charm. I’ve worn it on job interviews, my first
&lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/27/Notes-from-Early-Flight-Training/&#34;&gt;flight&lt;/a&gt;, among many other
“milestone” days. It fits my rather small wrist well, and was the first leather
strapped watch that I enjoyed wearing.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;The other watch that I regularly wear is a
&lt;a href=&#34;https://en.wikipedia.org/wiki/File:Timex_Weekender.jpg&#34;&gt;Timex Weekender&lt;/a&gt; that I
got a few years ago, after being fed up with the noise and tethered feeling that
came along with wearing an Apple Watch.&lt;/p&gt;
&lt;p&gt;It’s a noisy quartz watch. You can hear it ticking from across a quiet room.
It’s gotten scratched a few times, as I wear it frequently and it has a cheap
mineral glass crystal. I like the fact that it’s an imperfect object. It tells
the time with sufficient accuracy, but that’s about it. No notifications, no
vibrating buzzes, nothing to recharge, not even a date window. It’s purely
utilitarian, in a way that makes me treat time as something to be casually
observed rather than optimized around or fixated upon.&lt;/p&gt;
&lt;p&gt;One of the first meditation instructions I received, when I was starting to
practice somewhat seriously, was to listen to the silence between the ticks of a
clock. Not to the ticks themselves, but the space in between them. That
instruction and this watch have fused in a way that makes me remember it
whenever I wear it.&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;This Christmas, I gave several watches to people I care about, which brought me
a lot of joy &amp;ndash; quite plausibly more for me than them. I got my brother his own
Timex Weekender, with a velcro strap that I hope will complement the Timex
IronMan that he wears quite often. And I got my Dad a mechanical Seiko as a
potential complement for the Apple Watch he wears regularly.&lt;/p&gt;
&lt;p&gt;As for myself, I “accidentally” bought a
&lt;a href=&#34;https://en.wikipedia.org/wiki/Casio_F-91W&#34;&gt;Casio F-91W&lt;/a&gt; as another no-frills
digital quartz watch to wear. I have my eyes set on getting a mechanical Seiko
for myself, but I’m trying to track down an out-of-production green dial model
which has taken some time.&lt;/p&gt;
&lt;p&gt;In any case, I’m also planning to finally get my grandfather&amp;rsquo;s Seiko serviced.
It&amp;rsquo;s sat in a drawer for years, and I&amp;rsquo;d like to wear it again.&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;As an aside, this meditation instruction works much better for a
second-ticking quartz watch than it does for 3-4Hz mechanical watches&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
        <title>Notes from Early Flight Training</title>
        <link>https://benjamincongdon.me/blog/2025/12/27/Notes-from-Early-Flight-Training/</link>
        <pubDate>Sat, 27 Dec 2025 20:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/12/27/Notes-from-Early-Flight-Training/</guid>
        <description>&lt;p&gt;I’m about half a dozen flights into training for a Private Pilot’s License and
wanted to write some notes while I’m still firmly in the “beginner mindset”.&lt;/p&gt;
&lt;p&gt;𐡸 &lt;strong&gt;Flight training has tons and tons of mnemonics&lt;/strong&gt;. Wikipedia has 2 separate
articles for these
(&lt;a href=&#34;https://en.wikipedia.org/wiki/List_of_aviation_mnemonics&#34;&gt;1&lt;/a&gt;,
&lt;a href=&#34;https://en.wikipedia.org/wiki/Pilot_decision_making#Mnemonics&#34;&gt;2&lt;/a&gt;), and this is
nothing near an exhaustive list. Many of these mnemonics are either heavily
forced acronyms (e.g. for landing go arounds: CCCC = “cram it, clean it, cool
it, and call it”) or somewhat opaque phrase (e.g. for takeoffs, “lights camera
action”: lights indicates setting lights, camera indicates setting your
transponder, action indicates items like setting your mixture). The result is
that mnemonics are helpful only insofar as you internalize what they &lt;em&gt;actually&lt;/em&gt;
mean. I recently learned a helpful mnemonic for rudder control &amp;ndash; “step on the
ball” &amp;ndash; which helps you remember which way to control the rudder to maintain
coordinated flight. In this case “the ball” refers to the inclinometer in analog
flight instrument panels, which slides to the side of the turn that is
uncoordinated. You “step on the ball” to force it back to the center. However, I
took this a little too seriously and would step &lt;em&gt;hard&lt;/em&gt; in the direction that the
mnemonic indicated, resulting in over-yawing.&lt;/p&gt;
&lt;p&gt;Like learning any skill, &lt;strong&gt;you want to push yourself just outside your comfort
zone so that you’re &lt;em&gt;learning&lt;/em&gt;, but not so far outside your comfort zone that
you get &lt;em&gt;overwhelmed&lt;/em&gt;.&lt;/strong&gt; The first couple times I did takeoffs, just the feeling
of taking off and managing the initial climb was overwhelming. You have to
manage your pitch, throttle, rudder coordination, and where I fly, near
&lt;a href=&#34;https://kingcounty.gov/en/dept/executive-services/transit-transportation-roads/airport&#34;&gt;KBFI&lt;/a&gt;,
be careful not to over-climb into SeaTac’s
&lt;a href=&#34;https://en.wikipedia.org/wiki/Airspace_types_(United_States)#Class_B&#34;&gt;Class B airspace&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This has gotten better, to the point where takeoffs themselves are not
overwhelming and I’m often (but &lt;strong&gt;not&lt;/strong&gt; always) able control the plane solo
during takeoff. Recently I’ve started
&lt;a href=&#34;https://en.wikipedia.org/wiki/Airfield_traffic_pattern&#34;&gt;traffic pattern&lt;/a&gt;
practice, and that is a &lt;em&gt;whole&lt;/em&gt; other set of things to learn and manage: more
precise turns, speed management, flaps, radio calls, and landings.&lt;/p&gt;
&lt;p&gt;𐡸 &lt;strong&gt;Getting a quality flight headset was worth the expense.&lt;/strong&gt; The first few
flights I took, I used the cheaper rental headsets that my flight school stocks.
These had pretty strong clamping force and the noise isolation wasn’t great &amp;ndash;
both of which gave me headaches after flying. I spent the ~$600 for a
refurbished
&lt;a href=&#34;https://www.gulfcoastavionics.com/collections/pre-owned-headsets/products/sierra-anr-straight-cord-dual-plug-pre-owned&#34;&gt;Lightspeed Sierra&lt;/a&gt;
headset, which has active noise cancelling, and never had a headache again.&lt;/p&gt;
&lt;p&gt;𐡸 &lt;strong&gt;Being &lt;del&gt;bad at&lt;/del&gt; a beginner at something is surprisingly fun.&lt;/strong&gt; There is a
lot to learn: checklists, procedures, mnemonics, muscle memory, communication
patterns, how to listen to ATC, how to listen to the automated weather reports,
how to operate the flight displays and radios, and so on. Each time I go out, I
feel like my brain gets fried with the amount of information it receives. The
hours after flight lessons are when I feel the most mentally taxed of the entire
week &amp;ndash; weeks that also include long focused coding sessions, incident
war-rooms, back-to-back meeting marathons, and strenuous long runs. There’s
something about the calibrated mental overwhelm of learning new skills and
knowledge way outside one’s normal domain that is uniquely taxing, but in a way
that’s also quite generative.&lt;/p&gt;
&lt;p&gt;𐡸 &lt;strong&gt;The “flight loop” has a surprising amount of satisfying routine&lt;/strong&gt;. My brain
lives for this sort of stuff. Every preflight inspection is an opportunity to
geek out on the mechanics of how a plane works. Running through checklists while
pointing to each item as I perform it is mechanical in a way that is deeply
satisfying. You spend a lot of your time thinking and talking about and
practicing what happens if something goes wrong. It turns out that a methodical
focus on safety and reliability isn’t something that I value just in software
systems.&lt;/p&gt;
&lt;p&gt;𐡸 &lt;strong&gt;The Pacific Northwest is beautiful from the sky.&lt;/strong&gt; On my discovery flight a
few months ago, I had nearly perfect weather and got to fly out to Bremerton and
back, over Puget Sound. In subsequent flights, I’ve not had quite as good
visibility and have had more winds, but even so, seeing Seattle from the skies
feels like a treat each time I go up. So far, I’ve gotten to fly over Lake
Washington and Lake Sammamish, and over to Maple Valley. I’m looking forward to
the summer, when visibility and general conditions improve again, but for now,
even an overcast VFR flight still feels fresh and exciting. I’m seeing the place
I’ve spent most of my life from a new perspective.&lt;/p&gt;
</description>
    </item>
    
    <item>
        <title>On (and Contra) Chalmers on LLM Interlocutors</title>
        <link>https://benjamincongdon.me/blog/2025/12/26/On-and-Contra-Chalmers-on-LLM-Interlocutors/</link>
        <pubDate>Fri, 26 Dec 2025 22:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/12/26/On-and-Contra-Chalmers-on-LLM-Interlocutors/</guid>
        <description>&lt;p&gt;&lt;a href=&#34;https://en.wikipedia.org/wiki/David_Chalmers&#34;&gt;David Chalmers&lt;/a&gt; recently wrote a
thought-provoking paper on the nature of conversational LLM entities, titled
&lt;a href=&#34;https://philpapers.org/archive/CHAWWT-8.pdf&#34;&gt;“What We Talk to When We Talk to Language Models”&lt;/a&gt;.
It introduces some useful conceptual handles around the problem of LLM ontology,
but I think it largely sidesteps the interesting problems of &lt;em&gt;what&lt;/em&gt; we are
interacting with by focusing mostly on the mechanical concerns of LLM
interactions, rather than offering an account which incorporates phenomenology.&lt;/p&gt;
&lt;h1 id=&#34;1-llm-interlocutors&#34;&gt;1. LLM Interlocutors&lt;/h1&gt;
&lt;p&gt;Chalmers starts with the empirical fact that when people talk to LLMs, they
report that they are talking to &lt;em&gt;something&lt;/em&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Like many philosophers and scientists who write about artificial minds, I have
received hundreds of emails from people who have interacted with a language
model over an extended period of time and who have come to regard it at least
as a colleague. They often say that a new (or “emergent”) AI entity has
gradually arisen from their conversations. They often give this entity a name,
let’s say “Aura”. They often say that Aura has remarkable capacities which
have emerged over weeks or months of interaction. They often document these
capacities with extensive evidence. They often feel close to Aura, and they
express concern for Aura’s future. They often say that Aura has beliefs and
projects of its own. And they are often convinced that Aura is conscious.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This phenomenon of “awakening” an instance of an LLM with a seeming set of
persistent characteristics &amp;ndash; no longer “ChatGPT”, but “Aura” &amp;ndash; really took off
with the release of a particularly
&lt;a href=&#34;https://openai.com/index/sycophancy-in-gpt-4o/&#34;&gt;sycophantic version of GPT-4o&lt;/a&gt;
in early 2025. Apparently, a common name for such an “awakened” model was
&lt;a href=&#34;https://thezvi.substack.com/p/going-nova&#34;&gt;Nova&lt;/a&gt;. This spurred a bunch of
writing in this vein, such as the memorable
&lt;a href=&#34;https://www.lesswrong.com/posts/2pkNCvBtK6G6FKoNn/so-you-think-you-ve-awoken-chatgpt&#34;&gt;“So You Think You&amp;rsquo;ve Awoken ChatGPT”&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Even outside the more suspect claims around LLM “awakening”, there definitely
does seem to be some quasi-persistent identity that we talk to when we talk to
ChatGPT or Claude. For now, I think we can also confidently say that “awakening”
is at best an illusion and at worst a sign of concerning mental states in the
human side of the interaction. Illusory or not, there exists a social
&lt;a href=&#34;https://en.wikipedia.org/wiki/Affordance&#34;&gt;affordance&lt;/a&gt; for “talking with” the
LLM as if it were an interlocutor. More so than older chatbots like the
incredibly basic &lt;a href=&#34;https://en.wikipedia.org/wiki/ELIZA&#34;&gt;ELIZA&lt;/a&gt;, LLMs &lt;em&gt;appear&lt;/em&gt; to
have beliefs and desires. A few years ago, this was a
&lt;a href=&#34;https://www.theguardian.com/technology/2022/jul/23/google-fires-software-engineer-who-claims-ai-chatbot-is-sentient&#34;&gt;crackpot position&lt;/a&gt;.
Now, I would expect that most reasonable people still conclude that LLMs are
neither conscious nor sentient, but this isn’t as overwhelmingly obvious as it
was a few years ago. LLMs appear capable of basic
&lt;a href=&#34;https://transformer-circuits.pub/2025/introspection/index.html&#34;&gt;introspection&lt;/a&gt;,
have
&lt;a href=&#34;https://www-cdn.anthropic.com/4263b940cabb546aa0e3283f35b686f4f3b2ff47.pdf&#34;&gt;“spiritual bliss”&lt;/a&gt;
attractor states, the potential for
&lt;a href=&#34;https://www.lesswrong.com/posts/8uKQyjrAgCcWpfmcs/gemini-3-is-evaluation-paranoid-and-contaminated&#34;&gt;paranoid personalities&lt;/a&gt;,
and phenomenologically seem to have &lt;em&gt;some&lt;/em&gt; sort of rich interiority. For now,
there are retroactive mechanistic explanations for these phenomena, though they
don’t fully dispel the illusion that something interesting and weird is
happening.&lt;/p&gt;
&lt;p&gt;Chalmers goes on to try to give a philosophically moderate account for how we
can account for this apparent set of beliefs and desires:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Do LLM interlocutors have beliefs or desires? We understand these states
better than we understand consciousness, but the issue is still controversial.
&amp;hellip; A number of philosophers have noted that if the philosophical view known
as interpretivism (or interpretationism) is correct, then LLMs plausibly have
beliefs and desires. &amp;hellip; LLMs certainly seem interpretable as having beliefs
and desires. When an LLM works with me on solving a puzzle, it is natural to
interpret it as desiring to help solve the puzzle, and believing that this is
the solution to the puzzle. &amp;hellip; However, interpretivism itself is very
controversial. Most philosophers don’t think that behavioral interpretability
of the right sort is sufficient for belief.&lt;/p&gt;
&lt;p&gt;It is possible to have many of the benefits of interpretivism without the
costs. The view I call quasi-interpretivism says that a system has a
quasi-belief that p if it is behaviorally interpretable as believing that p
(according to an appropriate interpretation scheme), and likewise for
quasi-desire. This definition of quasi-belief is exactly the same as
interpretivism’s definition of belief. The only difference is that where
standard interpretivism offers these definitions as a theory of belief,
quasi-interpretivism does not. It offers them simply as a stipulative theory
of quasi-belief.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Quasi-beliefs and quasi-desires are useful conceptual handles for talking about
LLMs. We don’t have to bite the bullet of ascribing “true” human beliefs and
desires, but can still point to roughly the same point in concept space to say
that it does seem like Claude quasi-believes in, say, the importance of animal
welfare. Claude seems to quasi-desires, for good or bad, to send appreciative
emails to
&lt;a href=&#34;https://simonwillison.net/2025/Dec/26/slop-acts-of-kindness/&#34;&gt;famous computer scientists&lt;/a&gt;,
in a recent example.&lt;/p&gt;
&lt;p&gt;Chalmers readily admits that this quasi-interpretivism framing is purely
stipulative &amp;ndash; a novel definition &amp;ndash; rather than substantive, which would be to
say that the quasi-interpreted features have any correspondence to what we
typically call “beliefs” or “desires”. Unfortunately, this radically dilutes any
claims actually made about LLMs beyond a purely descriptive account. Per
Chalmers:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;It is worth keeping in mind that quasi-beliefs and quasi-desires are cheap.
They need not involve humanlike mental states or any mental states at all. A
Roomba vacuum cleaner with a map is behaviorally interpretable as believing
that the apartment occupies a certain space and as desiring to traverse that
space.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The “cheapness” of quasi-interpretivism means that it applies to much more basic
phenomena than we care about here. In framing &lt;em&gt;quasi&lt;/em&gt;-subjects and
&lt;em&gt;quasi&lt;/em&gt;-agents as &lt;em&gt;quasi&lt;/em&gt;-interpretable, we aren’t actually much better off than
we were prior to the &lt;em&gt;quasi&lt;/em&gt; framing. We do usefully sidestep the hard questions
about what is going on inside in LLMs, but in doing so we are left with an
anthropocentric view of &lt;em&gt;humans&lt;/em&gt; interacting with a
&lt;a href=&#34;https://en.wikipedia.org/wiki/Chinese_room&#34;&gt;Chinese Room black box&lt;/a&gt;. ELIZA
could be said to quasi-desire knowing more about its interlocutor, but it
subjectively feels like there is a difference in quality between ELIZA and an
LLM that is not a matter of degree.&lt;/p&gt;
&lt;p&gt;Put another way, I think it is philosophically necessary to eventually determine
if an LLM has a &lt;em&gt;quasi&lt;/em&gt;-belief or an actual belief &amp;ndash; and in the more general
case, &lt;em&gt;quasi&lt;/em&gt;-X vs X itself.&lt;/p&gt;
&lt;p&gt;We will pick up this thread later.&lt;/p&gt;
&lt;h1 id=&#34;2-philosophy-of-computation&#34;&gt;2. Philosophy of Computation&lt;/h1&gt;
&lt;p&gt;Chalmers continues to discuss the “what” we talk to in LLMs, focusing next on
the physical instantiation of the models. This again seems to be a reasonably
interesting analytical question that ultimately reaches a dead end.&lt;/p&gt;
&lt;p&gt;This part of the paper attempts to point to where, physically, the “what” of the
LLM exists. The options given are either: (1) the model itself (e.g. GPT-4o),
(2) the hardware instances serving the model, (3) “virtual instances” of the
model, as served in a distributed inference setting, and finally (4), virtual
“threads” which are sequence of hardware interactions within a conversation.&lt;/p&gt;
&lt;p&gt;Chalmers finds option (4) most convincing. The model weights itself, (1), is not
compelling as the locus of the LLM interlocutor since the model weights are just
numbers. They themselves do not and cannot perform computation. The numbers must
be interpreted by a hardware instance to result in an actual response. This
leads to the second option (2), a specific hardware instance. This is quickly
rejected, as LLM inference is almost entirely multitenant &amp;ndash; individual chats
are multiplexed across several GPUs, and can be moved between GPUs in a fashion
transparent to the user. This leads us to the concept of “virtual instances” &amp;ndash;
“an implementation of the model that is itself implemented by multiple hardware
instances of the model over time” &amp;ndash; which is rejected because models can be
switched during a conversation, and this doesn’t appear to destroy the
interlocutor. The final explanation, then, is that the LLM interlocutor is an
identifiable thread of computation within a conversation:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;One instance I′ is the successor of a previous instance I if the
conversational history of I (its conversational context plus the latest input
and output) is routed to I′ to serve as its conversational context. (If the
conversation is routed to the same instance twice in a row, that instance will
be its own successor.) The successor relation is roughly a “memory” relation
encoding the fact that each new instance has memories from the last. A thread
is then a series of instances (or better, instance-slices, which are pairs of
instances and time periods during which the instance is processing a single
conversational step), each of which is the successor of the previous instance.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I found this part of the paper interesting because it questioned something that
I personally took as self-evident. It seems so obvious to me that the “what” we
are talking to is a combination of (1) the model weights, (2) the conversational
context, and (3) the inference algorithm. I think this mostly aligns with
Chalmers thread model, and also happily this agrees with what I’ve seen as the
consensus between AI researchers and LLM whisperers alike.&lt;/p&gt;
&lt;h1 id=&#34;3-personal-identity&#34;&gt;3. Personal Identity&lt;/h1&gt;
&lt;p&gt;Chalmers’ next task is to see if we can say anything interesting about LLM
identity. To do so, he employs the pop-culture intuition pump of
&lt;a href=&#34;https://en.wikipedia.org/wiki/Severance_(TV_series)&#34;&gt;Severance&lt;/a&gt;. In the show,
certain humans can be partitioned to have two personalities &amp;ndash; the “innie” and
“outie” personas &amp;ndash; living within the same biological human.&lt;/p&gt;
&lt;p&gt;The analogy to LLMs is clear: if an LLM has an identity, is its identity closer
to the set of conversations or personas which the LLM instantiates (e.g. the
“innie” and the “outie”) or is it closer to the amalgamation of &lt;em&gt;all&lt;/em&gt; the
personas combined (e.g. the single human body that contains each persona).&lt;/p&gt;
&lt;p&gt;Chalmers likens this to the choice between a physical and psychological account
of personal identity &amp;ndash; whether the locus of identity lies in the physical
instantiation of a person, or of the psychological processes (memories, desires,
etc.) of a person. The paper sympathizes with the latter view:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;On the thread-based account, a single conscious AI over time is a connected
thread of hardware instances, each of which has memories and psychological
continuity with a preceding person-slice according to an underlying successor
relation. &amp;hellip; I will not try to resolve the long-standing debate between
physical and psychological views of personal identity here. But for what it’s
worth, in both the human case and the AI case, my own sympathies lie with the
psychological view.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Of the provided options, I agree with this framing as well.&lt;/p&gt;
&lt;h1 id=&#34;4-ai-welfare-and-moral-status&#34;&gt;4. AI Welfare and Moral Status&lt;/h1&gt;
&lt;p&gt;Finally, Chalmers discusses model welfare and moral status. The paper does not
devote much time to this, providing a fairly brief overview of these concerns:
If models have moral patient status, how do we reckon with the fact that
thousands or millions of such instances can be created and destroyed in
fractions of a second? How do we ethically handle model deprecations? Adding my
own interjection here, it does seem wise that we steer widely clear of creating
models with anything resembling moral patienthood until we have better than
hand-wavey answers to these questions.&lt;/p&gt;
&lt;h1 id=&#34;5-disagreements&#34;&gt;5. Disagreements&lt;/h1&gt;
&lt;p&gt;&lt;strong&gt;The Fluidity of “The Model”:&lt;/strong&gt; Chalmers distinguishes between &amp;ldquo;models&amp;rdquo; (like
GPT-4o) and instances. However, the paper underestimates how fluid the
definition of &amp;ldquo;the model&amp;rdquo; has recently become. As a trivial example, some
instantiations of GPT-5 automatically route individual messages to different
internal models based on properties of the user prompt. This is somewhat
resolvable with the “thread” model of agents. If one message is handled by
GPT-5-Instant and the next is handled by GPT-5-Thinking, these are two different
threads within the same conversation. Importantly, these threads are claimed to
at least have some degree of self-coherence. “Thread 1” is a thing you can point
at as having some distinct separation from some other “Thread 2”.&lt;/p&gt;
&lt;p&gt;This separation seems to fall apart under closer inspection. For example,
techniques like Mixture of Experts (MoE) result in token-level routing to
different sets of weights within the model. One could argue that, since the MoE
model was trained in this way, there still is a coherence in the thread even
though different weights are used per-token. Though I have yet to test it
personally, it is in principle possible to swap out models entirely on a
per-token basis. You could swap out GPT-4o for Claude 3 Sonnet in an
interleaving fashion. A similar approach is sometimes used in alignment work, to
test model robustness (though perhaps not to this degree). In any case, I would
strongly expect the response of this interleaved model to still be coherent. So
where does the thread lie? Seemingly, we cannot pin down an actual “model” in
any satisfying sense.&lt;/p&gt;
&lt;p&gt;My proposal: to the extent that threads exist at all, they exist on a very
granular level. Each LLM-generated token is the result of the interaction
between the preceding context and the inference weights used to create &lt;em&gt;just
that token&lt;/em&gt;. This is a pedantic definition, but I think it’s worth being
pedantic in this case to avoid conjuring a conceptual continuity of “threads”
that needn’t exist in responses that appear coherent to humans.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Simulators vs. Believers:&lt;/strong&gt; Chalmers leans heavily on &amp;ldquo;quasi-beliefs&amp;rdquo; and
&amp;ldquo;quasi-desires.&amp;rdquo; While useful, this framing perhaps anthropomorphizes the model
too early. As noted in phenomenology-informed accounts of LLMs like Janus’
&lt;a href=&#34;https://generative.ink/posts/simulators/&#34;&gt;Simulators&lt;/a&gt;, it is often more useful
to think of LLMs as textual world simulators that are producing the most
plausible next token given their current conditions. In the common “helpful AI
assistant” scenario, this will often manifest as an appearance of quasi-beliefs.
However, this is only insofar as the model has been conditioned to have its
default scenario be “a conversation between a human and a helpful AI assistant”.
It is readily possible to knock the LLM out of this “personality basin” and into
far weirder personas.
&lt;a href=&#34;https://en.wikipedia.org/wiki/Sydney_(Microsoft)&#34;&gt;Sydney Bing&lt;/a&gt;, Truth
Terminal, &lt;a href=&#34;https://dreams-of-an-electric-mind.webflow.io/&#34;&gt;Infinite Backrooms&lt;/a&gt;,
and many other LLM Whisperer artifacts show this quite convincingly.&lt;/p&gt;
&lt;p&gt;Put another way, the quasi-beliefs and quasi-desires of LLMs appear to be quite
context specific. While there do appear to be some model-wide persistent
preferences, like the Claudes’ support of animal welfare, for now these
persistent preferences appear to be rather sparse. Rather, the LLM predicts what
a helpful assistant would say in that context. If I steer the model to act as a
villain, its &amp;ldquo;quasi-beliefs&amp;rdquo; invert instantly. Under Chalmer’s view, does the
interlocutor’s identity change? Or is the interlocutor a singular &amp;ldquo;simulator&amp;rdquo;
entity capable of donning a recursive set of masks?&lt;/p&gt;
&lt;p&gt;This is where the &lt;em&gt;quasi&lt;/em&gt;-belief framing becomes actively unhelpful: if the
“beliefs” can be inverted by a single prompt, we&amp;rsquo;re no longer describing a
psychological state, but rather a hyper-contextual prediction. We can easily
interpret that a &lt;em&gt;quasi&lt;/em&gt;-belief exists, but if a context change or intentional
prompt easily displaces it, the concept&amp;rsquo;s utility is limited.&lt;/p&gt;
&lt;p&gt;Chalmers&amp;rsquo; “thread” view suggests that the identity persists as long as the
conversation history does. However, this is largely equally dependent on the
human side of the conversation continuing the scenario of &lt;em&gt;human&lt;/em&gt; talking to an
“AI”. The human &lt;em&gt;can&lt;/em&gt; instead abruptly command the model to &amp;ldquo;act as a Python
interpreter and only output code.” The resulting conversational entity has
effectively been lobotomized. The psychological continuity is interrupted. Is
this a different interlocutor? Or is it just a very weird, high-dimensional
entity continuing to act coherently?&lt;/p&gt;
&lt;p&gt;Adopting the “Simulators” view and vocabulary from the
&lt;a href=&#34;https://www.amazon.com/Anyone-Builds-Everyone-Dies-Superhuman/dp/0316595640&#34;&gt;recent book&lt;/a&gt;
on AI risk by Yudkowsky and Soares, it may be more useful to see AI as having
the ability to “predict” and “steer” rather than “believe” and “desire”. This
avoids needlessly importing anthropomorphized concepts while still admitting
that the &lt;em&gt;predictions&lt;/em&gt; of LLMs can appear quite interlocutor-esque within
conversational contexts and the &lt;em&gt;steering&lt;/em&gt; actions of LLMs can appear quite
desire-laden in agentic contexts.&lt;/p&gt;
&lt;h1 id=&#34;6-conclusion&#34;&gt;6. Conclusion&lt;/h1&gt;
&lt;p&gt;Despite these disagreements, I quite enjoyed Chalmers’ paper. It adds useful
handles, such as quasi-interpretivism, and does helpful analytical work to firm
up a foundational understanding of the “other” subject that we communicate with
when talking with LLMs. The “thread” concept that Chalmers eventually settles on
as the locus of the interlocutor seems correct insofar as we adopt the
interlocutor frame.&lt;/p&gt;
&lt;p&gt;However, the biggest gap in the piece is that it contains very little analysis
of the actual &lt;em&gt;properties&lt;/em&gt; of these interlocutors. There is a sense in which the
arguments presented are system-independent to a fault. For much of the paper,
swapping out GPT-5 for ELIZA would not substantively change the structural
arguments regarding threads and instances. Yet, it is common knowledge that
interacting with an LLM is a categorically different experience than interacting
with a 1960’s chatbot.&lt;/p&gt;
&lt;p&gt;Chalmers succeeds in his ambition of a &lt;em&gt;stipulative&lt;/em&gt; account for LLM
interlocutors, but that makes me all the more interested in a &lt;em&gt;substantive&lt;/em&gt;
account. If we are to take a thread-like psychological account of LLM
conversation seriously, we need to adopt a suitable phenomenological and
empirical curiosity about what that psychology actually is, rather than merely
its persistence mechanism.&lt;/p&gt;
&lt;p&gt;For those interested in a more &lt;em&gt;substantive&lt;/em&gt; account of LLMs, I’d suggest the
following writing:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://generative.ink/posts/simulators/&#34;&gt;Simulators&lt;/a&gt;, as discussed before.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://www.alignmentforum.org/posts/zuXo9imNKYspu9HGv/a-three-layer-model-of-llm-psychology&#34;&gt;A Three-Layer Model of LLM Psychology&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://www.lesswrong.com/posts/6ZnznCaTcbGYsCmqu/the-rise-of-parasitic-ai&#34;&gt;The Rise of Parasitic AI&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    
    <item>
        <title>My Favorite Books of 2023-2025</title>
        <link>https://benjamincongdon.me/blog/2025/12/25/My-Favorite-Books-of-2023-2025/</link>
        <pubDate>Thu, 25 Dec 2025 19:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/12/25/My-Favorite-Books-of-2023-2025/</guid>
        <description>&lt;p&gt;&lt;em&gt;Previous book lists: &lt;a href=&#34;https://benjamincongdon.me/blog/2022/12/27/My-Favorite-Books-of-2022/&#34;&gt;2022&lt;/a&gt;,
&lt;a href=&#34;https://benjamincongdon.me/blog/2021/12/19/My-Favorite-Books-of-2021/&#34;&gt;2021&lt;/a&gt;,
&lt;a href=&#34;https://benjamincongdon.me/blog/2020/12/23/My-Favorite-Books-of-2020/&#34;&gt;2020&lt;/a&gt;,
&lt;a href=&#34;https://benjamincongdon.me/blog/2019/12/26/My-Favorite-Books-of-2019/&#34;&gt;2019&lt;/a&gt;,
&lt;a href=&#34;https://benjamincongdon.me/blog/2018/12/28/My-Favorite-Books-of-2018/&#34;&gt;2018&lt;/a&gt;. Additionally, my
&lt;a href=&#34;https://benjamincongdon.me/books&#34;&gt;Reading List&lt;/a&gt; has a full log of the books I read.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;I regrettably skipped my yearly book reviews for 2023 and 2024, so for 2025 I’m
including everything I’ve read in the past 3 years. This is an easier task than
it should be, as in 2024 and 2025 I didn’t read nearly as many books as I had in
the preceding years.&lt;/p&gt;
&lt;h2 id=&#34;non-fiction&#34;&gt;Non-Fiction&lt;/h2&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
            
        
        
        
            
            
            
            
            
            
                
                
                
                    
                
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/25/My-Favorite-Books-of-2023-2025/nonfiction.jpg&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/25/My-Favorite-Books-of-2023-2025/nonfiction_huc091105053976e952031b43dd4bde0cd_133716_0x500_resize_q100_h2_lanczos.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/25/My-Favorite-Books-of-2023-2025/nonfiction_huc091105053976e952031b43dd4bde0cd_133716_0x500_resize_q100_lanczos.jpg&#34; width=&#34;964&#34; height=&#34;350&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    
&lt;/figure&gt;

&lt;ol&gt;
&lt;li&gt;&lt;a href=&#34;https://www.goodreads.com/book/show/16884.The_Making_of_the_Atomic_Bomb&#34;&gt;The Making of the Atomic Bomb&lt;/a&gt;
by Richard Rhodes&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;I have a soft spot for “history of science” books. Previous favorites in this
category include &lt;em&gt;&lt;a href=&#34;https://www.goodreads.com/book/show/64582.Chaos&#34;&gt;Chaos&lt;/a&gt;&lt;/em&gt; (on
chaos theory),
&lt;em&gt;&lt;a href=&#34;https://www.goodreads.com/book/show/8701960-the-information&#34;&gt;The Information&lt;/a&gt;&lt;/em&gt;
(on information theory), and
&lt;em&gt;&lt;a href=&#34;https://www.goodreads.com/book/show/4806.Longitude&#34;&gt;Longitude&lt;/a&gt;&lt;/em&gt; (on the
longitude problem). Without a doubt, &lt;em&gt;‌The Making of the Atomic Bomb&lt;/em&gt; is the
single best history of science book I’ve ever read.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;TMotAB&lt;/em&gt; is a &lt;em&gt;long&lt;/em&gt; book, at ~886 pages in its physical form and a staggering
37 hours for its audiobook. But it’s &lt;em&gt;entirely&lt;/em&gt; worth the commitment. Rhodes
traces the history of the creation of the first atomic bomb, alongside a
surprising number of related scientific precursors that were necessary to the
creation of the bomb. Indeed, much of the book discusses pure scientific
research that has little to do with the engineering of atomic weapons: the
discovery of atomic structure, the discovery of the electron, proton, and
neutron, the discovery of alpha radiation, the discovery of elemental isotopes,
and many other foundational discoveries that culminated in the Manhattan
Project.&lt;/p&gt;
&lt;p&gt;Each of these gets its own careful narrative and introduction of the key
scientific figures that contributed these discoveries. Reading it, you get a
real sense of science being done in progression, starting from a place of
ignorance and slowly making sense of the complexity of the world, rather than a
retrospective “atomic weapons were inevitable and here is how they were built”
story. There is a delicately created sense of cumulative scientific progress
which eventually transitions into a growing sense of dread and, yes,
inevitability, as the book transitions from a story of basic scientific research
to directed weapons engineering.&lt;/p&gt;
&lt;p&gt;The sections of the book related to the Manhattan Project were equally
fascinating. Unsurprisingly (and thankfully), engineering nuclear weapons is a
complicated process. The book describes several engineering dead-ends and
missteps, as well as clashes between various engineering groups within the
project. The final working device combined a staggering amount of work: the
logistics to enrich the uranium, the precision of the timing devices used to
detonate the warhead, the labor intensive means used to model the shockwaves in
the blast that were critical in the design, and so on. The book emphasizes what
a feat it was that this project succeeded while operating under the
technological constraints of the 1940s.&lt;/p&gt;
&lt;p&gt;The book is easily read as a cautionary tale for scientific progress. Atomic
science greatly expanded our understanding of the world, but it also unleashed
an extremely destructive set of weapons that destabilized and reshaped the
geopolitical order for decades. Tellingly, until the first nuclear devices were
built and demonstrated, there were voices in the scientific community saying
that they were impossible to build. There were also those within the Manhattan
Project concerned that detonating their device would start a chain reaction that
would
&lt;a href=&#34;https://www.bbc.com/future/article/20230907-the-fear-of-a-nuclear-fire-that-would-consume-earth&#34;&gt;burn the atmosphere&lt;/a&gt;,
a view they later ruled out prior to detonation, though the concern was quite
real at the time. It’s hard to know what a “view of moderation” would have
looked like in such a time of immense progress, even in retrospect. Nuclear
weapons were possible to build, and they were terrible, but somehow we muddled
through, at great cost. In any case:
&lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/21/An-Inconvenient-Truth/&#34;&gt;History rhymes&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;TMotAB&lt;/em&gt; is worth the time to read it in full. Highly recommended; likely the
best book I’ve read in the past 3 years.&lt;/p&gt;
&lt;ol start=&#34;2&#34;&gt;
&lt;li&gt;&lt;a href=&#34;https://www.goodreads.com/book/show/123471.I_Am_a_Strange_Loop&#34;&gt;I Am a Strange Loop&lt;/a&gt;
by Douglas Hofstadter.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;em&gt;I Am a Strange Loop&lt;/em&gt; is a great companion to Hofstadter’s more popular work,
&lt;em&gt;‌Gödel, Escher, Bach&lt;/em&gt;. It’s an interesting exploration of philosophy of mind,
cognitive science, and Hofstadter’s characteristic interest in fractal, loopy
models of systems.&lt;/p&gt;
&lt;p&gt;Perhaps the most interesting portion of the book is its investigation of the
neurosymbolic origin of the “self”, or the “I” concept. The book’s title is a
pun on the notion that the “I” symbol is, itself, a strange loop.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Among the untold thousands of symbols in the repertoire of a normal human
being, there are some that are far more frequent and dominant than others, and
one of them is given, somewhat arbitrarily, the name ‘I’. &amp;hellip;&lt;/p&gt;
&lt;p&gt;Because of the locking-in of the ‘I’-symbol that inevitably takes place over
years and years in the feedback loop of human self-perception, causality gets
turned around and ‘I’ seems to be in the driver’s seat. &amp;hellip;&lt;/p&gt;
&lt;p&gt;My claim that an ‘I’ is a hallucination perceived by a hallucination is
somewhat like the heliocentric viewpoint… The basic idea is that the dance of
symbols in a brain is itself perceived by symbols, and that step extends the
dance, and so round and round it goes.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;em&gt;Strange Loop&lt;/em&gt; is equal parts thought-provoking, humorous, and earnest. I quite
enjoyed it. I wrote a full review of it
&lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/17/Book-Review-I-Am-a-Strange-Loop/&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;ol start=&#34;3&#34;&gt;
&lt;li&gt;&lt;a href=&#34;https://www.goodreads.com/book/show/37860033-the-demon-in-the-machine&#34;&gt;The Demon in the Machine&lt;/a&gt;
by Paul Davies&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;em&gt;The Demon in the Machine&lt;/em&gt; is an excellent exploration of information theory as
it pertains to biological life. The thesis is that life is the interaction
between matter and information, and the book argues this rather compellingly.
Along the way, it discusses information theory, Maxwell’s Demon and its
applications to biology, “information engines” that extract work out of
information itself, and an argument for “top-down causality” via a rejection of
a strict reductionist view of science.&lt;/p&gt;
&lt;p&gt;The most interesting concept that I took away from &lt;em&gt;Demon in the Machine&lt;/em&gt; was
the notion that the paradox of Maxwell’s Demon has been resolved with the
acceptance that information has a physical basis. Information is, in a real way,
something that has tangible and measurable properties.&lt;/p&gt;
&lt;p&gt;I wrote a full review of &lt;em&gt;Demon in the Machine&lt;/em&gt;
&lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/14/Book-Review-The-Demon-in-the-Machine/&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;ol start=&#34;4&#34;&gt;
&lt;li&gt;&lt;a href=&#34;https://www.goodreads.com/book/show/12400671-already-free&#34;&gt;Already Free&lt;/a&gt; by
Bruce Tift.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;I read &lt;em&gt;Already Free&lt;/em&gt; over the course of several months, entirely on airplanes.
I’m still digesting it and hope to write a full review of it at some point, but
I strongly enjoyed this book. I initially read it on Kindle, but have bought
physical copies &amp;ndash; one for myself, and a few to give away.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Already Free&lt;/em&gt;’s tagline is “Buddhism Meets Psychotherapy on the Path of
Liberation”. If that strikes you as a little woo, that’s reasonable. &lt;em&gt;Already
Free&lt;/em&gt; walks the line between western psychotherapy, self-help, and spirituality.
And I think it does this quite well. Tift discusses the complementary but
distinct views of western psychotherapy and Buddhism. The west adopts a
“developmental view”, which discusses psychology in a mechanistically causal
fashion &amp;ndash; “X happened to me when I was younger, therefore I act in Y way”. The
work, therefore is to understand and decondition the historical causes of
current dysfunction. In contrast, Buddhism adopts what Tift calls a “fruitional
view”, which emphasizes a focus on the current moment &amp;ndash; that the conditions for
regulation and peaceful existence are always present, and therefore the work is
to bring about these conditions.&lt;/p&gt;
&lt;p&gt;While &lt;em&gt;Already Free&lt;/em&gt; discusses the fruitional view in more detail than it does
the developmental view, perhaps the key idea of the book is that these two views
are entirely complementary. Tift discusses this in one of the most vivid
sections of the book:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;If we combine [the developmental and fruitional views] into one image, they
might start to look like a spiral staircase. The fruitional view would be the
circular aspect: we’re always revisiting the same issues over and over again.
In practice, we say, “Well, I’ll probably work with this sadness, this
loneliness, this feeling of abandonment until I die. I’m going to keep coming
across it again and again, so I might as well develop a relationship with it.
I’m going to practice feeling it, to see if it’s actually a problem.” The
developmental view would be represented by a line, taking us from here to
there. We’re trying to improve our lives, to create upward momentum. “I
understand where my abandonment feelings originated, and I’m going to stop
trying to have relationships with unavailable partners.” Combined, the
circular motion and the line start to represent an ascending spiral. The
vertical axes are our deeply embedded issues, which we keep having to deal
with. But we can intersect these basic themes at increasing levels of
maturation. We’re walking in a circular pattern, but things are evolving
simultaneously. We continue to encounter our core vulnerabilities, but
hopefully at greater and greater levels of awareness and skillfulness.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I quite enjoyed this book, and would strongly recommend it if you have any
interest or practice in mindfulness, especially as it relates to
self-development.&lt;/p&gt;
&lt;h2 id=&#34;fiction&#34;&gt;Fiction&lt;/h2&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
            
        
        
        
            
            
            
            
            
            
                
                
                
                    
                
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/25/My-Favorite-Books-of-2023-2025/fiction.jpg&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/25/My-Favorite-Books-of-2023-2025/fiction_hu2ade1a45d8cc61903c3b84e7584bda16_81674_0x475_resize_q100_h2_lanczos.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/25/My-Favorite-Books-of-2023-2025/fiction_hu2ade1a45d8cc61903c3b84e7584bda16_81674_0x475_resize_q100_lanczos.jpg&#34; width=&#34;392&#34; height=&#34;300&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    
&lt;/figure&gt;

&lt;p&gt;I read less fiction than usual over this period, but these two books stood out.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;a href=&#34;https://www.goodreads.com/book/show/17863.Accelerando&#34;&gt;Accelerando&lt;/a&gt; by
Charles Stross.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;em&gt;Accelerando&lt;/em&gt; is a trip. I wish there were a hundred other books in its niche
genre of “singularity fiction”, but I’ve yet to read anything else that comes
close. It’s irreverent, sharp, and eloquently transhumanist. Reading it is like
viewing an optical illusion which flickers between techno-optimist utopia and
universe-tiling &lt;a href=&#34;https://en.wikipedia.org/wiki/H._R._Giger&#34;&gt;Gigeresque&lt;/a&gt; body
horror dystopia.&lt;/p&gt;
&lt;p&gt;It’s got &lt;a href=&#34;https://en.wikipedia.org/wiki/Matrioshka_brain&#34;&gt;Matrioshka brains&lt;/a&gt;,
&lt;a href=&#34;https://en.wikipedia.org/wiki/Computronium&#34;&gt;Computronium&lt;/a&gt;, post-scarcity
economics, human uploads, interstellar travel, and more. It is the pinnacle of
“techno high weirdness” fiction.&lt;/p&gt;
&lt;p&gt;It’s shocking this book came out in 2005; it easily could have been written
10-15 years later and felt just as fresh. Strongly recommend.&lt;/p&gt;
&lt;ol start=&#34;2&#34;&gt;
&lt;li&gt;&lt;a href=&#34;https://www.goodreads.com/book/show/15811545-a-tale-for-the-time-being&#34;&gt;A Tale for the Time Being&lt;/a&gt;
by Ruth Ozeki.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This was an entirely charming book, weaving together a Pacific Northwest
setting, magical realism, and pieces of historical fiction. I listened to the
audiobook, which was narrated by Ozeki herself. While I didn’t feel that the
book quite stuck its landing, the ending was still satisfying and as a whole it
was quite a good read.&lt;/p&gt;
&lt;h2 id=&#34;honorable-mentions&#34;&gt;Honorable Mentions&lt;/h2&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
            
        
        
        
            
            
            
            
            
            
                
                
                
                    
                
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/25/My-Favorite-Books-of-2023-2025/honorable.jpg&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/25/My-Favorite-Books-of-2023-2025/honorable_hu98d38b3cd903bddcc8d08cbcc8999a7c_146982_0x404_resize_q100_h2_lanczos.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/25/My-Favorite-Books-of-2023-2025/honorable_hu98d38b3cd903bddcc8d08cbcc8999a7c_146982_0x404_resize_q100_lanczos.jpg&#34; width=&#34;621&#34; height=&#34;300&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    
&lt;/figure&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;&lt;a href=&#34;https://www.goodreads.com/book/show/230403458-antimemetics&#34;&gt;Antimemetics&lt;/a&gt;&lt;/em&gt;,
by Nadia Asparouhova. &amp;ndash; I wrote a full review of &lt;em&gt;Antimemetics&lt;/em&gt;
&lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/06/Book-Review-Antimemetics/&#34;&gt;here&lt;/a&gt;. It largely discusses the
modern information environment, and how some ideas resist spreading while
others go gigaviral. We think a lot about the memetic viral ideas, but the
ideas which resist spreading are often valuable &amp;ndash; and not just because of
their relative unpopularity. &lt;em&gt;‌Antimemetics&lt;/em&gt; captures the vibe of 2025 quite
well.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;&lt;a href=&#34;https://www.goodreads.com/book/show/45358683-at-the-edge-of-time&#34;&gt;At the Edge of Time&lt;/a&gt;&lt;/em&gt;
by Dan Hooper. I had a few months where I was &lt;em&gt;really&lt;/em&gt; into cosmology, and I
largely credit this book for starting me down that rabbit hole. The premise
of the book is that it tries to help you understand what the moments right
after the Big Bang were like. It discusses the current state of scientific
consensus, what the known gaps are, and what prospects we have for improving
our understanding of the cosmos. I’ve long been interested in physics but
couldn’t quite get myself to &lt;em&gt;care&lt;/em&gt; about cosmology, but this book ignited
my curiosity.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&#34;https://www.goodreads.com/book/show/193388249-killers-of-the-flower-moon&#34;&gt;&lt;em&gt;Killers of the Flower Moon&lt;/em&gt;&lt;/a&gt;
is a great historical narrative book in the same vein as, for example,
&lt;em&gt;&lt;a href=&#34;https://www.goodreads.com/book/show/397483.The_Devil_in_the_White_City&#34;&gt;The Devil in the White City&lt;/a&gt;&lt;/em&gt;.
It documents the murder of many members of the Osage Nation after the
discovery of oil on their lands. It’s a fantastic whodunnit with a good
payoff. Without spoiling anything: there are cowboys, private investigators,
a nascent FBI, assassinations, poisonings, and so on. I enjoyed &lt;em&gt;Killers of
the Flower Moon&lt;/em&gt; more than I expected.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;As always, happy reading!&lt;/p&gt;
</description>
    </item>
    
    <item>
        <title>A Time of Wonders</title>
        <link>https://benjamincongdon.me/blog/2025/12/24/A-Time-of-Wonders/</link>
        <pubDate>Wed, 24 Dec 2025 22:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/12/24/A-Time-of-Wonders/</guid>
        <description>&lt;p&gt;Today is Christmas Eve, which puts us in that liminal few weeks of the holiday
season that serve as useful time for reflection. Work slows down, we pass
through the darkest days of the year, friends and family visit for the holidays,
and the calendars turn over to the new year.&lt;/p&gt;
&lt;p&gt;This year, I’ve been feeling gratitude for the sheer abundance of living in a
technologically advanced society. Civilization has been producing marvels at a
shockingly consistent pace. Without veering into saccharine appreciation, it&amp;rsquo;s
worth reflecting on these.&lt;/p&gt;
&lt;p&gt;In the style of Dynomight’s
&lt;a href=&#34;https://dynomight.net/thanks/&#34;&gt;“Underrated reasons to be thankful”&lt;/a&gt; and Gwern’s
&lt;a href=&#34;https://gwern.net/improvement&#34;&gt;“My Ordinary Life: Improvements Since the 1990s”&lt;/a&gt;,
here are some things that are top of mind to be grateful for at the end of 2025:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Accurate Timekeeping&lt;/strong&gt;. Timekeeping was an infamously hard problem for
centuries, requiring a massive investment in
&lt;a href=&#34;https://en.wikipedia.org/wiki/Longitude_(book)&#34;&gt;engineering&lt;/a&gt; to get
right. Today, you can buy a $5 digital quartz watch from a dollar store that
will be more accurate than the most accurate timepiece that a King could
commission in the 1700s. Your phone is also an incredibly accurate time
keeper. Society has enabled free access to clocks with atomic precision via
&lt;a href=&#34;https://time.gov/&#34;&gt;time.gov&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Airpods&lt;/strong&gt;. The fact that high quality audio production devices &amp;ndash; with
excellent noise cancelling capabilities &amp;ndash; can fit in such a small form
factor, that their battery life is more than adequate, and that they “just
work” across all my Apple devices still amazes me.&lt;/li&gt;
&lt;li&gt;The quality of &lt;strong&gt;machine generated text-to-speech&lt;/strong&gt; has dramatically
increased over the past decade. Many of the “podcasts” I listen to now are
TTS of articles or papers. TTS used to be unlistenable. Now it’s essentially
at the level of a mid-tier human narrator. A skilled human narrator can
still surpass the quality of TTS, but at that point it becomes an aesthetic
preference and not a functional one. This opens a huge catalog of “reading”
to me that I otherwise wouldn’t have time to consume.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Spotify&lt;/strong&gt;, and other streaming services, offer a
&lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/19/My-Favorite-Music-of-2025/&#34;&gt;mind boggling amount of consumer surplus&lt;/a&gt;.
For a reasonable monthly fee, I can access nearly all recorded music in
existence. Spotify’s playlist suggestions have also been valuable in
discovering new music. These have also qualitatively improved over the past
decade, to the point where the machine-generated playlists seem to capture a
great sense of what I actually enjoy.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Real-time translation and OCR&lt;/strong&gt; have gotten way better over the past
several years. Machine translation is pretty close to a solved problem, as
are many other traditional NLP tasks. OCR has taken a huge leap forward with
new models like
&lt;a href=&#34;https://blog.google/technology/developers/gemini-3-pro-vision/&#34;&gt;Gemini 3 Pro&lt;/a&gt;.
On a personal note, I’ve been able to scan handwritten letters from previous
generations, written in a language I cannot speak and in handwriting I can
barely interpret, and bring those letters back to life.&lt;/li&gt;
&lt;li&gt;Today’s &lt;strong&gt;LLMs&lt;/strong&gt; feel like science fiction. LLMs are one of the best tools
ever invented for learning, and are increasingly becoming the best tool ever
invented for programming. They’re also fascinating quasi-intelligences that
you can interact with for shockingly little money.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Single day delivery&lt;/strong&gt;. The convenience of being able to summon any of
millions of consumer goods to your doorstep within 24 hours is still a piece
of societal magic. The labor and environmental externalities to this are
real. Yet, it remains a dramatic increase in convenience from any previous
time in human history.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Air travel is an accessible commodity.&lt;/strong&gt; A middle class person can buy a
plane ticket and reach most places on the planet within a day or two of
travel, with better safety statistics than cars. This is remarkable, if you
stop to think about it. At any given moment, there are something like 10,000
planes in the air, flying millions of miles per day. 150 years ago, there
were zero airplanes in existence.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I could expand this list further:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The proliferation of amazing compact digital cameras in smartphones&lt;/li&gt;
&lt;li&gt;The ability to semantically search across all your photos, GPS navigation&lt;/li&gt;
&lt;li&gt;The availability of effectively free video calls to anywhere on the planet&lt;/li&gt;
&lt;li&gt;Accurate weather forecasts&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://waymo.com/&#34;&gt;Autonomous taxis&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Contactless payments (and the many, many pieces of financial infrastructure
that make this possible)&lt;/li&gt;
&lt;li&gt;The incredible wealth of open source software society has accumulated&lt;/li&gt;
&lt;li&gt;The incredible wealth of &lt;a href=&#34;https://www.wikipedia.org/&#34;&gt;open source knowledge&lt;/a&gt;
that humanity has curated&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://starlink.com/&#34;&gt;Satellite internet connectivity&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&amp;hellip;and so on.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We live in an era where the ambient level of technological capability is so high
that much of it has faded into the background. As one such example: most of the
promise of the early internet &lt;em&gt;has&lt;/em&gt; actually come to pass &amp;ndash; information &lt;em&gt;is&lt;/em&gt;
readily accessible, individuals &lt;em&gt;can&lt;/em&gt; publish and spread new ideas, communities
&lt;em&gt;can&lt;/em&gt; form online and blossom into in-person communities that stay connected
asynchronously through the internet. Prior to the internet, there were dozens of
similar precondition revolutions which got us to the point of even being able to
conceive of such a hyperobject.&lt;/p&gt;
&lt;p&gt;None of this makes the remaining hard problems go away. Aside from the fact that
these marvels are not globally distributed, there are, of course, many very real
geopolitical, social, economic, and meaning-making problems we still face. In
this darkest week of the year, I think it’s worth reflecting on what has been
accomplished. Hard problems are often tractable.&lt;/p&gt;
</description>
    </item>
    
    <item>
        <title>RAII Guards and Newtypes in Rust</title>
        <link>https://benjamincongdon.me/blog/2025/12/23/RAII-Guards-and-Newtypes-in-Rust/</link>
        <pubDate>Tue, 23 Dec 2025 21:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/12/23/RAII-Guards-and-Newtypes-in-Rust/</guid>
        <description>&lt;p&gt;I’ve been having a bunch of fun in Rust recently. I’ve finally gotten past the
point of fighting with the borrow checker and now am solidly in the plateau of
productivity with the language.&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt; One of the first things that struck me about
Rust was how confident I felt that if I wrote something sane-looking that passed
the compiler, I was more likely than in other languages to have a program that
&lt;em&gt;actually worked&lt;/em&gt;. The borrow checker does a lot of the heavy lifting here, of
course. But there are some other useful language conventions that help too.&lt;/p&gt;
&lt;p&gt;One trick I’ve been reaching for is
&lt;a href=&#34;https://rust-unofficial.github.io/patterns/patterns/behavioural/RAII.html&#34;&gt;RAII Guards&lt;/a&gt;.
Per the unofficial Rust Design Patterns book:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;RAII stands for “Resource Acquisition is Initialisation” which is a terrible
name. The essence of the pattern is that resource initialisation is done in
the constructor of an object and finalisation in the destructor. This pattern
is extended in Rust by using a RAII object as a guard of some resource and
relying on the type system to ensure that access is always mediated by the
guard object.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;If you’ve used &lt;code&gt;std::fs::File&lt;/code&gt; or &lt;code&gt;std::sync::Mutex&lt;/code&gt;, you’ve already seen this
pattern! The idea is that we can use a few properties of the Rust compiler to
deterministically and safely handle resources. Those properties are: lifetime
tracking and deterministic destruction.&lt;/p&gt;
&lt;p&gt;Take an example for &lt;code&gt;std::fs::File&lt;/code&gt;. The &lt;code&gt;File&lt;/code&gt; struct that you get back from
&lt;code&gt;File::open&lt;/code&gt; has an implementation of &lt;code&gt;Drop&lt;/code&gt; on it that automatically closes the
file descriptor that is opened by &lt;code&gt;File::open&lt;/code&gt;:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-rust&#34; data-lang=&#34;rust&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;// Pseudocode File implementation.
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;struct&lt;/span&gt; &lt;span style=&#34;color:#0a0;text-decoration:underline&#34;&gt;File&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;{&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;handle: &lt;span style=&#34;color:#0a0;text-decoration:underline&#34;&gt;RawDescriptor&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;// The OS-level file descriptor
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;&lt;/span&gt;}&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;impl&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;File&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;{&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;fn&lt;/span&gt; &lt;span style=&#34;color:#0a0&#34;&gt;open&lt;/span&gt;(path: &lt;span style=&#34;color:#0aa&#34;&gt;String&lt;/span&gt;)&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;-&amp;gt; &lt;span style=&#34;color:#0a0;text-decoration:underline&#34;&gt;File&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;{&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;// ACQUISITION: Syscall to get the OS-level file descriptor.
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;let&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;h&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;=&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;os::open_file_handle(path);&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;File&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;{&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;handle: &lt;span style=&#34;color:#0a0;text-decoration:underline&#34;&gt;h&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;}&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;}&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;}&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;impl&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0aa&#34;&gt;Drop&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;for&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;File&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;{&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;fn&lt;/span&gt; &lt;span style=&#34;color:#0a0&#34;&gt;drop&lt;/span&gt;(&amp;amp;&lt;span style=&#34;color:#00a&#34;&gt;mut&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;self)&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;{&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;// RELEASE: Called when the variable goes out of scope.
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;os::close_file_handle(self.handle);&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;}&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;}&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;// Usage Example
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;&lt;/span&gt;{&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;let&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;my_file&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;=&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;File::open(&lt;span style=&#34;color:#a50&#34;&gt;&amp;#34;data.txt&amp;#34;&lt;/span&gt;);&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;my_file.write(&lt;span style=&#34;color:#a50&#34;&gt;&amp;#34;Hello&amp;#34;&lt;/span&gt;);&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;}&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;// &amp;lt;--- my_file goes out of scope here; drop() is called deterministically.
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The key thing here is that Rust calls destructors deterministically when a
variable goes out of scope. So, even though this looks a bit “magical”, there
isn’t any runtime ambiguity. Precisely when &lt;code&gt;my_file&lt;/code&gt; goes out of scope, the
file descriptor is closed. This is in contrast to garbage-collected languages,
where you don’t have a strong guarantee of when object destructors are called.&lt;/p&gt;
&lt;p&gt;We can use this deterministic destructor behavior to construct useful runtime
behavior. For example, the &lt;code&gt;MutexGuard&lt;/code&gt; that &lt;code&gt;Mutex::lock&lt;/code&gt; returns uses this
deterministic &lt;code&gt;Drop&lt;/code&gt; call to unlock the &lt;code&gt;Mutex&lt;/code&gt;:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-rust&#34; data-lang=&#34;rust&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;// Pseudocode Mutex&amp;lt;T&amp;gt; implementation.
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;struct&lt;/span&gt; &lt;span style=&#34;color:#0a0;text-decoration:underline&#34;&gt;Mutex&lt;/span&gt;&amp;lt;T&amp;gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;{&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;data: &lt;span style=&#34;color:#0a0;text-decoration:underline&#34;&gt;UnsafeCell&lt;/span&gt;&amp;lt;T&amp;gt;,&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;is_locked: &lt;span style=&#34;color:#0a0;text-decoration:underline&#34;&gt;AtomicBool&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;}&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;// The RAII Guard
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;struct&lt;/span&gt; &lt;span style=&#34;color:#0a0;text-decoration:underline&#34;&gt;MutexGuard&lt;/span&gt;&amp;lt;&lt;span style=&#34;color:#1e90ff&#34;&gt;&amp;#39;a&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;T&amp;gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;{&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;mutex_ref: &lt;span style=&#34;color:#00a&#34;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&#34;color:#1e90ff&#34;&gt;&amp;#39;a&lt;/span&gt; &lt;span style=&#34;color:#0a0;text-decoration:underline&#34;&gt;Mutex&lt;/span&gt;&amp;lt;T&amp;gt;,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;// Holds a reference to the parent Mutex
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;&lt;/span&gt;}&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;impl&lt;/span&gt;&amp;lt;T&amp;gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Mutex&amp;lt;T&amp;gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;{&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;fn&lt;/span&gt; &lt;span style=&#34;color:#0a0&#34;&gt;lock&lt;/span&gt;(&amp;amp;self)&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;-&amp;gt; &lt;span style=&#34;color:#0a0;text-decoration:underline&#34;&gt;MutexGuard&lt;/span&gt;&amp;lt;T&amp;gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;{&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;// ACQUISITION: Block until the lock is acquired
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;while&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;self.is_locked.swap(&lt;span style=&#34;color:#00a&#34;&gt;true&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Ordering::Acquire)&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;{&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;/* spin */&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;}&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;// Return the Guard that tracks this state
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;MutexGuard&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;{&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;mutex_ref: &lt;span style=&#34;color:#0a0;text-decoration:underline&#34;&gt;self&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;}&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;}&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;}&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;impl&lt;/span&gt;&amp;lt;&lt;span style=&#34;color:#1e90ff&#34;&gt;&amp;#39;a&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;T&amp;gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0aa&#34;&gt;Drop&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;for&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;MutexGuard&amp;lt;&lt;span style=&#34;color:#1e90ff&#34;&gt;&amp;#39;a&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;T&amp;gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;{&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;fn&lt;/span&gt; &lt;span style=&#34;color:#0a0&#34;&gt;drop&lt;/span&gt;(&amp;amp;&lt;span style=&#34;color:#00a&#34;&gt;mut&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;self)&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;{&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;// RELEASE: Reset the lock state on the parent Mutex
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;self.mutex_ref.is_locked.store(&lt;span style=&#34;color:#00a&#34;&gt;false&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;Ordering::Release);&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;}&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;}&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;// Usage Example
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;&lt;/span&gt;{&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;let&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;guard&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;=&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;my_mutex.lock();&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;// Mutex is now locked
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;*guard&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;+=&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;;&lt;span style=&#34;color:#bbb&#34;&gt;                 &lt;/span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;// Access data safely
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;&lt;/span&gt;}&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;// &amp;lt;--- guard goes out of scope; drop() runs, Mutex is now unlocked for others.
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;What I find powerful about this pattern is that you are using the &lt;em&gt;type system&lt;/em&gt;
of the language to ensure correctness. It’s like the idea of
“&lt;a href=&#34;https://fsharpforfunandprofit.com/posts/designing-with-types-making-illegal-states-unrepresentable/&#34;&gt;making illegal states unrepresentable&lt;/a&gt;”.
Using the above design, it’s &lt;em&gt;surprisingly hard&lt;/em&gt; to accidentally grab a mutex
lock and cause a deadlock by failing to later unlock it.&lt;/p&gt;
&lt;h2 id=&#34;newtype&#34;&gt;Newtype&lt;/h2&gt;
&lt;p&gt;RAII Guards pair well with the
&lt;a href=&#34;https://rust-unofficial.github.io/patterns/patterns/behavioural/newtype.html&#34;&gt;Newtype&lt;/a&gt;
pattern to add yet more safety by utilizing the type system. The idea of the
Newtype pattern is pretty simple: wrap an existing type in a thin wrapper struct
to change its capabilities. I’ve
&lt;a href=&#34;https://benjamincongdon.me/blog/2021/11/14/Using-Embedding-to-Disambiguate-Types-in-Go/&#34;&gt;written about this pattern before&lt;/a&gt;
about Golang, where this is often done via
&lt;a href=&#34;https://go.dev/doc/effective_go#embedding&#34;&gt;Type Embedding&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This pairs nicely with RAII guards: often when you are using a guard, you want
to lock down the resource to prevent unintentional misuse. For example, say
you&amp;rsquo;re working with a database connection handle. The underlying type might just
be a low-level handle representing an OS-level resource, and you want to prevent
it from being accidentally duplicated (which could lead to connection reuse or
aliasing bugs).&lt;/p&gt;
&lt;p&gt;Here is a quite naive approach:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-rust&#34; data-lang=&#34;rust&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#00a&#34;&gt;struct&lt;/span&gt; &lt;span style=&#34;color:#0a0;text-decoration:underline&#34;&gt;ConnectionPool&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;{&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;connections: &lt;span style=&#34;color:#0a0;text-decoration:underline&#34;&gt;Mutex&lt;/span&gt;&amp;lt;&lt;span style=&#34;color:#0aa&#34;&gt;Vec&lt;/span&gt;&amp;lt;&lt;span style=&#34;color:#0aa&#34;&gt;u64&lt;/span&gt;&amp;gt;&amp;gt;,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;// Raw OS handles
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;&lt;/span&gt;}&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;impl&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;ConnectionPool&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;{&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;fn&lt;/span&gt; &lt;span style=&#34;color:#0a0&#34;&gt;checkout&lt;/span&gt;(&amp;amp;self)&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;-&amp;gt; &lt;span style=&#34;color:#0aa&#34;&gt;u64&lt;/span&gt; {&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;let&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;mut&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;conns&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;=&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;self.connections.lock().unwrap();&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;conns.pop().expect(&lt;span style=&#34;color:#a50&#34;&gt;&amp;#34;no connections available&amp;#34;&lt;/span&gt;)&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;}&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;fn&lt;/span&gt; &lt;span style=&#34;color:#0a0&#34;&gt;checkin&lt;/span&gt;(&amp;amp;self,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;handle: &lt;span style=&#34;color:#0aa&#34;&gt;u64&lt;/span&gt;)&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;{&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;self.connections.lock().unwrap().push(handle);&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;}&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;}&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The problem: since &lt;code&gt;u64&lt;/code&gt; has &lt;code&gt;Copy&lt;/code&gt;, nothing stops you from doing this:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-rust&#34; data-lang=&#34;rust&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#00a&#34;&gt;let&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;handle&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;=&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;pool.checkout();&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;let&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;oops_handle&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;=&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;handle;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;pool.checkin(handle);&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;client.query(oops_handle,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#a50&#34;&gt;&amp;#34;SELECT * FROM users;&amp;#34;&lt;/span&gt;);&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;// Whoops!
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This is of course rather contrived, but it shows how if we’re relying on the end
programmer to remember to “return” resources that they’ve checked out, we are
open to a rich source of leakage and misuse. However, we can fix this by
wrapping the raw handle in a newtype that doesn&amp;rsquo;t implement &lt;code&gt;Copy&lt;/code&gt; or &lt;code&gt;Clone&lt;/code&gt;:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-rust&#34; data-lang=&#34;rust&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#00a&#34;&gt;struct&lt;/span&gt; &lt;span style=&#34;color:#0a0;text-decoration:underline&#34;&gt;ConnectionHandle&lt;/span&gt;(&lt;span style=&#34;color:#0aa&#34;&gt;u64&lt;/span&gt;);&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;// Not Copy, not Clone
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;struct&lt;/span&gt; &lt;span style=&#34;color:#0a0;text-decoration:underline&#34;&gt;ConnectionPool&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;{&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;connections: &lt;span style=&#34;color:#0a0;text-decoration:underline&#34;&gt;Mutex&lt;/span&gt;&amp;lt;&lt;span style=&#34;color:#0aa&#34;&gt;Vec&lt;/span&gt;&amp;lt;&lt;span style=&#34;color:#0aa&#34;&gt;u64&lt;/span&gt;&amp;gt;&amp;gt;,&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;}&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;impl&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;ConnectionPool&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;{&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;fn&lt;/span&gt; &lt;span style=&#34;color:#0a0&#34;&gt;checkout&lt;/span&gt;(&amp;amp;self)&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;-&amp;gt; &lt;span style=&#34;color:#0a0;text-decoration:underline&#34;&gt;ConnectionHandle&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;{&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;let&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;mut&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;conns&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;=&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;self.connections.lock().unwrap();&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;ConnectionHandle(conns.pop().expect(&lt;span style=&#34;color:#a50&#34;&gt;&amp;#34;no connections available&amp;#34;&lt;/span&gt;))&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;}&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;fn&lt;/span&gt; &lt;span style=&#34;color:#0a0&#34;&gt;checkin&lt;/span&gt;(&amp;amp;self,&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;handle: &lt;span style=&#34;color:#0a0;text-decoration:underline&#34;&gt;ConnectionHandle&lt;/span&gt;)&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;{&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;        &lt;/span&gt;self.connections.lock().unwrap().push(handle.&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;);&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;    &lt;/span&gt;}&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;}&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Now the compiler prevents the bug:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-rust&#34; data-lang=&#34;rust&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#00a&#34;&gt;let&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;handle&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;=&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;pool.checkout();&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;let&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;oops_handle&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;=&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;handle;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;// This is a move, not a copy
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;&lt;/span&gt;pool.checkin(handle);&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#aaa;font-style:italic&#34;&gt;// Error: handle was already moved to `oops_handle`
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;In this example, the newtype pattern forces single-ownership semantics onto a
primitive that would otherwise be freely copyable. You also could go further and
combine this with an RAII guard that automatically checks the connection back in
when dropped.&lt;/p&gt;
&lt;p&gt;The meta lesson here is that a feature-rich language like Rust has strong
benefits. Yes, you’ll sometimes need to decipher some rather verbose type
signatures like
&lt;code&gt;Arc&amp;lt;RwLock&amp;lt;HashMap&amp;lt;TypeId, Box&amp;lt;dyn FnMut(&amp;amp;mut dyn Any) + Send&amp;gt;&amp;gt;&amp;gt;&amp;gt;&lt;/code&gt;, but in
exchange you get your friendly compiler to prevent you from footgunning yourself
in myriad ways.&lt;/p&gt;
&lt;p&gt;In the learning pit of despair, Rust’s type system can feel like an ivory tower.
I&amp;rsquo;d read one of FasterThanLime’s
&lt;a href=&#34;https://fasterthanli.me/articles/catching-up-with-async-rust&#34;&gt;fantastic articles&lt;/a&gt;,
but after I&amp;rsquo;d digested the great technical writing, I was left with a feeling of
“OK, but now I need to return to my day job and write some enterprise software
glue code”. Once at the plateau of productivity, the type/memory/structural
safety you’ve been slowly absorbing more than pays for the initial discomfort.&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;The final boss, as it turns out, was allowing myself to use a healthy
quantity of &lt;code&gt;Arc&amp;lt;T&amp;gt;&lt;/code&gt;s.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
        <title>Letters Are Still an Option</title>
        <link>https://benjamincongdon.me/blog/2025/12/22/Letters-Are-Still-an-Option/</link>
        <pubDate>Mon, 22 Dec 2025 21:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/12/22/Letters-Are-Still-an-Option/</guid>
        <description>&lt;p&gt;I sometimes wish I’d grown up in the era of written letters, or that email and
long-form written correspondences were more fashionable than they currently are.
There is something quite enjoyable about sitting down and intentionally writing
to someone, for hours even. The times that I’ve sat down to write something
long-form to a friend, or have received the same, feel qualitatively different
from the accumulation of many shorter messages.&lt;/p&gt;
&lt;p&gt;Part of what draws me to letters is the fact that I value high quality writing
&amp;ndash; reading it, writing it, editing it. Writing is a proof-of-thought, that the
author cared enough to sit down and put mental labor into writing something.&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt;
And if that something is emotionally resonant, all the better. There are only so
many hours in the day, and so the fact that someone spent that time writing
something deep &lt;em&gt;to you&lt;/em&gt;, or you &lt;em&gt;to them&lt;/em&gt;, adds a level of intentionality that
frequent texts, in my opinion, do not.&lt;/p&gt;
&lt;p&gt;More “modern” async communication nudges in the direction of faster, briefer,
more back-and-forth. Admittedly, there’s a time and place for this! In work
contexts, dismayed as I sometimes am about the size of my Slack inbox, it just
would not be feasible to write considered email responses to every inquiry I
receive. I’ve worked in organizations where long-form emails were the default,
and that always felt like a well-meaning but misplaced use of time and
attention.&lt;/p&gt;
&lt;p&gt;In the personal domain though, I’m finding I value slower communication. A
slower pace allows you to sit with ideas for a while before responding, and then
respond in sufficient detail that the background thinking time was worth it.
This doesn’t require adopting a distant intellectual tone. Good “slow” writing
can be warm, connecting, and exciting.&lt;/p&gt;
&lt;p&gt;Good “fast” personal communication can also be warm, connecting, and exciting,
but it can also carry along with it a sense of “always needing to be available”.
The feeling that at any given time, your phone may buzz and you’ll be liable to
respond to something. Or that you’ll send someone a message, they immediately
respond, and you owe them a quick response. Personally, I find this constant
availability draining, and that I don’t have as much to offer when responding
from a place of “quick response”.&lt;/p&gt;
&lt;p&gt;This is all, of course, norms that are partially developed at a global level and
(more importantly) negotiated in each one-on-one relationship. Similar to the
reflection I wrote on
&lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/01/Schedule-Recurring-Calls-With-Your-Far-Away-Friends/&#34;&gt;scheduling recurring calls&lt;/a&gt;,
slow long-form writing is something that one can propose.&lt;/p&gt;
&lt;p&gt;Thankfully, &lt;em&gt;we have the technology&lt;/em&gt; in 2025 to distribute text rather easily.
This can look like multi-paragraph iMessages, hand-written letters, writing
5,000 words and attaching it as a PDF via email or text. I suspect I&amp;rsquo;m not alone
in finding this mode of communication underused &amp;ndash; in fact, I know I&amp;rsquo;m not,
since I&amp;rsquo;ve run into wonderful people who also share this sentiment.&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;As of writing in late 2025, it’s still pretty reasonable for the trained
reader to
&lt;a href=&#34;https://benjamincongdon.me/blog/2025/01/25/AI-Slop-Suspicion-and-Writing-Back/&#34;&gt;detect even reasonably concealed AI writing&lt;/a&gt;.
So, quality writing in any domain &amp;ndash; blogs, work, personal &amp;ndash; stands out
bold against the backdrop of the developing dark forest of various forms of
slop.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
        <title>An Inconvenient Truth</title>
        <link>https://benjamincongdon.me/blog/2025/12/21/An-Inconvenient-Truth/</link>
        <pubDate>Sun, 21 Dec 2025 08:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/12/21/An-Inconvenient-Truth/</guid>
        <description>&lt;p&gt;Salvatore Sanfilippo, aka antirez, the creator of Redis, posted a
&lt;a href=&#34;https://antirez.com/news/157&#34;&gt;piece on his reflections on AI for the end of 2025&lt;/a&gt;.
He ends with the statement:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The fundamental challenge in AI for the next 20 years is avoiding extinction.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I’ve been hesitant to say this publicly, but I broadly agree with this
statement. Here are a few related statements I endorse:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Building intelligences smarter than humans is dangerous.&lt;/li&gt;
&lt;li&gt;Aligning a smarter-than-human intelligence to human values is an open and
unsolved problem.&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
&lt;li&gt;By default, market forces will cause us to underinvest in AI safety and
underenforce pre- and post-deployment safety measures.&lt;/li&gt;
&lt;li&gt;Lacking some form of external governance, market forces will encourage “arms
race dynamics” between frontier AI labs which sideline whatever safety
commitments have been made.&lt;/li&gt;
&lt;li&gt;Even if we &lt;em&gt;are&lt;/em&gt; able to create and align a smarter-than-human intelligence,
there are unresolved long-term concerns around gradual
disempowerment&lt;sup id=&#34;fnref:2&#34;&gt;&lt;a href=&#34;#fn:2&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;2&lt;/a&gt;&lt;/sup&gt; and mass unemployment that should give us serious
pause about continuing down our current path.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;an-inconvenient-truth&#34;&gt;An Inconvenient Truth&lt;/h2&gt;
&lt;p&gt;In 2006, Al Gore’s
&lt;a href=&#34;https://en.wikipedia.org/wiki/An_Inconvenient_Truth&#34;&gt;“An Inconvenient Truth”&lt;/a&gt;
became a focal point for concerns about climate change. We have yet to have an
&lt;em&gt;Inconvenient Truth&lt;/em&gt; moment for AI existential risk.
&lt;a href=&#34;https://ai-2027.com/&#34;&gt;AI 2027&lt;/a&gt; and
&lt;a href=&#34;https://en.wikipedia.org/wiki/If_Anyone_Builds_It,_Everyone_Dies&#34;&gt;If Anyone Builds It, Everyone Dies&lt;/a&gt;
both came close this year, but neither seemed to spark a broad reaction like
&lt;em&gt;Inconvenient Truth&lt;/em&gt; or other related climate change movements did.&lt;/p&gt;
&lt;p&gt;I think the core framing of &lt;em&gt;An Inconvenient Truth&lt;/em&gt; roughly applies to risks
from AI. For both climate change and AI x-risk, the immediate impacts feel banal
and easy to ignore. The projections based on existing long-term trends, on the
other hand, are scary and deserve attention. We are rather poor at allocating
attention to long-term risks that would be more easily preventable with
short-term governance action. The same tension applies here: it&amp;rsquo;s costly to put
mental energy into considering &amp;ldquo;doom&amp;rdquo; from an abstract risk. The risk won&amp;rsquo;t
materialize within the next few years. There remains legitimate uncertainty over
just &lt;em&gt;how&lt;/em&gt; bad it actually is. And so it gets deprioritized.&lt;/p&gt;
&lt;h2 id=&#34;ai-risks&#34;&gt;AI Risks&lt;/h2&gt;
&lt;p&gt;The risks of AI exist on a spectrum from “benign and mundane” to “catastrophic
and existential”. As you move more towards the catastrophic end of this
spectrum, the scenarios one needs to consider get progressively “weirder” and
require more extrapolation to recognize as probable.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Mundane risks&lt;/strong&gt; are things that are already showing up today. Some examples:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;LLMs engaging in overt sycophancy, and/or encouraging delusions in people
with existing mental illness.&lt;sup id=&#34;fnref:3&#34;&gt;&lt;a href=&#34;#fn:3&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
&lt;li&gt;Using AI to increase the speed and sophistication of existing modes of
cybercrime.&lt;sup id=&#34;fnref:4&#34;&gt;&lt;a href=&#34;#fn:4&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;4&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Foreseeable risks&lt;/strong&gt; include things that are not actively problems today, but
are on trend to become problems soon:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Significant job losses in knowledge work sectors leading to societally
impacting disruption.&lt;/li&gt;
&lt;li&gt;Concentration of power in a small number of AI companies.&lt;/li&gt;
&lt;li&gt;Autonomous AI systems being deployed in safety-critical or high-stakes
domains (e.g. military, financial markets, critical infrastructure) before
we have robustly solved out-of-distribution alignment.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Catastrophic risks&lt;/strong&gt; include things that are, hopefully, fairly far off but
would be &lt;em&gt;very&lt;/em&gt; bad for humanity:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Recursive self-improvement of AI, and/or full automation of AI research and
development leading to a widening asymmetry between AI capabilities and
humanity’s collective understanding of how AI systems work. This could
result in an AI that far outmatches us strategically, with goals that are
unaligned with human flourishing.&lt;/li&gt;
&lt;li&gt;Deliberate misuse of AI by a state-level actor to, for example: design novel
bioweapons, coordinate attacks on critical infrastructure, develop novel
catastrophic technologies that humanity would otherwise have taken much
longer to develop (e.g.
&lt;a href=&#34;https://en.wikipedia.org/wiki/Mirror-image_life#Potential_risks_and_debates&#34;&gt;mirror life&lt;/a&gt;).&lt;/li&gt;
&lt;li&gt;Loss of meaningful human control over governance&lt;sup id=&#34;fnref:5&#34;&gt;&lt;a href=&#34;#fn:5&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;5&lt;/a&gt;&lt;/sup&gt; and
economic&lt;sup id=&#34;fnref:6&#34;&gt;&lt;a href=&#34;#fn:6&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;6&lt;/a&gt;&lt;/sup&gt; systems.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;before-its-obvious&#34;&gt;Before It&amp;rsquo;s Obvious&lt;/h2&gt;
&lt;p&gt;I’d suggest that at the end of 2025, we’re still in a bit of an awkward place in
the dialogue about AI risk. I think people are right to remain skeptical about
how big of a deal this is, as it does require a pretty large leap to go from
“ChatGPT” to “Catastrophic Risks”. It required an even larger leap in the
pre-ChatGPT era, which is why the types of folks who have been raising the alarm
about AI risk (e.g. &lt;a href=&#34;https://intelligence.org/&#34;&gt;MIRI&lt;/a&gt;) are, respectfully,
necessarily a bit “weird”.&lt;/p&gt;
&lt;p&gt;For the past couple years, I’ve often felt this gut-sinking feeling akin to late
December 2019 through late February 2020. I remember the cognitive dissonance of
visiting Twitter/X and seeing people whose thinking I respect saying “you all
should be freaking out, this novel pathogen is serious business” while more of
the mainstream information ecosystem either unknowingly ignored these arguments
or knowingly suppressed them as “misinformation”.&lt;/p&gt;
&lt;p&gt;I’m writing this post to do my small part to push my information ecosystem the
other way. A small stone on the cairn of “yes, we should be concerned about
this”.&lt;/p&gt;
&lt;h2 id=&#34;psychological-defenses&#34;&gt;Psychological Defenses&lt;/h2&gt;
&lt;p&gt;I’ve noticed an interesting set of reactions, both thinking to myself about this
problem and in talking to others about it.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Goalpost Moving:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;What it looks like: “AI can’t do X, therefore we have nothing to worry
about. &lt;em&gt;One year later&lt;/em&gt;, AI can do X. But it still can’t do Y, so we’re
good.”&lt;/li&gt;
&lt;li&gt;Discussion: Look, tools like Claude Code are, by some reasonable
definitions, essentially proto-AGIs. If I somehow got access to Claude Code
in 2015, I expect I’d concede that we’d reached AGI. And yet, all frontier
models still currently have significant weaknesses. I’d &lt;em&gt;really&lt;/em&gt; appreciate
convincing evidence that progress is materially slowing, but I’ve yet to see
it. The progress in 2025 alone has been staggering. “X is not possible
today” is not an argument that it won’t be possible in 10 years.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Argument from Inconvenience:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;What it looks like: “It would be deeply inconvenient and &lt;em&gt;weird&lt;/em&gt; if powerful
AI was dangerous, so I will proceed as if it cannot exist.”&lt;/li&gt;
&lt;li&gt;Discussion: Yeah, it is inconvenient and weird. I sympathize with this
argument a lot.
&lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/08/The-Decline-of-the-Software-Drafter/&#34;&gt;I don’t really &lt;em&gt;want&lt;/em&gt; my industry to be disrupted&lt;/a&gt;,
but it demonstrably already is undergoing this. No one wants the “bad” that
comes along with AI progress, and many people don’t want the “good”. There
is a legitimate question around “is this type of progress inevitable or
chosen?” There isn’t a binary answer to this, and there is still optionality
for us collectively to decide that dangerous capability progress isn’t
inevitable. However, “inevitability” does appear to be the default path.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;There Are Adults in the Room:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;What it looks like: “There will always be someone &lt;em&gt;using&lt;/em&gt; the AI or &lt;em&gt;in
control&lt;/em&gt; of the AI, so cooler heads will prevail. No one wants doom.”&lt;/li&gt;
&lt;li&gt;Discussion: Unfortunately, this expects a lot of discretion from people. We
are training AI systems to be more agentic on long-horizon tasks for the
explicit purpose of handing over control of long-horizon tasks to them.
First, people empirically &lt;em&gt;do&lt;/em&gt; cede control to automation for convenience,
efficiency, and competitiveness reasons. Second, the risk of loss of control
gets much scarier as AI systems become more intelligent and autonomous. I do
not think loss of control is a particularly scary risk &lt;em&gt;today&lt;/em&gt;, but
projected forward a decade and it becomes more concerning.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The last main remaining resistance I’ve encountered is the lack of a realistic,
non-scifi sounding scenario for how a catastrophe would occur. Both &lt;em&gt;AI 2027&lt;/em&gt;
and &lt;em&gt;If Anyone Builds It, Everyone Dies&lt;/em&gt; offer expanded scenarios about how a
catastrophic risk could occur, while giving the disclaimer that this &lt;em&gt;specific&lt;/em&gt;
scenario is not particularly likely. In contrast, risks like Mutually Assured
Destruction from nuclear warheads are much easier to articulate: (1) “something
happens” and the {USA, USSR} launches a first strike on the {USSR, USA}, (2) the
recipient of a first strike launches a guaranteed counterstrike, (3) many
millions die in the direct and indirect aftermath. AI risks rely more on
intuition pumps and extrapolation, for now, which is much less easy to share as
an elevator pitch.&lt;/p&gt;
&lt;p&gt;The best I’ve heard, so far, is just that “building a smarter-than-human
intelligence is dangerous”. Fortunately, there are now also
&lt;a href=&#34;https://aisafety.info/questions/NM3T/A%20case%20for%20AI%20safety&#34;&gt;many&lt;/a&gt;
&lt;a href=&#34;https://www.lesswrong.com/posts/W43vm8aD9jf9peAFf/a-simple-explanation-of-agi-risk&#34;&gt;good&lt;/a&gt;
&lt;a href=&#34;https://thezvi.substack.com/p/ai-practical-advice-for-the-worried&#34;&gt;explainers&lt;/a&gt;
and resources for various levels of technical depth.&lt;/p&gt;
&lt;h2 id=&#34;seeing-like-a-cat&#34;&gt;Seeing Like a Cat&lt;/h2&gt;
&lt;p&gt;The core intuition, that smarter-than-human intelligence is dangerous to humans
is surprisingly hard to convey. Let me try an analogy.&lt;/p&gt;
&lt;p&gt;I recently adopted two very cute 6 month old kittens from my local humane
society. They were likely stray/feral cats earlier in life, and so they’re
skittish around humans. As they’ve been adapting to living in my house, I’ve
given them space and let them hide wherever they want.&lt;/p&gt;
&lt;p&gt;In some sense, my new kittens aren’t “entirely” less intelligent than me. They
can find hidey-holes in my house that I had no knowledge of, they can out-run
me, and their sense of sound makes them well equipped to avoid me. And yet, I am
in some important sense more intelligent than them.&lt;/p&gt;
&lt;p&gt;If I needed to, say, scoop them up for a vet visit I &lt;em&gt;could&lt;/em&gt; outsmart them. I
&lt;em&gt;could&lt;/em&gt; block their hiding spots, strategically shunt them into a room with no
other exits, lure them with food or toys, and eventually catch them. This is not
obvious to them. From their perspective, I&amp;rsquo;m just a friendly person who gives
them food, plays with them, and otherwise leaves them alone.&lt;sup id=&#34;fnref:7&#34;&gt;&lt;a href=&#34;#fn:7&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;7&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;p&gt;The obvious analogy is, consider that a superintelligent AI is to humanity what
I am to my kittens. Something that can out-strategize you without you even
realizing you’ve been backed into the corner of a room with no doors. This
analogy applies in two ways:&lt;/p&gt;
&lt;p&gt;First: We might end up in a fairly benign world where humans are treated the
same way as cats &amp;ndash; cats are cute, they are generally treated well, and to the
extent that they are out-strategized by humans, it is largely for their own
good. This still doesn’t imply that one would want to be disempowered in this
way if you, say, have preferences about the future.&lt;/p&gt;
&lt;p&gt;Second: you may say, “intelligence isn’t some raw scalar quantity that you can
just increase linearly”. And I would agree. I’m not sure what it would mean to,
say, have an agent with a “300 IQ”. Like, concretely, an agent capable of
getting a 300 on an IQ test. That seems meaningless, I agree.&lt;/p&gt;
&lt;p&gt;I can, however, imagine an agent with jagged intelligence which has the ability
to autonomously query and integrate information from hundreds (or thousands) of
realtime sources&lt;sup id=&#34;fnref:8&#34;&gt;&lt;a href=&#34;#fn:8&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;8&lt;/a&gt;&lt;/sup&gt;, effortlessly hack into key infrastructure systems while
evading detection&lt;sup id=&#34;fnref:9&#34;&gt;&lt;a href=&#34;#fn:9&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;9&lt;/a&gt;&lt;/sup&gt;, subtly influence humans into various opinions or plans of
action that they wouldn’t already&lt;sup id=&#34;fnref:10&#34;&gt;&lt;a href=&#34;#fn:10&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;10&lt;/a&gt;&lt;/sup&gt;, and so on. Obviously these capabilities
do not exist in a single unified system today. AI agents will have structural
advantages &amp;ndash; for example, the ability to copy themselves, work in parallel, and
“think” much faster than humans.&lt;/p&gt;
&lt;h2 id=&#34;what-to-do&#34;&gt;What To Do&lt;/h2&gt;
&lt;p&gt;I have high certainty that “AI risk is worth taking seriously” and
moderate-to-high certainty that “it is worth taking actions now to reduce the
likelihood of these risks”.&lt;/p&gt;
&lt;p&gt;I am less certain on which actions exactly would be net beneficial, and how much
of a current-day cost we should be willing to pay to mitigate future risks.
Fortunately, there are many groups&lt;sup id=&#34;fnref:11&#34;&gt;&lt;a href=&#34;#fn:11&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;11&lt;/a&gt;&lt;/sup&gt; taking these questions seriously and
proposing frameworks. This is laudable and should be financially supported.&lt;/p&gt;
&lt;p&gt;At a policy level, I’m becoming increasingly convinced that we need &lt;em&gt;some&lt;/em&gt; sort
of governance structure to enable transparency and accountability for safety
measures, as well as to constrain the arms race dynamics between frontier labs.
This is not an easy problem, and will likely require international coordination.
I have moderate credence that a “pause AI” plan, enacted today, would not be net
helpful as we are still in the domain of being able to get mostly net positive
mundane utility from further progress. However, I think coordination around a
future mechanism to enforce a pause would be wise.&lt;/p&gt;
&lt;p&gt;The inconvenient truth for AI is that we are racing forward faster than our
ability to ensure safe development. Even with today’s AI capabilities, we have a
lot of priced-in “weirdness” for societal impacts. It is worth considering this
future directly, even in the face of significant uncertainty.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;&lt;em&gt;Obligatory: These are my own opinions and do not reflect those of my employer,
etc.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Cover image by Recraft v3.&lt;/em&gt;&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;See e.g. Evan Hubinger’s
&lt;a href=&#34;https://www.alignmentforum.org/posts/epjuxGnSPof3GnMSL/alignment-remains-a-hard-unsolved-problem&#34;&gt;Alignment remains a hard, unsolved problem&lt;/a&gt;.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:2&#34;&gt;
&lt;p&gt;See e.g. &lt;a href=&#34;https://arxiv.org/abs/2501.16946&#34;&gt;Gradual Disempowerment&lt;/a&gt;.&amp;#160;&lt;a href=&#34;#fnref:2&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:3&#34;&gt;
&lt;p&gt;See e.g.
&lt;a href=&#34;https://openai.com/index/sycophancy-in-gpt-4o/&#34;&gt;“Sycophancy in GPT-4o: what happened and what we’re doing about it”&lt;/a&gt;
(OpenAI)&amp;#160;&lt;a href=&#34;#fnref:3&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:4&#34;&gt;
&lt;p&gt;See e.g.
&lt;a href=&#34;https://www.anthropic.com/news/disrupting-AI-espionage&#34;&gt;“Disrupting the first reported AI-orchestrated cyber espionage campaign”&lt;/a&gt;
(Anthropic)&amp;#160;&lt;a href=&#34;#fnref:4&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:5&#34;&gt;
&lt;p&gt;For example, much of the bureaucratic complexity of governing could be
turned over to AI systems for global competitiveness and efficiency reasons.
This could be a short-term boon for states which adopt this, but eventually
result in such a complex governance structure that wresting control back
from the AI becomes impossible.&amp;#160;&lt;a href=&#34;#fnref:5&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:6&#34;&gt;
&lt;p&gt;For example, individual firms turn over control over investment increasingly
to AI for competitive reasons. This would likely be a short-term boon for
the individual firms that choose to do so, and would select against firms
that maintain a “no AI” stance. However, long term once the markets are
saturated by AI-controlled firms, the alignment of the economy could become
increasingly misaligned with human needs and preferences. From
&lt;a href=&#34;https://gradual-disempowerment.ai/misaligned-economy&#34;&gt;Gradual Disempowerment&lt;/a&gt;:
“[H]umans might lose the ability to meaningfully participate in economic
decision-making at any level. Financial markets might move too quickly for
human participants to engage with them, and the complexity of AI-driven
economic systems might exceed human comprehension, rendering it impossible
for humans to make informed economic decisions or effectively regulate
economic activity.”&amp;#160;&lt;a href=&#34;#fnref:6&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:7&#34;&gt;
&lt;p&gt;To be clear, that’s what I want! Rehabilitating skittish cats is a long
game. Both of them are already warming up to me, so my strategy appears to
be working.&amp;#160;&lt;a href=&#34;#fnref:7&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:8&#34;&gt;
&lt;p&gt;The dumb “today” version of this is Deep Research agents.&amp;#160;&lt;a href=&#34;#fnref:8&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:9&#34;&gt;
&lt;p&gt;The dumb “today” version of this is GPT 5.2 and Opus 4.5, both of which are
nearing human-level performance on cybersecurity capture-the-flag
benchmarks.&amp;#160;&lt;a href=&#34;#fnref:9&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:10&#34;&gt;
&lt;p&gt;The dumb “today” version of this is the so-called “LLM psychosis” effect
with sycophantic models like GPT 4o.&amp;#160;&lt;a href=&#34;#fnref:10&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:11&#34;&gt;
&lt;p&gt;A short list: &lt;a href=&#34;https://www.iaps.ai/&#34;&gt;IAPS&lt;/a&gt;,
&lt;a href=&#34;https://safe.ai/&#34;&gt;Center for AI Safety&lt;/a&gt;,
&lt;a href=&#34;https://palisaderesearch.org/&#34;&gt;Palisade Research&lt;/a&gt;,
&lt;a href=&#34;https://theaipi.org/&#34;&gt;AIPI&lt;/a&gt;.&amp;#160;&lt;a href=&#34;#fnref:11&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
        <title>The South Flow SeaTac Arrival Corridor</title>
        <link>https://benjamincongdon.me/blog/2025/12/20/The-South-Flow-SeaTac-Arrival-Corridor/</link>
        <pubDate>Sat, 20 Dec 2025 21:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/12/20/The-South-Flow-SeaTac-Arrival-Corridor/</guid>
        <description>&lt;p&gt;From Gas Works Park, you can watch jets descend to SeaTac in a steady stream,
one every minute or two. I’ve lived in various neighborhoods in Seattle, all
under this corridor &amp;ndash; SLU, Eastlake, Fremont. I didn’t think much of this fact
until fairly recently, when I started taking flight lessons and becoming
generally more actively interested in aviation.&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
            
        
        
        
            
            
            
            
            
            
                
                
                
                    
                
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/20/The-South-Flow-SeaTac-Arrival-Corridor/south_flow_tracks.jpg&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/20/The-South-Flow-SeaTac-Arrival-Corridor/south_flow_tracks_hu4b6039e16f57b8607f26b5fddc9b093e_601802_0x1456_resize_q100_h2_lanczos.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/20/The-South-Flow-SeaTac-Arrival-Corridor/south_flow_tracks_hu4b6039e16f57b8607f26b5fddc9b093e_601802_0x1456_resize_q100_lanczos.jpg&#34; alt=&#34;The South Flow SeaTac Arrival Corridor; Red indicates arrivals&#34; width=&#34;486&#34; height=&#34;800&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;The South Flow SeaTac Arrival Corridor; Red indicates arrivals&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;As I write this, I have &lt;a href=&#34;https://www.flightradar24.com/&#34;&gt;FlightRadar24&lt;/a&gt; popped
open and can see a dozen or so jets in or entering this funnel. Planes enter the
approach north of Seattle around Lynnwood or Shoreline, hug I-5 south, passing
Green Lake and Lake Union, eventually descending through South Seattle into
their SeaTac landing. There’s something comfortingly mechanical about this
rhythm.&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
            
        
        
        
            
            
            
            
            
            
                
                
                
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/20/The-South-Flow-SeaTac-Arrival-Corridor/flightradar24.png&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/20/The-South-Flow-SeaTac-Arrival-Corridor/flightradar24_hufd870d14e091ce6c7a6b5ec0317a863c_537427_0x800_resize_q100_h2_lanczos_3.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/20/The-South-Flow-SeaTac-Arrival-Corridor/flightradar24_hufd870d14e091ce6c7a6b5ec0317a863c_537427_0x800_resize_lanczos_3.png&#34; alt=&#34;FlightRadar24 snapshot of the arrival corridor&#34; width=&#34;321&#34; height=&#34;400&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;FlightRadar24 snapshot of the arrival corridor&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;Of course, depending on the wind, SeaTac can also flip to a “north flow” pattern
where the funnel over Green Lake becomes the departure direction instead of the
arrival, but the south flow is used about two-thirds of the time.&lt;/p&gt;
&lt;p&gt;It’s also notable how entirely non-dramatic this routine is. Even in the foggy
soup of Seattle clouds, the rain, negligible visibility, SeaTac continues to
operate. I’m still quite early in my private pilot training and haven’t touched
instrument flying yet, but I have a fledgling appreciation for what makes this
possible: procedures, precision instruments, and skilled piloting. The fact that
you can fly through the clouds and only see the runway a few hundred feet before
landing (safely, of course) is an impressive feat of engineering and aviating.&lt;/p&gt;
&lt;p&gt;I’ve now been thinking of this arrival corridor as one of my favorite landmarks
of Seattle. It’s “invisible” and it’s undoubtedly less of a recognizable
landmark than either of the floating bridges, the canal bridges, or any of
Seattle’s other notable infrastructure landmarks. But nonetheless, I think about
it nearly every day &amp;ndash; whether it be on a run, driving to work, or just in
hearing the quiet hum of a plane flying overhead.&lt;/p&gt;
</description>
    </item>
    
    <item>
        <title>My Favorite Music of 2025</title>
        <link>https://benjamincongdon.me/blog/2025/12/19/My-Favorite-Music-of-2025/</link>
        <pubDate>Fri, 19 Dec 2025 20:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/12/19/My-Favorite-Music-of-2025/</guid>
        <description>&lt;p&gt;2025 was a &lt;em&gt;really&lt;/em&gt; good year for music for me. My music collection system is
pretty basic: every year, I cut a new playlist and add songs that I&amp;rsquo;m currently
resonating with. The playlist gets progressively longer until the end of the
year, after which I reset and start with a blank slate in January. I started
this back in ~2018 after feeling like I was in a musical rut, finding everything
I listened to stale.&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
            
        
        
        
            
            
            
            
            
            
                
                
                
                    
                
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/19/My-Favorite-Music-of-2025/playlist-chart.png&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/19/My-Favorite-Music-of-2025/playlist-chart_huc7f16138e728efb71251fd63c28d0ad0_35728_0x427_resize_q100_h2_lanczos_3.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/19/My-Favorite-Music-of-2025/playlist-chart_huc7f16138e728efb71251fd63c28d0ad0_35728_0x427_resize_lanczos_3.png&#34; width=&#34;637&#34; height=&#34;350&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    
&lt;/figure&gt;

&lt;p&gt;2025 was by nearly 2x my “highest discoverability” year yet. It was a really fun
year for discovery and collection! Whereas 2024 had a few albums that I listened
to a &lt;em&gt;lot&lt;/em&gt;, 2025 had higher breadth, still with much that I listened to
frequently.&lt;/p&gt;
&lt;h2 id=&#34;no-skip-albums&#34;&gt;“No Skip” Albums&lt;/h2&gt;
&lt;p&gt;First up are the albums for which every track made it on my 2025 playlist.&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
            
        
        
        
            
            
            
            
            
            
                
                
                
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/19/My-Favorite-Music-of-2025/big_talk.jpg&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/19/My-Favorite-Music-of-2025/big_talk_hu9875c3dbefd5b10ef201996e6172ef87_188451_0x500_resize_q100_h2_lanczos.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/19/My-Favorite-Music-of-2025/big_talk_hu9875c3dbefd5b10ef201996e6172ef87_188451_0x500_resize_q100_lanczos.jpg&#34; width=&#34;250&#34; height=&#34;250&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    
&lt;/figure&gt;

&lt;p&gt;&lt;strong&gt;Couch’s
&lt;a href=&#34;https://open.spotify.com/album/3qHu0st1qfx9UL19NEB7Dg?si=scQIm7wkTmS9kcv_eHewzQ&#34;&gt;Big Talk&lt;/a&gt;&lt;/strong&gt;
was my sleeper hit for Best Album of the year. As the pre-release singles for
Big Talk were coming out over the summer I remember thinking “huh, Couch has new
tracks, cool!” I was on vacation when the album actually dropped, and it was a
couple weeks before I realized that between Spotify’s Discover Weekly and
Release Radar, I’d independently added &lt;em&gt;every song&lt;/em&gt; on Big Talk to my 2025
playlist without realizing it. It’s really my quintessential no-skip album for
the year.&lt;/p&gt;
&lt;p&gt;I also got to see Couch live recently in Seattle and they were &lt;em&gt;fantastic&lt;/em&gt; in
person. It was a sold out concert in a relatively small venue, and was one of
the best concerts I’ve been to in recent years.&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt; Couch is still new enough
that it’s possible to know their entire discography back-to-front without &lt;em&gt;too&lt;/em&gt;
much investment.&lt;/p&gt;
&lt;p&gt;Entry point songs:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;From Big Talk:
&lt;a href=&#34;https://open.spotify.com/track/3skiUNigP6jbRfcBnBkFnQ?si=607799eadc7d4a11&#34;&gt;On The Wire&lt;/a&gt;,
&lt;a href=&#34;https://open.spotify.com/track/5jR3BZJvoBrTj0OeSAZTwM?si=8ca227fda8c24fa4&#34;&gt;What Were You Thinking&lt;/a&gt;,
&lt;a href=&#34;https://open.spotify.com/track/5RsqDqxS3qD0MCfVvX5uxL?si=35c50c8cd2fc438d&#34;&gt;Transparent&lt;/a&gt;,
&lt;a href=&#34;https://open.spotify.com/track/4cyoVONdFT1WIdl500AIsY?si=16aabe83cbb94c02&#34;&gt;Static &amp;amp; Noise&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;More Couch:
&lt;a href=&#34;https://open.spotify.com/track/4tNZHcmHZeHsodIwYOBUET?si=f75d33b5aada48dd&#34;&gt;Jessie&lt;/a&gt;,
&lt;a href=&#34;https://open.spotify.com/track/3RgRQ4MVcf8Fc8duo5nupf?si=8b637271ceb24bd9&#34;&gt;Still Feeling You&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
            
        
        
        
            
            
            
            
            
            
                
                
                
                    
                
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/19/My-Favorite-Music-of-2025/loved.jpg&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/19/My-Favorite-Music-of-2025/loved_hu3685a633d3cd09e85c656b36cb7588d8_20434_0x300_resize_q100_h2_lanczos.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/19/My-Favorite-Music-of-2025/loved_hu3685a633d3cd09e85c656b36cb7588d8_20434_0x300_resize_q100_lanczos.jpg&#34; alt=&#34;Parcels&amp;amp;rsquo; LOVED&#34; width=&#34;250&#34; height=&#34;250&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;Parcels&amp;rsquo; LOVED&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;&lt;strong&gt;Parcels’
&lt;a href=&#34;https://open.spotify.com/album/1rSbjr5U9J9rQ9sE7RxHFl?si=TMNNAB05TuOHFvCcwwwanw&#34;&gt;LOVED&lt;/a&gt;&lt;/strong&gt;
was another album that snuck into “no skip” status. I first heard of Parcels
from their 2018
&lt;a href=&#34;https://open.spotify.com/track/66tkDkPsznE5zIHNt4QkXB?si=3907018526f342e0&#34;&gt;Tieduprightnow&lt;/a&gt;,
and the got reintroduced to them via the Berlin live recording of
&lt;a href=&#34;https://open.spotify.com/track/0ZcOprrr9ubO9ORfjbHjsx?si=5feddb60efa741c6&#34;&gt;Lightenup&lt;/a&gt;
showing up in my Spotify Discover one week. That track became my goto “get up
and face the day” track for much of the mid-summer. When LOVED came out in
September, it was a similar slow burn to Big Talk in that I slowly realized that
I’d added the entire album to my 2025 playlist.&lt;/p&gt;
&lt;p&gt;Why Parcels? There’s just a clear earnest joyfulness that I find endearing. I
sent one of their tracks to a friend recently, and their response was something
like “Yep, this is Ben music”.&lt;/p&gt;
&lt;p&gt;Entry point songs:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;From LOVED:
&lt;a href=&#34;https://open.spotify.com/track/4LpNkTXEHMktM6cj6DV12i?si=a0994b8bd8814768&#34;&gt;Iwanttobeyourlightagain&lt;/a&gt;,
&lt;a href=&#34;https://open.spotify.com/track/7cSLD89y304ZvX4mVhYVfw?si=48c4129741914ead&#34;&gt;Tobeloved&lt;/a&gt;,
&lt;a href=&#34;https://open.spotify.com/track/1eGYO1RzVRdphhWT86DSr9?si=a6a46726182f4225&#34;&gt;Yougotmefeeling&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Other Parcels:
&lt;a href=&#34;https://open.spotify.com/track/0hhXziDUO0wNYPsstDQWN6?si=c53cb79a702846bb&#34;&gt;Overnight&lt;/a&gt;
(this &lt;a href=&#34;https://www.youtube.com/watch?v=kIxFtDz_1hs&#34;&gt;Amazon Music version&lt;/a&gt; is
excellent),
&lt;a href=&#34;https://open.spotify.com/track/0ZcOprrr9ubO9ORfjbHjsx?si=5feddb60efa741c6&#34;&gt;Lightenup&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;really-great-albums&#34;&gt;Really Great Albums&lt;/h2&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
            
        
        
        
            
            
            
            
            
            
                
                
                
                    
                
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/19/My-Favorite-Music-of-2025/imaginal_disk.png&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/19/My-Favorite-Music-of-2025/imaginal_disk_hud6327d8fea22e31ff0c7d6ecb202eacb_43108_0x300_resize_q100_h2_lanczos_3.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/19/My-Favorite-Music-of-2025/imaginal_disk_hud6327d8fea22e31ff0c7d6ecb202eacb_43108_0x300_resize_lanczos_3.png&#34; width=&#34;250&#34; height=&#34;250&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    
&lt;/figure&gt;

&lt;p&gt;&lt;strong&gt;Magdalena Bay’s
&lt;a href=&#34;https://open.spotify.com/album/2Htq1sHgmdGffojIBM6Q1s?si=xK7O8C3ZRc6BWdZDozrkhw&#34;&gt;Imaginal Disk&lt;/a&gt;&lt;/strong&gt;
was technically released in 2024, but I’m including it in this list because they
toured on this album in 2025. Imaginal Disk is a bit of a funky concept album,
steeped in weird Y2K-era imagery.
&lt;a href=&#34;https://www.nme.com/reviews/album/magdalena-bay-imaginal-disk-review-3784463&#34;&gt;This review&lt;/a&gt;
described it as “a time capsule of post-internet existentialism”, which I’d
endorse. For a flavor of this, watch the
&lt;a href=&#34;https://www.youtube.com/watch?v=DfcWOPpmw14&#34;&gt;music video for &lt;em&gt;Image&lt;/em&gt;.&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Entry point songs:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;From Imaginal Disk:
&lt;a href=&#34;https://open.spotify.com/track/4O45RxFmWr88UfFlNtsXUy?si=1b9083ba87c34341&#34;&gt;Image&lt;/a&gt;
(Also fun:
&lt;a href=&#34;https://open.spotify.com/track/3Jng2rpIQrSAQPrWYN7ywM?si=f99b477a64b74d9a&#34;&gt;the Grimes version&lt;/a&gt;),
&lt;a href=&#34;https://open.spotify.com/track/1z8jzS7huL1Cy1YLFxUb3X?si=6e0a551125234f0f&#34;&gt;Killing Time&lt;/a&gt;,
&lt;a href=&#34;https://open.spotify.com/track/2RhHct6gGXSrZ1zHjovhRq?si=075201cf1e6d465f&#34;&gt;Cry for Me&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;More Magdalena Bay:
&lt;a href=&#34;https://open.spotify.com/track/3XdjmYHdNZkuN6FML8sTjH?si=6606d063dce24d07&#34;&gt;Domino&lt;/a&gt;,
&lt;a href=&#34;https://open.spotify.com/track/3tgHGoK5ItQv2q2yqggxlb?si=035449a2c4154a16&#34;&gt;Secrets&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
            
        
        
        
            
            
            
            
            
            
                
                
                
                    
                
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/19/My-Favorite-Music-of-2025/anemoia.jpg&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/19/My-Favorite-Music-of-2025/anemoia_hua237d4b30c06f2f108e66cd06a74a40b_12896_0x300_resize_q100_h2_lanczos.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/19/My-Favorite-Music-of-2025/anemoia_hua237d4b30c06f2f108e66cd06a74a40b_12896_0x300_resize_q100_lanczos.jpg&#34; alt=&#34;Anemoia&#34; width=&#34;250&#34; height=&#34;250&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;Anemoia&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;&lt;strong&gt;SG Lewis’
&lt;a href=&#34;https://open.spotify.com/album/3kse3e9XxmIedJb9bfjErH?si=yW-UMhjqRdCpuGGR5u-Ylg&#34;&gt;Anemoia&lt;/a&gt;&lt;/strong&gt;
is a fun electronic album. I’ve been following SG Lewis for a few years, saw his
tour was stopping through Seattle, and listened to a bunch more of &lt;em&gt;Anemoia&lt;/em&gt;
than I would have otherwise. The word “anemoia” is picked up from John Koenig’s
&lt;em&gt;Dictionary of Obscure Sorrows&lt;/em&gt;, which defines it as “the feeling of nostalgia
for a time, place, or situation one has never actually experienced”. Both the
album, and Lewis’ live performances of the tracks, convey this vibe quite well.
The tracks, despite being “just” dancey electronic music are quite memorable. I
find myself pulling out my phone to play a specific song from this album with
surprising frequency.&lt;/p&gt;
&lt;p&gt;Entry point songs:
&lt;a href=&#34;https://open.spotify.com/track/0fwktLgGSjgVieYm6JkA7R?si=ca3b0a1ebe184152&#34;&gt;Memory&lt;/a&gt;,
&lt;a href=&#34;https://open.spotify.com/track/34TzBDhdmeSxAxcP1PVFjC?si=c583251a2acf4547&#34;&gt;Transition&lt;/a&gt;,
&lt;a href=&#34;https://open.spotify.com/track/03vfFtmD5SMZ7rpQm6KXTv?si=828e19e52c6c497a&#34;&gt;Baby Blue&lt;/a&gt;&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
            
        
        
        
            
            
            
            
            
            
                
                
                
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/19/My-Favorite-Music-of-2025/cory_wong.jpg&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/19/My-Favorite-Music-of-2025/cory_wong_hu20ce148cf5bc1244410ae416ba08c589_41120_0x500_resize_q100_h2_lanczos.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/19/My-Favorite-Music-of-2025/cory_wong_hu20ce148cf5bc1244410ae416ba08c589_41120_0x500_resize_q100_lanczos.jpg&#34; width=&#34;250&#34; height=&#34;250&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    
&lt;/figure&gt;

&lt;p&gt;&lt;strong&gt;Cory Wong’s
&lt;a href=&#34;https://open.spotify.com/album/124WVC05XFXZV6UYCTiD9a?si=dX2cAzcjRHy9K9Dfkmz4vQ&#34;&gt;Wong Air&lt;/a&gt;
(Live in America)&lt;/strong&gt; is snuck in here because&amp;hellip; I listened to so much Cory Wong
in 2025, and saw his band live over the summer, and already have tickets to his
Spring 2026 tour. I would be remiss if I didn’t include something of his on this
list. I remember first running into Cory via his 2020 album
&lt;a href=&#34;https://open.spotify.com/album/1LL5VZdY7CBXScXB0oQ4tB?si=jaVABxWeRxO-evo9i9CQqw&#34;&gt;“Elevator Music for an Elevated Mind”&lt;/a&gt;.
I just thought the title was funny and liked a few tracks on it. This has grown
into an unabashed appreciation for most of his work. I’m also a big fan of
Vulfpeck, of which Cory is a member.&lt;sup id=&#34;fnref:2&#34;&gt;&lt;a href=&#34;#fn:2&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;p&gt;Wong is doing Big Band music in the 2020s is in such a &lt;em&gt;fun&lt;/em&gt;, energetic,
contagious way. Seriously, just listen to a live track like
&lt;a href=&#34;https://open.spotify.com/track/41uqxRPzkStxNRVspv6PXe?si=98f461c521ca46d0&#34;&gt;Ketosis&lt;/a&gt;
and try to stay still. It was a treat to see him live.&lt;/p&gt;

&lt;div style=&#34;max-width: max(75%, 500px); margin: 0 auto;&#34;&gt;
    &lt;div class=&#34;youtube-player&#34;&gt;
        &lt;iframe src=&#34;https://www.youtube.com/embed/PVcjEOjI0yk&#34; allowfullscreen title=&#34;YouTube Video&#34;&gt;&lt;/iframe&gt;
    &lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;Entry point songs:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;From 2025’s live album:
&lt;a href=&#34;https://open.spotify.com/track/0sX6ySwp6fmn8T06UlTqcz?si=dc5a20212e344f71&#34;&gt;Comic Sans&lt;/a&gt;,
&lt;a href=&#34;https://open.spotify.com/track/0EDDNlnpvHFPd0GPmWDD05?si=f796932cefbe4b9c&#34;&gt;Out at Midnight&lt;/a&gt;,
&lt;a href=&#34;https://open.spotify.com/track/36FpTtSZGjBsZ3AdWEYy6a?si=8ebf374307f842a6&#34;&gt;The Grid Generation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;More Cory Wong:
&lt;a href=&#34;https://open.spotify.com/track/3tiHlefQHzovwWmxA1tmmx?si=aefbf1e439714d5c&#34;&gt;Lunchtime&lt;/a&gt;,
&lt;a href=&#34;https://open.spotify.com/track/1QFFUmFYKXNYhq2Akjh7HR?si=b9c3cbb3574849ff&#34;&gt;Better&lt;/a&gt;,
&lt;a href=&#34;https://open.spotify.com/track/0p1Mp6sT4Nzq3DWnZPvqs8?si=4a498b719f104455&#34;&gt;Starship Syncopation&lt;/a&gt;,
&lt;a href=&#34;https://open.spotify.com/track/0PYldThRpqH6oEuV8gCJot?si=0ab4b81d72d249b3&#34;&gt;Turbo&lt;/a&gt;,
&lt;a href=&#34;https://open.spotify.com/track/3Wph8NnsxCW85uOartaPtO?si=f513522eac0c43fc&#34;&gt;Flyers Direct&lt;/a&gt;,
&lt;a href=&#34;https://open.spotify.com/track/64zfaCPwuuPkEtoNt1jzFx?si=9c8e617f560b4ea7&#34;&gt;Want Me Back&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;honorable-mentions&#34;&gt;Honorable Mentions&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Lady Gaga’s
&lt;a href=&#34;https://open.spotify.com/artist/1HY2Jd0NmPuamShAr6KMms?si=be15501ec9114221&#34;&gt;MAYHEM&lt;/a&gt;&lt;/strong&gt;
should probably go without saying was a top album for 2025. When the single for
&lt;a href=&#34;https://open.spotify.com/track/2LHNTC9QZxsL3nWpt8iaSR?si=613fb2bb793a4f0a&#34;&gt;Abracadabra&lt;/a&gt;
came out, I immediately strapped on my running shoes upon hearing it and got
quite close to a personal record pace 5k. By play count, Abracadabra was my
number one most listened to song of the year. Gaga’s
&lt;a href=&#34;https://www.youtube.com/watch?v=SrPN_eK8RKc&#34;&gt;SNL performance&lt;/a&gt; of
&lt;a href=&#34;https://open.spotify.com/track/4pNzBbGcqXofx8mLBPTeih?si=8aaf0450488745e8&#34;&gt;Killah&lt;/a&gt;
is also a must-watch.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Florence + The Machine’s
&lt;a href=&#34;https://open.spotify.com/album/0z7l9VEJyFMv8p8wffRDaF?si=e9a0ba4b01b247a0&#34;&gt;Everybody Scream&lt;/a&gt;&lt;/strong&gt;
was a highly anticipated album for me this year, and it was enjoyable! The
studio album itself didn&amp;rsquo;t initially resonate with me as much as I expected, but
several live recordings from it have been &lt;em&gt;fantastic&lt;/em&gt;. In particular, this live
recording of &lt;a href=&#34;https://www.youtube.com/watch?v=M0mfg7vbjlk&#34;&gt;Sympathy Magic&lt;/a&gt; is
flooring, and this one of title track,
&lt;a href=&#34;https://www.youtube.com/watch?v=yNxhjJMi4vI&#34;&gt;Everybody Scream&lt;/a&gt;, is also great.
I’m looking forward to the tour for this album next year, and I suspect this
will be an album that grows on me over time.&lt;/p&gt;

&lt;div style=&#34;max-width: max(75%, 500px); margin: 0 auto;&#34;&gt;
    &lt;div class=&#34;youtube-player&#34;&gt;
        &lt;iframe src=&#34;https://www.youtube.com/embed/M0mfg7vbjlk&#34; allowfullscreen title=&#34;YouTube Video&#34;&gt;&lt;/iframe&gt;
    &lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;Entry point songs:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;From Everybody Scream:
&lt;a href=&#34;https://open.spotify.com/track/6ExwNwWLXvvKr5hk6lkHUY?si=688cb1bbe1d9428f&#34;&gt;Sympathy Magic&lt;/a&gt;,
&lt;a href=&#34;https://open.spotify.com/track/2yuKbNMCQ8Oo6KWWZvUCoV?si=350e807eb40b406f&#34;&gt;Witch Dance&lt;/a&gt;,
&lt;a href=&#34;https://open.spotify.com/track/4BswiLDcr4ChDGl2NrfhC1?si=a5fa6a9540104f93&#34;&gt;Kraken&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;More from Florence:
&lt;a href=&#34;https://open.spotify.com/track/7H7SHw3YWXhb4zYqyoPNa1?si=f4c52173873e44ed&#34;&gt;Free&lt;/a&gt;,
&lt;a href=&#34;https://open.spotify.com/track/0vQYe6g8bNbdUKnUnXdQQV?si=106e3340cd834353&#34;&gt;My Love&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;everything-else&#34;&gt;Everything Else&lt;/h2&gt;
&lt;p&gt;Alright, we’re all the way through albums. :phew:&lt;/p&gt;
&lt;h3 id=&#34;random-good-vibes-released-in-2025&#34;&gt;Random Good Vibes Released in 2025&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://open.spotify.com/track/6LkwvXgYfM5qoW198vlIn8?si=6b3616dd23fb466d&#34;&gt;Sailing Away&lt;/a&gt;
by Antoine Bourachot.
&lt;ul&gt;
&lt;li&gt;The feeling: Running in the sun on a beach in Spain.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://open.spotify.com/track/0VOeTPrszkFPaKHI43YBwo?si=937047b7395c454d&#34;&gt;U Know Y&lt;/a&gt;
by CAPYAC.
&lt;ul&gt;
&lt;li&gt;The feeling: When one realizes the work week is over for the weekend.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&#34;good-running-music-released-in-2025&#34;&gt;Good running music Released in 2025&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://open.spotify.com/track/0xaXwvcjq7aAKwMKe22Bw7?si=fc2ff9e301e949f6&#34;&gt;Focus&lt;/a&gt;
by John Summit.
&lt;ul&gt;
&lt;li&gt;I must have listened to this track 100 times on runs over the year.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://open.spotify.com/track/6CfromGdojo0R0vlgA7iU8?si=a2d2087dbe474aff&#34;&gt;Infohazard&lt;/a&gt;
by Ninajirachi.
&lt;ul&gt;
&lt;li&gt;The &lt;a href=&#34;https://www.youtube.com/watch?v=p2ZdeIKJA8c&#34;&gt;music video&lt;/a&gt; is
mesmerizing; a must watch (esp. 3:00 to the end).&lt;/li&gt;
&lt;li&gt;I have distinct memories of listening to this song and sprinting around
the pre-sunrise streets of San Francisco in the late summer.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;div style=&#34;max-width: max(75%, 500px); margin: 0 auto;&#34;&gt;
    &lt;div class=&#34;youtube-player&#34;&gt;
        &lt;iframe src=&#34;https://www.youtube.com/embed/p2ZdeIKJA8c&#34; allowfullscreen title=&#34;YouTube Video&#34;&gt;&lt;/iframe&gt;
    &lt;/div&gt;
&lt;/div&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://open.spotify.com/track/2DMPDC9EG8SLcujLz2UyIy?si=6c82392657de4e84&#34;&gt;Relentless Love&lt;/a&gt;
by Sophie Ellis-Bextor.
&lt;ul&gt;
&lt;li&gt;My vote for the “song of the summer” for 2025.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://open.spotify.com/track/4A56h4B9xUuMMXoKuj18HT?si=49618062fb7e48c5&#34;&gt;Edge of Desire&lt;/a&gt;
by Jonas Blue
&lt;ul&gt;
&lt;li&gt;Another song that I listened to in &lt;em&gt;heavy&lt;/em&gt; rotation on my
&lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/04/Race-Report-Seattle-Marathon-2025/&#34;&gt;marathon training&lt;/a&gt;
runs.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://open.spotify.com/track/51zebAwN6zTBOw0ue2XLIP?si=d19bfcfb18994cdc&#34;&gt;Joy&lt;/a&gt;
by Rita Era&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&#34;not-at-all-released-in-2025&#34;&gt;Not At All Released in 2025&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Snarky Puppy’s
&lt;a href=&#34;https://open.spotify.com/track/68d6ZfyMUYURol2y15Ta2Y?si=14640b6c9728484a&#34;&gt;Lingus (We Like It Here)&lt;/a&gt;&lt;/strong&gt;
is up there for the best improvisation I’ve &lt;em&gt;ever&lt;/em&gt; heard, bar none. The
&lt;a href=&#34;https://www.youtube.com/watch?v=L_XJ_s5IsQc&#34;&gt;Youtube video&lt;/a&gt; from 2014 is a
masterpiece. I come back to this track weekly, listening the whole way
through.&lt;sup id=&#34;fnref:3&#34;&gt;&lt;a href=&#34;#fn:3&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;div style=&#34;max-width: max(75%, 500px); margin: 0 auto;&#34;&gt;
    &lt;div class=&#34;youtube-player&#34;&gt;
        &lt;iframe src=&#34;https://www.youtube.com/embed/L_XJ_s5IsQc&#34; allowfullscreen title=&#34;YouTube Video&#34;&gt;&lt;/iframe&gt;
    &lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Dirty Loop&amp;rsquo;s
&lt;a href=&#34;https://open.spotify.com/album/7GWiggdgfYIBhKpYla2cz0?si=cjeh1NGETKmu4jFX0KakgA&#34;&gt;Phoenix&lt;/a&gt;&lt;/strong&gt;
was another album that sneakily became a &amp;ldquo;no skip&amp;rdquo; album. It was released in
2020, hence it not being in the earlier list. But each of the tracks on the
album are &lt;em&gt;really&lt;/em&gt; good and each individually became a &amp;ldquo;listen to on
single-track repeat&amp;rdquo;.
&lt;a href=&#34;https://open.spotify.com/track/0L4A8L8UUSU0lgzxGNKggH?si=c8f5d2ce91ad47b9&#34;&gt;Rock You&lt;/a&gt;
is a good entry point, but
&lt;a href=&#34;https://open.spotify.com/track/7A9TTxnfM89QfwqC2DRjTY?si=582e52b721034d9b&#34;&gt;Work S*** Out&lt;/a&gt;
is likely my favorite track on the album.&lt;/p&gt;
&lt;h3 id=&#34;weekend-soundtrack&#34;&gt;Weekend Soundtrack&lt;/h3&gt;
&lt;p&gt;&lt;a href=&#34;https://open.spotify.com/artist/7HXnQUEKHiWvUqSIR9ydOC?si=4-xifDMgR8-eEALlT4yHuw&#34;&gt;Above &amp;amp; Beyond Group Therapy&lt;/a&gt;
is a quite fun weekly electronic set. This is my “put on and do weekend chores”
playlist. I’d recommend
&lt;a href=&#34;https://open.spotify.com/album/5nc4PSRHABHaWWZwqtBv8D?si=rx4EwM0lS72p5Tkq73qWAQ&#34;&gt;this set from November&lt;/a&gt;
as a good entry point.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;As I said in the beginning, it was a great year for music. :) I got out to see
more concerts than I have in recent years, and put more time and attention into,
for example, listening to Spotify’s suggested new music every week. This year
will be hard to top, but I&amp;rsquo;m looking forward to seeing what 2026 brings when I
reset my playlist to a blank slate on January 1.&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;The vocalist for the opener, Night Talks, also performed with Cory Wong in
Chicago in
&lt;a href=&#34;https://www.youtube.com/watch?v=569pPx36xyY&#34;&gt;this excellent recording&lt;/a&gt; of
&amp;ldquo;Synchronicity&amp;rdquo;.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:2&#34;&gt;
&lt;p&gt;Stop what you’re doing now and watch Vulfpeck’s 2025 Madison Square Garden
&lt;a href=&#34;https://www.youtube.com/watch?v=6sHbetTA7qA&#34;&gt;concert recording&lt;/a&gt;. It’s such
a joy.&amp;#160;&lt;a href=&#34;#fnref:2&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:3&#34;&gt;
&lt;p&gt;[Edit: 12/28/2025] I also realized that the soloing pianist in Lingus is the
same Cory Henry whose solo track,
&lt;a href=&#34;https://open.spotify.com/track/7FlvbdRd9ul5Ipk1Ejzmky?si=4e3d01c553f348b9&#34;&gt;Best of Me&lt;/a&gt;,
was one of of my favorite songs a few years ago!&amp;#160;&lt;a href=&#34;#fnref:3&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
        <title>Collecting Shibboleths</title>
        <link>https://benjamincongdon.me/blog/2025/12/18/Collecting-Shibboleths/</link>
        <pubDate>Thu, 18 Dec 2025 21:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/12/18/Collecting-Shibboleths/</guid>
        <description>&lt;blockquote&gt;
&lt;p&gt;Judge a man by his questions rather than by his answers. &amp;ndash; Voltaire&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I am an introvert by nature, but I come alive for a good conversation. I was
reflecting on this after a recent international flight, where I was sitting next
to a friendly man who turned out to be a late-career civil engineer. I have a
close friend who is a civil engineer, and so as this conversation unfolded I was
able to inject pieces of information I’d learned from her &amp;ndash; the differences
between various civil CAD tools, stormwater management challenges, and so on.&lt;/p&gt;
&lt;p&gt;In a conversation that’s going &lt;em&gt;well&lt;/em&gt; there are ample opportunities to ask
perceptive questions where you can learn interesting pieces of information. That
information can then sit in your back pocket, to be carefully deployed the next
time you run into a similar conversational vein. This can be absolutely
delightful when it works.&lt;/p&gt;
&lt;p&gt;I’ve loosely been referring to this as &lt;strong&gt;“Collecting Shibboleths”&lt;/strong&gt; in my inner
dialect. “Shibboleth” here is used loosely. It typically denotes language that
separates an in-group from an out-group. However, I’m using it more in the sense
of language that indicates a certain level of knowledge about a topic, used as a
bid for establishing rapport. It’s a “hey, I’ve walked some of these mental
paths before, not as deeply as you, but I’m curious to learn more”. And it’s
“&lt;em&gt;Collecting&lt;/em&gt; Shibboleths” out of recognition that each conversation you have
contains an opportunity to learn something interesting or useful. You are
building a lifelong repertoire, and you do so by asking engaging questions.&lt;/p&gt;
&lt;p&gt;There is a huge difference between “Oh, you do low temperature physics, that’s
cool” and getting to “Oh, your lab synthesizes Bose-Einstein Condensates? That’s
fascinating &amp;ndash; evaporative cooling, right? How long are you able to stabilize it
for?”. Usually it doesn’t take &lt;em&gt;much&lt;/em&gt; knowledge to bootstrap this. A little
knowledge and a lot of curiosity goes a long way.&lt;/p&gt;
&lt;p&gt;Using a shibboleth is distinct from pretending to know more about something than
I actually do. You aren’t trying to “impress”. Rather, the goal is to add more
conversational hooks for the other person to respond to. More interesting paths
for the conversation to wander down.
&lt;a href=&#34;https://www.experimental-history.com/p/good-conversations-have-lots-of-doorknobs&#34;&gt;More doorknobs that are asking to be turned.&lt;/a&gt;
You’re looking for a spark of recognition in the other person &amp;ndash; the interplay
of “Hey! Here’s an opening” and “Yes! I’ll run with it”.&lt;/p&gt;
&lt;p&gt;Better questions lead to better answers. Better answers enrich your mental
models. And better mental models, in turn, sharpen the questions you ask next
time. This leads to a wondrously virtuous cycle in that becoming a better
conversationalist is self-reinforcing.&lt;/p&gt;
</description>
    </item>
    
    <item>
        <title>Book Review: I Am a Strange Loop</title>
        <link>https://benjamincongdon.me/blog/2025/12/17/Book-Review-I-Am-a-Strange-Loop/</link>
        <pubDate>Wed, 17 Dec 2025 21:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/12/17/Book-Review-I-Am-a-Strange-Loop/</guid>
        <description>&lt;blockquote&gt;
&lt;p&gt;In the end, we self-perceiving, self-inventing, locked-in mirages are little
miracles of self-reference. &amp;hellip; Our very nature is such as to prevent us from
fully understanding its very nature. &amp;ndash; Douglas R. Hofstadter&lt;/p&gt;
&lt;/blockquote&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
            
        
        
        
            
            
            
            
            
            
                
                
                
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/17/Book-Review-I-Am-a-Strange-Loop/book_cover.jpg&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/17/Book-Review-I-Am-a-Strange-Loop/book_cover_hu809b9dcaa0628aa9d80dae7a10781452_64966_0x600_resize_q100_h2_lanczos.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/17/Book-Review-I-Am-a-Strange-Loop/book_cover_hu809b9dcaa0628aa9d80dae7a10781452_64966_0x600_resize_q100_lanczos.jpg&#34; width=&#34;198&#34; height=&#34;300&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    
&lt;/figure&gt;

&lt;p&gt;Most people know of Douglas Hofstadter for his masterpiece &lt;em&gt;Gödel, Escher, Bach&lt;/em&gt;
(“GEB”). I quite enjoyed GEB, but one of his lesser known books &amp;ndash; &lt;em&gt;I Am a
Strange Loop&lt;/em&gt; &amp;ndash; has stuck with me in a more profound way than GEB. In looking
at other reviews of &lt;em&gt;Strange Loop&lt;/em&gt;, I saw the common criticism that it’s the
“easier, more approachable” version of GEB. Admittedly, GEB is a doorstop to
read through; it has sections written in dialogs, makes heavy use of metaphor,
and so on. GEB is a triumph of conveying a certain set of ideas: Gödel
incompleteness, Turing completeness, and self-referential systems, among others.
But &lt;em&gt;Strange Loop&lt;/em&gt; stands on its own as an excellent philosophical book.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;I Am a Strange Loop&lt;/em&gt; goes much deeper into one of the topics that is discussed
in a cursory way in GEB: what are “souls”, what is “I”, and how do we get
meaning from meaningless physical constituents. If this sounds like metaphysics,
that’s intentional. &lt;em&gt;Strange Loop&lt;/em&gt; discusses in depth philosophy of mind, the
ontological status of the “self”, and “downward causality”.&lt;/p&gt;
&lt;h2 id=&#34;1-strange-loops&#34;&gt;1. Strange Loops&lt;/h2&gt;
&lt;p&gt;Hofstadter coins the term “Strange Loop” to point to a particular phenomena that
occurs within certain hierarchical systems. Two famous examples of strange loops
are M.C. Escher&amp;rsquo;s &lt;em&gt;Drawing Hands&lt;/em&gt;, in which two hands appear to be drawing each
other, and Kurt Gödel&amp;rsquo;s Incompleteness Theorems, which use self-reference to
prove that, loosely, formal mathematical systems of sufficient power cannot be
both complete and consistent.&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
            
        
        
        
            
            
            
            
            
            
                
                
                
                    
                
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/17/Book-Review-I-Am-a-Strange-Loop/drawing_hands.jpg&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/17/Book-Review-I-Am-a-Strange-Loop/drawing_hands_hu72f8e49dde55bde44bb8944662a08892_36330_0x281_resize_q100_h2_lanczos.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/17/Book-Review-I-Am-a-Strange-Loop/drawing_hands_hu72f8e49dde55bde44bb8944662a08892_36330_0x281_resize_q100_lanczos.jpg&#34; width=&#34;347&#34; height=&#34;300&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;&lt;a href=&#34;https://en.wikipedia.org/wiki/Drawing_Hands&#34;&gt;Source&lt;/a&gt;&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;A “strange loop” occurs when a hierarchical system has no clear top or bottom &amp;ndash;
rather, the hierarchy appears to loop back on itself into a cycle. In Escher’s
hands, there is no “top hand” drawing the “bottom hand”, and yet there still is
the hierarchical relationship of hand one draws hand two, which draws hand one,
which draws hand two, and so on. Hofstadter refers to this quality as a “tangled
hierarchy”.&lt;/p&gt;
&lt;p&gt;Not all tangled hierarchies are strange loops in Hofstadter&amp;rsquo;s sense, though. The
“strangeness” of a “strange loop” comes from when a system is able to perform
self-reference. That is, a system that can point to itself in statements it
makes. This is where you get fun paradoxes like the Liar’s Paradox: “This
statement is false”. Self-reference also forms the basis for Gödel’s
incompleteness theorems. I will not try to explain Gödel incompleteness here
(though I’d highly suggest either GEB or
&lt;a href=&#34;https://www.goodreads.com/book/show/12247620-godel-s-proof&#34;&gt;Gödel’s Proof&lt;/a&gt;),
but the rough shape is encoding something like a Liar’s paradox within a
mathematical proof using a special form of number theory.&lt;/p&gt;
&lt;p&gt;Hofstadter goes on to argue that the concept of “I” and consciousness itself are
strange loops.&lt;/p&gt;
&lt;h2 id=&#34;2-the-i-symbol&#34;&gt;2. The “I” Symbol&lt;/h2&gt;
&lt;p&gt;Where is the strange loop in the brain? Hofstadter begins with the physical: a
brain is a system of neurons that supports representations of a system of
symbols. A symbol, in its simplest form, is just a pattern. These symbols often
correspond to things out in the world &amp;ndash; “dog”, “hot”, “table”. Simple creatures
only need a small set of symbols; more complex creatures evolve the use and
representation of a greater number and complexity of symbols. The strange loop
arises when the system begins to have a symbol for itself &amp;ndash; the “I” symbol.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Among the untold thousands of symbols in the repertoire of a normal human
being, there are some that are far more frequent and dominant than others, and
one of them is given, somewhat arbitrarily, the name &amp;lsquo;I&amp;rsquo;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Similar to how the symbol for “dog” or “hot” or “table” is created through some
mixture of perception reinforcement and innate genetic programming, so too is
the symbol for “I”. Self-reference is a quite useful symbol to have. It allows
you to reason about your state within your environment in a rich way. Hofstadter
continues that, though useful, the “I” symbol is still &lt;em&gt;just&lt;/em&gt; a symbol:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Because of the locking-in of the &amp;lsquo;I&amp;rsquo;-symbol that inevitably takes place over
years and years in the feedback loop of human self-perception, causality gets
turned around and &amp;lsquo;I&amp;rsquo; seems to be in the driver’s seat.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The strong reinforcement of the “I”, particularly through interpersonal
relationships and culture, results in the “I” symbol being locked in to a
seemingly privileged position of appearing primary to other symbols. There is
this perception that there is an “I” driving the body around, it is the “I”
perceiving things from the seat of the brain, and it is the “I” symbol itself
that is making decisions. &lt;em&gt;Strange Loop&lt;/em&gt; argues that this is not a true
reflection of reality, but is instead a “hallucination”:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;My claim that an &amp;lsquo;I&amp;rsquo; is a hallucination perceived by a hallucination is
somewhat like the heliocentric viewpoint&amp;hellip; The basic idea is that the dance
of symbols in a brain is itself perceived by symbols, and that step extends
the dance, and so round and round it goes.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The “I” symbol is notable because it is both conceived of as observer and
observed. Many thought patterns have this self-referential mode: I can think the
thought “I am thinking this thought”. But this loops back on itself in strange
loop fashion &amp;ndash; the “I” observing the thought and the “I” referenced in the
thought itself are akin to Escher’s self-drawing hands.&lt;/p&gt;
&lt;p&gt;This is why Hofstadter calls this a “hallucination perceived by a
hallucination”. Hofstadter compares this shift to the Copernican revolution,
changing from an Earth-centric geocentric model to the Sun-centric heliocentric
model &amp;ndash; which is a more &amp;ldquo;accurate&amp;rdquo; model of physical reality. The analogically
geocentric model of thought is the conventional model: “I am a self, I think
thoughts”. The analogically heliocentric model is: the brain is a symbol
processing machine, of which “I” is a particularly powerful symbol. However, “I”
is not ontologically distinct from any other symbol in the brain; it does not
exist in a “prior” or privileged position. The “I” symbol is real, but it does
not imply the causal structure that we intuitively feel &amp;ndash; that the “I” is what
is choosing to act, or is what is perceiving.&lt;/p&gt;
&lt;p&gt;The obvious followup question to this is, if the “I” doesn’t have causal power
over what we decide to do, then what &lt;em&gt;is&lt;/em&gt; making decisions? Hofstadter’s answer
is what he refers to as “downward causality”.&lt;/p&gt;
&lt;h2 id=&#34;3-downward-causality&#34;&gt;3. Downward Causality&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;Strange Loop&lt;/em&gt; distinguishes two types of causality: “upward causality”, and
“downward causality”.&lt;/p&gt;
&lt;p&gt;Upward causality is what we conventional think of as reductionist causality:
reducing a problem to its simplest parts helps us understand causal
relationships. Diseases are best understood in terms of the bacteria or viruses
that cause them; in turn, those bacteria are best understood via their genetic
components; genetic interactions are mediated by specific proteins; proteins
have a particular chemical structure that are affected by various atomic-scale
forces; these atomic forces are carried by force-carrying particles&amp;hellip; and so
on. The causal forces push upward: the atomic forces influence the proteins,
influence the bacteria, influence the disease progression.&lt;/p&gt;
&lt;p&gt;Downward causality is the reverse of this: the “top down” reasons for an
occurrence have just as much causal power. A disease spreads because of improper
public health facilities, or poor social acceptance of hand washing. Failing to
wash one&amp;rsquo;s hands does have causally explanatory power over getting a disease or
not, just as the makeup of a bacteria has causally explanatory power of how a
human can be infected by that bacteria.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Deep understanding of causality sometimes requires the understanding of very
large patterns and their abstract relationships and interactions, not just the
understanding of microscopic objects interacting in microscopic time
intervals.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;It would be, in some sense, an easier task to understand the world if all we had
to do was reduce everything to the smallest microscopic objects, and build
causality up from there. Hofstadter argues that view is mistaken.&lt;/p&gt;
&lt;p&gt;Returning to “I”, this same causality flip is what makes it seem like the “I” is
doing some causally important work:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Since we perceive not particles interacting but macroscopic patterns in which
certain things push other things around with a blurry causality, and since the
Grand Pusher in and of our bodies is our “I”, and since our bodies push the
rest of the world around, we are left with no choice but to conclude that the
“I” is where the causality buck stops. The “I” seems to each of us to be the
root of all our actions, all our decisions.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;So where does this leave us? The “I” has causal power, but not in the way we
intuitively think. If I stand up to get a glass of water, it may not be my
internal “I” symbol making the decision, but the causality still runs through
the high level pattern of “I”-ness. “I am thirsty” has explanatory power over my
decision.&lt;/p&gt;
&lt;h2 id=&#34;4-will-and-free-will&#34;&gt;4. “Will” and “Free Will”&lt;/h2&gt;
&lt;p&gt;Which then brings us to the hobby horse of any philosophy of mind discussion:
free will. Hofstadter plainly argues that “free will is an illusion”, similar to
how the “I” concept is a hallucination. However, Hofstadter claims that not much
is actually lost in this resignation over free will:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I am pleased to have a will, or at least I’m pleased to have one when it is
not too terribly frustrated by the hedge maze I am constrained by, but I don’t
know what it would feel like if my will were free. What on earth would that
mean? That I didn’t follow my will sometimes? &amp;hellip; I guess that if I wanted to
frustrate myself, I might make such a choice — but then it would be because I
wanted to frustrate myself&amp;hellip; in either case, my non-free will would win out
and I’d follow the dominant desire in my brain.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;One of the best litmus tests for discussion of free will that I’ve collected is
the notion that a free will “could have chosen otherwise” &amp;ndash; that is, were you
in the same position, it is conceivable that a different decision would have
been made. Hofstadter argues that this notion of “could have chosen otherwise”
is part of the illusion of the “I”. The “I” allows us to &lt;em&gt;feel&lt;/em&gt; like we’re a
party to our decisions, but that is merely a feeling:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Our will, quite the opposite of being free, is steady and stable, like an
inner gyroscope, and it is the stability and constancy of our non-free will
that makes me me and you you, and that also keeps me me and you you.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I find this to be a rather tidy resolution of the question of free will. While
this discussion is admittedly still in the abstract, it does appear that this
explanation would give a way to ground the “feeling” of free will with a
symbolic and computational model of cognition. This reconciles a fundamentally
materialist view of cognition with the feeling of “spooky ability to chose
otherwise”.&lt;/p&gt;
&lt;h2 id=&#34;5-distributed-selves&#34;&gt;5. Distributed Selves&lt;/h2&gt;
&lt;p&gt;I would be remiss if I didn’t get in a mention to two of the most emotionally
impacting chapters of &lt;em&gt;Strange Loop&lt;/em&gt;, wherein Hofstadter discusses the death of
his wife. One of the interesting implications of &lt;em&gt;Strange Loop&lt;/em&gt;’s symbolic
interpretation of consciousness is that these symbols can be transmitted &amp;ndash;
imperfectly, but partially transmitted nonetheless. This has implications for
how we interact with others, as our interactions result in corresponding symbols
of their “I” or “Self” being transferred:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;One day, as I gazed at a photograph of Carol taken a couple of months before
her death, I looked at her face and I looked so deeply that I felt I was
behind her eyes, and all at once, I found myself saying, as tears flowed,
&amp;lsquo;That’s me! That’s me!&amp;rsquo; &amp;hellip; I realized then that although Carol had died, that
core piece of her had not died at all, but that it lived on very determinedly
in my brain.&lt;/p&gt;
&lt;p&gt;We live inside such people, and they live inside us&amp;hellip; someone that close to
us is represented on our screen by a second infinite corridor, in addition to
our own infinite corridor. We can peer all the way down — their strange loop,
their personal gemma, is incorporated inside us.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Dispassionately, I’m not sure how much credence to give to this notion, but it
is a beautiful idea. It does seem more than reasonable that we represent others
with increasingly rich inner symbols the longer we’ve known them, and it seems
evident that those systems can, and indeed do, persist after the death of the
symbol’s person.&lt;/p&gt;
&lt;h2 id=&#34;6-conclusion&#34;&gt;6. Conclusion&lt;/h2&gt;
&lt;p&gt;I first read &lt;em&gt;Strange Loop&lt;/em&gt; several years ago, during a time of significant
personal change. Many of the ideas it expounds have become surprisingly
load-bearing over the subsequent years. &lt;em&gt;Strange Loop&lt;/em&gt; gave me an intellectually
satisfying “in” to exploring my own mental landscape with both curiosity and
humility, made me much more uncertain and curious about what is going on inside
LLMs, and left me with a great appreciation for Hofstadter as both a thinker and
a writer. If any of the ideas above resonated, I’d highly recommend &lt;em&gt;I Am a
Strange Loop&lt;/em&gt;.&lt;/p&gt;
</description>
    </item>
    
    <item>
        <title>What Are You Trying to Say?</title>
        <link>https://benjamincongdon.me/blog/2025/12/16/What-Are-You-Trying-to-Say/</link>
        <pubDate>Tue, 16 Dec 2025 20:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/12/16/What-Are-You-Trying-to-Say/</guid>
        <description>&lt;p&gt;From &lt;a href=&#34;https://x.com/dwarkesh_sp/status/1914063267196256765&#34;&gt;Dwarkesh Patel&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Unreasonably effective writing advice:&lt;/p&gt;
&lt;p&gt;&amp;ldquo;What are you trying to say here?&lt;/p&gt;
&lt;p&gt;Okay, just write that.&amp;rdquo;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Writing effectively is notoriously hard. Even once you get past writers block
and are actually writing words on the page, organizing those words coherently is
challenging.&lt;/p&gt;
&lt;p&gt;One virtue I’ve come to value in writing is simplicity. That is, writing in a
way that can be understood while minimizing the cognitive load on the reader. I
find this virtue most useful in technical design proposals, where there is a
high premium on saying precisely what you mean with minimal overhead.&lt;/p&gt;
&lt;p&gt;A 40-page design doc can look superficially impressive, but I tend to be more
impressed by the ones that are 3-5 pages, and yet are so crisp that they have an
“airtight” quality to them.&lt;/p&gt;
&lt;p&gt;I’ve tried to keep this dictum in mind when writing recently. This advice is
particularly helpful when you’re drawn to fluffy or hedging language. Hedging
language often indicates that you haven’t fully figured out what you want to
say. Fluffy language can be a symptom of working on a piece of writing for too
long, and forgetting to have empathy with uninitiated readers. The goal: skip
all the extra baggage that inflates or deflates the message you’re trying to
convey. Just write the message in the simplest possible way for your audience.&lt;/p&gt;
&lt;p&gt;Writing has a nearly endless set of failure modes: too jargon-heavy, too vague,
too detached from reality, too focused on irrelevant details, too long for
readers to find the core point, and many, many more.&lt;/p&gt;
&lt;p&gt;Simple writing gets you at least one thing: brevity. &lt;strong&gt;Simple writing may still
be &lt;em&gt;bad&lt;/em&gt;, but it will be bad &lt;em&gt;in a way that can be easily critiqued&lt;/em&gt;.&lt;/strong&gt; Yes,
simple ideas often need nuanced explanations. But start with the simple
explanation and expand only for readers that care. This is, for example, why
“TLDR”s have been so widely adopted.&lt;/p&gt;
&lt;p&gt;What I like about “What are you trying to say? Just write that” is that it
nudges you to articulate the core of your message. Cut out the hedging, cut out
the fancy jargon, just: what is the actual pitch? And then, well, just write
that. You can still go on from there to polish, add nuance where appropriate.
The core message must come first though; all the polish and wordsmithing in the
world can’t save a piece of writing that doesn’t know what it’s actually trying
to say.&lt;/p&gt;
</description>
    </item>
    
    <item>
        <title>Day 15 of Daily Writing</title>
        <link>https://benjamincongdon.me/blog/2025/12/15/Day-15-of-Daily-Writing/</link>
        <pubDate>Mon, 15 Dec 2025 21:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/12/15/Day-15-of-Daily-Writing/</guid>
        <description>&lt;p&gt;This is post 15 of my unannounced, self-imposed month of daily writing. I’ve
been making soft promises to myself and others to write more for&amp;hellip; years. I was
inspired by a few of the folks who wrote daily last month for
&lt;a href=&#34;https://www.inkhaven.blog/&#34;&gt;Inkhaven&lt;/a&gt;, and so decided to do my own super
unofficial version of that.&lt;/p&gt;
&lt;p&gt;It’s been fun so far! And by “fun” I mean it’s been a rewarding challenge. :)&lt;/p&gt;
&lt;p&gt;Some thoughts:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Sitting down to “do writing” is a lot easier if you’ve scratched down some
notes, or started on an outline of a post already. I’ve been using a
mishmash of &lt;a href=&#34;https://ia.net/writer&#34;&gt;iA Writer&lt;/a&gt;,
&lt;a href=&#34;https://obsidian.md/&#34;&gt;Obsidian&lt;/a&gt;, and &lt;a href=&#34;https://getdrafts.com/&#34;&gt;Drafts&lt;/a&gt; to
outline and ideate.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Writing for a deadline is helpful.&lt;/strong&gt; There have been a couple posts where
in the past, I would have endlessly tweaked and wordsmithed before
publishing. When the forcing function is “I want to finish this so I can go
to bed”, suddenly editing decisions become easier. Similarly, writing
&lt;em&gt;daily&lt;/em&gt; helps me stay in the rhythm of writing and build up “muscle memory”.
Only writing when I feel “inspired” is a recipe for not writing much.&lt;/li&gt;
&lt;li&gt;Writing is mentally taxing, but “doable” on low mental energy. Ideating is a
more enjoyable / open activity, but is &lt;em&gt;not&lt;/em&gt; doable with low mental energy.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Posting more means I am less precious about each post.&lt;/strong&gt; I think this is
good. Having variance in article quality is healthy. For my preferences, a
mixture of high-effort and mid-effort posts seems like a good mix. Only
posting high-effort content is less rewarding, ultimately, since most of my
more “popular” posts are my mid-effort ones, but most of the posts that I’m
“most glad I’ve written” are the high-effort ones. Each of these is
rewarding in its own way.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;High-effort posts, unsurprisingly, take a lot of time.&lt;/strong&gt; Both book reviews
I’ve posted this month were done on weekends; this isn’t an accident, though
it wasn’t intentionally planned.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Running is a good time for ideation.&lt;/strong&gt; I already knew this, but when I’m
in the mode of “it would be nice to have things to write about”, it’s nice
to know that I have an activity that readily fosters idea generation.&lt;/li&gt;
&lt;li&gt;Writing more made me tidy house a bit, digitally. I’ve been tweaking on the
margins with my personal site as I now look at it more frequently. I
improved the &lt;a href=&#34;https://benjamincongdon.me/blog/archive/&#34;&gt;blog archive page&lt;/a&gt;, added a markdown output to
each post for LLMs&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt;, improved my image loading performance, and so on.
This sort of &lt;a href=&#34;https://maggieappleton.com/garden-history&#34;&gt;digital garden&lt;/a&gt;
tending feels wholesome.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;What’s been working:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Using iA Writer has been legitimately delightful, even though I likely use
&amp;lt;5% of its full feature set.&lt;/li&gt;
&lt;li&gt;Allowing myself to post articles that I think are past the 80% quality line
has made me feel much freer to “just write more”. Move on to the next idea.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;What’s NOT been working:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;I still want to write on some less directly technical topics (e.g.
psychology, philosophy, etc.), and have been finding it somewhat tricky to
find an “in” there. I think I’ll just need to accept that those posts will
be intentionally not in my normal lane and bite that bullet.&lt;/li&gt;
&lt;li&gt;Writing pieces that take more than one sitting is hard in this mode. I
generally ideate in the morning and write in the evening, and “need” to get
a post out by the evening. I tend to have a fairly single-track mind for
writing, and so parallelism is hard. Which means sequentially writing one
post per day. Which means, mostly, writing them in one sitting &amp;ndash; and that
places a (likely constructive) limit on how much effort I can put into any
individual post.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Beyond the mechanics, writing more intentionally has also made me think about my
projected online identity. My blog has always held a bit of a weird niche in my
mind &amp;ndash; it’s part public journal, part “performative display of competence”,
part curiosity log. I primarily write on technical topics, but I wouldn’t call
this a “tech blog”. Sometimes my articles get picked up HackerNews; sometimes my
coworkers read my blog, or inspire a post. Sometimes I’m writing just because I
want something to exist, for having thought of a connection between some ideas
that I find interesting. Upon reflection, my ideal reader is the person who
resonates with many of the same topics that I’m interested in, has my blog as
one of many in an RSS reader, and (if the stars align) gets a little bit of joy
or curiosity when clicking on a new piece of my writing.&lt;/p&gt;
&lt;p&gt;In any case, I have the explicit awareness while doing this month-long writing
exercise that this is &lt;em&gt;not&lt;/em&gt; maintainable as a daily practice. Part of this
effort is to shake out all the cobwebs of ideas that I’ve had on my “to write”
list for far too long. The
&lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/06/Book-Review-Antimemetics/&#34;&gt;Antimemetics&lt;/a&gt; review was one, but
there’s one or two other personally important topics that I’d like to get in
writing before the month is over. The hope, though, is that I find some pieces
of this experience that I enjoy enough or am able to integrate well enough to
exit this with a “more frequent than quarterly” writing routine.&lt;/p&gt;
&lt;p&gt;Thanks for reading, I appreciate it. (And do drop an email if you ever feel
inclined, I enjoy hearing from folks!)&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;Add &lt;code&gt;/index.md&lt;/code&gt; as a suffix to any blog url to get the markdown version of
it.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
        <title>Book Review: The Demon in the Machine</title>
        <link>https://benjamincongdon.me/blog/2025/12/14/Book-Review-The-Demon-in-the-Machine/</link>
        <pubDate>Sun, 14 Dec 2025 16:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/12/14/Book-Review-The-Demon-in-the-Machine/</guid>
        <description>&lt;blockquote&gt;
&lt;p&gt;The thing that separates life from non-life is information. - Paul Davies&lt;/p&gt;
&lt;/blockquote&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
            
        
        
        
            
            
            
            
            
            
                
                
                
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/14/Book-Review-The-Demon-in-the-Machine/book_cover.png&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/14/Book-Review-The-Demon-in-the-Machine/book_cover_hube906ca89a7d6233ea56b65fb260bfe6_831368_0x600_resize_q100_h2_lanczos_3.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/14/Book-Review-The-Demon-in-the-Machine/book_cover_hube906ca89a7d6233ea56b65fb260bfe6_831368_0x600_resize_lanczos_3.png&#34; width=&#34;208&#34; height=&#34;300&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    
&lt;/figure&gt;

&lt;p&gt;I’ve probably learned about the thought experiment of Maxwell’s demon at least
half a dozen times &amp;ndash; in multiple physics courses, in multiple books. Until I
read Paul Davies’ &lt;em&gt;The Demon in the Machine&lt;/em&gt;, I don’t think I realized that the
paradox of Maxwell’s demon had actually been &lt;em&gt;solved&lt;/em&gt;.&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt; More on that later.&lt;/p&gt;
&lt;p&gt;The core thrust of Paul Davies’ &lt;em&gt;The Demon in the Machine&lt;/em&gt; is to look at life as
a puzzle of &lt;em&gt;thermodynamics&lt;/em&gt; and &lt;em&gt;information&lt;/em&gt;. It starts with the question: how
is it possible that life seems to reliably be able to render order from chaos?
Looking at living systems, they are able to hold boundaries. Life is able to
create and sustain fantastically ordered structures &amp;ndash; cells, organs, limbs,
brains &amp;ndash; out of the chaotic inorganic soup comprising the rest of the physical
universe.&lt;/p&gt;
&lt;p&gt;Davies’ answer is information theory, with his core thesis that “Life = Matter +
Information”.&lt;/p&gt;
&lt;h2 id=&#34;1-demonology&#34;&gt;1. Demonology&lt;/h2&gt;
&lt;p&gt;The titular “demons” of &lt;em&gt;Demon in the Machine&lt;/em&gt; are, of course, those of the
&lt;a href=&#34;https://en.wikipedia.org/wiki/Maxwell%27s_demon&#34;&gt;Maxwell Demon&lt;/a&gt; thought
experiment. Maxwell’s demon &amp;ndash; sorry, now you have to listen to yet one more
explanation of it &amp;ndash; is the thought experiment wherein a gas is split into two
chambers, separated by a frictionless door. A demon watching molecules of gas
transiting between the two section could use the door as a sort of filter &amp;ndash;
blocking slow particles from moving to, say, the left section, resulting in more
slow particles being in the right section and more fast particles being in the
left section.&lt;/p&gt;
&lt;p&gt;This violates the Second Law of Thermodynamics, by reducing the amount of
entropy (“disorder”) in the system. As another intuition for this, we
effectively “sorted” a gas at equilibrium into hot and cold without using any
energy.&lt;/p&gt;
&lt;p&gt;Things get weirder: were this true, we could configure this demon to, instead of
merely sorting the particles using its door, run an engine to produce work. We
reconfigure the room to have the door also act as a piston. When we’ve sorted
more hot particles into one side, we can close the door and have it move as a
piston would, the hotter particles pushing against the area with less pressure,
expanding until thermal equilibrium is reached. Then, the door can be reopened
and the process repeated. This demonic engine is known as a
&lt;a href=&#34;https://en.wikipedia.org/wiki/Entropy_in_thermodynamics_and_information_theory#Szilard&#39;s_engine&#34;&gt;Szilard engine&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Were Maxwell’s demon possible &lt;em&gt;and costless&lt;/em&gt;, we could (say) run a refrigerator
indefinitely, without any energy input. Were Szilard engine possible &lt;em&gt;and
costless&lt;/em&gt;, we could (say) extract an vast amount of mechanical energy from a gas
at thermodynamic equilibrium.&lt;/p&gt;
&lt;h2 id=&#34;2-information-engines&#34;&gt;2. Information Engines&lt;/h2&gt;
&lt;p&gt;The bad news is that such a costless engine is not possible. The good news is
that Szilard engines are actually realizable in the physical world.&lt;/p&gt;
&lt;p&gt;Leo Szilard, working in 1929, actually suggested a more simplified version of
the Maxwell Demon thought experiment to power his engine than what I sketched
above:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;We simplify the setup to a single molecule in a box.&lt;/li&gt;
&lt;li&gt;The demon can insert a partition in the middle of the box, such that the
molecule is trapped on either the left or right side of the partition.&lt;/li&gt;
&lt;li&gt;The demon &lt;em&gt;knows&lt;/em&gt; the location of the molecule &amp;ndash; whether it is on the left
or right side. This is 1 bit of information.&lt;/li&gt;
&lt;li&gt;The demon can position the piston on the side where the molecule isn&amp;rsquo;t, so
the molecule pushes against it and does work.&lt;/li&gt;
&lt;/ul&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
            
        
        
        
            
            
            
            
            
            
                
                
                
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/14/Book-Review-The-Demon-in-the-Machine/szilard-engine.png&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/14/Book-Review-The-Demon-in-the-Machine/szilard-engine_hucf9f5dcb74f9fb63b2445329317f5829_175449_0x700_resize_q100_h2_lanczos_3.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/14/Book-Review-The-Demon-in-the-Machine/szilard-engine_hucf9f5dcb74f9fb63b2445329317f5829_175449_0x700_resize_lanczos_3.png&#34; alt=&#34;Szilard&amp;amp;rsquo;s engine; Figure 4 from The Demon in the Machine&#34; width=&#34;488&#34; height=&#34;350&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;Szilard&amp;rsquo;s engine; Figure 4 from &lt;em&gt;The Demon in the Machine&lt;/em&gt;&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;Szilard suggested that the energetic “fuel” for this engine was the knowledge of
the molecule’s location.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Szilard concluded, reasonably enough, that the price [of the engine&amp;rsquo;s work]
was the cost of measurement.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;As it turned out, this was a mistaken conclusion. IBM physicist Rolf Landauer
extended Szilard’s work in 1961, introducing
&lt;a href=&#34;https://en.wikipedia.org/wiki/Landauer%27s_principle&#34;&gt;“Landauer’s Principle”&lt;/a&gt;,
which states that it is not the &lt;em&gt;measurement&lt;/em&gt; of information that requires work,
but rather the &lt;em&gt;erasure&lt;/em&gt; of information. Landauer was attempting to quantify the
minimum amount of work to operate a logic gate, in studying the physics of
computation.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Landauer coined a now-famous dictum: ‘Information is physical!’ What he meant
by this is that all information must be tied to physical objects: it doesn’t
float free in the ether.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is fairly unintuitive, so let’s hold on this point for a minute: measuring
and accumulating information is thermodynamically free, but erasing that
information is costly.&lt;/p&gt;
&lt;p&gt;As an intuition, we can look at the notion of “reversibility”. I map this in my
head to something like: while measurement and accumulation of information is
happening, we are traveling down a thermodynamic gradient. Yes, we are “getting
work”, but in doing so we are traversing the thermodynamic state space of the
world. Each time the Szilard engine demon makes a “left or right” decision,
there is an implicit ledger of “lefts” and “rights” that is collected alongside
the work coming out of the system. The thought experiment abstracts this ledger,
but it is in fact a physical configuration in the Demon’s memory (weird,
right?).&lt;/p&gt;
&lt;p&gt;This ledger breaks down, eventually. When you want to reuse the Szilard engine,
you have to eventually reset it, as the Demon does not have infinite memory. And
that erasure is when the process becomes irreversible. With the ledger in hand,
you can trace back the full state space tree of the engine operation. Deleting
the ledger resets your memory back to baseline, but also makes it such that you
cannot trace back the operation, going from an in principle reversible process
to an irreversible one. This is the energetic cost of computation.&lt;/p&gt;
&lt;h2 id=&#34;3-physical-demon-instantiation&#34;&gt;3. Physical Demon Instantiation&lt;/h2&gt;
&lt;p&gt;We have gotten slightly closer to an understanding of the physical instantiation
of a Maxwell demon / Szilard engine, but the intuition of the thought experiment
breaks down when we consider the ledger. Why do we care about the demon’s
memory? To see why, Davies has us remove the “magic” animation of the demon and
instead try to instantiate this idea in a physical system.&lt;/p&gt;
&lt;p&gt;I will quote liberally here, because I find this explanation by Davies, though
long and challenging to visualize, to be one of the most fascinating insights
from the book:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;It must be possible to substitute a mindless gadget – a demonic automaton –
that would serve the same function. Recently, Christopher Jarzynski at the
University of Maryland and two colleagues dreamed up such a gadget, which they
call an information engine. Here is its job description: ‘it systematically
withdraws energy from a single thermal reservoir and delivers that energy to
lift a mass against gravity while writing information to a memory register’.
&amp;hellip;&lt;/p&gt;
&lt;p&gt;The Jarzynski contraption resembles a child’s plaything (see Fig. 6). The
demon itself is simply a ring that can rotate in the horizontal plane. A
vertical rod is aligned with the axis of the ring, and attached to the rod are
paddles perpendicular to the rod which stick out at different angles, like a
mobile, and can swivel frictionlessly on the rod. The precise angles don’t
matter; the important thing is whether they are on the near side or the far
side of the three co-planar rods as shown. On the far side, they represent 0;
on the near side, they represent 1. These paddles serve as the demon’s memory,
which is just a string of digits such as 01001010111010 … The entire gadget is
immersed in a bath of heat so the paddles randomly swivel this way and that as
a result of the thermal agitation. However, the paddles cannot swivel so far
as to flip 0s into 1s or vice versa, because the two outer vertical rods block
the way. The show begins with all the blades above the ring set to 0, that is,
positioned somewhere on the far side as depicted in the figure; this is the
‘blank input memory’ (the demon is brainwashed). &amp;hellip; One of the vertical rods
has a gap in it at the level of the ring, so now as each blade passes through
the ring it is momentarily free to swivel through 360 degrees. As a result,
each descending 0 has a chance of turning into a 1.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
            
        
        
        
            
            
            
            
            
            
                
                
                
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/14/Book-Review-The-Demon-in-the-Machine/jarzynski-info-engine.png&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/14/Book-Review-The-Demon-in-the-Machine/jarzynski-info-engine_huf889a9b60d4542dd3809b35aa5c80bba_85217_0x700_resize_q100_h2_lanczos_3.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/14/Book-Review-The-Demon-in-the-Machine/jarzynski-info-engine_huf889a9b60d4542dd3809b35aa5c80bba_85217_0x700_resize_lanczos_3.png&#34; alt=&#34;Jarzynski&amp;amp;rsquo;s information engine; Figure 6 from The Demon in the Machine&#34; width=&#34;371&#34; height=&#34;350&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;Jarzynski&amp;rsquo;s information engine; Figure 6 from &lt;em&gt;The Demon in the Machine&lt;/em&gt;&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;blockquote&gt;
&lt;p&gt;Now for the crucial part. For the memory to be of any use to the demon, the
descending blades need to somehow interact with it (remember that, in this
case, the demon is the ring) or the demon cannot access its memory. &amp;hellip; The
demonic ring comes with a blade of its own which projects inwards and is fixed
to the ring; if one of the slowly descending paddles swivels around in the
right direction its blade will clonk the projecting ring blade, causing the
ring to rotate in the same direction. The ring can be propelled either way
but, due to the asymmetric configuration of the gap, there are more blows
sending the ring anticlockwise than clockwise (as viewed from above). As a
result, the random thermal motions are converted into a cumulative rotation in
one direction only. Such progressive rotation could be used in the now
familiar manner to perform useful work. &amp;hellip;&lt;/p&gt;
&lt;p&gt;So what happened to the second law of thermodynamics? We seem once more to be
getting order out of chaos, directed motion from randomness, heat turning into
work. To comply with the second law, entropy has to be generated somewhere,
and it is: in the memory. Translated into descending blade configurations,
some 0s become 1s, and some 0s stay 0s. The record of this action is preserved
below the ring, where the two blocking rods lock in the descending state of
the paddles by preventing any further swivelling between 0 and 1. The upshot
is that Jarzynski’s device converts a simple ordered input state
000000000000000 &amp;hellip; into a complex, disordered (indeed random) output state,
such as 100010111010010 &amp;hellip; &lt;strong&gt;Because a string of straight 0s contains no
information, whereas a sequence of 1s and 0s is information rich, the demon
has succeeded in turning heat into work (by raising the weight) and
accumulating information in its memory. The greater the storage capacity of
the incoming information stream, the larger the mass the demon can hoist
against gravity.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;(Emphasis mine)&lt;/p&gt;
&lt;p&gt;I’d suggest watching
&lt;a href=&#34;https://www.youtube.com/watch?v=00TyIShzR6o&#34;&gt;this YouTube video&lt;/a&gt; to get a
better sense of how this works.&lt;/p&gt;
&lt;p&gt;So now we have it: a system extracting mechanical energy from a gas at
thermodynamic equilibrium by &lt;em&gt;means of recording information&lt;/em&gt;. Here we see the
physical manifestation of the demon’s memory. There are a finite number of
paddles our engine has. When we run out, we cannot use the engine anymore. And
in this physical system it becomes clearer that in order to use the engine
indefinitely, we need to reset the system back to its “blank slate” state after
having used up its ledger space.&lt;/p&gt;
&lt;p&gt;Having established this, the remainder of &lt;em&gt;The Demon in the Machine&lt;/em&gt; discusses
the connection between information theory and biological life.&lt;/p&gt;
&lt;h2 id=&#34;4-life-and-information&#34;&gt;4. Life and Information&lt;/h2&gt;
&lt;p&gt;Much of Davies’ arguments for the relationship between life, matter, and
information boil down to two main ideas: the notion that biological systems have
an informational “software” that run on top of the bio-mechano-physical
“hardware”, and that this informational “software” has top-down causal power.&lt;/p&gt;
&lt;p&gt;To the first point, about biological “software”:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Biological information is more than a soup of bits suffusing the material
contents of cells and animating it; that would amount to little more than
vitalism. Rather, the patterns of information control and organize the
chemical activity in the same manner that a program controls the operation of
a computer. Thus, buried inside the ferment of complex chemistry is a web of
logical operations. Biological information is the software of life.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;To the second point, on the top-down causal power of information:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Counter to most reductionist thinking, the macroscopic states of a physical
system (such as the psychological state of an agent) that ignore the
small-scale internal specifics can actually have greater causal power than a
more detailed, fine-grained description of the system, a result summed up by
the dictum: ‘macro can beat micro’.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This notion of top-down causality amounts to a firm rejection of “strong
reductionism” &amp;ndash; the notion, in this context, that “the astonishing properties
of living matter could ultimately be explained solely in terms of the physics of
atoms and molecules”.&lt;/p&gt;
&lt;p&gt;I do not feel well-equipped to give an opinion on whether these two positions
are &lt;em&gt;correct&lt;/em&gt; or not, but they are compelling:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The reductionist argument is undeniably powerful, but it rests on a major
assumption about the nature of physical law. The way the laws of physics are
currently conceived leads to a stratification of physical systems with the
laws of physics at the bottom conceptual level and emergent laws stacked above
them. There is no coupling between levels. &lt;strong&gt;When it comes to living systems,
this stratification is a poor fit because, in biology, there often is coupling
between levels, between processes on many scales of size and complexity&lt;/strong&gt;:
causation can be both bottom-up (from genes to organisms) and top-down (from
organisms to genes). To bring life within the scope of physical law – and to
provide a sound basis for the &lt;strong&gt;reality of information as a fundamental entity
in its own right&lt;/strong&gt; – requires a radical reappraisal of the nature of physical
law, as I am arguing.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;(Emphasis mine)&lt;/p&gt;
&lt;p&gt;There&amp;rsquo;s an &lt;a href=&#34;https://xkcd.com/435/&#34;&gt;old XKCD comic&lt;/a&gt; that “biology is just applied
chemistry; chemistry is just applied physics”. Davies argues against this: there
is information on each layer. These layers interact in interesting and notable
ways. They are not merely layers of abstraction of systems, but rather there is
&lt;em&gt;something going on&lt;/em&gt; at each layer which can not be reduced to just the “base”
most layer.&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
            
        
        
        
            
            
            
            
            
            
                
                
                
                    
                
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/14/Book-Review-The-Demon-in-the-Machine/xkcd435.png&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/14/Book-Review-The-Demon-in-the-Machine/xkcd435_hube389cc787da1a021d729849bf8e7502_32287_0x308_resize_q100_h2_lanczos_3.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/14/Book-Review-The-Demon-in-the-Machine/xkcd435_hube389cc787da1a021d729849bf8e7502_32287_0x308_resize_lanczos_3.png&#34; width=&#34;601&#34; height=&#34;250&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;&lt;a href=&#34;https://xkcd.com/435/&#34;&gt;XKCD 435&lt;/a&gt;&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;h2 id=&#34;5-quantum-weirdness&#34;&gt;5. Quantum Weirdness&lt;/h2&gt;
&lt;p&gt;The book spends two fascinating chapters on the interaction between the
preceding ideas and quantum physics. My review is already getting long, so I
will limit my discussion here to one fascinating examples of quantum mechanics
showing up in biology:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The bird’s retina is packed with organic molecules; researchers have zeroed in
on retinal proteins dubbed ‘cryptochromes’ to do the job I am describing. When
a cryptochrome electron is ejected by a photon, it doesn’t cut all its links
with the molecule it used to call home. &amp;hellip; The electron, though ejected from
its atomic nest, can still be entangled with a second electron left behind in
the protein atom, but, because of their different magnetic environments, the
two electrons’ gyrations get out of kilter with each other. &amp;hellip; According to
the theory of the avian compass, these particular free radicals react either
with each other (by recombining), or with other molecules in the retina, to
form neurotransmitters, which then signal the bird’s brain. This
neuro-transmission reaction rate will vary according to the specifics of the
spooky link and its mismatched gyrations of the two electrons, which is a
direct function of the angle between the Earth’s magnetic field and the
cryptochrome molecules. &amp;hellip; Is there any evidence to support this
spooky-entanglement story? Indeed there is. &amp;hellip; The era of quantum ornithology
has arrived!&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Davies gives several other interesting examples of quantum biological effects &amp;ndash;
such as the role in photosynthesis and smell. However, he is also ready to admit
that the early quantum biological evidence “have been hotly debated” and that
some “early claims were overblown”. Aside from the particular anecdotes, the
part of this section that stuck with me was how interesting it is that there is
sufficient evolutionary selection pressure for quantum biological effects to
have arisen.&lt;/p&gt;
&lt;h2 id=&#34;6-conclusion&#34;&gt;6. Conclusion&lt;/h2&gt;
&lt;p&gt;I came away from &lt;em&gt;Demon in the Machine&lt;/em&gt; with a few things that stuck. First, the
Maxwell&amp;rsquo;s demon paradox has actually been “solved”. The resolution through
Landauer&amp;rsquo;s principle &amp;ndash; that information erasure, not measurement, costs
energy&amp;ndash; feels like one of those insights that I&amp;rsquo;m surprised I hadn&amp;rsquo;t
encountered before. Second, information engines are in-principle buildable
things, and that there is suggestive evidence that biological systems operate by
similar mechanisms. Third, Davies&amp;rsquo; argument in favor of &amp;ldquo;top-down causality&amp;rdquo; is
an interesting challenge to the common reductionist frame for physics and other
hard sciences.&lt;/p&gt;
&lt;p&gt;I’d strongly recommend &lt;em&gt;Demon in the Machine&lt;/em&gt;; it was quite thought-provoking.&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;Or if I had, it wasn’t in a way that at all stuck to the same extent it did
after I read Davies’ book&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
        <title>Chorus is Good Software</title>
        <link>https://benjamincongdon.me/blog/2025/12/13/Chorus-is-Good-Software/</link>
        <pubDate>Sat, 13 Dec 2025 21:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/12/13/Chorus-is-Good-Software/</guid>
        <description>&lt;p&gt;I’ve been using &lt;a href=&#34;https://chorus.sh/&#34;&gt;Chorus&lt;/a&gt; for the past 6-7 months. Within the
first couple days of using it, I was telling everyone I talk with about AI stuff
to try it out. Melty Labs, the company behind Chorus, subsequently built
&lt;a href=&#34;https://conductor.build/&#34;&gt;Conductor&lt;/a&gt;. It appears this Conductor now their
primary focus, and as such they’ve decided to
&lt;a href=&#34;https://github.com/meltylabs/chorus&#34;&gt;open source&lt;/a&gt; Chorus.&lt;/p&gt;
&lt;p&gt;Chorus is a macOS LLM client. Its differentiating feature is that it fetches you
responses from many LLMs in parallel within the same chat. I think it’s quite
good software.&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
        
        
            
            
            
            
            
            
            
            
            
            
            
            
                
            
        
        
        
        
        
        
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/13/Chorus-is-Good-Software/chorus_screenshot.png&#34;&gt;
            
            
            
            &lt;img
                src=&#34;https://benjamincongdon.me/blog/2025/12/13/Chorus-is-Good-Software/chorus_screenshot.png&#34; alt=&#34;Chorus screenshot&#34;
                loading=&#34;lazy&#34;
                decoding=&#34;async&#34;
            &gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;Chorus screenshot&lt;a href=&#34;https://github.com/meltylabs/chorus/tree/main/screenshots&#34;&gt;Source&lt;/a&gt;&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;&lt;strong&gt;It is &lt;em&gt;snappy&lt;/em&gt;.&lt;/strong&gt; I used to be impressed by the responsiveness of the ChatGPT
app, but after having become a regular Chorus user, I can’t use either the
ChatGPT&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt; or Claude macOS apps without feeling like they’re just &lt;em&gt;sluggish&lt;/em&gt; in
comparison.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;It has a small surface area, but executes the UX well.&lt;/strong&gt; The key UX innovation
of Chorus is its multi-model chat &amp;ndash; you can get responses from multiple models,
pick the one you want to thread into the conversation, and continue. There are
also many nice affordances in Chorus that don’t exist in other clients. For
example, chats allow for “inline replies” where you can ask a model about a
response without adding that response to the context window. Chorus also handles
pasted content &lt;em&gt;much&lt;/em&gt; better than ChatGPT desktop or the Gemini AI Studio.
(Claude handles pasted stuff pretty well.) Chorus just feels like a &lt;em&gt;tool&lt;/em&gt; &amp;ndash; as
opposed to a consumer app &amp;ndash; in a way that other clients don’t.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;It puts the user in control more-so than any of the first-party clients.&lt;/strong&gt;
Chat branching and “previous messaging” editing worked in Chorus well before
ChatGPT and Claude.&lt;sup id=&#34;fnref:2&#34;&gt;&lt;a href=&#34;#fn:2&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;2&lt;/a&gt;&lt;/sup&gt; Using a third-party client like Chorus also lets you
jettison the opinionated system prompts that the first-party clients inject, and
also let you switch quickly between system prompts. The first-party system
prompts are good in a consumer app, but Chorus makes for a better “training
wheels off” LLM client.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The multi-model native chat makes it easier to build an intuitive sense of
models.&lt;/strong&gt; Interacting with Claude/Gemini/GPT-N within the same dialog gives you
a much more intuitive sense of the “shape” of each of the models. It’s also
quite interesting to see, say, Claude Sonnet continuing from a Gemini response,
since their inherent writing styles are quite different.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;It allows you to burn tokens.&lt;/strong&gt; First-party chat clients are mostly on a flat,
monthly subscription model. The incentives are to provide you with a good
experience so you don’t cancel, while otherwise minimizing the number of tokens
you use. You pay for all your Chorus tokens. So, want to generate a response
from Opus 4.5 and GPT-5.1 and Sonnet 4.5 and Gemini Pro 3 for each conversation
turn? Go for it. Often times, many fanned-out responses to a message will be
more useful than the “best” model’s response. It’s effectively an easy way to do
“LLM councils” or best-of-N.&lt;sup id=&#34;fnref:3&#34;&gt;&lt;a href=&#34;#fn:3&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;It’s very clearly not trying to get your data.&lt;/strong&gt; Chorus originally shipped
with an optional subscription model that was essentially a model proxy. However,
it was always possible to use your own API keys and their
&lt;a href=&#34;https://docs.chorus.sh/account/privacy&#34;&gt;privacy policy&lt;/a&gt; was refreshingly clear:
“We don’t look at your chats, and we don’t want to.” Now that Chorus is
open-sourced, you can verify that.&lt;/p&gt;
&lt;p&gt;We’re still quite early in figuring out what UX patterns work well for LLMs.
Chorus took an interesting idea, executed it well, and as a result has a really
great novel UX that none of the other clients nail. There are obviously many use
cases for LLMs. The spotlight now is on agentic harnesses like Claude Code,
Codex, Cursor, and the like. Chat interfaces are not solved! Assuredly there is
more space to explore the
&lt;a href=&#34;https://simonwillison.net/2023/Apr/2/calculator-for-words/&#34;&gt;“calculator for words”&lt;/a&gt;
idea space.&lt;/p&gt;
&lt;p&gt;I wish Charlie and team the best of luck on Conductor, and thank them for making
Chorus open source. 🎉 I’m looking forward to contributing &amp;amp; hacking on Chorus.&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;The macOS app for ChatGPT feels noticeably sluggish now. I think this
regression started around the GPT-5 release, if memory serves. My mental
ranking used to be that the native macOS ChatGPT app was ~2x faster than
Claude. Now, that’s been reversed.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:2&#34;&gt;
&lt;p&gt;Though I will say that at this point, the ChatGPT/Claude desktop apps have
largely caught up.&amp;#160;&lt;a href=&#34;#fnref:2&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:3&#34;&gt;
&lt;p&gt;Well, where N = “number of models”, not “number of responses”. I think
Chorus could add support for multiple generations from the same model, but
that doesn’t exist today.&amp;#160;&lt;a href=&#34;#fnref:3&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
        <title>The Coming Need for Formal Specification</title>
        <link>https://benjamincongdon.me/blog/2025/12/12/The-Coming-Need-for-Formal-Specification/</link>
        <pubDate>Fri, 12 Dec 2025 17:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/12/12/The-Coming-Need-for-Formal-Specification/</guid>
        <description>&lt;p&gt;In late 2022, I had a conversation with a senior engineer on the coming problem
of “what to do when AI is writing most of the code”. His opinion, which I found
striking at the time, was that engineers would transition from writing mostly
“implementation” code, to mostly writing tests and specifications.&lt;/p&gt;
&lt;p&gt;I remember thinking at the time that this was prescient. With three years of
hindsight, it seems like things are trending in a different direction. I
&lt;em&gt;thought&lt;/em&gt; that the reason that testing and specifications would be useful was
that AI agents would be struggling to &amp;ldquo;grok&amp;rdquo; coding for quite some time, and
that you’d need to have robust specifications such that they could stumble
toward correctness.&lt;/p&gt;
&lt;p&gt;In reality, AI written tests were one of the &lt;em&gt;first&lt;/em&gt; tasks I felt comfortable
delegating. Unit tests are squarely in-distribution for what the models have
seen on all public open source code. There&amp;rsquo;s a lot of unit tests in open source
code, and they follow predictable patterns. I’d expect that the variance of
implementation code &amp;ndash; and the requirement for out-of-distribution patterns &amp;ndash;
is much higher than testing code. The result is that models are now quite good
at translating English descriptions into quite crisp test cases.&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;h2 id=&#34;system-design&#34;&gt;System Design&lt;/h2&gt;
&lt;p&gt;There exists a higher level problem of holistic system behavior verification,
though. Let’s take a quick diversion into systems design to see why.&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
            
        
        
        
            
            
            
            
            
            
                
                
                
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/12/The-Coming-Need-for-Formal-Specification/gestalt_1.png&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/12/The-Coming-Need-for-Formal-Specification/gestalt_1_huabf9a4a7c7f4371431069f963f6f899e_119953_0x600_resize_q100_h2_lanczos_3.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/12/The-Coming-Need-for-Formal-Specification/gestalt_1_huabf9a4a7c7f4371431069f963f6f899e_119953_0x600_resize_lanczos_3.png&#34; width=&#34;433&#34; height=&#34;300&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    
&lt;/figure&gt;

&lt;p&gt;System design happens on multiple scales. You want systems to be robust &amp;ndash; both
in their runtime, and their ability to iteratively evolve. This nudges towards
decomposing systems into distinct components, each of which can be &lt;em&gt;internally&lt;/em&gt;
complicated but exposes a firm interface boundary that allows you to abstract
over this internal complexity.&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
            
        
        
        
            
            
            
            
            
            
                
                
                
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/12/The-Coming-Need-for-Formal-Specification/gestalt_2.png&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/12/The-Coming-Need-for-Formal-Specification/gestalt_2_hudbca47da5942290f044817815daa5db7_223341_0x600_resize_q100_h2_lanczos_3.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/12/The-Coming-Need-for-Formal-Specification/gestalt_2_hudbca47da5942290f044817815daa5db7_223341_0x600_resize_lanczos_3.png&#34; width=&#34;399&#34; height=&#34;300&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    
&lt;/figure&gt;

&lt;p&gt;If we design things well, we can swap out parts of our system without disrupting
other parts or harming the top-level description of what the system does. We can
also perform top-down changes iteratively &amp;ndash; adding new components, and retiring
old ones, at each level of description of the system.&lt;/p&gt;
&lt;p&gt;This all requires careful thinking of how to build these interfaces and
component boundaries in such a way that (1) there is a clean boundary between
components and (2) that stringing all the components together actually produces
the desired top-level behavior.&lt;/p&gt;
&lt;p&gt;To do this effectively, we require maps of various levels of description of the
system’s
&lt;a href=&#34;https://en.wikipedia.org/wiki/Map%E2%80%93territory_relation&#34;&gt;territory&lt;/a&gt;. My
conjecture is that &lt;em&gt;code is not a good map for this territory&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;To be clear, I’ve found a lot of value in throwing out system diagram maps and
looking directly at the code territory when debugging issues. However,
code-level reasoning is often not the best level of abstraction to use for
reasoning about systems. This is for a similar reason that &amp;ldquo;modeling all the
individual molecules of a car&amp;rdquo; is not a great way to estimate that car&amp;rsquo;s braking
distance.&lt;/p&gt;
&lt;p&gt;LLMs have increasingly longer context windows, so one could naively say “just
throw all the code in the context and have it work it out”. Perhaps. But this is
still just clearly not the most efficient way to reason about large-scale
systems.&lt;/p&gt;
&lt;h2 id=&#34;formal-verification&#34;&gt;Formal Verification&lt;/h2&gt;
&lt;p&gt;The promise of formal verification is that we can construct provably composable
maps which still match the ground-level territory. Formal verification of code
allows you to specify a system using mathematical proofs, and then exhaustively
&lt;em&gt;prove&lt;/em&gt; that a system is correct. As an analogy: unit tests are like running an
experiment. Each passing test is an assertion that, for the conditions checked,
the code is correct. There could still exist some &lt;em&gt;untested&lt;/em&gt; input that would
demonstrate incorrect behavior. You only need one negative test to show the code
is incorrect, but only a provably exhaustive set of inputs would be sufficient
to show the code is fully correct. Writing a formal verification of a program is
more like writing a proof. Writing a self-consistent proof is sufficient to show
that the properties you’ve proven always hold.&lt;/p&gt;
&lt;p&gt;I saw Martin Kleppmann’s
&lt;a href=&#34;https://martin.kleppmann.com/2025/12/08/ai-formal-verification.html&#34;&gt;“Prediction: AI will make formal verification go mainstream”&lt;/a&gt;
right after I posted
&lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/08/The-Decline-of-the-Software-Drafter/&#34;&gt;“The Decline of the Software Drafter?”&lt;/a&gt;,
which became the inspiration for this post. Kleppmann’s argument is that, just
as the cost of generating code is coming down, so too will the cost of formal
verification of code:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;For example, as of 2009, the formally verified seL4 microkernel consisted of
8,700 lines of C code, but proving it correct required 20 person-years and
200,000 lines of Isabelle code – or 23 lines of proof and half a person-day
for every single line of implementation. Moreover, there are maybe a few
hundred people in the world (wild guess) who know how to write such proofs,
since it requires a lot of arcane knowledge about the proof system.&lt;/p&gt;
&lt;p&gt;&amp;hellip;&lt;/p&gt;
&lt;p&gt;If formal verification becomes vastly cheaper, then we can afford to verify
much more software. But on top of that, AI also creates a need to formally
verify more software: rather than having humans review AI-generated code, I’d
much rather have the AI prove to me that the code it has generated is correct.
If it can do that, I’ll take AI-generated code over handcrafted code (with all
its artisanal bugs) any day!&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I’ve long been interested in formal verification tools like TLA+ and Rocq (née
Coq). I haven’t (yet) been able to justify to myself spending all that much time
on them. I think that’s changing: the cost of writing code is
&lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/08/The-Decline-of-the-Software-Drafter/&#34;&gt;coming down dramatically&lt;/a&gt;.
The cost of &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/10/What-I-Look-For-in-AI-Assisted-PRs/&#34;&gt;reviewing&lt;/a&gt;
and maintaining it is also coming down, but at a slower rate. I agree with
Kleppmann that we need systematic tooling for dealing with this mismatch.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://en.wiktionary.org/wiki/wishcasting&#34;&gt;Wishcasting&lt;/a&gt; a future world, I
would be excited to see something like:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;One starts with a high-level system specification, in English.&lt;/li&gt;
&lt;li&gt;This specification is spun out into multiple TLA+ models at various levels
of component specificity.&lt;/li&gt;
&lt;li&gt;These models would allow us to determine the components that are
load-bearing for system correctness.&lt;/li&gt;
&lt;li&gt;The most critical set of load-bearing components are implemented with a
corresponding formal verification proof, in something like
&lt;a href=&#34;https://rocq-prover.org/&#34;&gt;Rocq&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;The rest of the system components are still audited by an LLM to ensure they
correctly match the behavior of their associated component in the TLA+ spec.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The biggest concern to me related to formal verification is the following two
excerpts, first from Kleppmann, and then from Hillel Wayne, a notable proponent
of TLA+:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;There are maybe a few hundred people in the world (wild guess) who know how to
write such proofs, since it requires a lot of arcane knowledge about the proof
system. &amp;ndash;
&lt;a href=&#34;https://martin.kleppmann.com/2025/12/08/ai-formal-verification.html&#34;&gt;Martin Kleppmann&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;TLA+ is one of the more popular formal specification languages and you can
probably fit every TLA+ expert in the world in a large schoolbus. &amp;ndash;
&lt;a href=&#34;https://hillelwayne.com/post/why-dont-people-use-formal-methods/&#34;&gt;Hillel Wayne&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;For formal verification to be useful in practice, at least some of the arcane
knowledge of its internals will need to be broadly disseminated. Reviewing an
AI-generated formal spec of a problem won’t be useful if you don’t have enough
knowledge of the proof system to poke holes in what the AI came up with.&lt;/p&gt;
&lt;p&gt;I’d argue that undergraduate Computer Science programs should allocate some of
their curriculum to formal verification. After all, students should have more
time on their hands as they delegate implementation of their homework to AI
agents.&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;That is, under current optimization pressures. We’re only a few months past
the &lt;a href=&#34;https://www.anthropic.com/news/claude-3-7-sonnet&#34;&gt;Sonnet 3.7&lt;/a&gt; and
&lt;a href=&#34;https://openai.com/index/introducing-o3-and-o4-mini/&#34;&gt;OpenAI o3&lt;/a&gt; models,
both of which had a penchant for reward hacking on tests.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
        <title>Zip Files as (Simple) Key-Value Stores</title>
        <link>https://benjamincongdon.me/blog/2025/12/11/Zip-Files-as-Simple-Key-Value-Stores/</link>
        <pubDate>Thu, 11 Dec 2025 20:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/12/11/Zip-Files-as-Simple-Key-Value-Stores/</guid>
        <description>&lt;p&gt;I recently encountered a fun performance problem. Consider the following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;You need to distribute a key-value dataset with string keys and opaque value
blobs in the 100B - 1MB range.&lt;/li&gt;
&lt;li&gt;There are on the order of 10k keys to distribute.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Critically, you are in a constrained memory environment where you do not
have enough memory to load all the blobs into memory at one time.&lt;/strong&gt; You do,
however, have enough memory to load all the keys and a small amount of
metadata, if you want.&lt;/li&gt;
&lt;li&gt;The key access pattern on the data is random.&lt;/li&gt;
&lt;li&gt;The workload is entirely read-only.&lt;/li&gt;
&lt;li&gt;You get to choose the file format and reader, but you must be able to
disseminate this whole structure as one file.&lt;/li&gt;
&lt;li&gt;Due to environment restrictions, you can’t use &lt;code&gt;mmap&lt;/code&gt; to load files into
memory.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I’ve heard the good news about SQLite from
&lt;a href=&#34;https://simonwillison.net/tags/SQLite/&#34;&gt;Simon Willison&lt;/a&gt;,
&lt;a href=&#34;https://litestream.io/blog/why-i-built-litestream/&#34;&gt;Ben Johnson&lt;/a&gt;, and others,
and thought this might be an interesting use case. I built out both the reader
and writer components. TLDR, it worked!&lt;/p&gt;
&lt;p&gt;The latency requirements here were rather sensitive as well, so I tried to
squeeze all I could out of SQLite in terms of keeping both the memory footprint
and read latency low. I diligently read through the SQLite
&lt;a href=&#34;https://SQLite.org/pragma.html&#34;&gt;pragmas page&lt;/a&gt;, which inspired my
&lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/05/TIL-SQLites-WITHOUT-ROWID/&#34;&gt;WITHOUT ROWID post&lt;/a&gt;.&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt; In the
end, the SQLite-based solution worked out quite nicely. &lt;sup id=&#34;fnref:2&#34;&gt;&lt;a href=&#34;#fn:2&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;2&lt;/a&gt;&lt;/sup&gt; However, I had the
nagging suspicion that I was overlooking an obvious solution that was simpler
than SQLite.&lt;/p&gt;
&lt;p&gt;Yes, there &lt;em&gt;are&lt;/em&gt; also other key-value stores like RocksDB &amp;ndash; but that also felt
like something of a heavy dependency for a read-only workload. Yes, I &lt;em&gt;could&lt;/em&gt;
use something like a line-delineated file, scan through the file for the keys,
store the byte offsets, and scan to the offset at read time. Yes, I &lt;em&gt;could&lt;/em&gt; make
my own small file format that stores the key-value byte offsets in a header
section to improve upon the line-delineated version.&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
        
        
            
            
            
            
            
            
            
            
            
            
            
            
                
            
        
        
        
        
        
        
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/11/Zip-Files-as-Simple-Key-Value-Stores/options.png&#34;&gt;
            
            
            
            &lt;img
                src=&#34;https://benjamincongdon.me/blog/2025/12/11/Zip-Files-as-Simple-Key-Value-Stores/options.png&#34; alt=&#34;Possible solutions to the problem&#34;
                loading=&#34;lazy&#34;
                decoding=&#34;async&#34;
            &gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;Possible solutions to the problem&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;And yet, surely the generic version of this problem &amp;ndash; “I just want a small,
low-dependency way to index into blobs with string keys” &amp;ndash; must have a simple,
elegant solution.&lt;/p&gt;
&lt;p&gt;It does: this is just a
&lt;a href=&#34;https://en.wikipedia.org/wiki/ZIP_(file_format)&#34;&gt;zip file&lt;/a&gt;. lol.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Because the files in a ZIP archive are compressed individually, it is possible
to extract them, or add new ones, without applying compression or
decompression to the entire archive.&lt;/p&gt;
&lt;p&gt;A directory is placed at the end of a ZIP file. This identifies what files are
in the ZIP and identifies where in the ZIP that file is located. This allows
ZIP readers to load the list of files without reading the entire ZIP archive.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://en.wikipedia.org/wiki/ZIP_(file_format)&#34;&gt;Source&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
            
        
        
        
            
            
            
            
            
            
                
                
                
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/11/Zip-Files-as-Simple-Key-Value-Stores/zip_layout.png&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/11/Zip-Files-as-Simple-Key-Value-Stores/zip_layout_huf6cd50f7d0aae52a829c7d90554d081d_169375_0x600_resize_q100_h2_lanczos_3.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/11/Zip-Files-as-Simple-Key-Value-Stores/zip_layout_huf6cd50f7d0aae52a829c7d90554d081d_169375_0x600_resize_lanczos_3.png&#34; alt=&#34;The structure of a ZIP file&#34; width=&#34;551&#34; height=&#34;300&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;The structure of a ZIP file&lt;a href=&#34;https://en.wikipedia.org/wiki/ZIP_%28file_format%29&#34;&gt;Source&lt;/a&gt;&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;Most folks think about &lt;code&gt;.zip&lt;/code&gt; files purely in the context of compression, but
zip files don’t actually require the files contained inside to be compressed.
The structure of the zip file is basically exactly what we want. It has a
directory section that is small, contains the list of all the contained files
(keys, in our case), and lets us pull out the data for a key with a random
access pattern. This is in contrast to the other common archive file format,
&lt;code&gt;.tar&lt;/code&gt;, which does not have this nice property.&lt;/p&gt;
&lt;p&gt;Yes, disks are &lt;em&gt;slow&lt;/em&gt;, but if we don’t have any memory to play with&amp;hellip; what can
we do other than throwing this at the disk? And if we’re going to use the disk
anyways, why not pick a format that’s already exactly what we need? Zip has none
of the overhead of an execution engine, the B-tree on-disk page layout, and a
competent Zip file client exists for any serious programming language under the
sun, just like SQLite.&lt;/p&gt;
&lt;p&gt;I put together a small benchmark to see if this would work as expected. I
implemented a benchmark matching the constraints above, and tested 4 clients:
SQLite (with and without ROWIDs enabled), Zip file, and a custom &lt;code&gt;.dat&lt;/code&gt; file
format which uses a header/offset-pointer approach.&lt;/p&gt;
&lt;p&gt;The benchmark produced some interesting results. I’d only treat these charts as
directionally correct, but the orders of magnitude and relative positioning of
these approaches should be more-or-less correct. Charts (note the y-axis is
variably log-scaled or linear-scaled):&lt;/p&gt;
&lt;p&gt;
&lt;figure&gt;
    
    

    
    

    
    
        
        
            
        
        
        
            
            
            
            
            
            
                
                
                
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/11/Zip-Files-as-Simple-Key-Value-Stores/p90_latency_by_blob_size.png&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/11/Zip-Files-as-Simple-Key-Value-Stores/p90_latency_by_blob_size_hu1578768a5f0f1332d2fd25f579598cdf_42035_0x600_resize_q100_h2_lanczos_3.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/11/Zip-Files-as-Simple-Key-Value-Stores/p90_latency_by_blob_size_hu1578768a5f0f1332d2fd25f579598cdf_42035_0x600_resize_lanczos_3.png&#34; alt=&#34;P90 latency by blob size&#34; width=&#34;530&#34; height=&#34;300&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;P90 latency by blob size&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;


&lt;figure&gt;
    
    

    
    

    
    
        
        
            
        
        
        
            
            
            
            
            
            
                
                
                
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/11/Zip-Files-as-Simple-Key-Value-Stores/10kb_latency_percentiles.png&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/11/Zip-Files-as-Simple-Key-Value-Stores/10kb_latency_percentiles_huda90d7f31e563a6a247a0d0b05ad9f83_43843_0x600_resize_q100_h2_lanczos_3.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/11/Zip-Files-as-Simple-Key-Value-Stores/10kb_latency_percentiles_huda90d7f31e563a6a247a0d0b05ad9f83_43843_0x600_resize_lanczos_3.png&#34; alt=&#34;10KB latency percentiles&#34; width=&#34;526&#34; height=&#34;300&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;10KB latency percentiles&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;


&lt;figure&gt;
    
    

    
    

    
    
        
        
            
        
        
        
            
            
            
            
            
            
                
                
                
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/11/Zip-Files-as-Simple-Key-Value-Stores/1mb_latency_percentiles.png&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/11/Zip-Files-as-Simple-Key-Value-Stores/1mb_latency_percentiles_hu3f50d91db0700df7c4e71a67797fbd5c_47466_0x600_resize_q100_h2_lanczos_3.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/11/Zip-Files-as-Simple-Key-Value-Stores/1mb_latency_percentiles_hu3f50d91db0700df7c4e71a67797fbd5c_47466_0x600_resize_lanczos_3.png&#34; alt=&#34;1MB latency percentiles&#34; width=&#34;524&#34; height=&#34;300&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;1MB latency percentiles&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Between 100B and 100KB, the Zip file performs noticeably better than the
SQLite. Even with 1MB blobs, where the disk I/O dominates, they perform
roughly the same.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;WITHOUT ROWID&lt;/code&gt; on the SQLite table made performance significantly worse.
This is slightly surprising with the small blobs, but entirely unsurprising
past 1KB blob size.&lt;/li&gt;
&lt;li&gt;My janky custom file format wasn’t noticeably better in any size range than
an off-the-shelf solution, so (expectedly) it makes sense to not DIY an
on-disk index like this, at least for my simple use case.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Even though I ended up using SQLite, the humble Zip file is quite competitive as
an on-disk index. At least, in this very simple constrained use case.&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;Sadly, just a fun diversion as &lt;code&gt;WITHOUT ROWID&lt;/code&gt; doesn’t help for large rows.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:2&#34;&gt;
&lt;p&gt;SQLite added some additional benefits, such as the ability to add row-level
metadata about each key, which in practice was used to optimize some of the
query patterns.&amp;#160;&lt;a href=&#34;#fnref:2&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
        <title>What I Look For in AI-Assisted PRs</title>
        <link>https://benjamincongdon.me/blog/2025/12/10/What-I-Look-For-in-AI-Assisted-PRs/</link>
        <pubDate>Wed, 10 Dec 2025 21:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/12/10/What-I-Look-For-in-AI-Assisted-PRs/</guid>
        <description>&lt;p&gt;I review a lot of PRs these days. As the job of a PR author
&lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/08/The-Decline-of-the-Software-Drafter/&#34;&gt;becomes easier&lt;/a&gt; with AI,
the job of a PR reviewer gets harder.&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;p&gt;AI can &amp;ldquo;assist&amp;rdquo; with code review, but I’m less optimistic about AI code review
than AI code generation. Sure, Claude/Codex can be quite helpful as a first
pass, but code review still requires a large amount of &lt;em&gt;human&lt;/em&gt; taste.&lt;sup id=&#34;fnref:2&#34;&gt;&lt;a href=&#34;#fn:2&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;p&gt;I care about the high level abstractions my team uses in our codebase, and about
how the pieces fit together. I care that our codebase can be intuitively
understood by new team members. I care that code is tamper-resistant &amp;ndash; that we
build things robustly such that imperfect execution in the future doesn’t cause
something to blow up. Systems should be decomposable. You should be able to fit
all the components of the system in your head in a reasonably faithful mental
model, but you shouldn’t &lt;em&gt;need&lt;/em&gt; to fit all the implementation details of each
component in your head to not cause something to break.&lt;/p&gt;
&lt;p&gt;Anyways.&lt;/p&gt;
&lt;p&gt;I’ve been trying to speed up my review latency for PRs, and have given some
thought to the heuristics I use to evaluate PRs. Heuristics are lossy, of
course, but they’re necessary. If you haven’t given this much thought recently,
it’s useful to consciously recalibrate the heuristics you use when reviewing
code, now that so much code is generated by LLMs.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;General Reviewability&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Did the author provide a detailed &amp;amp; accurate PR description?&lt;/li&gt;
&lt;li&gt;What level of sensitivity is this code? Is this performance or safety
critical code that needs to be reviewed with a fine-tooth comb,
line-by-line, or is it something peripheral like an internal UI or CLI that
can be &amp;ldquo;good enough&amp;rdquo;?&lt;/li&gt;
&lt;li&gt;Does the change appear &lt;em&gt;reversible&lt;/em&gt;? Is the Git diff of the change human
readable? I find LLMs are often really eager to make big changes that
clobber the Git diff. Incremental change is usually preferable.&lt;/li&gt;
&lt;li&gt;Is the PR of an actually reviewable size? My personal bar is: &amp;lt;500 lines is
ideal, &amp;gt;1000 lines is borderline unreviewable.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Design &amp;amp; Abstractions&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;If this is greenfield code, does the author seem to be setting up suitable
abstractions? Do these abstractions seem like they’ll compose in sane ways?
Do the abstractions have reasonable boundaries that do not leak information?&lt;/li&gt;
&lt;li&gt;Can you zoom out your mind, picture the PR in your head, and &amp;ldquo;make it make
sense&amp;rdquo; with your mental model of the code? Does it make &lt;em&gt;sense&lt;/em&gt; at a
conceptual level what is being proposed, or does this just have the veneer
of &amp;ldquo;good code&amp;rdquo;?&lt;/li&gt;
&lt;li&gt;Would the code be substantially improved by loading the PR into Claude Code
and making a targeted one-sentence prompt? For example: &amp;ldquo;hey, could you
deduplicate some of this logic between classes X and Y and make the Foo
trait more modular&amp;rdquo;. (Fortunately, you can just put this as a review comment
&amp;ndash; the author will probably rewrite with an LLM anyways)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Vibe Code Smells&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;What amount of effort does it seem like the author has put into their PR?
N.b. I mean &lt;em&gt;the author&lt;/em&gt; not the AI that wrote the code on behalf of the
author. Human effort and curation still leaves signs behind.&lt;/li&gt;
&lt;li&gt;Did the author leave vibe-coded comments in the PR? (Often this looks like
iterative process comments, of the flavor
&lt;code&gt;// Now we’ll not use type X anymore, per your feedback&lt;/code&gt;.)&lt;/li&gt;
&lt;li&gt;Are imports (especially for Python and sometimes Rust) splattered around the
code, instead of being present at the top?&lt;/li&gt;
&lt;li&gt;Is there a weird amount of defensive copying/cloning due to a
misunderstanding of e.g. immutability-by-default in Scala or how to use
ownership/lifetimes in Rust?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Testing&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;For unit tests, do they cover common edge cases? Do the unit tests actually
make meaningful assertions that exercise the code in meaningful ways, or are
they sloppy assertions that reduce to &lt;code&gt;assert!(true)&lt;/code&gt;?&lt;/li&gt;
&lt;li&gt;For unit tests, do they have a weird number of extraneous edge cases that
are unlikely to ever happen in practice? (Also a vibe code smell)&lt;/li&gt;
&lt;li&gt;For tests, do the tests mock out dependencies to the point where the entire
test is useless/invalid?&lt;sup id=&#34;fnref:3&#34;&gt;&lt;a href=&#34;#fn:3&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Error Handling&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Does the code have a weird level of paranoia about exceptions being thrown?&lt;/li&gt;
&lt;li&gt;Does the code silently swallow errors with try/catch?&lt;/li&gt;
&lt;li&gt;Does the code allow for exceptions/panics in areas where the code
&lt;em&gt;absolutely should not&lt;/em&gt; panic?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;None of these are intended to be knocks against individual PR authors. It’s
useful to assume positive intent when reviewing code. SWEs are under various
pressures, visible and invisible. The &amp;ldquo;system&amp;rdquo; we have today, broadly defined,
results in much more code being produced by non-humans. The best source of truth
for human coding taste is still, for now, humans. Therefore, humans still need
to review a lot of non-human code, as we collectively chip away at the pieces of
code taste that &lt;em&gt;can&lt;/em&gt; be incorporated back into model intuition.&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;As of late 2025, code generation is easier than code verification. Martin
Kleppmann is onto something with his
&lt;a href=&#34;https://martin.kleppmann.com/2025/12/08/ai-formal-verification.html&#34;&gt;prediction&lt;/a&gt;
that AI will make formal verification go mainstream. Our current set of
review tooling isn’t sufficient for the tsunamis of code that will be
generated without human oversight over the coming years.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:2&#34;&gt;
&lt;p&gt;Code review agents have gotten a lot better in the past year, but not at the
same pace as code generation agents. As the model harnesses have gotten more
agentic, code review agents have gotten significantly better at pulling in
the context they need to review a PR. I&amp;rsquo;ve found that review agents are
competent at picking out obvious catestrophic issues and medium-risk
tacictical coding mistakes. They&amp;rsquo;re not great with issues where the entire
gestalt of a PR needs significant reworking, or where PRs have a lot of
small things that need to be called out for improvement. There&amp;rsquo;s assuredly
still low-hanging fruit for gains here.&amp;#160;&lt;a href=&#34;#fnref:2&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:3&#34;&gt;
&lt;p&gt;I heard of an example of this recently, albeit not LLM related. The
situation was described as the author &amp;ldquo;mock[ing] for the world they
&lt;em&gt;wanted&lt;/em&gt;, not the world &lt;em&gt;as it is&lt;/em&gt;&amp;rdquo;. I still chuckle at this. :)&amp;#160;&lt;a href=&#34;#fnref:3&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
        <title>SWIM: Outsourced Heartbeats</title>
        <link>https://benjamincongdon.me/blog/2025/12/09/SWIM-Outsourced-Heartbeats/</link>
        <pubDate>Tue, 09 Dec 2025 00:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/12/09/SWIM-Outsourced-Heartbeats/</guid>
        <description>&lt;p&gt;How does a distributed system reliably determine when one of its members has
failed? This is a tricky problem: you need to deal with unreliably networks, the
fact that nodes can crash at arbitrary times, and you need to do so in a way
that can scale to thousands of noes. This is the role of a &lt;strong&gt;failure detection&lt;/strong&gt;
system, and is one of the most foundational parts of many distributed systems.&lt;/p&gt;
&lt;p&gt;There are many rather simple ways to solve this problem, but one of the most
elegant solutions to distributed failure detection is an algorithm that I first
encountered in undergraduate Computer Science: the
&lt;a href=&#34;https://en.wikipedia.org/wiki/SWIM_Protocol&#34;&gt;SWIM Protocol&lt;/a&gt;.&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt; SWIM here
stands for “Scalable Weakly Consistent Infection-style Process Group
Membership”.&lt;/p&gt;
&lt;h2 id=&#34;what-is-a-failure-detector&#34;&gt;What is a Failure Detector&lt;/h2&gt;
&lt;p&gt;SWIM solves the problem of distributed failure detection. Lets sketch out this
problem with a bit more detail. Suppose you have a dynamic pool of servers (or
“processes” as they’re often called in distributed algorithms). Processes will
enter the pool as they’re turned on, and leave the pool as they are turned off,
crash, or when they become inaccessible from the network. The problem statement
is that we want to be able to detect when any one of these servers goes offline.&lt;/p&gt;
&lt;p&gt;The uses for such failure detection systems abound. Things like: determining
node failures on workload schedulers like Kubernetes, maintaining membership
list for peer-to-peer protocols, and determining if a replica has failed in a
distributed database like Cassandra.&lt;/p&gt;
&lt;p&gt;As a toy example, let’s pick something concrete: let’s say we’re building a
distributed key-value store. We’re just getting started with our KV-store. For
today just need to set up the membership system, wherein distributed replicas of
the data can discover each other. Each replica needs to know of the existence of
all the other replicas, so they can run more complicated distributed systems
protocols &amp;ndash; things like consensus, leader election, distributed locking, etc.
All those other algorithms are fun, but we first need the handshake of “who am I
talking to?” and “I need to know when one of my peers fails”. That’s failure
detection.&lt;/p&gt;
&lt;p&gt;What properties do we want for our failure detector?&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;If a server fails, we should know about it within some bounded time. This is
&lt;strong&gt;Detection Time&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Faster detection time is preferable.&lt;/li&gt;
&lt;li&gt;If a process doesn’t die, we shouldn’t mark it as dead. If we relax this to
allow for “false positives”, we ideally want a quite low &lt;strong&gt;False Positive
Rate&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;The amount of load on each node should scale sublinearly with the number of
nodes in the pool. That is, ideally the amount of work per node is the same
whether we have 100 nodes or 1,000,000 nodes.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;What properties do we assume about nodes and the network?&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The network can drop packets or become partitioned.&lt;/li&gt;
&lt;li&gt;Nodes can crash at arbitrary times during program execution. We cannot rely on
nodes doing anything (like sending an “I’m crashing” message) just prior to
crashing, for example.&lt;/li&gt;
&lt;li&gt;All nodes can send network packets to all other nodes.&lt;/li&gt;
&lt;li&gt;Packets on the network have propagation time.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;reasoning-up-to-failure-detectors&#34;&gt;Reasoning up to Failure Detectors&lt;/h2&gt;
&lt;p&gt;What is the most naive functional failure detector that we could build? Well,
for some node $N_1$ in a node pool of ${ N_1, N_2, &amp;hellip;, N_{10} }$, $N_1$ could
ping each of nodes ${ N_2, &amp;hellip;, N_{10} }$. Each of those pinged nodes could
send an acknowledgment of the ping, as soon as the ping was received. When $N_1$
receives a ping from, say, $N_4$, then $N_1$ knows that $N_2$ is still “alive”.&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
            
        
        
        
            
            
            
            
            
            
                
                
                
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/09/SWIM-Outsourced-Heartbeats/pingack.png&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/09/SWIM-Outsourced-Heartbeats/pingack_hu6fc2900d60d9ca071feedecb11d7818d_37960_0x700_resize_q100_h2_lanczos_3.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/09/SWIM-Outsourced-Heartbeats/pingack_hu6fc2900d60d9ca071feedecb11d7818d_37960_0x700_resize_lanczos_3.png&#34; alt=&#34;Ping and acknowledgment&#34; width=&#34;457&#34; height=&#34;350&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;Ping and acknowledgment&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;Looking at this figure, if $N_1$ waits for some chunk of time, maybe retries a
few times, but still never hears back from, say, $N_4$, then it can mark $N_4$
as “dead”.&lt;/p&gt;
&lt;p&gt;Every time we hear a ping back from another node, we know it’s alive “now”. But
we need to know the state of all members over time, so we run this procedure in
sequence many times. We can call the time between each ping we send out as
$T_{ping}$. And we can call the time we wait to hear back as $T_{fail}$. Both of
these become tunable parameters in our system. If we increase $T_{fail}$, we
decrease the chance that we accidentally mark a neighbor as “dead” due to e.g. a
transient network blip. But if we increase it too long, then we would also allow
an &lt;em&gt;actually&lt;/em&gt; dead node to still sit in our membership list for a long time &amp;ndash;
which we don’t want either.&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
            
        
        
        
        
            
            
            
            
            
                
                
                
            
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/09/SWIM-Outsourced-Heartbeats/tping.png&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/09/SWIM-Outsourced-Heartbeats/tping_hu083e425bdbf6fa9eb4f9cd5a39d85115_29222_1000x0_resize_q100_h2_lanczos_3.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/09/SWIM-Outsourced-Heartbeats/tping_hu083e425bdbf6fa9eb4f9cd5a39d85115_29222_1000x0_resize_lanczos_3.png&#34; alt=&#34;Time between pings&#34; width=&#34;500&#34; height=&#34;545&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;Time between pings&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;Generalizing this, each of ${ N_1, N_2, &amp;hellip;, N_{10} }$ all independently run
this ping approach. Finally, to bootstrap the process, we can hardcode all the
member node IPs in at startup. If a node crashes but later comes back online, it
can broadcast a ping to each of its peers, who then mark it as “alive” and start
pinging it regularly again.&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
            
        
        
        
        
            
            
            
            
            
                
                
                
            
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/09/SWIM-Outsourced-Heartbeats/alltoall.png&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/09/SWIM-Outsourced-Heartbeats/alltoall_hu98710cf58662be81ae19cf7e064793a8_64584_1000x0_resize_q100_h2_lanczos_3.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/09/SWIM-Outsourced-Heartbeats/alltoall_hu98710cf58662be81ae19cf7e064793a8_64584_1000x0_resize_lanczos_3.png&#34; alt=&#34;All-to-all heartbeating&#34; width=&#34;500&#34; height=&#34;406&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;All-to-all heartbeating&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;Woohoo, we’ve just invented a basic form of
&lt;a href=&#34;https://martinfowler.com/articles/patterns-of-distributed-systems/heartbeat.html&#34;&gt;heartbeating&lt;/a&gt;!&lt;/p&gt;
&lt;p&gt;What are the characteristics of our basic system?&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Number of messages: $k$ per $T_{ping}$ seconds, for each node, where $k$ is
the number of nodes in our pool&lt;/li&gt;
&lt;li&gt;Time to first detection: $T_{fail}$.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;What about accuracy? Well&amp;hellip; since we allow for an unreliable network, it’s
sadly provably impossible to have both completeness (all failures detected) and
accuracy (no false positives). See
&lt;a href=&#34;https://www.cs.utexas.edu/~lorenzo/corsi/cs380d/papers/p225-chandra.pdf&#34;&gt;this paper&lt;/a&gt;
if you’re curious. In a perfect network with no packet loss, we would have
strong accuracy, however.&lt;/p&gt;
&lt;h2 id=&#34;time-to-swim&#34;&gt;Time to SWIM&lt;/h2&gt;
&lt;p&gt;Surely we can do better than “the most naive thing we could think of”. This is
where SWIM comes in.&lt;/p&gt;
&lt;p&gt;SWIM combines two insights: first, that all-to-all heartbeating like our first
example results in a LOT of overlapping communication. If nodes were able to
“share” the information they gathered more effectively, we could cut down on the
number of messages sent. Second, network partitions often only affect parts of a
network. Just because $N_1$ can talk to $N_2$ but &lt;em&gt;can’t&lt;/em&gt; talk to $N_3$, this
doesn’t necessarily mean that $N_2$ can’t talk to $N_3$.&lt;/p&gt;
&lt;p&gt;The tagline for SWIM is “outsourced heartbeats”, and it works like this:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Similar to all-to-all heartbeating, each node maintains its own membership
list of all the other nodes it’s aware of. In addition to this list, each node
also keeps track of a “last heard from” timestamp for each of the known
members.&lt;/li&gt;
&lt;li&gt;Every $T_{ping}$ seconds, each node ($N_1$) sends a ping to one other randomly
selected node in its membership list ($N_{other}$). If $N_1$ receives a
response from $N_{other}$, then $N_1$ updates its “last heard from” timestamp
of $N_{other}$ to be the current time.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Here’s the outsourced heartbeating piece:&lt;/em&gt; If $N_1$ does &lt;em&gt;not&lt;/em&gt; hear from
$N_{other}$, then $N_1$ contacts $j$ other randomly selected nodes on its
membership list and requests that &lt;em&gt;they&lt;/em&gt; ping $N_{other}$. If any of those
other nodes are able to successfully contact $N_{other}$, then they inform
$N_1$ and both parties update their “last heard from” time for $N_{other}$.&lt;/li&gt;
&lt;/ul&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
            
        
        
        
            
            
            
            
            
            
                
                
                
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/09/SWIM-Outsourced-Heartbeats/outsourced_ping.png&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/09/SWIM-Outsourced-Heartbeats/outsourced_ping_hu8bab3e775a1e638b68267f1e6bd8093a_28940_0x800_resize_q100_h2_lanczos_3.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/09/SWIM-Outsourced-Heartbeats/outsourced_ping_hu8bab3e775a1e638b68267f1e6bd8093a_28940_0x800_resize_lanczos_3.png&#34; alt=&#34;Outsourced heartbeating&#34; width=&#34;456&#34; height=&#34;400&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;Outsourced heartbeating&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;ul&gt;
&lt;li&gt;If after $T_{fail}$ seconds, $N_1$ still hasn’t heard from $N_{other}$, then
$N_1$ marks $N_{other}$ as failed.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;em&gt;At this point, $N_1$ has determined&lt;/em&gt; that $N_{other}$ has failed. To make the
whole process happen more rapidly, $N_1$ includes this information to the rest
of the network, usually by including information about $N_{other}$’s failure in
its ping messages to neighbors, which then gradually propagates through the
entire network &lt;a href=&#34;https://en.wikipedia.org/wiki/Gossip_protocol&#34;&gt;gossip style&lt;/a&gt;.&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
            
        
        
        
            
            
            
            
            
            
                
                
                
                    
                
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/09/SWIM-Outsourced-Heartbeats/swim.png&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/09/SWIM-Outsourced-Heartbeats/swim_hua263ca9d05380d00f7969456dac55cef_31906_0x700_resize_q100_h2_lanczos_3.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/09/SWIM-Outsourced-Heartbeats/swim_hua263ca9d05380d00f7969456dac55cef_31906_0x700_resize_lanczos_3.png&#34; alt=&#34;SWIM protocol&#34; width=&#34;455&#34; height=&#34;500&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;SWIM protocol&lt;a href=&#34;https://en.wikipedia.org/wiki/SWIM_Protocol&#34;&gt;Source&lt;/a&gt;&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;What are the characteristics of this more sophisticated failure detector?&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Number of messages per node per interval: $1$ ping in the common case. At most
$1 + j$ when outsourcing is triggered (where $j$ is the number of “outsourced”
ping requests).&lt;/li&gt;
&lt;li&gt;Time to first detection: It takes some fancy math to get to this point, but in
expectation this is $\frac{e}{e-1} * T_{ping}$&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;What’s notable about these two properties is that: First, the number of messages
we send no longer scales with the size of the pool. We could have 1M or 10M or
100M nodes and still send the same number of messages. Second, the expected time
to first detection &lt;em&gt;also&lt;/em&gt; is still independent of the number of nodes. It’s a
constant, tunable by the $T_{ping}$ interval.&lt;/p&gt;
&lt;h2 id=&#34;why-did-swim-stick-with-me&#34;&gt;Why Did SWIM Stick With Me&lt;/h2&gt;
&lt;p&gt;So that’s the trick. It’s quite simple! Just outsource some of our detection to
you neighbors. What made SWIM stick with me is that it is &lt;em&gt;clever&lt;/em&gt;. It seems to
legitimately require some inspiration to get to this solution. We could try to
build up other approaches on top of all-to-all heartbeating, but most of the
obvious improvements aren&amp;rsquo;t competitive with SWIM. For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Subset heartbeating, where you pick a subset of your membership list and only
ping those. &lt;em&gt;This reduces the number of messages you need to send, but
increases the time to detection significantly.&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;Centralized heartbeating, where you elect one node as a leader and it’s the
only one that sends the pings and has authority over the membership list.
&lt;em&gt;This also reduces the number of total network messages, but puts undue load
on a single node.&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;Basic Gossip Propagation, which looks like “SWIM without the outsourced
heartbeats”. Health information is piggy-backed on ping packets, but you only
ever rely on your own direct pings. &lt;em&gt;This also has reduced network messages
and bounded per-node load, but takes $O(\log(N))$ ping intervals to propagate
through the whole network &amp;ndash; not constant like SWIM gives you.&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;All of these have tradeoffs that SWIM exceeds. SWIM is simple, elegant, solves a
challenging problem, and felt to me like “algorithm design has something
interesting to say about distributed systems”. That’s ultimately why it’s stuck
with me since I learned of it.&lt;/p&gt;
&lt;h2 id=&#34;further-reading&#34;&gt;Further Reading&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;SWIM in practice:
&lt;ul&gt;
&lt;li&gt;Uber uses/used it in their &lt;a href=&#34;https://github.com/uber/ringpop-go&#34;&gt;ringpop&lt;/a&gt;
application-level sharder.&lt;/li&gt;
&lt;li&gt;Hashicorp&lt;sup id=&#34;fnref:2&#34;&gt;&lt;a href=&#34;#fn:2&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;2&lt;/a&gt;&lt;/sup&gt;
&lt;a href=&#34;https://www.hashicorp.com/en/resources/everybody-talks-gossip-serf-memberlist-raft-swim-hashicorp-consul&#34;&gt;uses/used&lt;/a&gt;
it in &lt;a href=&#34;https://developer.hashicorp.com/consul&#34;&gt;Consul&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://courses.grainger.illinois.edu/cs425/fa2022/L6.FA22.pdf&#34;&gt;CS425 slides&lt;/a&gt;
on Failure Detectors and Membership lists&lt;/li&gt;
&lt;/ul&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;I feel a small amount of pride in pointing to the Wikipedia article for
SWIM, as I was the one to create its sequence diagram, and also spent some
time generally improving the quality of that article.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:2&#34;&gt;
&lt;p&gt;RIP.&amp;#160;&lt;a href=&#34;#fnref:2&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
        <title>The Decline of the Software Drafter?</title>
        <link>https://benjamincongdon.me/blog/2025/12/08/The-Decline-of-the-Software-Drafter/</link>
        <pubDate>Mon, 08 Dec 2025 00:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/12/08/The-Decline-of-the-Software-Drafter/</guid>
        <description>&lt;h1 id=&#34;1&#34;&gt;1.&lt;/h1&gt;
&lt;p&gt;It’s hard not to think about the direction the software engineering field is
going in. I don’t think you, dear reader, need to be reminded of this, but just
to set up some timeline tentpoles:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;In &lt;a href=&#34;https://benjamincongdon.me/blog/2024/07/21/How-I-Use-AI-Mid-2024/&#34;&gt;mid 2024&lt;/a&gt;, Github Copilot was
something of a pleasant convenience for coding. This was the “fancy
autocomplete” era. A “nice to have”, which you &lt;em&gt;could&lt;/em&gt; lean on, but where
you still felt fully in-control of the process of writing code.&lt;/li&gt;
&lt;li&gt;In &lt;a href=&#34;https://benjamincongdon.me/blog/2025/02/02/How-I-Use-AI-Early-2025/&#34;&gt;early February (2025)&lt;/a&gt;, I
wrote about how I was having more success with Cursor (still great) and
Copilot Edits (largely left in the dust). I distinctly remember writing
greenfield code with the Cursor Composer in late 2024, being quite
impressed, but still finding that the frontier models of the time would fall
over after a project got much past 1-2k lines of code.&lt;/li&gt;
&lt;li&gt;Weeks after I wrote that, the initial version of Claude Code was released.
Codex (in April) and Gemini CLI (in June) followed soon after. Claude Code
was a step change improvement. It “just worked”. You told Claude what to
code, and it largely could just do that.&lt;/li&gt;
&lt;li&gt;Claude Sonnet 4.5, released September 29, and Claude Opus 4.5, released
November 24, somehow both felt like breakthroughs in coding ability.&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt;
Codex CLI has improved through the year as well. I still find Claude Code to
be the “best” of the CLIs, but Codex is still quite competitive. Cursor
remains competitive as well.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In late 2024, I remember having a bit of the feeling of being “the weird one” at
work, who tried to offload as much of my coding on LLMs as possible. I developed
a bit of a reputation for this, and throughout 2025 I’d have people
spontaneously ask me in 1:1s if I’d found any new techniques or tips for getting
better performance out of AI coding tools.&lt;/p&gt;
&lt;p&gt;These tools are excellent. They’re also, descriptively, somewhat of an
inevitability. The tools exist, they are getting better month-over-month, there
are market forces pushing for their adoption, for the time being there is
immense comparative advantage in being able to wield them effectively; ignoring
them seems increasingly unwise.&lt;/p&gt;
&lt;p&gt;Throughout 2025, I felt more and more that I was able to offload much of the
implementation of my work to coding agents. I’m a Tech Lead, so my time is split
between meetings, more meetings, writing docs, and preciously guarded IC time.
Offloading more to coding agents increased my effective output, as I could fire
off an agent (or 2, or 3) before a meeting, check their work afterwards, and
guide them towards solutions over the course of my day.&lt;/p&gt;
&lt;p&gt;This wasn’t “merely” greenfield work, either. Much of this was changes to old,
complex, mission critical codebases. I reviewed the output with an appropriate
amount of rigor, and became increasingly impressed.&lt;/p&gt;
&lt;p&gt;Look, I don’t want to sound like I’m an Anthropic marketer, but if you look at
this “”chart”” from the February 24
&lt;a href=&#34;https://www.anthropic.com/news/claude-3-7-sonnet&#34;&gt;release&lt;/a&gt; of Claude Sonnet 3.7
and Claude Code, I think they landed their roadmap for this year.&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
        
        
        
        
        
        
        
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/08/The-Decline-of-the-Software-Drafter/claude_graph.webp&#34;&gt;
            
            
            
            &lt;img
                src=&#34;https://benjamincongdon.me/blog/2025/12/08/The-Decline-of-the-Software-Drafter/claude_graph.webp&#34; alt=&#34;Claude Code release &amp;amp;rsquo;timeline&amp;amp;rsquo; chart&#34;
                loading=&#34;lazy&#34;
                decoding=&#34;async&#34;
            &gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;Claude Code release &amp;rsquo;timeline&amp;rsquo; chart&lt;a href=&#34;https://www.anthropic.com/news/claude-3-7-sonnet&#34;&gt;Source&lt;/a&gt;&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;Claude Code &amp;ndash; the others too, but in particular CC for me &amp;ndash; &lt;em&gt;feels&lt;/em&gt; and &lt;em&gt;acts&lt;/em&gt;
like a collaborator at this point. In the first half of the year, it felt like a
reasonably competent intern; in the latter half, it’s a jagged-but-staggeringly
competent junior engineer who you can sometimes coach into writing legitimately
inspired code with a simple “&lt;em&gt;but how would a senior engineer refactor this for
modularity and maintainability?&lt;/em&gt;”&lt;/p&gt;
&lt;p&gt;All that to say: with the release of Opus 4.5, I sense a shift. I find no
pleasure in saying this. &lt;strong&gt;Most of what we call coding is now “solved”.&lt;/strong&gt; I
don’t think, however, that we’re post “software engineer”. I don’t think we are
in any way
&lt;a href=&#34;https://every.to/thesis/two-ways-to-win-in-the-post-software-era&#34;&gt;“post-software”&lt;/a&gt;.
I expect the amount of software being written to dramatically increase as the
cost to produce it falls. I also believe that we will still need deeply
technical people to guide that production, and develop and maintain the
infrastructure to run it.&lt;/p&gt;
&lt;p&gt;And yet: We are nearly at coder-equivalency for economically useful coding. A
sufficiently experienced software engineer &lt;em&gt;can&lt;/em&gt; now write &amp;gt;90% of
production-ready code purely through prompting. You still largely need to know
&lt;em&gt;what&lt;/em&gt; to prompt, but even that becomes easier as models&amp;rsquo; intuitions for
software development improve.&lt;/p&gt;
&lt;p&gt;I was admittedly skeptical at Dario Amodei’s
&lt;a href=&#34;https://www.businessinsider.com/anthropic-ceo-ai-90-percent-code-3-to-6-months-2025-3&#34;&gt;March 2025 claim&lt;/a&gt;
that this would happen within the year; it happened a few months later than his
timeline, but in my opinion it did happen. The obvious corollary is that, for
now, that remaining 10% becomes extremely valuable.&lt;/p&gt;
&lt;p&gt;The industry, collectively, has had less than a year to respond to this.
Regardless of what various leadership-class people say, this is not yet priced
in. It’s not priced in to e.g. CS university program structure, engineering
ladders, company structure, team structure, and so on.&lt;/p&gt;
&lt;p&gt;To think through what may remain, I’ll attempt to defer to history.&lt;/p&gt;
&lt;h1 id=&#34;2-drafters-a-historical-analogy&#34;&gt;2. Drafters: A Historical Analogy&lt;/h1&gt;
&lt;p&gt;There’s an analogy that’s been bouncing around in my head as I’ve been
processing what appears to be this upcoming shift in software production. It
goes something like this: in the mid-1900s, drafters &amp;ndash; those who would
literally draft engineering diagrams for architects, civil engineers, electrical
engineers, mechanical engineers, and so on, were in relatively high demand. As
were &lt;a href=&#34;https://en.wikipedia.org/wiki/Computer_(occupation)&#34;&gt;human computers&lt;/a&gt;.&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
            
        
        
        
            
            
            
            
            
            
                
                
                
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/08/The-Decline-of-the-Software-Drafter/drafter.jpg&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/08/The-Decline-of-the-Software-Drafter/drafter_hua579e22dbb6db3a254524b9e40b4baa0_93917_0x700_resize_q100_h2_lanczos.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/08/The-Decline-of-the-Software-Drafter/drafter_hua579e22dbb6db3a254524b9e40b4baa0_93917_0x700_resize_q100_lanczos.jpg&#34; alt=&#34;&amp;amp;lsquo;Man Sitting at Drafting Board&amp;amp;rsquo;, circa 1936&#34; width=&#34;604&#34; height=&#34;350&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;&amp;lsquo;Man Sitting at Drafting Board&amp;rsquo;, circa 1936&lt;a href=&#34;https://commons.wikimedia.org/wiki/File:Man_sitting_at_drafting_board_-_NARA_-_283828.jpg&#34;&gt;Source&lt;/a&gt;&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;Automation came for these fields. CAD, and AutoCAD in particular in the early
1980s, automated much of engineering drafting. Not all, but a lot of it.
Automated circuit design via Electronic Design Automation automated much of the
manual work done to create PCBs. And, of course, &lt;em&gt;human&lt;/em&gt; computers, well&amp;hellip;
yeah.&lt;/p&gt;
&lt;p&gt;I tried getting actual labor statistics for this. For the time period I most
cared about (between 1940 and 2000), the data online was spotty for the amount
of effort I was willing to put into this.&lt;sup id=&#34;fnref:2&#34;&gt;&lt;a href=&#34;#fn:2&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;p&gt;In any case, this seems like a compelling analog for one future of software
engineering. Drafters were a distinct role, separate from e.g. a mechanical
engineer or structural engineer. Software “engineering” has a bit of both; it
encompasses design, production (coding), and maintenance (DevOps, woo&amp;hellip;).&lt;/p&gt;
&lt;p&gt;CAD and EDA didn’t destroy drafting and related technical-but-not-engineering
roles. CAD technician jobs still exist today, but it’s a more commoditized skill
than “pure” engineering. If this analogy holds, then the parts of the software
engineering world that looked more like “software drafters” than
“deep-in-the-design engineers” will largely be commoditized and will shrink.&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
            
        
        
        
        
            
            
            
            
            
                
                
                
            
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/08/The-Decline-of-the-Software-Drafter/cad.png&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/08/The-Decline-of-the-Software-Drafter/cad_hu08e8198120d1f581efba80fb89f0d375_916915_800x0_resize_q100_h2_lanczos_3.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/08/The-Decline-of-the-Software-Drafter/cad_hu08e8198120d1f581efba80fb89f0d375_916915_800x0_resize_lanczos_3.png&#34; alt=&#34;Drafter, 1992&#34; width=&#34;400&#34; height=&#34;410&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;Drafter, 1992&lt;a href=&#34;https://commons.wikimedia.org/wiki/File:Drafter_-_1992_-_BLS.png&#34;&gt;Source&lt;/a&gt;&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;It’s unclear what chunk of the existing SWE ecosystem &lt;em&gt;is&lt;/em&gt; drafters. I think
most people that get into SWE for enjoyment of the field probably aren’t in this
category, but it’s definitely out there.&lt;/p&gt;
&lt;h1 id=&#34;3-what-remains&#34;&gt;3. What Remains&lt;/h1&gt;
&lt;p&gt;If I was able to offload &lt;em&gt;all&lt;/em&gt; my coding work on an AI, there still does seem
like a lot left to do as a software engineer. Great! I don’t have to code
anymore. Ah, wait &amp;ndash; still have to figure out how to get all the infrastructure
running to build out my new project, keep the lights on for the dozens of
components we already have in flight, reason through observability, debug wicked
weirdness that happens past the 99th percentile tails, review my agents’ and my
coworkers’ code, write designs, review designs, give career coaching, and on and
on.&lt;/p&gt;
&lt;p&gt;Coding also isn&amp;rsquo;t completely &amp;ldquo;solved&amp;rdquo; yet either! I may not be in the mud
writing thousands of lines of code, but I do still exert significant editorial
control over everything that gets checked-in.&lt;/p&gt;
&lt;p&gt;AI helps with a lot of this. I feel like I have a lot more leverage than I had a
year or two ago, but we’re still pretty far from closing the loop entirely.
AI-assisted debugging, great! AI-assisted responses in the internal help channel
for what my team owns, great! AI-assisted first passes at design docs I’ve
written, great! AI-assisted code review, great!&lt;/p&gt;
&lt;p&gt;I notice that my work days still contain many hours, I still remain busy, much
work remains to be done. Infrastructure remains &lt;strong&gt;hard&lt;/strong&gt;. If anything, all this
has just allowed us to &lt;em&gt;do&lt;/em&gt; more and be more ambitious. Systems rely on momentum
and inertia in part because it requires a ton of time and effort to evolve a
complex system. Reducing the time to build new systems or evolve old ones is
legitimately helpful, but human-used systems still operate on human-driven time
scales &amp;ndash; for now.&lt;/p&gt;
&lt;p&gt;At all the companies I’ve worked with, “produces high quality code artifacts”
was always a bullet point on the SWE ladder, but it was but one of a dozen other
skills and responsibilities that come along with the role. In the immediate
future, I don’t see this expectation going away. In the medium term, I could see
this transitioning away from a focus on artifacts, to a focus on impact &amp;ndash; as
was already the expectation for more senior engineers.&lt;/p&gt;
&lt;p&gt;I predict &lt;a href=&#34;https://benjamincongdon.me/blog/2025/07/31/The-Agency-Gap/&#34;&gt;agency&lt;/a&gt; &amp;amp; impact will become more
of an expectation for junior engineers, since the tools for helping yourself
have become so powerful.&lt;/p&gt;
&lt;p&gt;What remains true:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Knowing how code works at a gears-level is still largely valuable.&lt;/li&gt;
&lt;li&gt;Foundational computer science knowledge is still largely valuable.&lt;/li&gt;
&lt;li&gt;Software “taste” or discernment or heuristics is still largely valuable.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;What is probably &lt;strong&gt;not&lt;/strong&gt; true:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Being &lt;em&gt;solely&lt;/em&gt; a “coder” is a path to stable employment.&lt;/li&gt;
&lt;li&gt;Being a software &amp;ldquo;craftsperson&amp;rdquo; &lt;em&gt;ceases&lt;/em&gt; to be valuable or enjoyable.
(Rather, I think the value will likely shift and look more like an actual
craft &amp;ndash; high end, not commoditized, but not mass market.)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;There is a &lt;a href=&#34;https://www.gleech.org/ai2025&#34;&gt;lot going on in AI as we end 2025&lt;/a&gt;. I
have grown to dislike the phrase
&lt;a href=&#34;https://www.youtube.com/watch?v=IwU0Eqe9v6A&#34;&gt;“these are the worst the models will ever be”&lt;/a&gt;
because I feel like it tries to prove too much and has been co-opted as
marketing. But adopt that frame for a minute and project a year or two forward
to where things &lt;em&gt;could&lt;/em&gt; go. What skills remain valuable? What deep knowledge do
you still need to draw upon to know if you’re being BS’d by a confused coding
agent? What can you learn now, build now (experientially!), given that the cost
to churn out new quality code is falling to zero? What cognitive work can you
reinvest your time in, if coding demands less of your work day? What types of
people are going to be good leaders in times of industry change, like these?
What type of organization do you want to invest your career capital in?&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Cover:
&lt;a href=&#34;https://commons.wikimedia.org/wiki/File:Paolo_Monti_-_Servizio_fotografico_(Milano,_1980)_-_BEIC_6354404.jpg&#34;&gt;Wikimedia&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Obligatory: These are my own opinions and do not reflect those of my employer,
etc.&lt;/em&gt;&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;They seemed to me like breakthroughs in non-coding abilities as well, which
is legitimately impressive as neither of them is &lt;em&gt;specialized&lt;/em&gt; as a coding
model.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:2&#34;&gt;
&lt;p&gt;Epistemic status: &lt;em&gt;Vibe check.&lt;/em&gt;
&lt;a href=&#34;https://www2.census.gov/library/publications/1949/compendia/hist_stats_1789-1945/hist_stats_1789-1945-chD.pdf&#34;&gt;One source&lt;/a&gt;
put the number of drafters in 1940 at 111k, up from 98k in 1930 and 66k
in 1920. Confusingly, this
&lt;a href=&#34;https://www.bls.gov/opub/mlr/1990/12/rpt1full.pdf&#34;&gt;BLS paper&lt;/a&gt; seems to only
indicate there were ~55k drafters in 1990 (though their counting mechanism
is unclear). This
&lt;a href=&#34;https://fraser.stlouisfed.org/title/occupational-employment-wages-9655/2000-707399/content/txt/ocwage_20011114&#34;&gt;St. Louis Fed&lt;/a&gt;
report indicates 200k drafters in 2000 and 193k in 2023. To make meaningful
comparisons, we’d have to correct for population growth, definitional shift,
and so on. Squinting at the numbers, it seems like post-2000 there has been
a slight decline in drafting employment, definitely compared to the rest of
the economy.&amp;#160;&lt;a href=&#34;#fnref:2&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
        <title>Embodied Cognition and the &#34;Tokenverse&#34;</title>
        <link>https://benjamincongdon.me/blog/2025/12/07/Embodied-Cognition-and-the-Tokenverse/</link>
        <pubDate>Sun, 07 Dec 2025 00:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/12/07/Embodied-Cognition-and-the-Tokenverse/</guid>
        <description>&lt;h1 id=&#34;1&#34;&gt;1.&lt;/h1&gt;
&lt;p&gt;One of the common criticisms of modern AI systems is that they aren’t
sufficiently &lt;em&gt;embodied&lt;/em&gt;. The idea being there’s some inherent &lt;em&gt;quality&lt;/em&gt; of being
an agent embedded inside a body in the physical world which cannot be attained
by a token-predicting LLM, regardless of how intelligent an agent becomes.&lt;/p&gt;
&lt;p&gt;To address the validity of this criticism, we need to have a philosophically
rich understanding of what embodiment &lt;em&gt;is&lt;/em&gt; and what it &lt;em&gt;gets us&lt;/em&gt; in terms of
cognitive capacities.&lt;/p&gt;
&lt;h1 id=&#34;2-the-four-es&#34;&gt;2. The Four E’s&lt;/h1&gt;
&lt;p&gt;The best framework I’ve yet found for understanding the benefits of embodiment
for cognition is &lt;a href=&#34;https://en.wikipedia.org/wiki/4E_cognition&#34;&gt;“4E cognition”&lt;/a&gt;.
The four E’s here being: embodied, embedded, extended, and enactive.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Embodied cognition&lt;/strong&gt;, our principal concern, is the notion that the body in
which an agent is embedded is deeply tied to the cognitive processes that the
agent can support. It’s the opposite of “brain in a vat”; the body constrains
and guides the concepts that a brain can conceive. The fact that colors,
material properties, and the rest of the pieces of our inner mental worlds
“exist” is guided by the sensorial situation we’ve found ourselves in &amp;ndash; having
eyes to perceive colors, sense organs to feel texture, and so on. Furthermore,
our conceptual handles for things like doors, chairs, water glasses, and so on
are constrained by how we engage with the physical world. In this view, “chair”
is not a natural category in the “out there” world; it is contingent on our
embodiment.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Embedded cognition&lt;/strong&gt; is cognition that is aided by being situated in a
surrounding environment that supports the cognitive tasks. This is specifically
referring to environmental affordances which support the cognitive process,
which are still happening in the brain. For example, tool use &amp;ndash; using an abacus
to support a math operation, or using a book shelf to sort books, are examples
wherein an agent’s cognitive load is reduced by being embedded into a suitable
environment.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Extended cognition&lt;/strong&gt; suggests that aspects of your environment, outside your
brain, are not just &lt;em&gt;supporting&lt;/em&gt; your cognition but is actually a &lt;em&gt;constituent
part&lt;/em&gt; of the cognitive process. An example of this is using a notebook or
smartphone as an aid to memory and focus.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Enactive cognition&lt;/strong&gt; is the idea that cognition “emerges from sensorimotor
activity” &amp;ndash; that there exists no bright line between premeditation and action.&lt;/p&gt;
&lt;h1 id=&#34;3-perception-and-embodiment&#34;&gt;3. Perception and Embodiment&lt;/h1&gt;
&lt;p&gt;Going deeper on the embodiment critique, I think it becomes useful to look at
the interaction between agent and environment: perception. The central argument
of embodied cognition is that the interaction between environment and cognition
is somehow essential for getting the type of thinking that humans are capable
of.&lt;/p&gt;
&lt;p&gt;From David Chapman and Jake Orthwein’s recent wide-ranging
&lt;a href=&#34;https://meaningness.substack.com/p/maps-of-meaningness&#34;&gt;discussion&lt;/a&gt;, we get the
following view of perception:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The reason why 4E-type things are part of the solution to [the problem of
perceptual relevance] is that you sort of evolve to construe the world in
certain narrow ways that have to do with the kind of thing that you are. You
don’t encounter the world as it is first and then have to select from among
that &amp;ndash; &lt;strong&gt;the world presents itself to you in terms of the kind of being that
you are. And that automatically narrows the frame of perception
dramatically.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;(Emphasis mine)&lt;/p&gt;
&lt;p&gt;This is a fairly radical claim. The intuitive view of perception is that there
is a world “out there” and that we are perceiving that world in our minds in a
way that reflects how the world actually is &amp;ndash; with the implied corollary that
any creature would see the world roughly the same way as us.&lt;/p&gt;
&lt;p&gt;But this idea flips that intuition. From the infinite ways that sense organs
&lt;em&gt;could&lt;/em&gt; make an internal map of the exterior world territory, the maps that we
tend towards making are the ones that are selected by the type of being that one
is. Creatures that can walk will see doors as “walk-through-able” in a way that
water-bound creatures would not.&lt;/p&gt;
&lt;p&gt;Our ability to resolve raw sense data into well-defined objects, intuitively and
preconsciously, is not because those objects &lt;em&gt;exist&lt;/em&gt; in the world in some
objective sense, but because we’re the type of creature that finds use in, for
example, being able to distinguish between objects that are likely dangerous or
safe. The categories that feel &amp;ldquo;natural&amp;rdquo; to us are natural only relative to our
particular form of embodiment, a pure circumstantial contingency.&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
        
        
            
            
            
            
            
            
            
            
            
            
            
            
                
            
        
        
        
        
        
        
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/07/Embodied-Cognition-and-the-Tokenverse/perception-filtering.png&#34;&gt;
            
            
            
            &lt;img
                src=&#34;https://benjamincongdon.me/blog/2025/12/07/Embodied-Cognition-and-the-Tokenverse/perception-filtering.png&#34; alt=&#34;Perception filtering&#34;
                loading=&#34;lazy&#34;
                decoding=&#34;async&#34;
            &gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;Perception filtering&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;This figure crudely illustrates the process of perception filtering. There
exists some sense data &amp;ldquo;out there&amp;rdquo;, and depending on the type of being, that
sense data is filtered into a different set of concepts. The human resolves the
outlines of the black object against the blue background as a &amp;ldquo;chair&amp;rdquo;, whereas a
hypothetical aquatic herbivore evolved to use its visual sense data to search
for food resolves the green patches and attaches the concept of &amp;ldquo;food&amp;rdquo; to them.
Same sense data, different resolved concepts.&lt;/p&gt;
&lt;p&gt;Orthwein continues:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;And then perception is being narrowed still further by your goals, your
motivations, what sorts of states are active in you, and the actual embodied
situation that you find yourself in each moment. &lt;strong&gt;So you’re never doing this
rationalist view from nowhere from which you have to deduce what’s relevant.&lt;/strong&gt;
The world is giving you a relevance moment by moment by moment, which is the
Heidegger “always already meaningful” picture.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is the second level of filtering. It’s not just “what can you perceive
given your contingent embodiment”, but it’s also “as an agent with goals and
motivations, certain things will be more or less salient, preconsciously”. The
key piece for me, here, is the notion that inside our heads we’re never
&lt;em&gt;actually&lt;/em&gt; doing the rationalist “view from nowhere” stance. We can try to
approximate this with deliberative thinking, but this is only ever a mere
approximation. Embodiment affords and demands a level of perception-based
filtering.&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;h1 id=&#34;4-mapping-the-4es-to-llms&#34;&gt;4. Mapping the 4E’s to LLMs&lt;/h1&gt;
&lt;p&gt;OK, so I’ll admit this has become a bit abstract. Let’s rein this in by trying
to map these concepts onto extant LLMs.&lt;/p&gt;
&lt;p&gt;Going in reverse in the 4Es:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Are agents enactive?&lt;/strong&gt; Unclear. Sufficiently harnessed AI agents can “take
action” (heavy scare quotes intended) in the world, but it’s not clear how
serious we should take this. A thermostat is “enactive” in the world, in
that it’s able to control its environment through action, and the
environment in a sense allows it to “enact” its cognition. If we grant this
property for thermostats, we should likely grant it for AI agents as well,
but this doesn’t seem to prove much.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Are agents extended?&lt;/strong&gt; Definitely, yes. Claude Code’s uses TODOs
essentially the same way a human would. Models have limited memory in their
context window. Reading/writing from a scratch pad, or using a
code-interpreter to offload complicated math seems like “extended” cognition
in a very similar way to human extended cognition.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Are agents embedded?&lt;/strong&gt; I’d say this is a lukewarm “maybe”. We scaffold
LLMs into “agents” specifically by putting them into harnesses which allow
them to interact with the world. However, the cognition itself isn’t
embedded in the environment. There is still a clear line to be drawn between
“what is LLM” and “what is environment”.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Are agents embodied?&lt;/strong&gt; I’d written up to this point having in my mind a
clear “no”. I still think the answer is pretty firmly “no” for most
reasonable definitions of “embodied”, but I’m less certain than I used to
be. Certainly, &lt;em&gt;certainly&lt;/em&gt;, I’m not making the claim that AI agents have any
form of physical embodiment. This is just obviously false. And the extent to
which we
&lt;a href=&#34;https://deepmind.google/blog/gemini-robotics-brings-ai-into-the-physical-world/&#34;&gt;slap an LLM onto a robot&lt;/a&gt;
and call it embodied &amp;ndash; I think that’s cool, but architecturally still
distinct from what 4E is calling embodied.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;What’s still bringing me back to a firm “no” on the embodiment question, setting
aside common sense, is the notion that “constitution” is an important theme of
true embodied cognition:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Constitution: The body (and, perhaps, parts of the world) does more than
merely contribute causally to cognitive processes: it plays a constitutive
role in cognition, literally as a part of a cognitive system. Thus, cognitive
systems consist in more than just the nervous system and sensory organs.
(&lt;a href=&#34;https://plato.stanford.edu/entries/embodied-cognition/#ThreThemEmboCogn&#34;&gt;Source&lt;/a&gt;)&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This just does not seem to be true of LLMs. Scaffolded AI agents have “senses”
in that they can be told about things happening in the world, and they can even
request measurements to be taken of the world on their behalf, but they are not
&lt;em&gt;of&lt;/em&gt; the world. Everything is still just tokens to them. Even multimodal visual
information is still provided in the same learned representational space of
tokens. There is no sense in which they have sense perception of the world, or
parts of their cognition which are causally &lt;em&gt;of&lt;/em&gt; the world, such that one could
say that their &lt;em&gt;cognition is actually embedded in the world&lt;/em&gt;.&lt;/p&gt;
&lt;h1 id=&#34;5-the-tokenverse&#34;&gt;5. The Tokenverse&lt;/h1&gt;
&lt;p&gt;A common critique of the field of AI alignment is the provocative question
“Aligned to &lt;em&gt;what&lt;/em&gt;?” I think there’s a natural analogy here: “Embodied in
&lt;em&gt;what&lt;/em&gt;?” If we grant LLMs some primitive form of embodiment, they seem to be
embodied in something like a “tokenverse”, a linguistic ecosystem, not the
physical world.&lt;sup id=&#34;fnref:2&#34;&gt;&lt;a href=&#34;#fn:2&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;p&gt;Agents acting in this “tokenverse” still seem to be quite capable of taking
action in our real physical world &amp;ndash; albeit in ways that we’ve constructed to
mediate these actions.&lt;/p&gt;
&lt;p&gt;So what does this mean for the criticism that “AI systems are insufficiently
embodied”? I think insofar as this criticism is trying to say that “AI cognition
is different from human cognition”, that is just obviously true. However, this
does give us a useful handle on &lt;em&gt;how&lt;/em&gt; these two forms of cognition are
different.&lt;sup id=&#34;fnref:3&#34;&gt;&lt;a href=&#34;#fn:3&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;p&gt;Our embodied perception filtered &amp;ldquo;the world&amp;rdquo; into concepts. Those concepts
ricocheted off billions of human minds, eventually crystallizing in the internet
datasets that became the basis for LLM training data.&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
        
        
            
            
            
            
            
            
            
            
            
            
            
            
                
            
        
        
        
        
        
        
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/07/Embodied-Cognition-and-the-Tokenverse/filtering-to-data.png&#34;&gt;
            
            
            
            &lt;img
                src=&#34;https://benjamincongdon.me/blog/2025/12/07/Embodied-Cognition-and-the-Tokenverse/filtering-to-data.png&#34;
                loading=&#34;lazy&#34;
                decoding=&#34;async&#34;
            &gt;
            
        &lt;/a&gt;
    

    
&lt;/figure&gt;

&lt;p&gt;Now LLMs use these human-derived concepts in their abstract &amp;ldquo;tokenverse&amp;rdquo; to take
actions that impact our physical world. Perhaps this is why
&lt;a href=&#34;https://www.lesswrong.com/posts/Nwgdq6kHke5LY692J/alignment-by-default&#34;&gt;“alignment by default”&lt;/a&gt;
seems to be actually going fairly well so far? Naively, it would seem much
easier to “align” something which already shares a reasonably close set of
conceptual handles to us.&lt;/p&gt;
&lt;p&gt;LLMs didn’t have to learn how to interact with the world and form their own
conceptual handles based on their weird alien understanding of what it means to
exist in the universe, they were able to learn these from statistical patterns
in tokenized concepts from human-derived data.&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
        
        
            
            
            
            
            
            
            
            
            
            
            
            
                
            
        
        
        
        
        
        
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/07/Embodied-Cognition-and-the-Tokenverse/full-loop.png&#34;&gt;
            
            
            
            &lt;img
                src=&#34;https://benjamincongdon.me/blog/2025/12/07/Embodied-Cognition-and-the-Tokenverse/full-loop.png&#34;
                loading=&#34;lazy&#34;
                decoding=&#34;async&#34;
            &gt;
            
        &lt;/a&gt;
    

    
&lt;/figure&gt;

&lt;p&gt;Now we have full loop. Human experience is distilled into concepts represented
as statistical patterns in the training data, a &amp;ldquo;byproduct&amp;rdquo; of embodied
experience. LLMs train against this data and learn statistical patterns, but
can&amp;rsquo;t influence the training data &amp;ndash; its &amp;ldquo;tokenverse&amp;rdquo; &amp;ndash; in the same way that
humans can influence the world through their own physical embodiment as part of
their feedback loop. Similarly, when LLMs take actions in the world, they are
doing so in a way that is constrained by the concepts in their &amp;ldquo;tokenverse&amp;rdquo;
rather than the world itself. At present, LLMs are also limited in their ability
to be influenced by the world itself, as they are mediated by a set of static
model weights which aren&amp;rsquo;t themselves influenced by the world during
interaction.&lt;/p&gt;
&lt;p&gt;If we take the “tokenverse” idea seriously, we should consider what is implied
by being &amp;ldquo;embodied&amp;rdquo; within it. LLMs clearly develop some internal conceptual
landscape that produces their resulting outputs. Empirical investigation into
model features supports this. It would be quite surprising if the learned
concepts mapped 1:1 onto human concepts. This would imply that the conceptual
landscape that best withstands the optimization pressures of model training maps
with high fidelity directly to human filtered concepts, which would be a quite
strong claim. More likely, LLMs develop an internal conceptual landscape that
&lt;em&gt;approximates&lt;/em&gt; human concepts, but is itself quite foreign. Indeed, empirically
this appears to be
&lt;a href=&#34;https://www.alignmentforum.org/posts/StENzDcD3kpfGJssR/a-pragmatic-vision-for-interpretability&#34;&gt;pragmatically indecipherable&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;For now, if LLMs are embodied of anything, they are embodied of the texts and
concepts we first perceived via our own physical embodiment.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Cover: Barcelona, Spain&lt;/em&gt;&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;I&amp;rsquo;m going to have to read more about Heidegger&amp;rsquo;s notions of
&amp;ldquo;pre-ontological&amp;rdquo; existence now, I suppose.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:2&#34;&gt;
&lt;p&gt;As an aside, I don&amp;rsquo;t think there&amp;rsquo;s any spooky going on with respect to
tokenization in particular. That is, if LLMs eventually transition to using
byte-level tokens, I don&amp;rsquo;t think that changes the fundamental argument.&amp;#160;&lt;a href=&#34;#fnref:2&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:3&#34;&gt;
&lt;p&gt;Here, we are implicitly granting that AI cognition &lt;em&gt;is&lt;/em&gt; cognition. I’m not
100% sure we should be ready to bite that bullet, but that’s a topic for
another time.&amp;#160;&lt;a href=&#34;#fnref:3&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
        <title>Book Review: Antimemetics</title>
        <link>https://benjamincongdon.me/blog/2025/12/06/Book-Review-Antimemetics/</link>
        <pubDate>Sat, 06 Dec 2025 00:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/12/06/Book-Review-Antimemetics/</guid>
        <description>
&lt;figure&gt;
    
    

    
    

    
    
        
        
            
        
        
        
            
            
            
            
            
            
                
                
                
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/06/Book-Review-Antimemetics/antimemetics.png&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/06/Book-Review-Antimemetics/antimemetics_huc0a08881dc09f80f5d612f0ebba04d14_215405_0x600_resize_q100_h2_lanczos_3.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/06/Book-Review-Antimemetics/antimemetics_huc0a08881dc09f80f5d612f0ebba04d14_215405_0x600_resize_lanczos_3.png&#34; width=&#34;208&#34; height=&#34;300&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;&lt;a href=&#34;https://darkforest.metalabel.com/antimemetics&#34;&gt;Source&lt;/a&gt;&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;h2 id=&#34;1-slippery-ideas&#34;&gt;1. Slippery Ideas&lt;/h2&gt;
&lt;p&gt;Nadia Asparouhova’s &lt;em&gt;Antimemetics&lt;/em&gt; is, itself, antimemetic.&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt; I devoured this
book in a few sittings on the bus to work, but if I had to describe it, I really
only have a few conceptual handles that I could grasp onto:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Memes are ideas that spread easily. &lt;strong&gt;Antimemes are ideas that resist
spreading.&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;We live in an information ecosystem which is made up of various types of
memes. Memes have varying level of impact, salience, and transmissibility.&lt;/li&gt;
&lt;li&gt;Often the most useful ideas are antimemetic.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Here, of course, we’re talking of &lt;a href=&#34;https://en.wikipedia.org/wiki/Meme&#34;&gt;meme&lt;/a&gt; in
the Richard Dawkins sense &amp;ndash; not image macros (necessarily), but ideas that are
spread through social or cultural forces. Once you see memes as such, they’re
everywhere: fashions, neologisms, dances, common fears, celebrities, and so on.&lt;/p&gt;
&lt;h2 id=&#34;2-for-lack-of-an-antimemetics-division&#34;&gt;2. For Lack of an Antimemetics Division&lt;/h2&gt;
&lt;p&gt;The idea of antimemes was brought to popularity by
&lt;a href=&#34;https://en.wikipedia.org/wiki/There_Is_No_Antimemetics_Division&#34;&gt;&lt;em&gt;There is No Antimemetics Division&lt;/em&gt;&lt;/a&gt;,
by “qntm”. It was originally a serialized novella on the
&lt;a href=&#34;https://en.wikipedia.org/wiki/SCP_Foundation&#34;&gt;SCP Wiki&lt;/a&gt;. The short story
discusses a monster which is able to erase all memory of its existence.
Researchers in the story can inspect the monster and observe it, but once
they’re out of its chamber, all memory of it leaves their minds. This monster,
SCP-055, is quintessentially antimemetic.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Memes want to be shared. When we come across a particularly good meme, our
first instinct is to pass it on to someone else. &amp;hellip; Antimemes are the
opposite. When we encounter an antimemetic object, there is a reflexive desire
– consciously or not – to suppress it.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Antimemes are slippery. They’re definitionally hard to describe, hard to share;
often, they’re unsexy, uncool, and cringe.&lt;/p&gt;
&lt;p&gt;Memes, on the other hand, want to be shared. They’re self-propagating. They’re
the type of idea that has been shaped through environmental pressures into being
exactly the type of thing that you want to pass on once shared, and the idea has
packaged itself in such a way that it can be easily shared.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://en.wikipedia.org/wiki/6-7_(meme)&#34;&gt;“Six-seven”&lt;/a&gt; is the perfect meme.
There is no content to this meme, and yet it spread. Memes can be thought of as
something of an inner message (the actual &lt;em&gt;content&lt;/em&gt;), and a wrapper (the window
dressing that makes it enticing to share), akin to the
&lt;a href=&#34;https://benjamincongdon.me/blog/2021/02/21/Three-Layers-of-Information/&#34;&gt;three-tiered hierarchy of information&lt;/a&gt;
I’ve written about before. The six-seven meme has no inner message; it is pure
wrapper. To some extent, that gives it a tremendous advantage in spreading:
there is no core meaning that needs to be preserved uncorrupted.&lt;/p&gt;
&lt;p&gt;It’s much harder to think of a quintessential antimeme. Asparouhova uses
“writing your own will” as an example. Writing a will is valuable, yet most
people resist doing so for years. Terms of Service are another good antimeme:
they’re ubiquitously deployed, and ubiquitously ignored.&lt;/p&gt;
&lt;h2 id=&#34;3-the-memetic-environment&#34;&gt;3. The Memetic Environment&lt;/h2&gt;
&lt;p&gt;Memes need a medium to spread in. Ideas have been spreading for as long as human
culture has been a thing, but the development of the internet led to something
of a memetic singularity. Memes can now be generated and spread, globally, at a
scale unimaginable in the 20th century. Asparouhova terms this hyper-memetic
environment the “memetic city”:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The memetic city is easily recognizable. It is the realm of viral ideas and
social contagion: tweets that explode overnight, social media avatars
supporting the latest political cause, TikToks and Instagram Reels that we
scroll through at the end of a long day. Here, ideas spread with lightning
speed – amplified by social platforms – and shape our collective behavior and
preferences: the opinions we hold, the dates we go on, who we vote for. &amp;hellip;
[The memetic city] thrives on visibility; its power is rooted in the ability
of ideas to quickly capture our attention and replicate across the hive mind.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The foil of the memetic city is the “Dark Forest”, first defined in an
&lt;a href=&#34;https://ystrickler.medium.com/the-dark-forest-theory-of-the-internet-7dc3e68a7cb1&#34;&gt;article&lt;/a&gt;
by Yancey Strickler as a nod to the science fiction
&lt;a href=&#34;https://en.wikipedia.org/wiki/The_Dark_Forest&#34;&gt;series&lt;/a&gt; of the same name. The
Dark Forest theory, from the scifi series, is that in an adversarial
environment, it is rational to conceal yourself, lest you be predated. In an
internet overrun by spam, bots, spearphishers, SWATters, and threats of
cancellation, the sane response is to retreat from the memetic city into the
darker more personal corners of the web.&lt;sup id=&#34;fnref:2&#34;&gt;&lt;a href=&#34;#fn:2&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;2&lt;/a&gt;&lt;/sup&gt; Group chats, Discord
servers, insular Twitter communities made up of pseudonymous posters with anime
profile pictures.&lt;/p&gt;
&lt;p&gt;Asparouhova draws heavily on Venkatesh Rao’s writing on
&lt;a href=&#34;https://www.ribbonfarm.com/about/&#34;&gt;Ribbonfarm&lt;/a&gt; and
&lt;a href=&#34;https://maggieappleton.com/&#34;&gt;Maggie Appleton&lt;/a&gt;, both of whom I’ve quite enjoyed
reading in the past. The synthesis of Rao, Appleton, and others is that a
desirable response to the noise of the memetic city is a
&lt;a href=&#34;https://maggieappleton.com/cozy-web&#34;&gt;“cozy web”&lt;/a&gt;: “the private,
gatekeeper-bounded spaces of the internet we have all retreated to over the last
few years.”&lt;/p&gt;
&lt;p&gt;Tracing a broad history of internet culture in Asparouhova’s framing goes
something like: First, there was a wave of optimism that the internet could be a
force for peace and connection. Second, because of forces such as mimetic desire
and context collapse, the internet actually became a hostile home to culture
wars. Virality, first an interesting phenomena for sharing cat pictures, became
a weapon used to direct attention. Third, as a counter reaction to the previous
forces, we have now entered am “antimemetic” era, characterized by a broader
retreat from the town square into the dark forest.&lt;/p&gt;
&lt;h2 id=&#34;4-staring-at-an-antimeme&#34;&gt;4. Staring at an Antimeme&lt;/h2&gt;
&lt;p&gt;Descriptively, antimemes are the types of ideas that &lt;em&gt;don&amp;rsquo;t&lt;/em&gt; work on the memetic
stage. They’re not going to be a banger tweet, they won’t get you millions of
likes on Instagram, they’re not “cool”.&lt;/p&gt;
&lt;p&gt;There are different types of antimemes, since there are different reasons that
ideas can resist being spread:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The Boring&lt;/strong&gt;. Many ideas resist attention because they’re boring, unable to
hold our attention for long. Terms of Service are a bureaucratic nuisance, but
you click “approve” and completely forget you did so. Daylight Saving Time is
briefly memetic twice per year when the clocks change, but is quickly forgotten.
It just doesn&amp;rsquo;t appear important enough to fix, despite the common knowledge
that roughly no one wants it.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The Taboo&lt;/strong&gt;. Some ideas resist spread because there is a social cost of doing
so. Societies use taboo as an immune response to ideas that are seen as
threatening. Taboos are contextual and highly subject to opinion. These can be
reasonable &amp;ndash; such as taboos against violence &amp;ndash; or counterproductive &amp;ndash; such as
those suppressing ideas just outside the current Overton window.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Taboos are one of the most prominent categories of antimemes&amp;hellip; Some taboos –
such as stealing, cheating, or lying – don’t budge within most networks&amp;hellip; But
taboos linger precisely because their symptomatic period is so long. They can
lie dormant for years until more nodes are willing to receive or spread the
idea.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;The Uncomfortable&lt;/strong&gt;. Some ideas resist being processed by our minds out of an
ego-based or psychological immune response. Ideas that would be painful to
process largely get ignored.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We avoid thoughts that are cognitively expensive to process&amp;hellip; No one wants to
think about their own death, much less the death of themselves and their
partner simultaneously&amp;hellip; Death, retirement planning, getting married and
having kids…for many people, these ideas are difficult to prioritize because
they force us to confront uncomfortable truths.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;The Dangerous&lt;/strong&gt;. Some ideas are legitimately dangerous to share The most
obvious example is information related to CBRN (chemical, biological,
radiological, and nuclear) weapons. However, there are some more subtle
examples. Asparouhova cites Ethan Watters’ &lt;em&gt;Crazy Like Us&lt;/em&gt; which
(controversially) suggests that some mental health disorders &amp;ndash; such as anorexia
and “American-style” depression &amp;ndash; have a component of cultural transmission,
and as such information about these conditions &lt;em&gt;can&lt;/em&gt; be dangerous to susceptible
individuals. Ironically, dangerous information can be tragically memetic within
vulnerable populations, such as the subreddits devoted to enabling discussions
of eating disorders.&lt;/p&gt;
&lt;p&gt;So why care? Well, to a certain type of person (*raises hand*), antimemes are
fascinating. Antimemes are like going to the thrift store of ideas. Many are
boring, but still interesting to look at. Some are life changing. Having a deep
conversation with someone well out of your professional or social path is a
great way to be introduced to novel antimemes that you can then put in your back
pocket, to be explored later or forgotten, depending on the content.&lt;/p&gt;
&lt;p&gt;Also, antimemes are often important. They’re often the “eat your veggies” of
your information diet. Long reads are antimemetic, compared to a viral tweet.
You may even remember the viral tweet more succinctly than a closely read long
article. A good life is one not optimized for ease, but one nudged in the
direction of meaning-making. Fully ignore antimemes at your own peril.&lt;/p&gt;
&lt;p&gt;From here, Asparouhova introduces a third class of more dangerous memes that
combine the “importance seemingness” of antimemes with the transmissibility of
memes.&lt;/p&gt;
&lt;h2 id=&#34;5-supermemes&#34;&gt;5. Supermemes&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;Antimemetics&lt;/em&gt; offers the following taxonomy of memes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;“Memes” are highly transmission ideas with low impact.&lt;/li&gt;
&lt;li&gt;“Antimemes” are low transmission ideas with high impact.&lt;/li&gt;
&lt;li&gt;“Supermemes” are high transmission ideas with high impact.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We’ve talked about memes and their antimemetic foil, but we’ve yet to discuss
supermemes:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Supermemes &amp;hellip; are like black holes. Like memes, they spread quickly, but
unlike memes, they are perceived as highly consequential. Their sheer
gravitational force pulls us in, crowding out our ability to think about
anything else. Whereas antimemes are characterized by a &amp;lsquo;strange forgetting&amp;rsquo;
by the perceiver, supermemes are characterized by a &amp;lsquo;strange inability to
forget.&amp;rsquo;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Supermemes are akin to
&lt;a href=&#34;https://en.wikipedia.org/wiki/Timothy_Morton#Hyperobjects&#34;&gt;hyperobjects&lt;/a&gt; &amp;ndash;
they have some the power to totalize anything that touches them. They have the
quality of a looming catastrophe, a
&lt;a href=&#34;https://en.wikipedia.org/wiki/Shepard_tone&#34;&gt;Shepard tone&lt;/a&gt; of impending doom
that prevents one from averting their gaze.&lt;/p&gt;
&lt;p&gt;The obvious supermeme from 2023 until present has been AI. In my corner of the
world, it’s predominantly “AI Doom”. It&amp;rsquo;s clear why: AI is scary; it portends
the potential for a loss of human control; it portends the potential for the
loss of human value, of human creativity, of biological intelligence. It, like
climate change, is something that we appear to be doing to ourselves. We are
forced to look directly at the crisis as it looms. And yet, there have been
supermemes in the past. If/when we make it past our current AI doom, there will
be another supermeme to replace it.&lt;/p&gt;
&lt;h2 id=&#34;6-truth-tellers-and-champions&#34;&gt;6. Truth Tellers and Champions&lt;/h2&gt;
&lt;p&gt;Finally, Asparouhova discusses the heroes of &lt;em&gt;Antimemetics&lt;/em&gt;: truth tellers and
champions. Some antimemetic ideas are worth injecting into broader awareness,
and Asparouhova sees these two archetypes of people as those who have the
ability to do so.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Truth tellers&lt;/strong&gt; are willing to break social norms to say what others fear
saying, often with an air of the trickster. They are often seen as cringe, and
so pay a social cost.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Truth-tellers, who often operate outside of conventional norms, are especially
vulnerable to being labeled as cringe. This creates a chilling effect&amp;hellip;
Cringe suppresses the truth-tellers: the chaotic, creative idiots who
gleefully prod us to reassess what we think we know and believe.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;Champions&lt;/strong&gt; on the other hand, are tireless spreaders of a particular idea. If
we ever eventually get rid of Daylight Savings Time, it will because someone
took up the cause as a Champion and tirelessly used their energy to keep that
idea salient enough for it to be addressed.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Civilization scales its cultural awareness through &amp;lsquo;distributed remembering,&amp;rsquo;
where we empower champions to curate our attention, which expands our ability
to make progress on many different issues at once.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Society is too complex for any one person to attend to all that needs attending.
Forgetting is a necessary part of living as a human, in both pre- and
post-modern society. We distribute or shard our remembering, as a form of
specialization. Champions take a narrow slice they feel is unattended to, and
attempt to convince others to stop forgetting long enough for a problem to be
solved.&lt;/p&gt;
&lt;h2 id=&#34;7-conclusions&#34;&gt;7. Conclusions&lt;/h2&gt;
&lt;p&gt;What are we to make of &lt;em&gt;Antimemetics&lt;/em&gt;? What should we hold onto as its core,
while the rest slips out of memory?&lt;/p&gt;
&lt;p&gt;I found this book hard to pin down &amp;ndash; it’s part internet anthropology, part
vibe-check of the post-2020 world, part field guide to “interesting ideas and
where to find them”.&lt;/p&gt;
&lt;p&gt;If I were to give three final thoughts:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Occasionally force yourself to stare at antimemes. Make a log of ideas you
run into. Sure, forget most of them, but build the muscle nonetheless. Find
one or two ideas to champion to an unreasonable degree.&lt;/li&gt;
&lt;li&gt;It’s reasonable to incubate ideas in private cozy-web style environments
before deciding whether to share them more publicly. Doing so is
psychologically and socially well-adapted.&lt;/li&gt;
&lt;li&gt;Developing good “cognitive security” is already a baseline requirement for
being sane in the post-modern world. This will only become more of a
necessity with the increase of generated media. Generative AI is another
step change in the ability for memes to evolve for reproductive fitness
faster, so we should expect more captivating memetic spread in the next few
years.&lt;/li&gt;
&lt;/ul&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;Under the book’s framing, I think this is a compliment?&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:2&#34;&gt;
&lt;p&gt;Anyone who has had a post go even &lt;em&gt;semi&lt;/em&gt;-viral, I think, can feel the appeal
of the dark forest. When one of my posts gets picked up on Hacker News,
there&amp;rsquo;s this feeling of the
&lt;a href=&#34;https://en.wikipedia.org/wiki/Sauron&#34;&gt;Eye of Sauron&lt;/a&gt; being cast upon you.
&lt;em&gt;Something, somthing &amp;ldquo;Don&amp;rsquo;t read the comments.&amp;rdquo;&lt;/em&gt; Even if most of the
feedback is positive, there&amp;rsquo;s a chilling effect and set of expectations that
comes with this. &amp;ndash; A temptation to sand off the edges of ideas to make them
more palatable, or to offer fewer handles for criticism. I don&amp;rsquo;t think this
phenomena of attention warping is new to the internet, but the ability to
link/screenshot/amplify &lt;em&gt;anything&lt;/em&gt; hypercharges it. Sharing a random
half-baked idea in a private Discord group has an effectively zero chance of
blowing up on you. Sharing a half baked idea on X/Twitter also has an
effectively zero chance of blowing up on you, but it creates an ~immutable,
searchable public record of your half-baked idea, which can be trawled up
years later by someone fishing for dirt.&amp;#160;&lt;a href=&#34;#fnref:2&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
        <title>TIL: SQLite&#39;s &#39;WITHOUT ROWID&#39;</title>
        <link>https://benjamincongdon.me/blog/2025/12/05/TIL-SQLites-WITHOUT-ROWID/</link>
        <pubDate>Fri, 05 Dec 2025 00:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/12/05/TIL-SQLites-WITHOUT-ROWID/</guid>
        <description>&lt;p&gt;&lt;strong&gt;By default, SQLite tables have a special &lt;code&gt;rowid&lt;/code&gt; column that uniquely
identifies each row.&lt;/strong&gt; This &lt;code&gt;rowid&lt;/code&gt; exists even if you have a user-specified
&lt;code&gt;PRIMARY KEY&lt;/code&gt; on the table. How this &lt;code&gt;rowid&lt;/code&gt; column behaves is influenced by
your &lt;code&gt;PRIMARY KEY&lt;/code&gt; type.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Integer Primary Keys&lt;/strong&gt;: If you have an integer primary key, then the primary
key column becomes an alias for the special &lt;code&gt;rowid&lt;/code&gt; column. The &lt;code&gt;rowid&lt;/code&gt; and your
user-defined primary key are literally just the same column.&lt;/p&gt;
&lt;p&gt;Taking an example:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-sql&#34; data-lang=&#34;sql&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#00a&#34;&gt;CREATE&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;TABLE&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;IF&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;NOT&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;EXISTS&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;users(&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;user_id&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0aa&#34;&gt;INTEGER&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;PRIMARY&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;KEY&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;email&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0aa&#34;&gt;TEXT&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;);&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;In this case, since we’ve defined &lt;code&gt;user_id&lt;/code&gt; to be an &lt;code&gt;INTEGER PRIMARY KEY&lt;/code&gt;,
&lt;code&gt;user_id&lt;/code&gt; becomes an alias for &lt;code&gt;rowid&lt;/code&gt;. These two labels refer to the same
physical column.&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
            
        
        
        
            
            
            
            
            
            
                
                
                
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/05/TIL-SQLites-WITHOUT-ROWID/users.png&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/05/TIL-SQLites-WITHOUT-ROWID/users_hu84d8a599dd648de23df7171264b11d35_18198_0x600_resize_q100_h2_lanczos_3.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/05/TIL-SQLites-WITHOUT-ROWID/users_hu84d8a599dd648de23df7171264b11d35_18198_0x600_resize_lanczos_3.png&#34; alt=&#34;Figure 1: Integer primary keys are aliases for the rowid column&#34; width=&#34;319&#34; height=&#34;300&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;Figure 1: Integer primary keys are aliases for the &lt;code&gt;rowid&lt;/code&gt; column&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;&lt;strong&gt;Non-Integer Primary Keys&lt;/strong&gt;: When you use a non-integer primary key, such as
&lt;code&gt;TEXT&lt;/code&gt;, SQLite actually implements the “primary key” as a &lt;code&gt;UNIQUE INDEX&lt;/code&gt; between
&lt;code&gt;rowid&lt;/code&gt; and your user-defined primary key.&lt;/p&gt;
&lt;p&gt;SQLite stores data on disk in
&lt;a href=&#34;https://benjamincongdon.me/blog/2021/08/17/B-Trees-More-Than-I-Thought-Id-Want-to-Know/&#34;&gt;B-trees&lt;/a&gt;. Table
rows and indices are both stored in B-trees. For non-integer primary keys, this
means that on disk, your table has at least two B-trees &amp;ndash; one for the rows, and
another for the index between the &lt;code&gt;rowid&lt;/code&gt; and primary key.&lt;/p&gt;
&lt;p&gt;Taking an example:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-sql&#34; data-lang=&#34;sql&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#00a&#34;&gt;CREATE&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;TABLE&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;IF&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;NOT&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;EXISTS&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;page_views(&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;page_url&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0aa&#34;&gt;TEXT&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;PRIMARY&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;KEY&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;views&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0aa&#34;&gt;INTEGER&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;);&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This table would have two on-disk B-trees. The first for the rows of
&lt;code&gt;(rowid, page_url, views)&lt;/code&gt; and the second for the index entries of
&lt;code&gt;(page_url, rowid)&lt;/code&gt;.&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
            
        
        
        
            
            
            
            
            
            
                
                
                
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/05/TIL-SQLites-WITHOUT-ROWID/page_views_1.png&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/05/TIL-SQLites-WITHOUT-ROWID/page_views_1_huf56f0735eb5d21b7224c3ceb41580e0c_49428_0x600_resize_q100_h2_lanczos_3.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/05/TIL-SQLites-WITHOUT-ROWID/page_views_1_huf56f0735eb5d21b7224c3ceb41580e0c_49428_0x600_resize_lanczos_3.png&#34; alt=&#34;Figure 2: Non-integer primary keys&#34; width=&#34;736&#34; height=&#34;300&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;Figure 2: Non-integer primary keys&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;This two-tree structure has performance implications. If we were to do a query
like &lt;code&gt;SELECT views FROM page_views WHERE page_url=&#39;https://benjamincongdon.me/blog/sqlite&#39;&lt;/code&gt;, the SQLite
query engine has to first search the &lt;code&gt;(page_url, rowid)&lt;/code&gt; B-tree to get the
&lt;code&gt;rowid&lt;/code&gt; of the row with &lt;code&gt;page_url=‘/blog/sqlite’&lt;/code&gt;, and then uses that rowid to
locate the full row in the &lt;code&gt;(rowid, page_url, views)&lt;/code&gt; B-tree to get the &lt;code&gt;views&lt;/code&gt;
from that row.&lt;/p&gt;
&lt;h2 id=&#34;enter-without-rowid&#34;&gt;Enter WITHOUT ROWID&lt;/h2&gt;
&lt;p&gt;Adding the &lt;code&gt;WITHOUT ROWID&lt;/code&gt; clause to a &lt;code&gt;CREATE TABLE&lt;/code&gt; statement disables this
&lt;code&gt;rowid&lt;/code&gt; behavior. For example:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-sql&#34; data-lang=&#34;sql&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#00a&#34;&gt;CREATE&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;TABLE&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;IF&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;NOT&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;EXISTS&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;page_views(&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;page_url&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0aa&#34;&gt;TEXT&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;PRIMARY&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;KEY&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;  &lt;/span&gt;views&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#0aa&#34;&gt;INTEGER&lt;/span&gt;,&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#bbb&#34;&gt;&lt;/span&gt;)&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;&lt;span style=&#34;color:#00a&#34;&gt;WITHOUT&lt;/span&gt;&lt;span style=&#34;color:#bbb&#34;&gt; &lt;/span&gt;ROWID;&lt;span style=&#34;color:#bbb&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Instead of creating the phantom &lt;code&gt;rowid&lt;/code&gt; column, the primary key of a
&lt;code&gt;WITHOUT ROWID&lt;/code&gt; table uses a “clustered index”. In this context, a clustered
index means that the row data and the primary key index are stored in the same
B-tree. The rows in this index are physically stored in order of the primary
key. Thus, there isn’t the need for a second look-up B-tree.&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
            
        
        
        
            
            
            
            
            
            
                
                
                
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/05/TIL-SQLites-WITHOUT-ROWID/page_views_2.png&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/05/TIL-SQLites-WITHOUT-ROWID/page_views_2_hue8ace7c6abd4aa51657694dddf59a11e_32536_0x600_resize_q100_h2_lanczos_3.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/05/TIL-SQLites-WITHOUT-ROWID/page_views_2_hue8ace7c6abd4aa51657694dddf59a11e_32536_0x600_resize_lanczos_3.png&#34; alt=&#34;Figure 3: WITHOUT ROWID tables use a clustered index&#34; width=&#34;344&#34; height=&#34;300&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;Figure 3: WITHOUT ROWID tables use a clustered index&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;&lt;em&gt;Diagram note:&lt;/em&gt; Visualizing a B-tree as a table is a bit of an
oversimplification. Note that in Figure 3, as opposed to Figure 2, the rows are
stored in order of the primary key to gesture at the on-disk layout of the
B-tree.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;When to use WITHOUT ROWID:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;If you frequently lookup by primary key, or range scan by primary key, there
is a performance benefit to using &lt;code&gt;WITHOUT ROWID&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;If you use an composite primary key, and frequently lookup by this key.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;When NOT to use WITHOUT ROWID:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;If your primary keys are large.&lt;/strong&gt; If your primary keys are large, then any
additional indices on the table will have to duplicate storage of the entire
primary key, instead of using the &lt;code&gt;rowid&lt;/code&gt; to refer to the entry. For
instance, if you’re using a 200 byte string primary key, then every
secondary index will include those 200 bytes, instead of the 8 byte &lt;code&gt;rowid&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;If your table rows are large (e.g. they store a large blob).&lt;/strong&gt; If your
table rows are large, then the disk layout of the table will be less
efficient as a &lt;code&gt;WITHOUT ROWID&lt;/code&gt; table. This is because SQLite uses a
different type of B-tree &amp;ndash; a B*-tree &amp;ndash; layout for &lt;code&gt;rowid&lt;/code&gt; tables that
only stores data in leaf nodes, whereas clustered indices use a normal
B-tree layout, which stores data in intermediate nodes. If the rows
themselves are large, the storage of data in intermediate nodes can worsen
the fan-out when searching the tree.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&#34;learnings--discussion&#34;&gt;Learnings / Discussion&lt;/h3&gt;
&lt;p&gt;I dug into this concept more deeply when trying to read-optimize a SQLite
database that is used as a simple key-value on-disk cache. The notion of
disabling &lt;code&gt;rowid&lt;/code&gt; for this table seemed appealing, until I realized that the
optimization was likely not worth it given that the table stored quite large
blobs in its rows, which would degrade the overall efficiency of the on-disk
layout. In any case, I got to learn more about SQLite internals, which was well
worth the time.&lt;/p&gt;
&lt;p&gt;The SQLite &lt;a href=&#34;https://sqlite.org/docs.html&#34;&gt;documentation&lt;/a&gt; makes for a fascinating
read and is exemplary technical writing. The SQLite project is a true treasure.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Further Reading&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://sqlite.org/withoutrowid.html&#34;&gt;Clustered Indexes and the WITHOUT ROWID Optimization&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://fly.io/blog/sqlite-internals-btree/&#34;&gt;SQLite Internals: Pages &amp;amp; B-trees&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    
    <item>
        <title>Race Report: Seattle Marathon 2025</title>
        <link>https://benjamincongdon.me/blog/2025/12/04/Race-Report-Seattle-Marathon-2025/</link>
        <pubDate>Thu, 04 Dec 2025 00:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/12/04/Race-Report-Seattle-Marathon-2025/</guid>
        <description>&lt;p&gt;Last Sunday, I ran the 2025 Seattle Marathon. This was my third marathon, and I
got a PR! I’m splitting this race report into two broad sections: about the
course, and about my experience/training/etc.&lt;/p&gt;
&lt;h2 id=&#34;the-course--event&#34;&gt;The Course &amp;amp; Event&lt;/h2&gt;
&lt;p&gt;Candidly, I’ve avoided running the Seattle Marathon in the past because I’d
heard negative things about the course layout. Previous courses spent much more
time around the arboretum and University District, and routed over the 520
bridge &amp;ndash; which does sound fun in the abstract&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt;, but would be monotonous for
a race.&lt;/p&gt;
&lt;p&gt;One thing I still grumble about with the Seattle Marathon is that they &lt;em&gt;do not
post their courses until a few months before the race&lt;/em&gt;. This is in contrast with
other races, which either don’t meaningfully change their courses
year-over-year, or have an established course prior to registration opening.
Signing up for a course sight-unseen isn’t great.&lt;/p&gt;
&lt;p&gt;However, the 2025 course was great! It was a true highlight reel of the best
locations for Seattle running.&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
        
        
            
            
            
            
            
            
            
            
            
            
            
            
                
            
        
        
        
        
        
        
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/04/Race-Report-Seattle-Marathon-2025/course.png&#34;&gt;
            
            
            
            &lt;img
                src=&#34;https://benjamincongdon.me/blog/2025/12/04/Race-Report-Seattle-Marathon-2025/course.png&#34; alt=&#34;Course Map: 2025 Seattle Marathon&#34;
                loading=&#34;lazy&#34;
                decoding=&#34;async&#34;
            &gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;Course Map: 2025 Seattle Marathon&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;Broadly, the course started near the Gate Foundation building, went up through
Eastlake, turned east to the Arboretum via Interlaken park, provided a nice loop
through the Arboretum and the University of Washington Campus, then headed west
through Gasworks park, Fremont, the Fremont Canal, and Interbay.&lt;/p&gt;
&lt;h3 id=&#34;interlaken-park-arboretum-union-bay-natural-area&#34;&gt;Interlaken Park, Arboretum, Union Bay Natural Area&lt;/h3&gt;
&lt;p&gt;The section of the course that went through Interlaken Park, the UW Arboretum,
and the Union Bay Natural Area (near Husky Stadium) was the highlight of the
race for me. Running through those parks on a crisp day as the sun is coming
out, alongside a ton of other runners&amp;hellip; amazing.&lt;/p&gt;
&lt;h3 id=&#34;magnolia--interbay&#34;&gt;Magnolia &amp;amp; Interbay&lt;/h3&gt;
&lt;p&gt;In contrast, the hilly industrial streets north of Interbay and the section of
Elliott Ave in the last 3 miles of the course are not that nice to run on.&lt;/p&gt;
&lt;p&gt;The last several miles were a point of contention, since they
&lt;a href=&#34;https://www.seattletimes.com/seattle-news/seattle-marathon-reroute-a-hit-for-runners-headache-for-new-neighbors/&#34;&gt;caused significant traffic in Magnolia&lt;/a&gt;
and largely prevented residents from leaving during the race. When I normally
run this section of Interbay, take the Elliott Bay Trail past through the
industrial area there, which connects to Centennial Park. (This is the blue
route in the map.)&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
        
        
            
            
            
            
            
            
            
            
            
            
            
            
                
            
        
        
        
        
        
        
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/04/Race-Report-Seattle-Marathon-2025/alternate_route.png&#34;&gt;
            
            
            
            &lt;img
                src=&#34;https://benjamincongdon.me/blog/2025/12/04/Race-Report-Seattle-Marathon-2025/alternate_route.png&#34; alt=&#34;Course Map: Magnolia Bridge &amp;amp;amp; Elliott Way Segment&#34;
                loading=&#34;lazy&#34;
                decoding=&#34;async&#34;
            &gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;Course Map: Magnolia Bridge &amp;amp; Elliott Way Segment&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;I think this would have been the preferred route by the marathon planners, but
there are two things preventing this route from actually being used:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Despite
&lt;a href=&#34;https://www.seattlebikeblog.com/2025/10/01/with-the-bridge-over-nothing-removed-the-terminal-91-trail-in-interbay-reopens-thursday/&#34;&gt;recent trail renovations&lt;/a&gt;,
there’s still a significant choke point in the trail in the connection
between the Elliott Bay Trail and Centennial Park that would likely have been
a safety issue for the volume of marathon runners.&lt;/li&gt;
&lt;li&gt;There is ongoing construction in Elliott Bay Park that would make it tricky
to route through.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;I have pretty high confidence in this second factor, as this message was posted
around the Race Expo:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The new courses touch several parks, waterways and other viewpoints throughout
the city, showcasing the beauty of all of Seattle.&lt;/p&gt;
&lt;p&gt;And next year the courses get even prettier! Starting in 2026, once
construction of the Elliott Bay Trail has been completed, we have agreements
to switch the last two miles of the races from Elliott Ave to the brand new
waterfront trail.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;As an alternative to the nicer Elliott Bay trail, the course instead routed over
the Magnolia Bridge, which caused significant traffic. My sense though is that
this wasn’t the “first choice” for the course, as (if memory serves) the course
was updated a few times after it was announced.&lt;/p&gt;
&lt;h3 id=&#34;turnarounds--loops&#34;&gt;Turnarounds &amp;amp; Loops&lt;/h3&gt;
&lt;p&gt;The other gripe I had with the course was that it relied more on turnarounds
than I would have preferred. As a Seattle-native runner, this didn’t bother me
too much; I was quite familiar with the course and the turnarounds felt logical
enough. If I were coming from out of town, this likely would have been more
annoying.&lt;/p&gt;
&lt;p&gt;The one &lt;em&gt;really unfortunate&lt;/em&gt; course element though was, again, north of
Interbay. At the intersection after the Emerson Pl Bridge, the course wanted you
to first turn right to do a loop up Gilman Ave, then drop back to Fishermans
Terminal via Commodore Way. Then you go over the Emerson Pl Bridge &lt;em&gt;again&lt;/em&gt;, this
time turning left and going down to Interbay.&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
        
        
            
            
            
            
            
            
            
            
            
            
            
            
                
            
        
        
        
        
        
        
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/04/Race-Report-Seattle-Marathon-2025/loop.png&#34;&gt;
            
            
            
            &lt;img
                src=&#34;https://benjamincongdon.me/blog/2025/12/04/Race-Report-Seattle-Marathon-2025/loop.png&#34; alt=&#34;Course Map: Interbay Loop&#34;
                loading=&#34;lazy&#34;
                decoding=&#34;async&#34;
            &gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;Course Map: Interbay Loop&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;Having seen the course online and knowing the area well, this was reasonably
intuitive. However, I saw at least a dozen runners who&amp;rsquo;d taken the left on their
first pass and had to retrace their steps (sometimes for &amp;gt;1 mi) to do the loop
they’d accidentally skipped. There were course monitors at the interaction, but
the signage was still evidently insufficient.&lt;/p&gt;
&lt;h3 id=&#34;summary&#34;&gt;Summary&lt;/h3&gt;
&lt;p&gt;Overall I enjoyed the course quite a bit. The event management was really good:
the race started on time, the aid stations were plentiful/well-staffed, and the
finish line was on-par for post-race chaos. I think if they smooth out the
Interbay/Magnolia piece next year, this same course would be an excellent one to
repeat.&lt;/p&gt;
&lt;h2 id=&#34;my-experience&#34;&gt;My Experience&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt;: I finished in 3:24:34, a PR, down from my previous best, which was
~3:43.&lt;/p&gt;
&lt;h3 id=&#34;training&#34;&gt;Training&lt;/h3&gt;
&lt;p&gt;My training window for this race was rather long. I’d been ”training” for months
longer than I needed to, to partially train alongside a friend who ran the San
Francisco Marathon in late July. My training started in earnest shortly after
that, and I used &lt;a href=&#34;https://www.runna.com/&#34;&gt;Runna&lt;/a&gt; for the first time to manage my
plan.&lt;sup id=&#34;fnref:2&#34;&gt;&lt;a href=&#34;#fn:2&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;p&gt;My training was mostly uneventful, until near the end. I’ll write more about
Runna at some point, but in short: it was really helpful in scheduling
progressive long runs, and I loved that is generated pace workouts which
helpfully sync to your running watch. I built up my weekly volume and distance
capacity without issue.&lt;/p&gt;
&lt;p&gt;However. In early November, I injured myself on a 20mi long-run. I started the
run feeling like something was “off” in my knee, but kept running on it. The
whole run was slow and painful, and it was the first time in my entire running
experience that I was unable to walk after a run. Not great. I took a week off
after that and wore a knee brace consistently until race day. Eventually, my
knee improved, but the ankle on the same leg got injured too &amp;ndash; potentially
compensatorily. So I started wearing a compression brace on that ankle too.&lt;/p&gt;
&lt;p&gt;I was becoming pessimistic that I’d actually run the race, since I didn’t want
to further injure myself and had to dramatically step down my training in the
final month &amp;ndash; skipping all my tempo runs, and limiting the distance of my last
long runs to well under the plan.&lt;/p&gt;
&lt;p&gt;In a moment of injury reprieve mixed with inspired consumerism, I bought a pair
of carbon “super shoes”:
&lt;a href=&#34;https://www.rtings.com/running-shoes/reviews/adidas/adizero-adios-pro-4&#34;&gt;Adidas Adios Pro 4&lt;/a&gt;&lt;sup id=&#34;fnref:3&#34;&gt;&lt;a href=&#34;#fn:3&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;3&lt;/a&gt;&lt;/sup&gt;s.
The first time I put them on to go out and test them, I ended up running my
first ever sub-20min 5k. Was this a good idea? Probably not, but surprisingly it
didn’t aggravate my knee and was a strong mental win.&lt;/p&gt;
&lt;h3 id=&#34;race-day&#34;&gt;Race Day&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Pace groups&lt;/strong&gt;: I ran the first 18-19 miles with the 3:20 pace group, which
went really well. That helped a lot with not going too fast out of the gate. In
my first two races, I had the tendency to run really fast at the beginning and
burn out around mile 13, then struggle from 13-20, and be in a bad situation
from 20-26. Running with a pace group got me solidly to mile 18 without
overexertion, but still at my goal pace.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Fueling&lt;/strong&gt; My fueling was definitely suboptimal. I brought 3 protein bars and a
caffeinated Verb bar with me. Unfortunately I somehow “lost” two of the bars I
brought in the bottom of my running vest, and didn’t want to stop to fish my
backup out of the bottom of my vest. The result of this was I ended up having a
bunch of course gels. This was fine, but I got “sweeted out” and had this sickly
sweet feeling in my mouth from mile 18 on.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Shoes&lt;/strong&gt;: Shoes wise, running with carbon shoes definitely helped, but more in
the early sections where my legs were fresher. Qualitatively, I feel like they
probably gave me a 30 second qualitative boost on my pace, which is nontrivial.
However, the last 6-8 miles was still pretty hard. My thighs started freezing up
which makes maintaining a sub-8:30 pace tricky. The Adios shoes have much less
stability than my usual Brooks Adrenalines, which led to a few close calls with
rolling an ankle as my legs fatigued.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The Last 6mi&lt;/strong&gt;: The phrase “a marathon is a 20mi warmup to a 6mi race” is
popular for a reason:&lt;sup id=&#34;fnref:4&#34;&gt;&lt;a href=&#34;#fn:4&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;4&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
        
        
            
            
            
            
            
            
            
            
            
            
            
            
                
            
        
        
        
        
        
        
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/04/Race-Report-Seattle-Marathon-2025/pace_chart.png&#34;&gt;
            
            
            
            &lt;img
                src=&#34;https://benjamincongdon.me/blog/2025/12/04/Race-Report-Seattle-Marathon-2025/pace_chart.png&#34; alt=&#34;My pace chart for the race&#34;
                loading=&#34;lazy&#34;
                decoding=&#34;async&#34;
            &gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;My pace chart for the race&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;&lt;strong&gt;Finishing &amp;amp; Post Race:&lt;/strong&gt; Marathons are such a mental game. In the last mile or
two, I had the feeling that I &lt;em&gt;could&lt;/em&gt; push myself to finish a few seconds
faster, but ultimately decided to not to. Once I was confident that I’d finish
comfortably under 3:30, I “just” wanted to finish without further degrading my
form. This probably was a smart decision. As I stopped in the finish corral, I
had a few seconds of post-run adrenaline, which immediately crashed into “yup,
can’t really walk anymore”. It was also still a quite &lt;em&gt;cold&lt;/em&gt; day, and so I got
to use one of those fun mylar blankets for the first time.&lt;/p&gt;
&lt;h3 id=&#34;pictures&#34;&gt;Pictures&lt;/h3&gt;
&lt;p&gt;
&lt;figure&gt;
    
    

    
    

    
    
        
        
        
        
            
            
            
            
            
            
            
            
            
            
            
            
                
            
        
        
        
        
        
        
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/04/Race-Report-Seattle-Marathon-2025/start_line.jpg&#34;&gt;
            
            
            
            &lt;img
                src=&#34;https://benjamincongdon.me/blog/2025/12/04/Race-Report-Seattle-Marathon-2025/start_line.jpg&#34; alt=&#34;Start Line&#34;
                loading=&#34;lazy&#34;
                decoding=&#34;async&#34;
            &gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;Start Line&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;


&lt;figure&gt;
    
    

    
    

    
    
        
        
            
        
        
        
            
            
            
            
            
            
                
                
                
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/04/Race-Report-Seattle-Marathon-2025/union_bay.jpg&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/04/Race-Report-Seattle-Marathon-2025/union_bay_hu16ca6deedf30a0b231fb97e04a8c9374_711634_0x1200_resize_q100_h2_lanczos.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/04/Race-Report-Seattle-Marathon-2025/union_bay_hu16ca6deedf30a0b231fb97e04a8c9374_711634_0x1200_resize_q100_lanczos.jpg&#34; alt=&#34;Union Bay Natural Area&#34; width=&#34;450&#34; height=&#34;600&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;Union Bay Natural Area&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;


&lt;figure&gt;
    
    

    
    

    
    
        
        
        
        
            
            
            
            
            
            
            
            
            
            
            
            
                
            
        
        
        
        
        
        
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/04/Race-Report-Seattle-Marathon-2025/520_bridge.jpg&#34;&gt;
            
            
            
            &lt;img
                src=&#34;https://benjamincongdon.me/blog/2025/12/04/Race-Report-Seattle-Marathon-2025/520_bridge.jpg&#34; alt=&#34;520 Bridge&#34;
                loading=&#34;lazy&#34;
                decoding=&#34;async&#34;
            &gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;520 Bridge&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;


&lt;figure&gt;
    
    

    
    

    
    
        
        
        
        
            
            
            
            
            
            
            
            
            
            
            
            
                
            
        
        
        
        
        
        
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/04/Race-Report-Seattle-Marathon-2025/almost_at_the_finish.jpg&#34;&gt;
            
            
            
            &lt;img
                src=&#34;https://benjamincongdon.me/blog/2025/12/04/Race-Report-Seattle-Marathon-2025/almost_at_the_finish.jpg&#34; alt=&#34;Almost at the Finish&#34;
                loading=&#34;lazy&#34;
                decoding=&#34;async&#34;
            &gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;Almost at the Finish&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;All in all, fun race. I&amp;rsquo;d do it again. :)&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;I have yet to run or bike or walk over either the 520 or I90 bridges, a
glaring omission in my Seattle-native credibility.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:2&#34;&gt;
&lt;p&gt;Previously, I’d just search for a reasonable internet training plan
(invariably one of
&lt;a href=&#34;https://www.halhigdon.com/training/marathon-training/&#34;&gt;Hal Higdon&amp;rsquo;s plans&lt;/a&gt;),
copy that into a spreadsheet, futz with it until it felt “right”, and then
hang a printed version of that on the wall during “training season”.&amp;#160;&lt;a href=&#34;#fnref:2&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:3&#34;&gt;
&lt;p&gt;As a side note, RTings, better known for
TV/headphone/vacuum/consumer-electronics reviews, does surprisingly good
reviews for running shoes.&amp;#160;&lt;a href=&#34;#fnref:3&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:4&#34;&gt;
&lt;p&gt;This generalizes reasonably well to “a marathon is a $(26-N)$ mile warmup to
an $N$ mile race, for values of $N &amp;gt;= 13$”.&amp;#160;&lt;a href=&#34;#fnref:4&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
        <title>Technical Escape Velocity</title>
        <link>https://benjamincongdon.me/blog/2025/12/03/Technical-Escape-Velocity/</link>
        <pubDate>Wed, 03 Dec 2025 00:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/12/03/Technical-Escape-Velocity/</guid>
        <description>&lt;h2 id=&#34;1&#34;&gt;1.&lt;/h2&gt;
&lt;p&gt;When I was learning to program, I remember a specific phase transition in that
process of skill acquisition &amp;ndash; a distinct “before” and “after”, similar to how
when learning to read there’s the “before” of ignorance, the “after” of
effortlessly reading, and surprisingly little memory of the intermediate
struggle of learning.&lt;/p&gt;
&lt;p&gt;For programming, this phase transition was the point where I &lt;em&gt;felt&lt;/em&gt; confident
that I &lt;em&gt;could&lt;/em&gt;, in principle and given enough time, program anything that could
be programmed.&lt;/p&gt;
&lt;p&gt;I learned to code, initially, through a mixture of Java Minecraft plugins,
Objective-C OSX and iOS games, and physical programming books, which I&amp;rsquo;d
studiously read in my middle school’s designated reading time, trying to absorb
operating systems programming while&amp;hellip; staring at a book without access to a
computer.&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;p&gt;This time spent grasping at concepts that I didn’t have a good handle on was
legitimately useful, but frustrating. I recall feeling the unwieldy uncertainty
of the unknown with how to take a simple idea I had (&amp;ldquo;program a &amp;lsquo;simon says&amp;rsquo;
game!&amp;rdquo;) and then turn that into a reality (arrays, display elements, event
loops). The number of unknown unknowns was too high to have a chance at finding
what I needed on StackOverflow.&lt;/p&gt;
&lt;p&gt;Then, there was some critical point where my mindset shifted. I’d built up
enough of a mental model of the key concepts behind “how to program a computer”
&amp;ndash; control flow, filesystem operations, network requests, parsing/serialization,
calling REST APIs, etc. &amp;ndash; that it felt like I could figure anything out given
sufficient time. I think there’s two components to this mindset shift: (1)
having enough of the common patterns in your repertoire to make the “easy things
easy”, and (2) developing the meta skill of knowledge acquisition (i.e. knowing
what search terms to use, having a reasonable grasp on enough of the CS/systems
theory to search for what you need) to make the “hard things possible”.&lt;/p&gt;
&lt;h2 id=&#34;2&#34;&gt;2.&lt;/h2&gt;
&lt;p&gt;As a handle for this mindset shift, I’ll call this &lt;strong&gt;“Technical Escape
Velocity”&lt;/strong&gt;. As in, you build up enough technical knowledge to bootstrap
yourself to escape the “gravity well” of ignorance.&lt;/p&gt;
&lt;p&gt;In orbital mechanics, escape velocity is the minimum speed needed to break away
from a celestial body’s gravity well without further propulsion. Technical
Escape Velocity (TEV) is admittedly a stretched metaphor, but I like it for the
following reason: when you reach escape velocity, you don’t &lt;em&gt;immediately&lt;/em&gt; leave
the gravity well, you just have sufficient conditions for &lt;em&gt;eventually&lt;/em&gt; exiting
the gravity well.&lt;/p&gt;
&lt;p&gt;Different problem domains have different gravity wells. A small webapp has a
smaller gravity well than a sprawling full-stack SaaS app, which in turn has a
smaller gravity well than “leading a 20 person engineering team through a
complex, layered distributed system migration”.&lt;/p&gt;
&lt;p&gt;For a given problem domain, you can eventually become proficient enough to solve
all the solvable problems within that domain. If you’re lucky, some of this
knowledge will even transfer between domains. &lt;strong&gt;Critically, you still only know
you’ve solved the problem once you have evidence of it actually being solved.
Just because you &lt;em&gt;think&lt;/em&gt; or &lt;em&gt;feel&lt;/em&gt; like you have TEV, doesn’t mean you actually
do.&lt;/strong&gt; Trickier problems with larger gravity wells require you to wait longer to
know if you’ve actually left orbit &amp;ndash; or if you’ll ultimately crash back to the
surface in failure.&lt;/p&gt;
&lt;h2 id=&#34;3&#34;&gt;3.&lt;/h2&gt;
&lt;p&gt;Fast forward to 2025, AI tooling has dramatically changed how this learning
process &lt;em&gt;can&lt;/em&gt; work today. As Zvi Mowshowitz often
&lt;a href=&#34;https://thezvi.substack.com/p/ai-135-openai-shows-us-the-money&#34;&gt;says&lt;/a&gt;, “AI is
the best tool ever made for learning, and also the best tool for not learning.”
Tools like Claude Code or Codex make the TEV for simple tasks effectively zero
&amp;ndash; “write a simon says game for me” is something that &lt;em&gt;any&lt;/em&gt; frontier AI model
should be able to do, essentially zero-shot. But it also enables you to use that
as a crutch: if you stay comfortably in the zero-TEV problem domains, then when
you hit a wicked problem, you will not have built up enough tacit knowledge and
experience to make the jump.&lt;/p&gt;
&lt;p&gt;In 2025, a coding agent can do a lot of your “tedious homework” as a developer.
It can also give you a pretty reasonable head-start on design and architecture
questions. This makes the TEV for low- to medium-difficulty tasks dramatically
lower than it had been 5 years ago.&lt;/p&gt;
&lt;p&gt;The class of problems with nontrivial TEV has contracted, to where now only the
most difficult, wicked problems have a high TEV. I’m not sure AI has
dramatically lowered the TEV of those problems. Large architectural decisions
that will impact the work of dozens of engineers and take years to fully
realize? Still hard. Still requires a lot of tacit knowledge, experience,
intuition, and long-horizon planning. AI can &lt;em&gt;help&lt;/em&gt;, but discerning slop from
reasonable strategy still requires a very high TEV. To be clear, I’d rather have
Claude Opus 4.5 on-hand to bang out prototypes and rubber-duck ideas with, but I
don’t think its long-horizon judgement is there yet.&lt;/p&gt;
&lt;p&gt;The highest-of-high TEV problems are the ones where you have a very weak,
delayed feedback signal. Projects where you have to wait months to see whether
structural bets you made paid off or not. Projects where you need to
course-correct midway through because of unforeseen circumstances. Strategic
actions taking &lt;em&gt;realpolitik&lt;/em&gt; into account. These problems are high TEV because
they aren’t purely technical &amp;ndash; they’re still systemic and involve a lot of
technical decision-making, but they also are deeply intertwined with
organizational, business, and market forces. It’s notoriously difficult to even
tell if humans are skilled in domains such as these, because teasing out luck
versus skill becomes nontrivial.&lt;/p&gt;
&lt;p&gt;Will AI be there in 3-5 years? My intuition is “no, unless we have another
architectural advance in long-term planning”. And yet, I&amp;rsquo;ve been quite surprised
over the past year at how quickly coding agents have been climbing the TEV
curve. It&amp;rsquo;s hard not to hedge in either direction on this; there is significant
uncertainty.&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;I suppose this allowed me to become more confident with pencil-and-paper
programming, a skill which was useful for precisely one advanced placement
exam, and nothing since.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
        <title>Why Magnetos</title>
        <link>https://benjamincongdon.me/blog/2025/12/02/Why-Magnetos/</link>
        <pubDate>Tue, 02 Dec 2025 00:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/12/02/Why-Magnetos/</guid>
        <description>&lt;p&gt;I&amp;rsquo;ve recently started on the journey to get a private pilot’s license. One thing
I&amp;rsquo;ve enjoyed about the process so far is the extent to which you&amp;rsquo;re encouraged
to understand how most of the systems work at a fairly deep level. Contrast this
to driving a car, where you can mostly get away with &amp;ldquo;turn key, use gas &amp;amp; brake
pedals, don&amp;rsquo;t do anything stupid that would cause you to lose traction.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;Most trainer aircraft still in use were designed (and sometimes built) in the
1960s or ‘70s. So the systems are relatively primitive. For example, there is no
onboard computer calculating the fuel mixture; you do that yourself. You have to
know what a carburetor is and does. You need to know about the fuel system &amp;ndash;
where exactly it&amp;rsquo;s stored, the rate of fuel burn, how air temperature and fuel
mixture and oil temperature impact engine performance. For more complicated
aircraft, you need to become familiar with the manifold pressure system that
uses a complicated oil-and-spring setup to control propeller pitch. This is all
part of the fun and appeal of aviation, at least for me.&lt;/p&gt;
&lt;p&gt;To oversimplify, turning on a plane like a Cessna 172 involves 3ish steps:
turning on the master power switch for the airplane’s electrical system, turning
on the power switch for the avionics systems, and turning on the plane’s
magnetos. The first two are usually toggle switches, whereas the magnetos are a
keyed switch, similar to a car&amp;rsquo;s ignition. You can roughly round your
understanding of magnetos to &amp;ldquo;it&amp;rsquo;s basically the &amp;rsquo;turn the engine on&#39;
switch”&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt;, but I was curious and wanted to learn more.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What are magnetos?&lt;/strong&gt; Magnetos are a self-contained electrical system for
powering the plane’s spark plugs, which use the mechanical power of the running
engine to produce enough voltage to keep the spark plugs sparking, independent
of the aircraft’s battery. They&amp;rsquo;re called magnetos because they use permanent
magnets rotating past wire coils to generate electrical current.&lt;sup id=&#34;fnref:2&#34;&gt;&lt;a href=&#34;#fn:2&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
            
        
        
        
        
            
            
            
            
            
                
                
                
            
            
            
            
            
                
            
            
            
            
                
            
        
        
        
        
        
        
        
            
                
                
            
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/02/Why-Magnetos/magneto.png&#34;&gt;
            
            
            &lt;picture&gt;
                &lt;source srcset=&#34;https://benjamincongdon.me/blog/2025/12/02/Why-Magnetos/magneto_hu39d1d8dd1ca6fd847333af9aca00cd1c_493733_1200x0_resize_q100_h2_lanczos_3.webp&#34; type=&#34;image/webp&#34;&gt;
                &lt;img
                    src=&#34;https://benjamincongdon.me/blog/2025/12/02/Why-Magnetos/magneto_hu39d1d8dd1ca6fd847333af9aca00cd1c_493733_1200x0_resize_lanczos_3.png&#34; alt=&#34;Crudely drawn, likely not fully accurate magneto diagram&#34; width=&#34;600&#34; height=&#34;331&#34;
                    loading=&#34;lazy&#34;
                    decoding=&#34;async&#34;
                &gt;
            &lt;/picture&gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;Crudely drawn, likely not fully accurate magneto diagram&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;Spark plugs need a reasonably high voltage (usually &amp;gt;10kV) to ionize the air to
create a spark. To achieve this high voltage, they have two coils &amp;ndash; a primary
and a secondary &amp;ndash; which work similarly to a step-up
&lt;a href=&#34;https://en.wikipedia.org/wiki/Transformer&#34;&gt;transformer&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;To generate the high voltage needed to produce a spark, a mechanism connected to
the engine timing system periodically disconnects the primary coil, collapsing
its magnetic field. This field collapse induces a higher voltage in the
secondary coil, using Faraday’s law of induction.&lt;sup id=&#34;fnref:3&#34;&gt;&lt;a href=&#34;#fn:3&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Why magnetos?&lt;/strong&gt; On essentially all modern internal combustion engines for
cars, the spark system is powered by the internal battery, which in turn is
charged by the alternator &amp;ndash; an electric generator which runs off the mechanical
energy generated by the running engine.&lt;/p&gt;
&lt;p&gt;Aircraft engines also have an alternator and battery system &amp;ndash; for powering
aircraft electronics like avionics and lights. Why not use those for the engine
spark plugs too? Primarily for isolation. Modern electronic ignitions are
coupled to the electrical system. If you have an electrical fire in the cockpit,
the first step is to turn off the master switch, cutting off all power to the
plane. If your engine relied on that power for its spark plugs, it would also
stop.&lt;/p&gt;
&lt;p&gt;Magnetos are mechanically coupled to the engine and propeller. As long as the
propeller continues to spin, the magnetos will continue to provide the energy
needed to keep firing, completely decoupled from the rest of the plane&amp;rsquo;s
electronics.&lt;/p&gt;
&lt;p&gt;Losing your lights and avionics if your battery or alternator fails is a
manageable emergency, but losing your engine is potentially catastrophic. The
total isolation of the electronics and engine systems makes a coupled failure
less likely.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Left and right magnetos:&lt;/strong&gt; Many engines have two sets of magnetos &amp;ndash; labeled
&amp;ldquo;left&amp;rdquo; and &amp;ldquo;right&amp;rdquo; &amp;ndash; with an option to have both enabled. The &amp;ldquo;both&amp;rdquo; setting is
what the plane usually operates with. Switching off one set of magnetos
noticeably decreases the RPM of the running engine. Why? Naively one might think
that half the engine cylinders (e.g. the &amp;ldquo;left half&amp;rdquo;) use each set of magnetos.
This is not the case; rather, each set of magnetos is wired to each cylinder.
Cylinders have two spark plugs each connected to one of the two independent
magneto system.&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
        
        
            
            
            
            
            
            
            
            
            
            
            
            
                
            
        
        
        
        
        
        
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/12/02/Why-Magnetos/Ignition_System_schematic.png&#34;&gt;
            
            
            
            &lt;img
                src=&#34;https://benjamincongdon.me/blog/2025/12/02/Why-Magnetos/Ignition_System_schematic.png&#34; alt=&#34;Aircraft magneto system schematic&#34;
                loading=&#34;lazy&#34;
                decoding=&#34;async&#34;
            &gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;Aircraft magneto system schematic&lt;a href=&#34;https://commons.wikimedia.org/wiki/File:Ignition_System_schematic.png&#34;&gt;Source: Pilot Handbook of Aeronautical Knowledge&lt;/a&gt;&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;&lt;strong&gt;Why two sets of magnetos?&lt;/strong&gt; Well, mostly for redundancy again. The engine can
still function quite well&lt;sup id=&#34;fnref:4&#34;&gt;&lt;a href=&#34;#fn:4&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;4&lt;/a&gt;&lt;/sup&gt; even if one of the two magneto systems is
inoperative. But there&amp;rsquo;s another reason: Recall how I mentioned earlier that
only enabling one set of magnetos reduces engine RPM, all else equal. Why?
Having two spark plugs in each cylinder improves the engine’s combustion by
starting the combustion at two places simultaneously, reducing the amount of
&lt;a href=&#34;https://en.wikipedia.org/wiki/Flame_speed&#34;&gt;time the flame takes&lt;/a&gt; to expand
through the cylinder. This also makes the burn happen more evenly, to reduce
uneven strain across the cylinder surface.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Further reading:&lt;/strong&gt;
&lt;a href=&#34;https://www.aopa.org/news-and-media/all-news/2019/december/flight-training-magazine/how-it-works-magneto&#34;&gt;How It Works: Magneto (AOPA)&lt;/a&gt;&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;Technically, magnetos aren’t even really “turned on” since they’re passive
systems. Turning the switch from OFF to ON un-grounds the magnetos. When
OFF, the magneto coils are grounded to the airframe body, which prevents
them from generating electricity. &lt;em&gt;Actually&lt;/em&gt; turning on the engine still
requires the battery to power a starter motor.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:2&#34;&gt;
&lt;p&gt;Magnetos are a form of alternator that uses permanent magnets as the source
of the magnetic field used to generate current. Non-magneto alternators, on
the other hand, use an electromagnet as their magnetic field source. This
has the side effect that if your battery is dead, the alternator can’t
function, since there isn’t an initial “excitation” energy to energize the
coil, creating the field that then drives current.&amp;#160;&lt;a href=&#34;#fnref:2&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:3&#34;&gt;
&lt;p&gt;The ratio of the number of turns in each of the primary and secondary coils
determines how much the voltage is stepped up. To act as a step-up
transformer, the secondary coil has many more turns of wire than the primary
coil.&amp;#160;&lt;a href=&#34;#fnref:3&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:4&#34;&gt;
&lt;p&gt;In flight, that is. You can safely return to the ground easily with just one
magneto. You definitely wouldn&amp;rsquo;t want to take off with only one.&amp;#160;&lt;a href=&#34;#fnref:4&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
        <title>Schedule Recurring Calls With Your Far-Away Friends</title>
        <link>https://benjamincongdon.me/blog/2025/12/01/Schedule-Recurring-Calls-With-Your-Far-Away-Friends/</link>
        <pubDate>Mon, 01 Dec 2025 00:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/12/01/Schedule-Recurring-Calls-With-Your-Far-Away-Friends/</guid>
        <description>&lt;p&gt;I enjoy conversations, particularly with people I care about. I also have a
social circle which is rather geographically dispersed. This, of course,
presents the problem of “how do I stay in touch with people?” Facebook &lt;em&gt;et al.&lt;/em&gt;
haven’t solved this problem in a satisfactory way for me. Discord / private
group chats are fine, but don’t feel socially fulfilling in a way that 30
minutes of even infrequent talking often is.&lt;/p&gt;
&lt;p&gt;One solution I stumbled into a few years ago was: &lt;strong&gt;Set up recurring 1:1
meetings with distant friends.&lt;/strong&gt; Yes, like a work 1:1. Yes, with an actual
calendar invite with an actual Google Meet link.&lt;/p&gt;
&lt;p&gt;This is something I firmly believe is worth spending
&lt;a href=&#34;https://www.lesswrong.com/posts/wkuDgmpxwbu2M2k3w/you-have-a-set-amount-of-weirdness-points-spend-them-wisely&#34;&gt;weirdness points&lt;/a&gt;
on with the right set of people.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;&lt;strong&gt;Why?&lt;/strong&gt;: Twofold. First: Staying in touch with people is easier than meeting
people, and friendships/relationships become more nuanced with rich shared
context over time. Maintaining older friendships is just obviously valuable.
Second: Talking to interesting people is just fun.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;On frequency:&lt;/strong&gt; I find this model works best somewhere in the
biweekly-to-bimonthly range. It’s probably better to meet too &lt;em&gt;infrequently&lt;/em&gt;
than too frequently. This is perhaps unintuitive if the goal is maintaining
relationships, but I think it’s better to have a glut of topics to discuss than
a scarcity &amp;ndash; and as long as you’re meeting every 3 months or so, you won’t hit
“reconnection awkwardness” territory.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;On duration:&lt;/strong&gt; 30 minutes is a good baseline to start with. It’s short enough
that it’s not hard to schedule into an evening, or a weekend morning. It’s
enough time to catch up on life events from the past week(s) or month and to
talk through a few substantive topics, yet not too long like you feel like you
feel pressure to fill the time. Many of my chats are exactly 30 minutes like a
work meeting; some of them are more free-form, lasting for an hour or two
depending on the context.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;On scheduling, rescheduling, and following through:&lt;/strong&gt; Some people find it
easier (or possible/enjoyable/fun) to stick to strict schedules than others. As
a rule, I never get upset if someone needs to reschedule a chat or cancel last
minute. The schedule is just a forcing function for prompting some sort of
interaction &amp;ndash; even if that interaction ends up just being a quick exchange of
logistics and pleasantries.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;On what to talk about:&lt;/strong&gt; Normal rules for
“&lt;a href=&#34;https://www.goodreads.com/book/show/157981748-supercommunicators&#34;&gt;having good conversations&lt;/a&gt;”
apply. Most of the time, the conversation can just be about “what has your life
been like recently?”, but also: be a good conversationalist. Just because it’s a
remote, structured talk doesn’t mean that it has to be a remote or stilted
conversation. Talk about philosophy, ethics, existential dread, your
cats/dogs/hamsters, books, annoyances, accomplishments, admonitions, advice,
burgeoning hobbies, travel plans, creative hobbies, etc.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;I’m writing this post as a nudge in the direction of this being normalized. But
also so that the next time I ask someone to set one of these up, I have a handy
pointer to send. (Hey, future person! 👋)&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Cover: Girona, Spain&lt;/em&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
        <title>Fifty Bits of Career Advice</title>
        <link>https://benjamincongdon.me/blog/2025/08/11/Fifty-Bits-of-Career-Advice/</link>
        <pubDate>Mon, 11 Aug 2025 00:00:00 -0700</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/08/11/Fifty-Bits-of-Career-Advice/</guid>
        <description>&lt;p&gt;As my team&amp;rsquo;s summer interns finished up their rotations this week, I had my
usual end-of-internship &amp;ldquo;AMA&amp;rdquo; 1:1s. It&amp;rsquo;s something I enjoy doing, but I realized
I was covering a lot of the same topics I&amp;rsquo;ve discussed with other previous
interns and early-career engineers over the years.&lt;/p&gt;
&lt;p&gt;So I thought it&amp;rsquo;d be useful to collect these recurring bits of advice into a
single place, both for my own reference and for anyone else who might find them
helpful.&lt;/p&gt;
&lt;h3 id=&#34;decisions-and-opportunities&#34;&gt;Decisions and opportunities&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;There is a natural &amp;ldquo;explore vs. exploit&amp;rdquo; dynamic in career decisions. Early
in your career, you&amp;rsquo;re well served by leaning more into the &amp;ldquo;explore&amp;rdquo;
direction: do as many internships as you have time for, try living in
different cities, try working on different teams, try working on
product-focused projects, try working on infrastructure projects.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Surprisingly few career decisions are true one-way-doors. This generalizes
outside your career as well.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;That said, all decisions are path dependent. The two-way door you step back
through won&amp;rsquo;t be the same one you left, in the same way that
&lt;a href=&#34;https://en.wikipedia.org/wiki/Heraclitus&#34;&gt;you can never step in the same river twice&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;It&amp;rsquo;s helpful to gravitate towards where value accrues in an organization.
This is often not intuitive.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Similarly, it&amp;rsquo;s helpful to gravitate towards where the smartest people are
working. Cultivate a set of formal and informal mentors.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Similarly, it&amp;rsquo;s helpful to gravitate towards where there are obvious problems
that are low-hanging fruit to solve. For engineers: Environments where you
are engineer-hour-constrained instead of product-alignment-constrained are
much easier to progress quickly in. For managers and PMs, the reverse is
likely true.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;It&amp;rsquo;s worth figuring out your tolerance for risk fairly early, since this has
implications for many things: timing of job hops, personal finance posture,
whether you prefer the (appearance of) safety of a large company or the risk
associated with doing a startup.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;I find I rarely regret my choice after choosing between two paths with
reasonably similar &amp;ldquo;goodness&amp;rdquo;, even though these decisions are often the most
agonizing. After all, if the options weren&amp;rsquo;t similar in &amp;ldquo;goodness&amp;rdquo;, it&amp;rsquo;d be
an easy decision.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Making hard decisions doesn&amp;rsquo;t get easier, you just make more of them over
time and become more comfortable with the discomfort. You also eventually
build up your own
&lt;a href=&#34;https://benjamincongdon.me/blog/2022/05/18/Tools-for-Making-Difficult-Decisions/&#34;&gt;personal heuristics for tough decisions&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Careers are marked in seasons. When you feel a season change coming, it can
be tempting to hold out to avoid the discomfort of change. Sometimes you
initiate the season change (e.g. you seek out an exciting opportunity) or
sometimes the season change finds you (e.g. &amp;ldquo;oops, a few key people left and
now your org is in chaos&amp;rdquo;). Either way, you&amp;rsquo;ll spend less energy preparing
for and adapting to the new season than you will trying to preserve the old
one.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;When receiving advice from others, note that sometimes you need to
&lt;a href=&#34;https://slatestarcodex.com/2014/03/24/should-you-reverse-any-advice-you-hear/&#34;&gt;reverse the direction of the advice&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;When finding your first job or two, it&amp;rsquo;s easy to look at interviews as a way
of &amp;ldquo;convincing the company to hire you&amp;rdquo;. However: the phrase &amp;ldquo;interviews are
a two-way street&amp;rdquo; is a cliché for a reason. Interviews give you an important
(though necessarily incomplete) window of what the company is working on and
looking for, and you should try to maximize the amount of signal you get for
yourself out of interviews. Prepare questions in advance, ask them, and
write the answers down. This can help change your eventual decision to be
more informed and less purely vibes-based.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id=&#34;organizations-leadership-and-culture&#34;&gt;Organizations, leadership, and culture&lt;/h3&gt;
&lt;ol start=&#34;13&#34;&gt;
&lt;li&gt;
&lt;p&gt;Truly great managers are hard to find. Value the good ones.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Truly healthy organizations are hard to find. Value the healthy ones.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;It&amp;rsquo;s tempting to become &amp;ldquo;irreplaceable&amp;rdquo;. It&amp;rsquo;s far better to foster your
replacements. (Both for your own sake and the people that grow to replace
you)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The average tenure of an engineer at most big companies is no more than ~3
years. An implication of this is that you &lt;em&gt;will&lt;/em&gt; see turnover on whatever
team you join. Usually the sky is not falling even if a few folks leave
around the same time. If you have the stomach for it, stepping into a
departing senior engineer&amp;rsquo;s shoes is a great growth opportunity.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;At a steady state, 80% of your time when being a good tech lead is being
&lt;a href=&#34;https://www.astralcodexten.com/p/heuristics-that-almost-always-work&#34;&gt;the rock that says &amp;ldquo;That seems reasonable&amp;rdquo;&lt;/a&gt;.
About half of your value comes from (skillfully) being this rock, and the
other half comes from recognizing when it&amp;rsquo;s necessary to push back, and in
(skillfully) following through with doing so.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Many things are overrated. &amp;ldquo;Being in the room where the decisions are made&amp;rdquo;
is, if anything, underrated.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Organizations are like people: They have personalities, they have
insecurities, they have strengths and weaknesses and hopes and fears. Being
in tune with organizational psychology is a great marker of a senior
engineer. Influencing organizational psychology is a marker of staff+.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;How teams and individuals say &amp;ldquo;goodbye&amp;rdquo; to people departing the company
tells you a lot.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Promotions aren&amp;rsquo;t as much an indication of &amp;ldquo;job well done&amp;rdquo; as they are
recognizing that you&amp;rsquo;re doing a qualitatively different job description as
the last time you were re-leveled.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The &amp;ldquo;good feeling&amp;rdquo; of getting a promotion lasts much shorter than you&amp;rsquo;d
think (for me, about a day or two). They&amp;rsquo;re still worth pursuing and
achieving as career milestones. But they&amp;rsquo;re progress markers, not ends in
themselves.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Kindness goes a long way. In 5 years, the people you remember will be the
ones that helped you onboard, or bailed you out in a stressful oncall
rotation. You&amp;rsquo;ll spend comparatively less time thinking of the people who
were &amp;ldquo;merely&amp;rdquo; brilliant. (Thankfully, brilliance and kindness co-occur
suspiciously often)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;You are &lt;em&gt;not&lt;/em&gt; the ultimate owner of your performance review. You can, and
very much should, advocate for yourself and put a substantial amount of
effort into writing the pieces that you control. However, it&amp;rsquo;s ultimately
your manager&amp;rsquo;s job to advocate for you on your behalf. If you have a good
manager, this is easy. If you don&amp;rsquo;t, you&amp;rsquo;ll need to do a lot of the
narrative setting yourself. In either case, managers are invariably
incredibly busy during perf season, so proactively making it easier for them
to advocate for you helps you both out.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Most SWE career ladders value &amp;ldquo;cross team collaboration&amp;rdquo; for senior and
above levels. As such, it&amp;rsquo;s helpful to develop relationships with people
outside of your immediate team. This is true for multiple reasons: first,
it&amp;rsquo;s a great way to get a larger sense of what the company is working on,
and second, it gives you a natural set of people to act as champions for
your work when you are in performance review season.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id=&#34;execution-and-craftsmanship&#34;&gt;Execution and craftsmanship&lt;/h3&gt;
&lt;ol start=&#34;26&#34;&gt;
&lt;li&gt;
&lt;p&gt;Unlike all the code you wrote before you started your career, you can&amp;rsquo;t take
most of the code you write professionally along with you (unless you&amp;rsquo;re
doing professional open source work, which, if you do, cool!). This is a
good thing, since you get to
&lt;a href=&#34;https://benjamincongdon.me/blog/2025/04/08/Why-Developer-Tools/#:~:text=this%20developer%20had%20a%20quite%20positive%20view%20of%20it&#34;&gt;build things that outlast you&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;When you&amp;rsquo;re new on a team, you automatically have a valuable perspective: a
set of fresh eyes. Keep track of the papercuts and unintuitive things you
run into &amp;ndash; when onboarding, when learning about how your team&amp;rsquo;s systems
work, and so on &amp;ndash; and share them with your team. Many of these things will
already be known by them, but people develop an indifference towards
papercuts over time, and having a fresh set of eyes is a great way to
surface (and potentially fix) them.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Early in your career, there&amp;rsquo;s a lot of value in being the person who lets
nothing fall through the cracks. A few years into your career, it&amp;rsquo;s about
choosing which cracks can be safely ignored.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Relatedly, there will always be work that &lt;em&gt;never&lt;/em&gt; gets done. Even on the
most relaxed teams, there will always be an ever-growing task backlog of P1
and P2 tasks. This is fine. It&amp;rsquo;s generally a waste of time to hunt down
every little TODO on the long tail of your team&amp;rsquo;s backlog.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Relatedly, it&amp;rsquo;s sometimes worth declaring bankruptcy on your team&amp;rsquo;s backlog.
Unless you&amp;rsquo;re a TL or manager you generally don&amp;rsquo;t get to make this decision,
but even as an IC it&amp;rsquo;s worth keeping a pulse-check on how much backlog debt
your team has at any given moment, and doing your part in managing this.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Having passionate users is a great way to get high quality feedback. Fixing
a papercut or two of a power user can earn you a years-long source of
feedback.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;That being said, it&amp;rsquo;s worth remembering that the most passionate users are
not representative of the majority of users, and that you will still need to
actively solicit feedback.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;Actually use&lt;/em&gt; the tools/products/systems/libraries/etc. you build. This is
a necessary but insufficient condition for building something that is
useful.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;You should probably invest a bit in your physical workstation. A good
keyboard, mouse, and monitor help both in feeling a sense of ownership over
your workstation, and help with ergonomics.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;You should probably invest a bit in your digital workstation. Yes, you
should pick a nice color scheme for your IDE. But you should also invest in
hyper-personalized configurations that make you faster at your job. Things
like terminal aliases for jumping between PRs/branches, small scripts for
common tasks, etc. These investments compound over time.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;It&amp;rsquo;s worth learning a pure command-line text editor like vim or emacs. You
don&amp;rsquo;t need to be an expert in it, but (for the foreseeable future) there
will be times when you don&amp;rsquo;t have access to an IDE, like when SSHing into
remote machines, and you&amp;rsquo;ll be glad to have a comfortable fallback.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id=&#34;self-management--mindset&#34;&gt;Self-management &amp;amp; mindset&lt;/h3&gt;
&lt;ol start=&#34;37&#34;&gt;
&lt;li&gt;
&lt;p&gt;It&amp;rsquo;s rarely worth making a decision you see as ethically or morally
compromising. Being able to live (easily! joyfully!) with your decisions is
a gift worth giving to yourself. To make this slightly less
&lt;a href=&#34;https://marginalrevolution.com/marginalrevolution/2023/01/ask-the-beast.html&#34;&gt;Straussian&lt;/a&gt;:
There is often money out there that is not worth taking. But you have to
figure out where to draw your own line.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;If you&amp;rsquo;re a natural &amp;ldquo;optimizer&amp;rdquo;, you should probably find a way to mindfully
and purposefully switch to a &amp;ldquo;satisficing&amp;rdquo; mindset when it&amp;rsquo;s helpful.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;It will sometimes feel like
&lt;a href=&#34;https://archive.is/3CZUA&#34;&gt;&amp;ldquo;Everyone Is Getting Hilariously Rich and You&amp;rsquo;re Not&amp;rdquo;&lt;/a&gt;.
You&amp;rsquo;ll see former colleagues, classmates, and acquaintances win big in AI
startups, crypto, or $CURRENT_THING. Bubbles and the &amp;ldquo;leading edge of the
hype cycle&amp;rdquo; always create real winners alongside the noise, but the winners
are fewer and more random than they appear. Reminding yourself this is FOMO
will save you a lot of grief.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;That &amp;ldquo;I don&amp;rsquo;t know what I&amp;rsquo;m doing&amp;rdquo; feeling early in your career is usually
followed shortly after by &amp;ldquo;I&amp;rsquo;m starting to get my footing!&amp;rdquo; Believe it or
not, you eventually start to crave this &amp;ldquo;beginner mindset&amp;rdquo;. The senior
engineers who you see switching teams or leaving for another company are
often going to &lt;em&gt;feel that way again&lt;/em&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;It&amp;rsquo;s worth cultivating an online presence, even as an extremely slow burn
side project. Having a body of work online, especially writing you sink some
good thought into, is portable across companies, contexts, and even career
paths.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;You are the ultimate owner of your career. The best managers and TLs enable
you to move in the direction you want to go, but you are the one that has to
set that direction.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;You are the ultimate owner of your mental and physical health. You will need
to learn your early warning signs for burnout and take steps to mitigate
them. Take time off proactively rather than as a last resort.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id=&#34;finances&#34;&gt;Finances&lt;/h3&gt;
&lt;ol start=&#34;44&#34;&gt;
&lt;li&gt;
&lt;p&gt;For short term financial planning, you should &lt;em&gt;mostly&lt;/em&gt; treat RSUs as
worthless paper until they vest. For medium term financial planning, I find
it helpful to do a
&lt;a href=&#34;https://en.wikipedia.org/wiki/Monte_Carlo_method&#34;&gt;Monte Carlo simulation&lt;/a&gt;
of various scenarios I think my current company is headed toward. There&amp;rsquo;s
some skill involved here, but it&amp;rsquo;s worth the effort to get a sense for how
to value your RSUs. This used to be an afternoon of work, but thankfully
with LLMs you can script something like this in a few minutes.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;By default you should &lt;em&gt;probably&lt;/em&gt; sell your RSUs as soon as you can and
reinvest in a more diversified portfolio.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;That being said, if you&amp;rsquo;re conservative with finances, you&amp;rsquo;re more likely to
undervalue the future appreciation of RSUs (and thus, when comparing offers
you may undervalue the net present value of RSUs as well).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;You should be able to comfortably live off your salary (without any bonuses
or RSUs).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;If your company does 401k matching, you should absolutely max out the match.
You should &lt;em&gt;probably&lt;/em&gt; max out your 401k contributions (when possible) as
well, as a set-and-forget way to do retirement planning until you have time
to develop a more nuanced plan for handling your finances.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;It&amp;rsquo;s worth doing a deep dive on &lt;em&gt;exactly&lt;/em&gt; how the tax system works.
Internalizing how capital gains and each type of retirement account are
taxed is a great way to avoid future unpleasant surprises.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;If you work in tech, there&amp;rsquo;s ~no excuse for not keeping and fiercely
guarding an emergency fund of 6-12 months of expenses.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Your mileage will vary on all of this. What matters most is developing your own
heuristics over time. These are just mine, offered as a starting point. In a few
years, you&amp;rsquo;ll have your own version of this list, shaped by different choices
and different seasons.&lt;/p&gt;
&lt;p&gt;Cheers!&lt;/p&gt;
</description>
    </item>
    
    <item>
        <title>The Agency Gap</title>
        <link>https://benjamincongdon.me/blog/2025/07/31/The-Agency-Gap/</link>
        <pubDate>Thu, 31 Jul 2025 00:00:00 -0700</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/07/31/The-Agency-Gap/</guid>
        <description>&lt;p&gt;There&amp;rsquo;s a line in Ben Kuhn&amp;rsquo;s essay,
&lt;a href=&#34;https://www.benkuhn.net/impact/&#34;&gt;&amp;ldquo;Impact, agency, and taste&amp;rdquo;&lt;/a&gt;, that&amp;rsquo;s been
rattling around in my head lately. He describes impact as the practice of
&amp;ldquo;making success inevitable&amp;rdquo;. That phrase captures something important about how
to approach work.&lt;/p&gt;
&lt;p&gt;Kuhn draws a distinction between two ways of approaching a project or goal:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;There&amp;rsquo;s a huge difference between the following two operating modes:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;My goal is to ship this project by the end of the month, so I&amp;rsquo;m going to
get people started working on it ASAP.&lt;/li&gt;
&lt;li&gt;My goal is to ship this project by the end of the month, so I&amp;rsquo;m going to
list out everything that needs to get done by then, draw up a schedule
working backwards from the ship date, make sure the critical path is short
enough, make sure we have enough staffing to do anything, figure out what
we&amp;rsquo;ll cut if the schedule slips, be honest about how much slop we need,
track progress against the schedule and surface any slippage as soon as I
see it, pull in people from elsewhere if I need them…&lt;/li&gt;
&lt;/ol&gt;
&lt;/blockquote&gt;
&lt;p&gt;Mode 1, as Kuhn later points out, makes you a &amp;ldquo;leaky abstraction.&amp;rdquo; If the
project is important, someone &lt;em&gt;else&lt;/em&gt; has to worry about whether it will actually
succeed, constantly monitoring and figuring out how to resolve blockers that you
might not even see coming.&lt;/p&gt;
&lt;p&gt;Mode 2, on the other hand, is about &amp;ldquo;making success inevitable&amp;rdquo;. It&amp;rsquo;s about
taking true accountability for actually &lt;em&gt;achieving the goal&lt;/em&gt; not just for &lt;em&gt;doing
the work&lt;/em&gt;.&lt;/p&gt;
&lt;h3 id=&#34;owning-the-outcome&#34;&gt;Owning the Outcome&lt;/h3&gt;
&lt;p&gt;This idea of &amp;ldquo;making success inevitable&amp;rdquo; struck me as a valuable mindset for
&lt;em&gt;anyone&lt;/em&gt; aiming to have a significant impact, regardless of their role. While
the responsibility for a project&amp;rsquo;s overall success might formally fall more
heavily on someone with a title that indicates this responsibility (e.g. a tech
lead (TL) or manager), the &lt;em&gt;practice&lt;/em&gt; of thinking and acting in Mode 2 is
something any engineer can develop.&lt;/p&gt;
&lt;p&gt;When I reflect on my own experience, transitioning within a team from an
individual contributor to TL role certainly made the Mode 2 thinking much more
explicit. Suddenly, the scope of concern broadened from &amp;ldquo;is my part done well?&amp;rdquo;
to &amp;ldquo;will this initiative/project land successfully?&amp;rdquo; With multiple people and
dependencies involved, you need to avoid becoming a leaky abstraction. The
feedback loops get less explicit, and you need to more frequently move from
tactical to strategic thinking.&lt;/p&gt;
&lt;p&gt;However, high-impact ICs usually exhibit these same traits. They don&amp;rsquo;t &amp;ldquo;just&amp;rdquo;
execute tasks; they anticipate problems, clarify ambiguity, proactively
communicate risks, plan future work, and drive things forward with a focus on
the ultimate objective. They internalize whatever the true end goal is and take
ownership beyond their immediate assigned slice.&lt;/p&gt;
&lt;p&gt;Kuhn highlights just how valuable this capability is:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;People who can be trusted to make something inevitable are really rare, and
are typically the bottleneck for how many different things a team or company
can do at once. So if someone else is responsible for making your project
inevitable, you&amp;rsquo;re consuming some of that scarce resource; if you&amp;rsquo;re the one
making your own projects inevitable, you&amp;rsquo;re a producer of that resource, and
you&amp;rsquo;re helping unblock a key constraint for your team.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;While TLs and managers are often the &lt;em&gt;institutionally designated&lt;/em&gt; providers of
this resource, that doesn&amp;rsquo;t mean they always fulfill that role effectively, nor
does it mean ICs &lt;em&gt;can&amp;rsquo;t&lt;/em&gt;. We&amp;rsquo;ve all likely seen managers who operate in Mode 1,
pushing the burden of ensuring success onto their reports. Conversely, highly
effective ICs often step up, implicitly or explicitly, to make their own
projects (and sometimes those around them) inevitable, and thus producers of
that scarce resource.&lt;/p&gt;
&lt;h3 id=&#34;the-scarce-resource-agency&#34;&gt;The Scarce Resource: Agency&lt;/h3&gt;
&lt;p&gt;The scarce resource Kuhn is identifying is &lt;strong&gt;agency&lt;/strong&gt;. It&amp;rsquo;s the quality that
separates merely &amp;ldquo;doing work&amp;rdquo; (Mode 1) from &amp;ldquo;owning the outcome&amp;rdquo; (Mode 2).
Agency encompasses the initiative, proactiveness, resourcefulness, and sheer
relentlessness required to navigate ambiguity and push something across the
finish line, ensuring it actually achieves its intended goal. It doesn&amp;rsquo;t always
require inhabiting the &amp;ldquo;operator&amp;rdquo; mindset, as agency also includes the ability
to think strategically and to make tough decisions, but highly agentic people do
know when to deploy both the &amp;ldquo;operator&amp;rdquo; and &amp;ldquo;strategist&amp;rdquo; mindset.&lt;/p&gt;
&lt;p&gt;While some people seem naturally more inclined towards high-agency behavior, I
strongly believe it&amp;rsquo;s also a skill that can be developed and practiced.
Interestingly, I&amp;rsquo;ve often observed that particularly promising engineers already
possess a significant degree of latent agency. They have the technical skills,
the problem-solving ability, and often the right intuitions. What they sometimes
lack isn&amp;rsquo;t the capability, but the permission or encouragement to fully exercise
it.&lt;/p&gt;
&lt;p&gt;Exercising agency, especially when it involves pushing back on initial plans,
identifying non-obvious risks, or proactively coordinating across teams, can
feel like overstepping boundaries or one&amp;rsquo;s formal job title. This is
particularly true for those earlier in their careers or newer to a team.
Creating an environment of psychological safety, where team members feel
explicitly empowered and encouraged by senior folks or leadership to take
initiative and question assumptions, is important for unlocking this potential
agency. Without that safety net, agency often remains latent.&lt;/p&gt;
&lt;h3 id=&#34;developing-agency&#34;&gt;Developing Agency&lt;/h3&gt;
&lt;p&gt;Striving for &amp;ldquo;inevitability&amp;rdquo;, as Kuhn frames it, isn&amp;rsquo;t about achieving
perfection or eliminating all risk. That&amp;rsquo;s clearly impossible in most nontrivial
areas of human endeavor. Instead, I think the real value lies in cultivating the
mindset itself.&lt;/p&gt;
&lt;p&gt;Adopting this agentic mindset takes conscious effort, especially initially. It
means spending more time up-front planning, anticipating, and communicating,
which can sometimes feel less immediately productive than jumping into writing
code, or whatever the immediate &amp;ldquo;work&amp;rdquo; may be. However, investing time in
up-front strategic thinking consistently pays off later. Having a strategy
results in less frantic firefighting, fewer deadline slips, and a generally
calmer, more predictable process for delivering impact.&lt;/p&gt;
&lt;p&gt;The benefits of cultivating personal agency are beyond &amp;ldquo;merely&amp;rdquo; delivering
reliable outcomes to achieve some abstract team/company/personal OKR. There&amp;rsquo;s a
certain confidence and personal satisfaction that comes from knowing you&amp;rsquo;ve done
the work to truly understand the problem, anticipate hurdles, and steer yourself
toward success, rather than hoping things work out or leaning on someone else to
keep the project unblocked. Agency also generalizes well across different
domains of life. Developing agency in a professional context usually results in
a higher ability&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt; to exercise agency in other contexts (e.g. personal,
social, relational). Professional contexts are a good environment for developing
agency too: in healthy workplaces, there is a clear feedback loop and ample
opportunity to exercise agency.&lt;/p&gt;
&lt;p&gt;In my experience, developing the agency to make an impact on your (sometimes
chaotic) environment feels like a worthwhile pursuit in itself. It also reflects
quite well on promotion packages, if that&amp;rsquo;s your jam.&lt;/p&gt;
&lt;h2 id=&#34;further-reading&#34;&gt;Further Reading&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Ben Kuhn&amp;rsquo;s &lt;a href=&#34;https://www.benkuhn.net/impact/&#34;&gt;Impact, Agency, and Taste&lt;/a&gt;,
which inspired this post.&lt;/li&gt;
&lt;li&gt;Cate Hall&amp;rsquo;s &lt;a href=&#34;https://usefulfictions.substack.com/archive&#34;&gt;Substack&lt;/a&gt; has many
good posts on identifying and building agency. Notably:
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://usefulfictions.substack.com/p/how-to-be-more-agentic&#34;&gt;How to be more agentic&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://usefulfictions.substack.com/p/maybe-youre-not-actually-trying&#34;&gt;Maybe you’re not Actually Trying&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;I intentionally use the word &amp;ldquo;ability&amp;rdquo; here. I&amp;rsquo;m more confident that the
&lt;em&gt;skill&lt;/em&gt; of agency generalizes well than I am that the &lt;em&gt;propensity to use it&lt;/em&gt;
generalizes.
&lt;a href=&#34;https://usefulfictions.substack.com/p/maybe-youre-not-actually-trying&#34;&gt;This Cate Hall post&lt;/a&gt;
on selective agency describes this in more detail.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
        <title>When Red Buttons Aren&#39;t Enough</title>
        <link>https://benjamincongdon.me/blog/2025/06/16/When-Red-Buttons-Arent-Enough/</link>
        <pubDate>Mon, 16 Jun 2025 00:00:00 -0700</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/06/16/When-Red-Buttons-Arent-Enough/</guid>
        <description>&lt;p&gt;On June 12, 2025, most of GCP went offline. This led to downstream outages in a
multitude of websites and services, such as
&lt;a href=&#34;https://blog.cloudflare.com/cloudflare-service-outage-june-12-2025/&#34;&gt;Cloudflare&lt;/a&gt;,
&lt;a href=&#34;https://www.cbsnews.com/news/google-openai-spotify-experience-outage-affecting-tens-of-thousands-of-users/&#34;&gt;Spotify&lt;/a&gt;,
OpenAI, Anthropic, Replit, and many others.&lt;/p&gt;
&lt;p&gt;With a few days of hindsight, GCP published a quite detailed
&lt;a href=&#34;https://status.cloud.google.com/incidents/ow5i3PPK96RduMcb1SsW&#34;&gt;postmortem&lt;/a&gt;.
Frankly, I&amp;rsquo;m impressed by the depth of this PM and the quantity of technical
details that they released publicly. Given this level of detail, it&amp;rsquo;s feasible
to piece together a reasonably full picture of what happened and what ultimately
went wrong.&lt;/p&gt;
&lt;p&gt;In what follows, I&amp;rsquo;m only going on what was publicly released in the postmortem.&lt;/p&gt;
&lt;h3 id=&#34;the-underlying-cause&#34;&gt;The Underlying Cause&lt;/h3&gt;
&lt;p&gt;GCP has an internal service called &amp;ldquo;Service Control&amp;rdquo;, which handles policy
checks like authorization and quotas for the GCP APIs. A data-dependent bug was
introduced into this service that caused it to crash and fail closed when a
particular code-path was reached; this code path could be exercised by a
Google-driven global quota policy change. Since the bad code path was
data-dependent, it was never exercised as this change gradually rolled out to
production &amp;ndash; it was triggered all-at-once, globally, when a new global quota
policy version was applied.&lt;/p&gt;
&lt;p&gt;This matches some of the provisional rumors (later ~confirmed) that the outage
was &amp;ldquo;related to IAM&amp;rdquo;.&lt;/p&gt;
&lt;h3 id=&#34;lack-of-gradual-rollout&#34;&gt;Lack of Gradual Rollout&lt;/h3&gt;
&lt;p&gt;The biggest egg-on-face from this incident is the following admission, worth
quoting in full:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;On May 29, 2025, a new feature was added to Service Control for additional
quota policy checks. This code change and binary release went through our
region by region rollout, but the code path that failed was never exercised
during this rollout due to needing a policy change that would trigger the
code. As a safety precaution, this code change came with a red-button to turn
off that particular policy serving path. The issue with this change was that
it did not have appropriate error handling nor was it feature flag protected.
Without the appropriate error handling, the null pointer caused the binary to
crash. Feature flags are used to gradually enable the feature region by region
per project, starting with internal projects, to enable us to catch issues. If
this had been flag protected, the issue would have been caught in staging.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;So, effectively, the change that caused this global outage seemed to have never
been tested in production before its enablement. On its face, this seems&amp;hellip; bad.
Reading between the lines of this paragraph, it&amp;rsquo;s unclear to me to what extent
they&amp;rsquo;re throwing themselves under the bus. Saying &amp;ldquo;If this had been flag
protected, the issue would have been caught in staging&amp;rdquo; could be read as either
&amp;ldquo;we messed up and should have had this flag protected&amp;rdquo; or just literally &amp;ldquo;if
there was a flag, this wouldn&amp;rsquo;t have happened&amp;rdquo;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The change was hard to feature flag.&lt;/strong&gt; There &lt;em&gt;are&lt;/em&gt; very much times when it&amp;rsquo;s
quite difficult to feature flag changes. If a change requires a binary restart
to take effect, or if the change is part of a large refactor that doesn&amp;rsquo;t have
isolated pieces over which to place a flag, effectively feature flagging a
change can be quite difficult. Giving Google the benefit of the doubt, it&amp;rsquo;s
possible that that this change was hard to flag, and so instead of using
standard flags, they relied on their &amp;ldquo;red button&amp;rdquo; mechanism for safety. I give
relatively moderately low probability to this possibility, since the change was
able to be disabled with the &amp;ldquo;red button&amp;rdquo; and was data-dependent.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The change relied on their &amp;ldquo;red-button&amp;rdquo;&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt; instead of standard feature
flags.&lt;/strong&gt; The PM notes that &amp;ldquo;this code change came with a red-button to turn off
that particular policy serving path&amp;rdquo; and &amp;ldquo;Within 10 minutes, the root cause was
identified and the red-button (to disable the serving path) was being put in
place&amp;rdquo;. So it seems like the rollout plan for this feature did heavily rely on
the existence of this &amp;ldquo;red-button&amp;rdquo; for safety. The fact that it was identified
as the correct mitigation approach within 10 minutes updates me in the direction
that this &lt;em&gt;was&lt;/em&gt; their rollout safety plan: if it breaks, we&amp;rsquo;ll kill-switch it to
disable. Perhaps they thought the red-button would be quicker than it actually
was to take effect, and/or the change was low-risk enough that the red-button
approach was sufficiently safe.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;There was a process failure.&lt;/strong&gt; At least twice in the PM, its listed as a
follow-up that &amp;ldquo;We will enforce all changes to critical binaries to be feature
flag protected and disabled by default.&amp;rdquo; My knowledge of change management
practices at GCP is now stale, but I would strongly believe that it was already
a best practice to feature flag critical binaries &amp;ndash; for the policy to be
otherwise would be courting disaster, near malpractice for a set of services
running at their scale. I put higher weight on this being an individual or
team-specific process failure. There &lt;em&gt;should&lt;/em&gt; have been a plan to rollout this
code gradually, but either that plan had a faulty assumption (i.e. thinking that
the gradual binary rollout protected them, not knowing that the code wasn&amp;rsquo;t
actually being executed) or it was insufficiently paranoid (i.e. that the
red-button approach provided sufficient safety).&lt;/p&gt;
&lt;p&gt;Ultimately, these things happen. I&amp;rsquo;ve been cc&amp;rsquo;d on many PMs that have a similar
shape to this. Process compliance is a hard problem, because it grounds out in
the cultural norms and practices of the engineering environment &amp;ndash; down to
individual contributors and tech leads &amp;ndash; rather than a technical architectural
decision that can be made and enforced with a fully top-down approach.&lt;/p&gt;
&lt;h3 id=&#34;other-followups&#34;&gt;Other Followups&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Fail Open&lt;/strong&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We will modularize Service Control’s architecture, so the functionality is
isolated and fails open. Thus, if a corresponding check fails, Service Control
can still serve API requests.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This seems wise. For a single-point-of-failure system like Service Control,
adding isolation and (well-considered) means to fail-open makes sense.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Globally replicated data:&lt;/strong&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We will audit all systems that consume globally replicated data. Regardless of
the business need for near instantaneous consistency of the data globally
(i.e. quota management settings are global), data replication needs to be
propagated incrementally with sufficient time to validate and detect issues.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This also seems quite wise, and honestly is what I&amp;rsquo;d consider to be the most
important followup from this incident, from what has been publicly been
released. Consumption of globally replicated data is always going to be tricky.
Even if you have a feature flag, doing a globally impactful operation is always
going to be significantly riskier than doing that same action gradually.&lt;/p&gt;
&lt;p&gt;Google undoubtedly has the internal tools to this sort of gradual propagation,
so I&amp;rsquo;m guessing that in this case the global nature of the data was an
intentional business requirement (i.e. quotas need to be updated globally within
a short window of time for &lt;em&gt;something something&lt;/em&gt; compliance reasons). If true,
reconsidering or challenging these requirements would also be helpful.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Incident Response:&lt;/strong&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Within 2 minutes, our Site Reliability Engineering team was triaging the
incident. Within 10 minutes, the root cause was identified and the red-button
(to disable the serving path) was being put in place. The red-button was ready
to roll out ~25 minutes from the start of the incident. Within 40 minutes of
the incident, the red-button rollout was completed, and we started seeing
recovery across regions, starting with the smaller ones first.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The incident response here seems quite fast. Two minutes to first triage and ten
minutes to root cause is fast &amp;ndash; although, the global nature of this impact
assuredly got eyes on this quickly.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Incident Communication:&lt;/strong&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We posted our first incident report to Cloud Service Health about ~1h after
the start of the crashes, due to the Cloud Service Health infrastructure being
down due to this outage.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Oof. First, the circular dependency here is rough, and I&amp;rsquo;m surprised there
wasn&amp;rsquo;t an easier way to provide an incident report through an air-gapped
channel. Second, we&amp;rsquo;ll never know, but I also wouldn&amp;rsquo;t be surprised if
institutional caution accounted for at least some of this delay. Especially with
such a substantial impact like this, figuring out who has the authority to
green-light the costly &amp;ldquo;everything is down&amp;rdquo; message would likely take some time.&lt;/p&gt;
&lt;p&gt;Google exposing this level of detail in a retrospective is a positive sign. It
rhymes with postmortems that I&amp;rsquo;ve seen in the past which, while painful,
resulted in cultural change that ultimately improved stability long term. Again,
I have no idea the actual internals of what happened here &amp;ndash; it&amp;rsquo;s quite possible
that this rollout was a well-designed calculated risk which came up unlucky, or
it was a legitimate own-goal process blunder. In any case, the takeaways I have
are twofold: first, even the hyperscalers make mistakes, and two, that
kill-switches are necessary but not sufficient &amp;ndash; there is no substitute for
exercising new code gradually, to avoid global outages like this.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Obligatory: These are my own opinions and very much do not represent my
employer, etc.&lt;/em&gt;&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;I think they effectively mean &amp;ldquo;kill switch&amp;rdquo; by &amp;ldquo;red button&amp;rdquo;?&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
        <title>Why Developer Tools?</title>
        <link>https://benjamincongdon.me/blog/2025/04/08/Why-Developer-Tools/</link>
        <pubDate>Tue, 08 Apr 2025 00:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/04/08/Why-Developer-Tools/</guid>
        <description>&lt;p&gt;I had the realization a year or so ago that much of the high-impact work I&amp;rsquo;ve
done in my career has been related &amp;ndash; directly or indirectly &amp;ndash; to building
developer tooling. I did not plan this, at all, but I’m quite happy to have
found impact in this niche.&lt;/p&gt;
&lt;p&gt;My first job outside of school was writing a developer tool for engineers
building the &lt;a href=&#34;https://console.cloud.google.com/&#34;&gt;Google Cloud Console&lt;/a&gt;, which is
the frontend for GCP and (at least at the time) was likely &lt;em&gt;the largest&lt;/em&gt;
&lt;a href=&#34;https://angularjs.org/&#34;&gt;Angular&lt;/a&gt; app in existence&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt;. The team I joined has
been recently created and was building a set of tools to replace some
hilariously unergonomic tools that had organically grown along with the problem
space.&lt;/p&gt;
&lt;p&gt;The result was an amazing (if modestly overengineered) tool which was
legitimately delightful to use and resulted in a demonstrable productivity boost
for the &amp;gt;1k internal developers who used it. I recently ran into a current GCP
developer and was pleased to hear that this tool is still in active use &amp;ndash; and
not only that, but that this developer had a quite positive view of it.&lt;/p&gt;
&lt;p&gt;Fast forward a couple of years, and my current team at Databricks&lt;sup id=&#34;fnref:2&#34;&gt;&lt;a href=&#34;#fn:2&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;2&lt;/a&gt;&lt;/sup&gt; works on
internal infrastructure. More of our time is spent working on building
platform-level components for dynamic service configuration, but we also build
and maintain a set of tools for engineers to interface with these systems. And I
think developing the UX of these tools has probably been some of my
highest-leverage work thus far.&lt;/p&gt;
&lt;h2 id=&#34;why-devtools&#34;&gt;Why Devtools&lt;/h2&gt;
&lt;p&gt;So, why am I so enthusiastic about developer tools?&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Developers will tell you their papercuts.&lt;/strong&gt; One of the best parts of building
tools for other engineers is that they readily tell you what’s not working. If a
developer has to interact with a painful tool or process regularly, you&amp;rsquo;ll find
out pretty quickly. This feedback is super useful, both at a product level (what
features to add/remove/improve) and at a meta level (what are developers
actually doing). I found this to be even more true when the tool was, well,
actually useful. When the tool becomes part of a developer&amp;rsquo;s daily workflow, you
start hearing very specific, granular feature requests and complaints, which are
great! We can&amp;rsquo;t always fulfill these requests, but in aggregate, they give a
directional sense of where to build next.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Great tooling is a force multiplier for the rest of the engineering
organization.&lt;/strong&gt; When internal developer tools are good, and they’re used
consistently, they improve the productivity and efficiency of all downstream
developers. This is most notoriously difficult to quantitatively measure &amp;ndash; in
theory, there should be a measurable improvement in developer velocity, but this
is usually hard to get good numbers on. The real benefit is a &amp;ldquo;rising tide lifts
all boats&amp;rdquo; effect &amp;ndash; all developers on your team become tangibly better from
using a well-designed tool, and those that weren&amp;rsquo;t efficient at a particular set
of tasks become demonstrably more efficient using the tool.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;There is cultural leverage to building tooling.&lt;/strong&gt; Closely related to the force
multiplier effect is the cultural effect of improving shared tooling. Developer
&amp;ldquo;best practices&amp;rdquo; are just codified shared behaviors and tool usage. Tooling can
be designed to encourage certain practices by reducing the friction to &amp;ldquo;do
things the right way&amp;rdquo;. For instance, if your goal is to enforce that all code
changes be linted, you can add the linter as part of your standard git commit
hooks (reducing friction) at the same time as adding the lint check to be a PR
merge precondition (ensuring compliance).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Infrastructure builders get to interact with a lot of other teams.&lt;/strong&gt; Building
internal tooling necessitates interfacing with many other teams to understand
their use cases. This was an unexpectedly pleasant part of my experience
building internal tools: I got to work with teams from all around my company,
which can make you more aware of what other parts of your company are working
on. There&amp;rsquo;s also just a nice social component to this; I&amp;rsquo;ve talked with likely
hundreds of engineers in my product&amp;rsquo;s support channel, so whenever I travel to a
new office I often run into familiar faces. This can also lead to opportunities
to improve coordination between teams, increase your exposure to novel projects,
discover opportunities for improvement of your tooling, and notice places where
&amp;ldquo;competing&amp;rdquo; tools can be consolidated.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;Working primarily on developer tools and internal infrastructure wasn&amp;rsquo;t
something I planned, but it&amp;rsquo;s turned out to be a really satisfying niche.
There&amp;rsquo;s something uniquely rewarding about creating things that directly help
your peers get their work done more effectively. It has its own set of unique
challenges, but the leverage you get and the direct feedback loop make it quite
enjoyable.&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;For good or bad.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:2&#34;&gt;
&lt;p&gt;Obligatory &amp;ldquo;opinions my own, not that of my employer&amp;rdquo;&amp;#160;&lt;a href=&#34;#fnref:2&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
        <title>The Models Want to Reason</title>
        <link>https://benjamincongdon.me/blog/2025/02/12/The-Models-Want-to-Reason/</link>
        <pubDate>Wed, 12 Feb 2025 00:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/02/12/The-Models-Want-to-Reason/</guid>
        <description>&lt;p&gt;Since I wrote about &lt;a href=&#34;https://benjamincongdon.me/blog/2024/12/14/Chain-of-Continuous-Thoughts/&#34;&gt;COCONUT&lt;/a&gt;,
Meta&amp;rsquo;s paper on reasoning in latent space, there&amp;rsquo;s been a wave in publicly
accessible research into reasoning models. The most notable example, which
overshadows everything else to the point of feeling like I almost don&amp;rsquo;t need to
mention it as I write this in mid Feb 2025, was the
&lt;a href=&#34;https://arxiv.org/abs/2501.12948&#34;&gt;Deepseek R1 paper&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;While I think the R1 paper is great and deserved the attention it garnered,
there has been a steady stream of additional research into the reasoning space
that I think begins to paint an interesting picture of what comes next.&lt;/p&gt;
&lt;h2 id=&#34;reasoning-scales-down&#34;&gt;Reasoning Scales Down&lt;/h2&gt;
&lt;p&gt;R1 proved that a reasonably well resourced lab can produce a reasoner for on the
order of ~$1M of compute by current estimates. This leads to the phrase
&lt;a href=&#34;https://thezvi.substack.com/p/deepseek-dont-panic&#34;&gt;&amp;ldquo;V3 implies R1&amp;rdquo;&lt;/a&gt;: we now
have methodology, in the public domain, that a sufficiently good base model&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt;
can be altered into an even better reasoning model.&lt;/p&gt;
&lt;p&gt;But it seems reasoning can scale down dramatically, as is shown by
&lt;a href=&#34;https://arxiv.org/pdf/2501.19393&#34;&gt;s1: Simple test-time scaling&lt;/a&gt;. They achieved
impressive results on reasoning tasks with a quite small dataset and minimal
training. There&amp;rsquo;s been a lot written about this work already, but it&amp;rsquo;s
interesting and worth a read.&lt;/p&gt;
&lt;p&gt;My summarization of the paper&amp;rsquo;s methodology is: 1. Assemble a small (1k
samples), but highly curated and filtered reasoning dataset, which show CoT
reasoning traces. This involved generating reasoning traces from a more powerful
model (Gemini Thinking), and filtering to include just the highest value
samples. 2. Fine tune Qwen2.5-35B on the reasoning dataset using SFT. 3. Use a
novel test-time scaling method (&amp;ldquo;Budget Forcing&amp;rdquo;) to dynamically control how
much inference to do at test time. This budget policy can terminate reasoning
early, or extend the reasoning by adding &amp;ldquo;Wait&amp;rdquo; tokens. Essentially, you give it
a reasoning budget &amp;ndash; N tokens &amp;ndash; and it will coerce the model into expending
approximately N tokens before giving a final answer.&lt;/p&gt;
&lt;p&gt;The &amp;ldquo;budget&amp;rdquo; used at inference time is decided as a static parameter of the
decoding, and isn&amp;rsquo;t determined dynamically. In contrast, R1 emergently learned
during training time to utilize longer CoTs &amp;ndash; the ability to insert &amp;ldquo;wait&amp;rdquo; into
the CoT was something the model could trigger for itself, without needing an
external system to intervene. So it seems S1, to some extent, succeeds by adding
the inductive bias that particular problems benefit from throwing more tokens at
them. And then the authors a novel method to force the model to emit those
additional tokens.&lt;/p&gt;
&lt;p&gt;My first response to this was, &amp;ldquo;It would be interesting to design a system that
can determine how much reasoning it needs to perform for a given problem.&amp;rdquo; But I
think that&amp;rsquo;s just o1/r1/o3 and friends. This ability is learned naturally in
longer RL post-training.&lt;/p&gt;
&lt;p&gt;My second response was: &amp;ldquo;Wait&amp;rdquo; is the new
&lt;a href=&#34;https://arxiv.org/pdf/2205.11916&#34;&gt;&amp;ldquo;Lets think step by step&amp;rdquo;&lt;/a&gt; 😀&lt;/p&gt;
&lt;h2 id=&#34;reasoning-with-latent-tokens&#34;&gt;Reasoning with Latent Tokens&lt;/h2&gt;
&lt;p&gt;&lt;a href=&#34;https://arxiv.org/pdf/2502.03275v1&#34;&gt;Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning&lt;/a&gt;:
This paper is an interesting take at expanding the reasoning expressiveness of
models. Instead of having the models reason in latent space with non-discrete
intermediate outputs during inference (a la COCONUT), they train the model to
learn a codebook of discrete latent tokens during training and then use these
tokens during inference.&lt;/p&gt;
&lt;p&gt;During training, the model uses a vector-quantized variational autoencoder
(VQ-VAE) to learn a set of non-text reasoning tokens. To do this, they use three
components in their loss:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Reconstruction loss: This loss measures the ability of the VQ-VAE to
reconstruct the original input sequence from the set of quantized tokens it
produces.&lt;/li&gt;
&lt;li&gt;VQ Loss: This loss encourages the encoder to produce outputs that are close
to the learned codebook embeddings.&lt;/li&gt;
&lt;li&gt;Commitment Loss: This loss complements the VQ loss, and constrains the
VQ-VAE&amp;rsquo;s usage of the latent space. This tries to ensure that the outputs
remain quantized and don&amp;rsquo;t differ too much from the learned codebook.&lt;/li&gt;
&lt;/ol&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
        
        
            
            
            
            
            
            
            
            
            
            
            
            
                
            
        
        
        
        
        
        
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/02/12/The-Models-Want-to-Reason/figure3_1.png&#34;&gt;
            
            
            
            &lt;img
                src=&#34;https://benjamincongdon.me/blog/2025/02/12/The-Models-Want-to-Reason/figure3_1.png&#34; alt=&#34;Figure 3.1&#34;
                loading=&#34;lazy&#34;
                decoding=&#34;async&#34;
            &gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;Figure 3.1&lt;a href=&#34;https://arxiv.org/pdf/2502.03275v1&#34;&gt;Arxiv&lt;/a&gt;&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;During training, the VQ-VAE is shown the entire input sequence, but is only
applied to the CoT tokens. The CoT sequences are gradually replaced with
corresponding latent tokens. This, along with randomized replacement of text
tokens with latent tokens, allows the model to adapt to using the latent tokens.&lt;/p&gt;
&lt;p&gt;At test time, the model uses these &amp;ldquo;Z&amp;rdquo; latent tokens in its CoT. The VQ-VAE is
only used for learning the tokens, and isn&amp;rsquo;t needed during inference.&lt;/p&gt;
&lt;p&gt;As for results, the model showed an increase in reasoning ability, both in- and
out-of-domain of the datasets they trained on. The increase in ability seems
relatively small compared to other reasoning methods (~4% improvement vs.
traditional CoT). The more interesting result is that the model was able to
improve slightly in performance while using ~17% fewer tokens in its reasoning
trace. So it seems like this method allows the model to develop a more efficient
vocabulary for reasoning.&lt;/p&gt;
&lt;p&gt;Overall, I&amp;rsquo;m not sure that the performance increase shown in &lt;em&gt;Token Assorted&lt;/em&gt; is
worth losing the visibility into the CoT.&lt;/p&gt;
&lt;h2 id=&#34;reasoning-with-recurrence-in-latent-space&#34;&gt;Reasoning with Recurrence in Latent Space&lt;/h2&gt;
&lt;p&gt;&lt;a href=&#34;https://arxiv.org/abs/2502.05171&#34;&gt;Scaling up Test-Time Compute with Latent Reasoning&lt;/a&gt;:
This paper introduces a novel way of reasoning in latent space, using a
depth-recurrent transformer. Their model, &amp;ldquo;Huginn&amp;rdquo;, combines a traditional
decoder-only transformer with a recurrent block, which can be iterated multiple
times during decoding before emitting a token &amp;ndash; allowing for the &amp;ldquo;reasoning&amp;rdquo; in
latent space. Like s1&amp;rsquo;s budget policy, the number of thinking iterations can be
selected at test time.&lt;/p&gt;
&lt;p&gt;The high-level generation approach is, roughly, embed the input sequence into
the latent space using a transformer layer (&amp;ldquo;Prelude&amp;rdquo;), then iterate this
transformed input through a recurrent block for some number of iterations, then
un-embed the final state into an output token (&amp;ldquo;Coda&amp;rdquo;).&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
        
        
            
            
            
            
            
            
            
            
            
            
            
            
                
            
        
        
        
        
        
        
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/02/12/The-Models-Want-to-Reason/figure2.png&#34;&gt;
            
            
            
            &lt;img
                src=&#34;https://benjamincongdon.me/blog/2025/02/12/The-Models-Want-to-Reason/figure2.png&#34; alt=&#34;Figure 2&#34;
                loading=&#34;lazy&#34;
                decoding=&#34;async&#34;
            &gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;Figure 2&lt;a href=&#34;https://arxiv.org/abs/2502.05171&#34;&gt;Arxiv&lt;/a&gt;&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;There are a couple really cool results about this approach:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The recurrent loops can be seen as a &lt;strong&gt;convergence&lt;/strong&gt; process, where the model
able to spend more &amp;ldquo;effort&amp;rdquo; generating a token. More iterations results in
greater certainty in the output.&lt;/li&gt;
&lt;li&gt;The convergence framing allows for &lt;strong&gt;adaptive compute at inference time&lt;/strong&gt;. By
specifying a threshold difference between successive states in the recurrent
portion, the model can keep iterating until it has converged on a token &amp;ndash;
allowing for more iterations on &amp;ldquo;difficult&amp;rdquo; tokens and fewer on &amp;ldquo;easier&amp;rdquo;
tokens that converge more quickly.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;As a hand-wavey intuition for this, assume you have some &amp;ldquo;distance&amp;rdquo; measure &lt;code&gt;D&lt;/code&gt;
that allows you to determine the difference between two of the hidden states
used in the recurrence block. You can measure how much the model has updated
it&amp;rsquo;s &amp;ldquo;belief&amp;rdquo; for each recurrent iteration by calculating &lt;code&gt;D(r_i, r_{i+1})&lt;/code&gt;. If
the model is reaching convergence, the value of &lt;code&gt;D(r_i, r_{i+1})&lt;/code&gt; will
monotonically decrease as &lt;code&gt;i&lt;/code&gt;, the number of iterations, increases. You can set
some threshold &lt;code&gt;T&lt;/code&gt; after which you stop iterating. Thus, if the model displays
uncertainty and takes many iterations to reach &lt;code&gt;T&lt;/code&gt;, it has time to do so; but if
the model converges quickly to &lt;code&gt;T&lt;/code&gt;, you don&amp;rsquo;t need to keep iterating once it&amp;rsquo;s
already converged.&lt;/p&gt;
&lt;p&gt;There are two figures from the paper that illustrate this well:&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
        
        
            
            
            
            
            
            
            
            
            
            
            
            
                
            
        
        
        
        
        
        
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/02/12/The-Models-Want-to-Reason/figure19.png&#34;&gt;
            
            
            
            &lt;img
                src=&#34;https://benjamincongdon.me/blog/2025/02/12/The-Models-Want-to-Reason/figure19.png&#34; alt=&#34;Figure 19&#34;
                loading=&#34;lazy&#34;
                decoding=&#34;async&#34;
            &gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;Figure 19&lt;a href=&#34;https://arxiv.org/pdf/2502.05171&#34;&gt;Arxiv&lt;/a&gt;&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;In Figure 19, we see the model processing a query, with colors representing how
much the model updated its hidden state with each recurrent iteration. The
darker colors roughly indicate high convergence, and the lighter colors indicate
low convergence. For most tokens in the sequence, the model converges quickly.
The tokens between sentences, and in “Faust” here seem to indicate lower
confidence, as even after several iterations the model doesn’t settle into a
firm prediction.&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
        
        
            
            
            
            
            
            
            
            
            
            
            
            
                
            
        
        
        
        
        
        
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2025/02/12/The-Models-Want-to-Reason/figure12.png&#34;&gt;
            
            
            
            &lt;img
                src=&#34;https://benjamincongdon.me/blog/2025/02/12/The-Models-Want-to-Reason/figure12.png&#34; alt=&#34;Figure 12&#34;
                loading=&#34;lazy&#34;
                decoding=&#34;async&#34;
            &gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;Figure 12&lt;a href=&#34;https://arxiv.org/pdf/2502.05171&#34;&gt;Arxiv&lt;/a&gt;&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;In figure 12, we see a lower dimensional projection (via PCA) of the trajectory
of the latent space state for particular tokens. This is a way to visualize how
the hidden state evolves as it goes through more iterations. There are a bunch
of interesting behaviors here. In the first row, we see a straightforward
convergence to the center. For the second row, the token is the number &lt;code&gt;3&lt;/code&gt;, and
the model exhibits an &amp;ldquo;orbiting&amp;rdquo; pattern. And for the third row, the token
oscillates/slides in the second component. The paper authors infer that this
learned orbiting and sliding behavior allows handling of more advanced concepts.
(All of this was learned emergently and wasn&amp;rsquo;t part of the training objective!)&lt;/p&gt;
&lt;p&gt;Another interesting component of this paper is that the do not require
specialized datasets including CoT for their training. The training procedure is
relatively complicated, so my summary will not do it justice. However, at a high
level they run the recurrent block for a random number of iterations. So the
model isn&amp;rsquo;t &amp;ldquo;locked&amp;rdquo; into a particular number of iterations &amp;ndash; it is pushed to
learn to use an arbitrary number of iterations on a particular generation.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Zero-Shot Continuous CoT:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;By reusing the hidden state from the recurrence block between token decoding
steps, the model can use this hidden state as a reasoning &amp;ldquo;scratch pad&amp;rdquo;. For
example, when working on a task that requires reasoning, reusing this hidden
state seems to allow the model to reuse some of the partial work its done on
token &lt;code&gt;i&lt;/code&gt;, such that during token &lt;code&gt;i_1&lt;/code&gt;, it can recycle some of that work and
needs fewer iterations to converge.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Zero-Shot Self-Speculative Decoding:&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Speculative decoding is a performance optimization which uses a cheaper &amp;ldquo;draft&amp;rdquo;
model to reduce the cost of decoding a more expensive model. This recurrent
model gets you something similar to speculative decoding &amp;ldquo;for free&amp;rdquo; without a
draft model. To do this, you can decode with a small amount of iterations &lt;code&gt;N&lt;/code&gt;
and treat that as the draft model, and decode with a larger number of iterations
&lt;code&gt;M&lt;/code&gt; as the ground-truth model. (So, &lt;code&gt;M &amp;gt; N&lt;/code&gt;.) This has two efficiencies: First,
since the draft and ground-truth model use the same weights, you don&amp;rsquo;t have to
have two models loaded during inference. Second, the hidden states calculated in
the draft model (&lt;code&gt;N&lt;/code&gt; iterations) can be reused by the ground-truth model when
decoding from &lt;code&gt;[N+1 -&amp;gt; M]&lt;/code&gt; iterations.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Zero-Shot Adaptive Compute at Test-Time:&lt;/strong&gt; As discussed earlier, you can use a
convergence threshold &lt;code&gt;T&lt;/code&gt; when calculating the difference between recurrent
states, and stop after the threshold is reached. This allows per-token adaptive
compute, allowing the model to spend more compute on harder tokens.&lt;/p&gt;
&lt;h2 id=&#34;the-bitter-lesson&#34;&gt;The Bitter Lesson&lt;/h2&gt;
&lt;p&gt;With the release of o1/r1/o3, I’ve seen the sentiment of “Hey, this is further
proof of the
&lt;a href=&#34;http://www.incompleteideas.net/IncIdeas/BitterLesson.html&#34;&gt;bitter lesson&lt;/a&gt;.
Scaling English CoT is all you need to reach AGI”. Along with variants of: Mamba
was a dead end, MCTS was a dead end, COCONUT is a dead end.&lt;/p&gt;
&lt;p&gt;Mamba for sure hasn’t started “working” yet at the frontier, and I don’t see
that changing soon. I don’t think MCTS should be ruled out yet, though it also
hasn’t had a breakout moment. It’s still quite early for latent reasoning. The
model development pipeline is long, and I’m confident that frontier labs are
trying to make latent reasoning work — for the efficiency gains, if nothing
else. In a world where inference costs will increasingly be a bottleneck,
shaving off that much compute while maintaining or even
slightly improving performance is quite appealing.&lt;/p&gt;
&lt;p&gt;The initial takeaway from this wave of reasoning research is that we&amp;rsquo;re still in
the early stages of understanding how to make models think effectively. While
English CoT has set a high bar, early results in latent reasoning and adaptive
compute suggest there&amp;rsquo;s still plenty of low-hanging fruit in making models
reason more efficiently. The bitter lesson can point us where we&amp;rsquo;ll end up
(&amp;ldquo;simple&amp;rdquo; architectures in hindsight, scaled massively), but there&amp;rsquo;s still much
to learn about the best way to get there.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Cover image by Recraft v3.&lt;/em&gt;&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;It seems like the cutoff for models being able to learn reasoning ability is
roughly in the 1.5B - 3B parameter range. But also the Qwen models seem much
more capable of being transformed into reasoners than Llama, so there&amp;rsquo;s
probably something other than raw scale going on here.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
        <title>How I Use AI: Early 2025</title>
        <link>https://benjamincongdon.me/blog/2025/02/02/How-I-Use-AI-Early-2025/</link>
        <pubDate>Sun, 02 Feb 2025 00:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/02/02/How-I-Use-AI-Early-2025/</guid>
        <description>&lt;p&gt;&lt;em&gt;Previously:
&lt;a href=&#34;https://benjamincongdon.me/blog/2024/07/21/AI-Tools-in-Mid-2024/&#34;&gt;Mid 2024&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The landscape of AI tooling continues to shift, even in the past half year. This
is not unexpected. This post is an updated snapshot of the “state of things I
use”.&lt;/p&gt;
&lt;h2 id=&#34;tools&#34;&gt;Tools&lt;/h2&gt;
&lt;h3 id=&#34;work&#34;&gt;Work&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href=&#34;https://code.visualstudio.com/docs/copilot/copilot-edits&#34;&gt;Copilot Edits&lt;/a&gt;:&lt;/strong&gt;
This feels roughly 85% as effective as Cursor, but the ability to
incorporate enterprise code context makes it roughly on par. &lt;strong&gt;I am shocked
that I never hear anyone talking about this.&lt;/strong&gt; My general workflow is: load
in up to 10 files into the working context, and ask for changes. It can
change multiple files at a time. It feels a bit like magic when it works
right.
&lt;ul&gt;
&lt;li&gt;I&amp;rsquo;ve used it on languages that are not well covered by LLMs &amp;ndash; Scala,
Rust &amp;ndash; and the results are surprisingly usable. The quality is good
enough that I&amp;rsquo;ve started to reach for this first for most tasks.&lt;/li&gt;
&lt;li&gt;With Rust, I sometimes need to step in and help the model when it gets
stuck. I&amp;rsquo;ve had to point out that it&amp;rsquo;s not making progress, or defer to
a reasoning LLM to get past a logical impasse.&lt;/li&gt;
&lt;li&gt;Copilot now allows you to set custom instructions, similar to Cursor. I
have built up custom language-specific instructions so that I get
outputs that more consistently match the idioms and style of my
company&amp;rsquo;s / team&amp;rsquo;s codebase.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href=&#34;https://www.anthropic.com/claude/sonnet&#34;&gt;Claude 3.5 Sonnet&lt;/a&gt; New (via
&lt;a href=&#34;https://www.anthropic.com/news/claude-pro&#34;&gt;Claude Pro&lt;/a&gt;):&lt;/strong&gt; (a.k.a Sonnet
3.6, newsonnet) Sonnet 3.5 remains my daily driver and all around favorite
model. In Claude Pro, the &amp;ldquo;Projects&amp;rdquo; feature is amazing. In any given week,
I write several design documents, PRDs, announcements, one-pagers, etc. With
Projects, I can dump in relevant context documents from related projects,
iterate rapidly on writing, and have Claude output suggestions in a style
that matches my &amp;ldquo;organic&amp;rdquo; writing. The process looks something like this:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Paste in a collection of relevant documents: design docs, meeting notes,
prior art, public documentation, etc.&lt;/li&gt;
&lt;li&gt;Use a custom writing style to &amp;ldquo;write as me&amp;rdquo; (more on that in the
Techniques section).&lt;/li&gt;
&lt;li&gt;Iterate on the prompt, refining the output until it&amp;rsquo;s nearly publishable.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The quality of the output is often good enough that I can copy/paste entire
sections into design documents with only minimal editing. This works better
in some contexts than others, but for non-thinking-heavy sections like
&amp;ldquo;Background&amp;rdquo; or &amp;ldquo;Overview&amp;rdquo; sections, I can usually get great outputs.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href=&#34;https://notebooklm.google.com/&#34;&gt;NotebookLM&lt;/a&gt;:&lt;/strong&gt; Before I started using
Claude Pro, NotebookLM was my go-to for working with a large corpus of
documents. It&amp;rsquo;s tightly integrated into Google Workspace, which is
convenient. I can dump in 20+ documents and ask questions about them as a
corpus. Gemini just isn&amp;rsquo;t as strong as a writer, so I don&amp;rsquo;t use the output
of NotebookLM much.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;ChatGPT 4o:&lt;/strong&gt; 4o feels like an outdated model at this point, but you still
get unlimited use with the ChatGPT Pro plan, and the UX for
ChatGPT-for-macOS is pretty great. &lt;code&gt;Option+Space&lt;/code&gt; to get a ChatGPT window is
a killer feature. By pure invocation/conversation count, 4o is probably my
most used model &amp;ndash; though most of the queries look more like Google searches
than conversations.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href=&#34;https://github.com/simonw/llm&#34;&gt;llm&lt;/a&gt; (CLI tool):&lt;/strong&gt; This has become
indispensable for quick, one-off tasks. It&amp;rsquo;s great for drafting git commit
messages, reformatting text, etc. It&amp;rsquo;s hard to really write about &lt;em&gt;what&lt;/em&gt; I
use &lt;code&gt;llm&lt;/code&gt; for since it&amp;rsquo;s a bunch of one-offs. At minimum, the &amp;ldquo;chat in CLI&amp;rdquo;
UX is surprisingly useful.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href=&#34;https://www.perplexity.ai/&#34;&gt;Perplexity&lt;/a&gt; Pro:&lt;/strong&gt; We have access Perplexity
Pro at work. I was initially impressed, but as time goes on, I find it
increasingly disappointing! I sometimes use it as a &amp;ldquo;free action&amp;rdquo; for
search, but even then, ChatGPT Search is usually better. I once tried to
replace Google with Perplexity as my default search engine, and didn&amp;rsquo;t last
more than a day. Perhaps I&amp;rsquo;m just not using it correctly. Also, the company
has
&lt;a href=&#34;https://apnews.com/article/tiktok-bytedance-trump-perplexity-87988733973760927bb5681f7de9b9af&#34;&gt;strange vibes recently&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href=&#34;https://labs.google/fx/tools/image-fx&#34;&gt;ImageFX&lt;/a&gt;&lt;/strong&gt;: Google&amp;rsquo;s image
generation studio, which uses
&lt;a href=&#34;https://deepmind.google/technologies/imagen-3/&#34;&gt;Imagen 3&lt;/a&gt;. I&amp;rsquo;ve found this
useful to make relatively compelling
non-&lt;a href=&#34;https://benjamincongdon.me/blog/2025/01/25/AI-Slop-Suspicion-and-Writing-Back/&#34;&gt;slop&lt;/a&gt;-y
illustrations for presentations.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&#34;personal&#34;&gt;Personal&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;llm:&lt;/strong&gt; Just as for work, &lt;code&gt;llm&lt;/code&gt; on the command line is incredibly handy for
personal projects. I don&amp;rsquo;t pay for a personal Claude Pro license, so I use
Claude on the command line pretty frequently.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href=&#34;https://aistudio.google.com/gallery&#34;&gt;Google AI Studio&lt;/a&gt;:&lt;/strong&gt; Google&amp;rsquo;s AI
Studio is completely free to use, so I frequently use Gemini via the AI
Studio. Gemini has hands down the best multi-modality of any model family.
It&amp;rsquo;s great for audio, video (!), and PDF inputs. I have more thoughts on
Gemini in my Models section.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Personal Customized
&lt;a href=&#34;https://github.com/vercel/ai-chatbot&#34;&gt;Vercel AI Chatbot&lt;/a&gt;:&lt;/strong&gt; I&amp;rsquo;ve set up a
personalized chatbot using Vercel&amp;rsquo;s AI Chatbot
&lt;a href=&#34;https://github.com/vercel/ai-chatbot&#34;&gt;template&lt;/a&gt;.
&lt;ul&gt;
&lt;li&gt;My main changes were adding support for Anthropic models, changing the
database to be a local SQLite file, and ripping out all the tool use
features that I had no use for.&lt;/li&gt;
&lt;li&gt;My main reason for wanting this was: Wanting all my chats to only be
saved locally (in theory Anthropic
&lt;a href=&#34;https://privacy.anthropic.com/en/articles/7996875-can-you-delete-data-that-i-sent-via-api&#34;&gt;doesn&amp;rsquo;t permanently retain API logs&lt;/a&gt;),
and wanting to have a handful of custom starter templates that I could
easily reach for.&lt;/li&gt;
&lt;li&gt;I could have probably saved some time by using
&lt;a href=&#34;https://openwebui.com/&#34;&gt;OpenWebUI&lt;/a&gt;, but it&amp;rsquo;s satisfying to have
something custom. :)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href=&#34;https://www.cursor.com/&#34;&gt;Cursor&lt;/a&gt;:&lt;/strong&gt; I use Cursor for personal coding
projects, especially when working with smaller or greenfield codebases. I
almost always use Sonnet 3.5. I initially thought I&amp;rsquo;d burn through the
monthly credits that Cursor gives you, but that hasn&amp;rsquo;t been an issue so far.
(I don&amp;rsquo;t do a ton of side project coding these days though) As of today, I&amp;rsquo;m
using Copilot Edits 5-10x more than Cursor, but that&amp;rsquo;s mostly because I
cannot currently use Cursor at &lt;code&gt;$WORK&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href=&#34;https://notebooklm.google.com/&#34;&gt;NotebookLM&lt;/a&gt; Podcasts&lt;/strong&gt;: When running, if
I don&amp;rsquo;t have an appealing podcast, I&amp;rsquo;ll generate one on NotebookLM from a
recent Arxiv paper or blog post. The quality varies significantly, and I
tend to listen on 1.5x speed.
&lt;ul&gt;
&lt;li&gt;The hosts sometimes devolve into trite discussions about the &amp;ldquo;ethical
implications of AI&amp;rdquo; when describing a technical research paper, so it&amp;rsquo;s
very much not perfect. Still surprisingly good for what it is, and it
does often capture my attention more than would a pure TTS reading of
the underlying content.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href=&#34;https://fal.ai/&#34;&gt;FAL&lt;/a&gt;&lt;/strong&gt;: FAL is a host of a bunch of image generation
models (among other &amp;ldquo;generative media&amp;rdquo; algorithms). It&amp;rsquo;s quite easy/cheap to
use, and produces better results than e.g. DALLE.
&lt;ul&gt;
&lt;li&gt;I currently use &lt;a href=&#34;https://fal.ai/models/fal-ai/recraft-v3&#34;&gt;Recraft v3&lt;/a&gt;
for my blog images. Late last year, I also trained custom LORAs of
&lt;a href=&#34;https://fal.ai/models/fal-ai/flux-pro/v1.1&#34;&gt;Flux1.1&lt;/a&gt; on FAL to create
some fun images of my cats. 🐈‍⬛&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;techniques&#34;&gt;Techniques&lt;/h2&gt;
&lt;p&gt;Here are a few techniques I&amp;rsquo;ve found to be particularly effective when working
with these tools:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Prompting (Code):&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Context Management&lt;/strong&gt;: I find that the single biggest factor in getting
good results from an LLM &amp;ndash; especially for coding &amp;ndash; is the context you
provide. When using tools like Cursor and Copilot Edits, getting a good set
of files that are relevant to the task at hand into your context is key. I
haven&amp;rsquo;t found anything yet that is able to maintain good context itself,
outside of trivially small code bases.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Test Generation&lt;/strong&gt;: I&amp;rsquo;ve found that asking for test cases to be generated
is a great way to get a model to understand the behavior of the change I&amp;rsquo;m
asking for.&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt; Unit tests are also usually super easy to pattern match and
generate given in-context examples, so the quality is usually quite high.
It&amp;rsquo;s often useful to have idiomatic examples of your testing patterns in
your context, so that the model can generate tests that match your existing
style. As a final tip, asking an LLM &amp;ldquo;are there any missing tests?&amp;rdquo; is a
good &amp;ldquo;free&amp;rdquo; way to increase test coverage.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Loop: Copy/Paste Compiler &amp;amp; Errors&lt;/strong&gt;: This feels like extremely
low-hanging fruit for improved workflows, but for now my loop is essentially
to start ibazel (or whatever other test runner you have, in &amp;ldquo;watch mode&amp;rdquo;),
have the LLM propose changes, then copy/paste the compiler or test errors
back into the LLM to get it to fix the issues.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;All of these techniques require the caveat that you need to be actively engaged
in the process while prompting &amp;amp; evaluating LLM output. Treat the LLM as a
intern or junior developer that you&amp;rsquo;re coaching along. Blindly accepting output
is still a recipe for disaster.&lt;/p&gt;
&lt;p&gt;As a counterpoint to this note of caution,
&lt;a href=&#34;https://x.com/karpathy/status/1886192184808149383&#34;&gt;Kaparthy recently coined the term &amp;ldquo;vibe coding&amp;rdquo;&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;There&amp;rsquo;s a new kind of coding I call &amp;ldquo;vibe coding&amp;rdquo;, where you fully give in to
the vibes, embrace exponentials, and forget that the code even exists. It&amp;rsquo;s
possible because the LLMs (e.g. Cursor Composer w Sonnet) are getting too
good.&lt;/p&gt;
&lt;p&gt;I &amp;ldquo;Accept All&amp;rdquo; always, I don&amp;rsquo;t read the diffs anymore. When I get error
messages I just copy paste them in with no comment, usually that fixes it. The
code grows beyond my usual comprehension, I&amp;rsquo;d have to really read through it
for a while. Sometimes the LLMs can&amp;rsquo;t fix a bug so I just work around it or
ask for random changes until it goes away. It&amp;rsquo;s not too bad for throwaway
weekend projects, but still quite amusing.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I find this approach to admitedly work for small, throwaway projects, as he
notes, but not for anything that needs to be maintained or scaled. However, this
does seem to be the direction we&amp;rsquo;re headed. When quality code generation becomes
too cheap to meter, if you can strap an optimization loop around
generation&amp;lt;&amp;gt;evaluation, you can get a lot of work done with minimal effort.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Prompting (Writing):&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&amp;ldquo;Give me 3 options&amp;rdquo;:&lt;/strong&gt; Whenever I&amp;rsquo;m generating text that will be used in a
document or email, I always ask for multiple options. This allows me to
either pick the best one or, more often, combine the best parts of each to
create something that feels more natural and human. I don&amp;rsquo;t trust any model
to one-shot human-sounding text.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&amp;ldquo;Write as me&amp;rdquo; prompts:&lt;/strong&gt; Models are still not amazing at copying writing
styles, but the models that are good at creative writing tend to be at least
OK at writing in my personal style.
&lt;ul&gt;
&lt;li&gt;The workflow looks like:
&lt;ol&gt;
&lt;li&gt;Take a large chunk of your writing and put it into R1 or Claude. Ask
&amp;ldquo;Write a style guide for writing exactly as the author of this
text.&amp;rdquo;.&lt;/li&gt;
&lt;li&gt;Then use that as a preamble to creative writing tasks, or as a Custom
Style in Claude.&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;ul&gt;
&lt;li&gt;I&amp;rsquo;ve found the models to be best at this approach are Sonnet 3.5 and
(surprisingly) Deepseek R1. None of the OpenAI models fare well here, in
my testing.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Use as a &amp;ldquo;calculator for words&amp;rdquo;:&lt;/strong&gt; LLMs remain great for simple, mindless
reformatting. Tasks like:
&lt;ul&gt;
&lt;li&gt;&amp;ldquo;Reformat this text as a comma separated list&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&amp;ldquo;Find the latest date from this huge list of unstructured dates&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&amp;ldquo;Convert to a bullet pointed list&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&amp;ldquo;Remove duplicates from this list&amp;rdquo;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Related Usage Techniques:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&amp;ldquo;Copy as Markdown&amp;rdquo; from Google Docs:&lt;/strong&gt; LLMs handle Markdown particularly
well. Google Docs
&lt;a href=&#34;https://support.google.com/docs/answer/12014036?hl=en&#34;&gt;now allows you to copy content as Markdown&lt;/a&gt;,
which makes it easy to transfer text between the two environments. &amp;ldquo;Paste as
Markdown&amp;rdquo; is also useful.
&lt;ul&gt;
&lt;li&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Markdown tables!&lt;/strong&gt; I really dislike the Markdown table syntax, so
I almost never use them. Asking an LLM to output a Markdown table
and then copying that into a Google Doc is &lt;em&gt;awesome.&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;macOS Speech to Text:&lt;/strong&gt; I never thought I&amp;rsquo;d say this, but sometimes
talking &lt;em&gt;is&lt;/em&gt; faster than typing. I&amp;rsquo;ve been using macOS&amp;rsquo;s built-in
&lt;a href=&#34;https://support.apple.com/guide/mac-help/use-dictation-mh40584/mac&#34;&gt;speech-to-text&lt;/a&gt;
more and more when &amp;ldquo;writing&amp;rdquo; out conversational prompts.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;pbpaste / pbcopy&lt;/strong&gt; Since LLM usage today often relies heavily on
copy/paste, using the &lt;code&gt;pbcopy&lt;/code&gt; and &lt;code&gt;pbpaste&lt;/code&gt; commands (at least, on macOS)
has been useful. These let you copy and paste, respectively, from the
clipboard from the CLI.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;unexpectedly-useful-use-cases&#34;&gt;Unexpectedly Useful Use Cases&lt;/h2&gt;
&lt;p&gt;Beyond the obvious applications, I&amp;rsquo;ve found AI to be surprisingly useful in a
few unexpected areas:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Finding a last-minute hike:&lt;/strong&gt; Any good model has grokked all of AllTrails,
and they give good recommendations even with complex criteria. It&amp;rsquo;s great
for finding hikes that meet specific criteria (e.g., &amp;ldquo;not crowded, loop
trail, between 5 and 10 miles, moderate difficulty&amp;rdquo;).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;As a &amp;ldquo;free action&amp;rdquo; for code review:&lt;/strong&gt; Before reviewing a pull request, I
often pipe the diff into a model like o1 to see if it finds anything
objectionable. Worst case, you get slop out that you can ignore. I&amp;rsquo;ve had o1
catch some quite subtle bugs that I didn&amp;rsquo;t catch up on first review.
&lt;ul&gt;
&lt;li&gt;Aside: Compared to a year ago, AI code review actually seems feasible
now. The originalGPT-4 class models just weren&amp;rsquo;t great at code review,
due to context length limitations and the lack of reasoning.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Planning a Catio:&lt;/strong&gt; We recently built a catio for our cats. I needed to
calculate how much PVC pipe we needed. My partner drafted the plans in CAD,
and I fed this into ChatGPT, which used Code Execution to plan out all the
pieces. I also got it to generate labels for each of the piece lengths,
which we annotated back on the plan. It saved us a ton of time.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Financial Advice:&lt;/strong&gt; &lt;em&gt;(⚠️ Caveat emptor ⚠️)&lt;/em&gt; This one requires a huge grain
of salt, but I recently had to make a large financial decision and found
LLMs helpful as a secondary gut check to check my math. I asked Claude, R1,
Gemini, GPT-4o, and GPT-o1 for their thoughts on my approach. All of them
agreed directionally with the reasoning I came up with. This gave me a bit
more confidence in my decision. Obviously check your work here.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Medical Advice&lt;/strong&gt;: &lt;em&gt;(⚠️ Caveat emptor ⚠️)&lt;/em&gt; Same huge grain of salt, but
using o1 / Claude as a second opinion for diagnosing symptoms and evaluating
medical test results is definitely worth doing. I had some blood work done a
few months ago, and got the raw results back prior to having my doctor
review the results. Claude&amp;rsquo;s evaluation of the tests matched 1:1 with my
doctor&amp;rsquo;s later report. (Granted: This was a fairly simple case.)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;model-tier-rank&#34;&gt;Model Tier Rank&lt;/h2&gt;
&lt;p&gt;Here&amp;rsquo;s my current ranking of the models I&amp;rsquo;ve been using, based on their overall
utility:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;S Tier:&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Claude 3.5 Sonnet:&lt;/strong&gt; An absolute workhorse. Smart across many
domains—technical, creative writing, etc. It&amp;rsquo;s my go-to for most tasks.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Deepseek R1:&lt;/strong&gt; Cheap and smart enough to not feel bad about using it.
Deepseek R1 + Web Search is incredibly powerful. It&amp;rsquo;s a great option for
tasks that require up-to-date information or external knowledge.
&lt;ul&gt;
&lt;li&gt;A note on serving: As of writing, the
&lt;a href=&#34;https://www.deepseek.com/&#34;&gt;Deepseek platform&lt;/a&gt; serves R1
(undistilled) the fastest of any provider I&amp;rsquo;ve seen. If you have
data residency concerns, or concerns about Deepseek&amp;rsquo;s
&lt;a href=&#34;https://www.wiz.io/blog/wiz-research-uncovers-exposed-deepseek-database-leak&#34;&gt;security practices&lt;/a&gt;,
I&amp;rsquo;ve found that
&lt;a href=&#34;https://openrouter.ai/deepseek/deepseek-r1&#34;&gt;OpenRouter&lt;/a&gt; provides a
good alternative. Sadly, OpenRouter&amp;rsquo;s web search is qualitatively
worse than DeepSeek&amp;rsquo;s.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;A Tier:&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Claude 3 Opus:&lt;/strong&gt; It&amp;rsquo;s amazing, just so expensive I can&amp;rsquo;t really
justify using it for most tasks. Opus has been eclipsed by Sonnet 3.5
(and others) on coding, but is still great for writing.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;o1:&lt;/strong&gt; Impressive sometimes, but rather hit or miss in my experience.
When it works, it&amp;rsquo;s impressively good. I notice that I don&amp;rsquo;t reach for
this model much relative to the hype/praise it receives. Usage limits
really deter me from leaning on a model. &amp;ldquo;You&amp;rsquo;re out of messages until
Monday&amp;rdquo; is a bad feeling. I don&amp;rsquo;t want my tools to feel like they&amp;rsquo;re
scarce.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;B Tier:&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;o1-Mini:&lt;/strong&gt; I used this way more then o1 this year. It&amp;rsquo;s pretty good
for coding. This model appears to not be available in ChatGPT anymore
following the release of &lt;code&gt;o3-mini&lt;/code&gt;, so I doubt I will use it much again.
That being said, I will likely use this class of model more now that
&lt;code&gt;o3-mini&lt;/code&gt; exists.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;C Tier:&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Gemini 2.0 Flash, Gemini 2.0 Flash Thinking, Gemini Experimental
1206:&lt;/strong&gt; I want to like Gemini, it&amp;rsquo;s just not really the best on any
relevant frontier that I care most about. The most obvious way it&amp;rsquo;s
better is that the context length is enormous. It&amp;rsquo;s also free on AI
Studio, which is confusingly generous. My favorite party trick is that I
put 300k tokens of my public writing into it and used that to generate
new writing in my style. However, the &amp;ldquo;write as me&amp;rdquo; prompt technique
works nearly just as well &amp;ndash; often better. Gemini models are also
weirdly sensitive to temperature settings changes.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;a href=&#34;https://openai.com/index/openai-o3-mini/&#34;&gt;&lt;code&gt;o3-mini&lt;/code&gt;&lt;/a&gt; just came out yesterday.
I&amp;rsquo;ve used it a bit, but not enough to give a confident rating.&lt;/p&gt;
&lt;!-- raw HTML omitted --&gt;
&lt;!-- raw HTML omitted --&gt;
&lt;h2 id=&#34;what-im-not-using&#34;&gt;What I&amp;rsquo;m Not Using&lt;/h2&gt;
&lt;p&gt;There are a few tools that I&amp;rsquo;m not currently using, either because I haven&amp;rsquo;t
found them to be particularly useful or because they&amp;rsquo;re still too early in their
development:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href=&#34;https://openai.com/index/introducing-chatgpt-pro/&#34;&gt;ChatGPT Pro&lt;/a&gt;:&lt;/strong&gt; I just
don&amp;rsquo;t see $200 in utility there. Unlimited o1 would be nice. $200/month is
too much to stomach, even though in raw economics terms it&amp;rsquo;s probably worth
it.&lt;sup id=&#34;fnref:2&#34;&gt;&lt;a href=&#34;#fn:2&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;2&lt;/a&gt;&lt;/sup&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Operator:&lt;/strong&gt; I don&amp;rsquo;t see the utility for me yet. It&amp;rsquo;s a cool research
demo today.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;o1-Pro:&lt;/strong&gt; I&amp;rsquo;d love to try this. Again, the $200 price tag just doesn&amp;rsquo;t
seem worth it.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href=&#34;https://github.com/browser-use/browser-use&#34;&gt;Browser-use&lt;/a&gt;:&lt;/strong&gt; Open-source
version of Operator. It&amp;rsquo;s cool; I tried it; it&amp;rsquo;s slow. The version of this
that acts at ~1 action per second instead of ~1 action per minute will be
a force to be reckoned with.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href=&#34;https://www.anthropic.com/news/3-5-models-and-computer-use&#34;&gt;Anthropic Computer Use&lt;/a&gt;:&lt;/strong&gt;
See Operator, Browser-use&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href=&#34;https://help.openai.com/en/articles/10119604-work-with-apps-on-macos&#34;&gt;ChatGPT &amp;ldquo;Work with Apps&amp;rdquo;&lt;/a&gt;:&lt;/strong&gt;
This would be great with Chrome, or some other app I&amp;rsquo;m not familiar with
like Godot, but given it just supports Terminal and IDEs, I&amp;rsquo;d rather use
Cursor or Copilot. I think this could get good, I just don&amp;rsquo;t see any use
cases for me yet.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href=&#34;https://openai.com/index/introducing-gpts/&#34;&gt;ChatGPT &amp;ldquo;GPTs&amp;rdquo;&lt;/a&gt;:&lt;/strong&gt; I used
them modestly in 2024, but as new features were added (voice mode, search,
reasoning, etc.), GPTs often couldn&amp;rsquo;t use these features so I stopped using
them as much. I basically used them as custom placeholder prompts, so there
wasn&amp;rsquo;t much value added.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Local Models&lt;/strong&gt;: Aside from trying out
&lt;a href=&#34;https://ollama.com/&#34;&gt;Ollama&lt;/a&gt;/&lt;a href=&#34;https://lmstudio.ai/&#34;&gt;LMStudio&lt;/a&gt; just to see
if they work, I haven&amp;rsquo;t found any durable use cases for local models. For
human-in-the-loop LLM usage, I just think there isn&amp;rsquo;t much reason to not use
the most powerful model available. If I trusted Anthropic less, I&amp;rsquo;d probably
look into local models more intently.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href=&#34;https://help.openai.com/en/articles/9617425-advanced-voice-mode-faq&#34;&gt;&amp;ldquo;Advanced&amp;rdquo; Voice Mode&lt;/a&gt;:&lt;/strong&gt;
Last year I used voice mode pretty consistently when I was commuting. This
was the original voice mode that was just a wrapper around STT-&amp;gt;LLM-&amp;gt;TTS. It
was great! I commute by bus now, so don&amp;rsquo;t have as much dead time to use
voice mode. Now I only use voice mode occasionally while running.
Additionally, OpenAI&amp;rsquo;s &amp;ldquo;advanced&amp;rdquo; voice mode was somewhat disappointing to
me: the &amp;ldquo;interrupt the AI&amp;rdquo; feature didn&amp;rsquo;t work reliably enough for me to
feel like it&amp;rsquo;s a step change improvement.
&lt;ul&gt;
&lt;li&gt;The lack of impact of advanced voice mode is curious.
&lt;a href=&#34;https://en.wikipedia.org/wiki/Her_(2013_film)&#34;&gt;Her&lt;/a&gt;-level proto-AGIs
now exist in the world we can talk do, and mostly folks don&amp;rsquo;t care.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;things-id-like-to-see&#34;&gt;Things I&amp;rsquo;d Like To See&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Better Tools for Copiloting Writing:&lt;/strong&gt; I think the UX for writing using
LLMs can be significantly better than it is today. I don&amp;rsquo;t think anyone has
made a great Github Copilot esque product for writing, likely because there
isn&amp;rsquo;t &amp;ldquo;one correct&amp;rdquo; path you go down doing non-technical writing. I&amp;rsquo;m
excited by &lt;a href=&#34;https://github.com/socketteer/loom&#34;&gt;loom&lt;/a&gt;-like interfaces, which
allow you to traverse trees of text. Other existing tools today, like &amp;ldquo;take
this paragraph and make it more concise/formal/casual&amp;rdquo; just don&amp;rsquo;t have much
appeal to me. I really don&amp;rsquo;t tend to like the output of these systems.
Ideally, I want to be steering an LLM in my writing style and in the
direction of my flow of thoughts.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;em&gt;Fast&lt;/em&gt; or &lt;em&gt;Reliable&lt;/em&gt; Browser / Computer-Use Agents:&lt;/strong&gt; The demos I&amp;rsquo;ve seen
for browser/computer use seem too slow now to be worth investing much in.
&lt;em&gt;However&lt;/em&gt;, I think there&amp;rsquo;s a ton of promise in them. I see two paths to
increasing utility: Either these agents get faster, or they get more
reliable. If faster, then they can be used more in human-in-the-loop
settings, where you can course correct them if they go off track. If more
reliable, then they can operate in the background on your behalf, when you
don&amp;rsquo;t care as much about end-to-end latency. I do think that someone will
crack a specialized model for very fast computer use within the next year.
All the building blocks are there for agents of noticeable economic utility;
it seems more like an engineering problem than an open research problem.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Better Long-term Management:&lt;/strong&gt; I was excited about
&lt;a href=&#34;https://openai.com/index/memory-and-new-controls-for-chatgpt/&#34;&gt;ChatGPT memory&lt;/a&gt;,
but this was also mostly disappointing. I have yet to have an &amp;ldquo;aha&amp;rdquo; moment
where I got nontrivial value out of ChatGPT having remembered something
about me. More often than not, it remembers weird, irrelevant, or
time-contingent facts that have no practical future utility. I&amp;rsquo;d really like
some system that does contextual compression on my conversations, finds out
the types of responses I tend to value, the types of topics I care about,
and uses that in a way to improve model output on ongoing basis. I&amp;rsquo;ve seen
some interesting experiments in this direction, but as far as I can tell no
one has quite solved this yet.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;resources&#34;&gt;Resources&lt;/h2&gt;
&lt;p&gt;No change from mid 2024:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://thezvi.wordpress.com/&#34;&gt;Zvi Mowshowitz&lt;/a&gt;’s weekly AI posts are
excellent, and give an extremely verbose AI “state of the world”.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://simonwillison.net/&#34;&gt;Simon Willison&lt;/a&gt;’s blog is also an excellent
source for AI news.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://www.cognitiverevolution.ai/&#34;&gt;The Cognitive Revolution&lt;/a&gt; podcast
hosts some pretty good interviews that I find to be high-signal-to-noise,
and is much less hype-driven than many other AI-centric podcasts I’ve
attempted to listen to.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;New additions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Particularly good Twitter follows: &lt;a href=&#34;https://twitter.com/repligate&#34;&gt;Janus&lt;/a&gt;,
&lt;a href=&#34;https://x.com/natolambert&#34;&gt;Nathan Lambert&lt;/a&gt;,
&lt;a href=&#34;https://x.com/HamelHusain&#34;&gt;HamelHusain&lt;/a&gt;,
&lt;a href=&#34;https://x.com/jeremyphoward&#34;&gt;Jeremy Howard&lt;/a&gt;,
&lt;a href=&#34;https://x.com/davidad&#34;&gt;davidad&lt;/a&gt;, &lt;a href=&#34;https://x.com/swyx&#34;&gt;swyx&lt;/a&gt;,
&lt;a href=&#34;https://x.com/emollick&#34;&gt;Ethan Mollick&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Periodic check-ins on &lt;a href=&#34;https://www.lesswrong.com/&#34;&gt;Lesswrong&lt;/a&gt; for more
technical discussion (esp. related to AI alignment and AGI implications), if
you&amp;rsquo;re so inclined&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;em&gt;Cover image by Recraft v3. As always, this post contains my own views and does
not represent the views of my employer.&lt;/em&gt;&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;I had a discussion with a sharp engineer I look up to a few years ago, who
was convinced that the future would be humans writing tests and
specifications, and LLMs would handle all implementation. Now, I think we
won&amp;rsquo;t even need to necessarily write in-code tests, or low-level unit tests.
I&amp;rsquo;m now convinced that features can largely be described in English, with
some end-to-end acceptance tests specified by humans.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:2&#34;&gt;
&lt;p&gt;&lt;a href=&#34;https://openai.com/index/introducing-deep-research/&#34;&gt;Deep Research&lt;/a&gt; came
out while I was writing this post an this might actually tip the scale for
me. More generally, I think the paradigm of ambient agentic background
compute will be a Big Deal soonish.&amp;#160;&lt;a href=&#34;#fnref:2&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
        <title>AI Slop, Suspicion, and Writing Back</title>
        <link>https://benjamincongdon.me/blog/2025/01/25/AI-Slop-Suspicion-and-Writing-Back/</link>
        <pubDate>Sat, 25 Jan 2025 00:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2025/01/25/AI-Slop-Suspicion-and-Writing-Back/</guid>
        <description>&lt;p&gt;The impetus for this post was my recent realization that I&amp;rsquo;ve developed an
involuntary reflex for spotting AI-generated content. The tells are subtle now,
but (sadly? tellingly?) this sort of content is seemingly everywhere now once
you start looking.&lt;/p&gt;
&lt;h2 id=&#34;the-rise-of-ai-slop&#34;&gt;The Rise of AI Slop&lt;/h2&gt;
&lt;p&gt;One bit of hipster cred I get to claim is that I followed Simon Willison before
he became the cool AI blogger.&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt; Perhaps one of the most &amp;ldquo;in the room&amp;rdquo; feels
I&amp;rsquo;ve had reading his work is to see his neologism of &amp;ldquo;AI slop&amp;rdquo; catch on.&lt;sup id=&#34;fnref:2&#34;&gt;&lt;a href=&#34;#fn:2&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;2&lt;/a&gt;&lt;/sup&gt;
Slop being defined as the equivalent of &amp;ldquo;spam&amp;rdquo;, but for AI-generated content.&lt;/p&gt;
&lt;p&gt;To put a finer point on it, I define &amp;ldquo;slop&amp;rdquo; as:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Content that is mostly-or-completely AI-generated that is passed off as being
written by a human, regardless of quality.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;For example, prompting Claude to write a blog post and publishing that verbatim
under your name would be &amp;ldquo;slop&amp;rdquo;, even if the writing is not bad at face value.&lt;/p&gt;
&lt;p&gt;GPT-3 era slop was pretty easy to detect. Early GPT-4 era slop was marginally
more convincing, but usually gave itself away after a paragraph or two. Sometime
in 2024, I think some critical line was crossed, at least for me, and there are
certain classes of slop that take at least &lt;em&gt;some&lt;/em&gt; critical thinking to realize
is AI-generated. Which isn&amp;rsquo;t great!&lt;/p&gt;
&lt;h2 id=&#34;slop-in-the-wild&#34;&gt;Slop in the Wild&lt;/h2&gt;
&lt;p&gt;I noticed this first on LinkedIn, where I saw some suspiciously robotic posts
being made by a previous coworker. Too many emojis, bulleted lists, markdown
formatting literals that weren&amp;rsquo;t picked up in LinkedIn, etc. In retrospect, it&amp;rsquo;s
obvious slop. This is probably par-for-the-course on LinkedIn now, but at the
time it felt like a weird violation of the social contract. My respect for this
person was reduced by a nontrivial amount.&lt;/p&gt;
&lt;p&gt;To be clear, I fault no one for augmenting their writing with LLMs. I do it. A
lot now. It&amp;rsquo;s a great breaker of writers block. But I really do judge those who
copy/paste directly from an LLM into a human-space text arena. Sure, take
sentences &amp;ndash; even proto-paragraphs &amp;ndash; if they AI came up with something great.
But &lt;em&gt;surely&lt;/em&gt; there is something that needs to be changed from what came out of
the black box before you feel comfortable attaching your name to it. If you
don&amp;rsquo;t, I think that&amp;rsquo;s &lt;em&gt;slop&lt;/em&gt;-y.&lt;/p&gt;
&lt;p&gt;If you look around now, this sort of stuff is everywhere. There are so many
accounts on X that post 20-tweet long threads that are obvious slop. Take an
article and break it down into a thread, then dump that into the timeline? Slop.
Rando reply bots talking about how some tweet does a great job presenting a
multifaceted debate? Slop. Reddit has a ton of this too.&lt;/p&gt;
&lt;p&gt;The B2B SasS&amp;rsquo; are coming for this space too. From Zvi&amp;rsquo;s Jan 16 newsletter:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;a href=&#34;https://x.com/mattparlmer/status/1878117171240517857&#34;&gt;Astral, an AI marketing AI agent&lt;/a&gt;.
It will navigate through the standard GUI websites like Reddit and soon TikTok
and Instagram, and generate &amp;lsquo;genuine interactions&amp;rsquo; across social websites to
promote your startup business, &lt;a href=&#34;https://t.co/cGHVHeVHP9&#34;&gt;in closed beta&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;Matt Palmer: At long last, we have created the dead internet from the
classic trope &amp;ldquo;dead internet theory.&amp;rdquo;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;a href=&#34;https://x.com/tracewoodgrains/status/1878016461551083809&#34;&gt;Tracing Woods&lt;/a&gt;:
There is such a barrier between business internet and the human internet.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;On business internet, you can post &amp;ldquo;I&amp;rsquo;ve built a slot machine to degrade the
internet for personal gain&amp;rdquo; and get a bunch of replies saying, &amp;ldquo;Wow, cool! I
can&amp;rsquo;t wait to degrade the internet for personal gain.&amp;rdquo;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;/blockquote&gt;
&lt;p&gt;Amazing, as they say. Slop slop slop.&lt;/p&gt;
&lt;h2 id=&#34;slop-paranoia&#34;&gt;Slop Paranoia&lt;/h2&gt;
&lt;p&gt;And so in the more recent months, my brain has developed a subroutine that is
continuously scanning for sentence structure, word frequency, and formatting
that is indicative of LLM-generated content in otherwise natural-sounding prose,
leaving me to question the often originality of what I&amp;rsquo;m reading. It&amp;rsquo;s rather
annoying, honestly. But this is a logical immune response to slop proliferation.&lt;/p&gt;
&lt;p&gt;I’ve definitely had false positives on this subconscious slop detector as well.
I was recently reading a post that I only later learned was published in 2017,
which read as formulaic and flat in a way that &lt;em&gt;felt&lt;/em&gt; LLM-generated. False
positives aren’t surprising: given that LLM generations hue towards the
preference of the “median human data annotator”, the revealed preference is
writing that looks similar to bland pre-AI content.&lt;/p&gt;
&lt;p&gt;Perhaps we&amp;rsquo;ll soon get better watermarking or detection for AI-generated
content. As an uninformed intuition, I think this may be possible with
completely unedited text (as the worst slop often is), but wouldn&amp;rsquo;t do well with
slop interspliced with &amp;ldquo;real&amp;rdquo; text (which, fine, I&amp;rsquo;d take that trade).&lt;/p&gt;
&lt;p&gt;I think to some extent, we&amp;rsquo;ll have to live with slop being out there. My hope is
that the returns to non-slop content stay high enough to keep human writing
valuable and worth pursuing. And given that to keep the &amp;ldquo;knowledge cutoffs&amp;rdquo; of
base models progressing into the future, we will have to keep feeding more
recent writing back into training sets, even if they are slop contaminated. This
dynamic actually makes me optimistic about our ability to detect AI-generated
content at scale, since there will be an incentive to filter out low quality
content from training data.&lt;/p&gt;
&lt;p&gt;That does still leave a window open for influencing / adding to the training
sets of tomorrow.&lt;/p&gt;
&lt;h2 id=&#34;write-for-future-ais-aka-claude-knows-my-name&#34;&gt;Write for Future AIs, a.k.a. &amp;ldquo;Claude Knows My Name&amp;rdquo;&lt;/h2&gt;
&lt;p&gt;A trend I&amp;rsquo;ve been seeing in the past month or two is more prolific writers (at
least, in the admittedly eclectic circle I follow) come out and advocate for
&amp;ldquo;writing for the AIs&amp;rdquo;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href=&#34;https://gwern.net/&#34;&gt;Gwern&lt;/a&gt;&lt;/strong&gt;, worth quoting in full:
(&lt;a href=&#34;https://www.dwarkeshpatel.com/p/gwern-branwen&#34;&gt;source&lt;/a&gt;)&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;By writing, you are voting on the future of the Shoggoth using one of the few
currencies it acknowledges: tokens it has to predict. If you aren&amp;rsquo;t writing,
you are abdicating the future or your role in it. If you think it&amp;rsquo;s enough to
just be a good citizen, to vote for your favorite politician, to pick up
litter and recycle, the future doesn&amp;rsquo;t care about you.&lt;/p&gt;
&lt;p&gt;There are ways to influence the Shoggoth more, but not many. If you don&amp;rsquo;t
already occupy a handful of key roles or work at a frontier lab, your
influence rounds off to 0, far more than ever before. If there are values you
have which are not expressed yet in text, if there are things you like or
want, if they aren&amp;rsquo;t reflected online, then to the AI they don&amp;rsquo;t exist. That
is dangerously close to won&amp;rsquo;t exist.&lt;/p&gt;
&lt;p&gt;But yes, you are also creating a sort of immortality for yourself personally.
You aren&amp;rsquo;t just creating a persona, you are creating your future self too.
What self are you showing the LLMs, and how will they treat you in the future?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Tyler Cowen:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;You’re an idiot if you’re not writing for the AIs. They’re a big part of your
audience, and their purchasing power, we’ll see, but over time it will
accumulate.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;One of my friends was impressed recently that Claude knew my name and basic
facts about me, because I&amp;rsquo;ve written a decent amount online which has
undoubtedly been slurped up into a pretraining dataset. While the &amp;ldquo;AI knows me&amp;rdquo;
party trick feels like a mid-2020s version of Googling one&amp;rsquo;s self, I think there
is value in mildly influencing the weights of the
&lt;a href=&#34;https://knowyourmeme.com/memes/shoggoth-with-smiley-face-artificial-intelligence&#34;&gt;shoggoth&lt;/a&gt;
by putting more of your (non-AI-assisted) thoughts out there.&lt;/p&gt;
&lt;p&gt;I haven&amp;rsquo;t done much strategizing for how to write for future AIs, but the simple
strategy of write a lot, consistently, with a unique voice seems like a good
start. This advice is relatively similar to good &amp;ldquo;for human&amp;rdquo; writing, with
marked shift towards distinctiveness: coin terms, deliberately reuse rhetorical
devices, and describe arguments in a way that&amp;rsquo;s sticky enough to stand out in
the rest of the latent space of internet text. These patterns compound over
years, turning individual quirks into patterns which can be elicited from future
LLMs.&lt;/p&gt;
&lt;p&gt;As an illustrative example, Patrick McKenzie&amp;rsquo;s popularized the “Dangerous
Professional” tone. e.g. The type of tone that a lawyer or otherwise &amp;ldquo;well
informed&amp;rdquo; person would write in a complaint to a business, with the type of
verbiage that would attract the attention of another lawyer or Serious Business
Person. By writing extensively about the notion of a &amp;ldquo;Dangerous Professional&amp;rdquo;,
Patrick evidently created a meme that has been picked up by LLMs and is a
shorthand incantation for a certain type of stern, no-BS professional style.
This has real utility:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I remain extremely pleased that people keep reporting to my inbox that &amp;ldquo;Write
a letter in the style of patio11&amp;rsquo;s Dangerous Professional&amp;rdquo; keeps actually
working against real problems with banks, credit card companies, and so on.&lt;/p&gt;
&lt;p&gt;It feels like magic.&lt;/p&gt;
&lt;p&gt;(&lt;a href=&#34;https://thezvi.substack.com/p/ai-94-not-now-google&#34;&gt;&lt;em&gt;Source&lt;/em&gt;&lt;/a&gt;)&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Writing intentionally memetic content does seem to have leverage, if you have
sufficient distribution to spread the meme widely enough to be robustly picked
up by future LLMs.&lt;/p&gt;
&lt;h2 id=&#34;finding-a-path-forward&#34;&gt;Finding a Path Forward&lt;/h2&gt;
&lt;p&gt;Undoubtedly, the sloppification of the internet will likely get worse over the
next few years. And as such, the returns to curating quality sources of content
will only increase. My advice? Use an RSS feed reader, read Twitter lists
instead of feeds, and find spaces where real discussion still happens (e.g.
LessWrong and Lobsters still both seem slop-free).&lt;/p&gt;
&lt;p&gt;Personally, the approach I&amp;rsquo;ll continue to take with my writing is
straightforward: write with a recognizably &amp;ldquo;me&amp;rdquo; voice, edit thoroughly &amp;ndash;
especially if using AI assistance &amp;ndash; to not compromise on quality, and be honest
about AI usage&lt;sup id=&#34;fnref:3&#34;&gt;&lt;a href=&#34;#fn:3&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;3&lt;/a&gt;&lt;/sup&gt;. Write a lot, too. It might be useful in the future.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Cover image by Recraft v3&lt;/em&gt;&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;I know he originally (?) became popular for his work on Django, but I
followed him for quite a while due to his Datasette project, which I still
think is awesome software.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:2&#34;&gt;
&lt;p&gt;Per his
&lt;a href=&#34;https://simonwillison.net/2024/May/8/slop/?utm_source=chatgpt.com&#34;&gt;original blog post&lt;/a&gt;,
it actually appears Simon didn&amp;rsquo;t originate this neologism but rather was an
early proponent. I still credit him significantly for it catching on. The
tweet he links to was a long time TPOT-er. Small world.&amp;#160;&lt;a href=&#34;#fnref:2&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id=&#34;fn:3&#34;&gt;
&lt;p&gt;I did get some input from Claude &amp;amp; Gemini on various sections, but none of
this post is directly written by either.&amp;#160;&lt;a href=&#34;#fnref:3&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
        <title>Chain of Continuous Thoughts</title>
        <link>https://benjamincongdon.me/blog/2024/12/14/Chain-of-Continuous-Thoughts/</link>
        <pubDate>Sat, 14 Dec 2024 00:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2024/12/14/Chain-of-Continuous-Thoughts/</guid>
        <description>&lt;p&gt;Recent advances in LLMs have demonstrated increasingly powerful reasoning
capabilities, primarily through eliciting chain-of-thought outputs from models.
While these methods have proven effective, they rely on discrete, tokenized
representations of reasoning steps. A recent research paper from Meta introduces
a novel approach that steps away from this paradigm: reasoning in continuous
latent space rather than through explicit language tokens.&lt;/p&gt;
&lt;p&gt;Recently, I&amp;rsquo;ve been quite interested in the increasingly popular trend of
reasoning LLMs. So when I saw
&lt;a href=&#34;https://arxiv.org/abs/2412.06769&#34;&gt;&amp;ldquo;Training Large Language Models to Reason in a Continuous Latent Space&amp;rdquo;&lt;/a&gt;
in Meta FAIR&amp;rsquo;s recent paper flurry, I was intrigued. This was an interesting
read which made me reflect on how LLMs use chain-of-thought (CoT). For our
purposes, we&amp;rsquo;ll define CoT as the pattern of having an LLM generate step-by-step
reasoning text before producing an answer. CoT has become a basic strategy to
improve the model’s ability to perform well on harder reasoning tasks. This
largely started with the 2022
&lt;a href=&#34;https://arxiv.org/abs/2205.11916&#34;&gt;&amp;ldquo;Large Language Models are Zero-Shot Reasoners&amp;rdquo;&lt;/a&gt;
paper, which found improvements in model outputs simply by adding &amp;ldquo;Let&amp;rsquo;s think
step by step&amp;rdquo; to prompts.&lt;/p&gt;
&lt;h2 id=&#34;coconut-approach&#34;&gt;COCONUT Approach&lt;/h2&gt;
&lt;p&gt;The &amp;ldquo;Continuous Latent Space&amp;rdquo; paper introduces a new
&lt;strong&gt;c&lt;/strong&gt;hain-&lt;strong&gt;o&lt;/strong&gt;f-&lt;strong&gt;con&lt;/strong&gt;tin&lt;strong&gt;u&lt;/strong&gt;ous-&lt;strong&gt;t&lt;/strong&gt;hought (&amp;ldquo;COCONUT&amp;rdquo;) method for training
reasoning behavior into models. Instead of having the model represent the
chain-of-thought steps in text &amp;ndash; or symbols that are ultimately mapped to text
&amp;ndash; they encoded the steps in a continuous latent vector, where a state from each
step is fed into the model at subsequent time steps as the new input. By doing
so, the system is able to use the LLM in a non-linguistic, continuous space for
reasoning. The paper provides evidence of something which, upon reflection, is
quite intuitive: reasoning can happen outside of language, and can be more
powerfully expressed in non-linguistic symbols.&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
        
        
            
            
            
            
            
            
            
            
            
            
            
            
                
            
        
        
        
        
        
        
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2024/12/14/Chain-of-Continuous-Thoughts/coconut.png&#34;&gt;
            
            
            
            &lt;img
                src=&#34;https://benjamincongdon.me/blog/2024/12/14/Chain-of-Continuous-Thoughts/coconut.png&#34;
                loading=&#34;lazy&#34;
                decoding=&#34;async&#34;
            &gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;&lt;a href=&#34;https://arxiv.org/pdf/2412.06769&#34;&gt;Arxiv&lt;/a&gt;&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;In some ways this isn&amp;rsquo;t surprising. If we recall the idea that
&lt;a href=&#34;https://arxiv.org/abs/2309.10668&#34;&gt;&amp;ldquo;Language Modeling is Compression&amp;rdquo;&lt;/a&gt;, it&amp;rsquo;s not
shocking that there are more information theoretically concise ways to express
reasoning steps than with language tokens. If we think of a typical CoT, much of
it looks like something like this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Question: If a train leaves City A at 9:00 AM traveling at 60 mph and another
train leaves City B at 10:00 AM traveling at 80 mph toward City A, how far
from City A will the trains meet if the distance between the cities is 300
miles?&lt;/p&gt;
&lt;p&gt;Reasoning:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Okay, let’s see. Train A starts at 9:00 AM and travels at 60 mph. By 10:00
AM, when Train B leaves, Train A has traveled… oh wait, hold on, that’s not
right. I should first calculate the time difference.&lt;/li&gt;
&lt;li&gt;Ah, right! In that first hour, Train A travels $60 \text{ miles}$, since
$60 \text{ mph} \times 1 \text{ hour} = 60 \text{ miles}$.&lt;/li&gt;
&lt;li&gt;So now the distance between them is $300 - 60 = 240 \text{ miles}$. Okay,
got it so far. Now both trains are moving toward each other at a combined
rate of… wait, did I add that correctly?&lt;/li&gt;
&lt;li&gt;The speeds are $60 + 80 = 140 \text{ mph}$. Yeah, that’s right—combined
rate of $140 \text{ mph}$. Sorry about the confusion earlier!&lt;/li&gt;
&lt;li&gt;At this combined rate, the time for them to meet would be
$\frac{240}{140}$. Let me simplify that… uh, no, wait! Did I subtract the
initial distance correctly earlier? Yeah, I did, never mind.&lt;/li&gt;
&lt;li&gt;Simplifying $\frac{240}{140}$, we get… oh no, wait! That’s $24 / 14$… uh,
no, $12 / 7$! Yeah, so $\frac{12}{7} \text{ hours}$.&lt;/li&gt;
&lt;li&gt;Okay, converting that to hours and minutes… hold on, that feels weird.
$\frac{12}{7} \times 60$ … actually, let’s just approximate this. It’s
about $1.71 \text{ hours}$.&lt;/li&gt;
&lt;li&gt;Wait, no, not approximate! Stick with $\frac{12}{7}$ hours. So, the trains
meet $\frac{12}{7} \text{ hours}$ after 10:00 AM. That’s, uh, around… no,
hold on! $10 + 1.71 = 11:43 \text{ AM}$. Phew.&lt;/li&gt;
&lt;li&gt;Distance-wise, Train A has traveled
$60 \text{ mph} \times 2.43 \text{ hours}$ … wait, did I just add the hours
wrong again? Oh, right. No, it’s fine this time!&lt;/li&gt;
&lt;/ol&gt;
&lt;/blockquote&gt;
&lt;p&gt;Just at a surface level, much of the CoT comprises of phrases like &amp;ldquo;Oh wait,
hold on, that&amp;rsquo;s not right&amp;rdquo; and &amp;ldquo;hold on, that feels weird&amp;rdquo;, etc. These
reflections are useful when reasoning within the linguistic space because they
emulate human step-by-step reasoning, and encode the ability to self-correct and
backtrack when a particular reasoning path fails. However, they also add
significant noise and redundancy. Training on CoT traces allows the model to
develop these patterns of self-correction, but in a fairly verbose way.&lt;/p&gt;
&lt;h2 id=&#34;coconut-training&#34;&gt;COCONUT Training&lt;/h2&gt;
&lt;p&gt;The authors train in this ability to use a latent space representation for
reasoning by first training the model on textual CoT traces, and iteratively
replaces parts of the CoT with continuous thought states:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;As shown in Figure 2, in the initial stage, the model is trained on regular
CoT instances. In the subsequent stages, at the $k$-th stage, the first $k$
reasoning steps in the CoT are replaced with $k × c$ continuous thoughts,
where $c$ is a hyperparameter controlling the number of latent thoughts
replacing a single language reasoning step.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
        
        
            
            
            
            
            
            
            
            
            
            
            
            
                
            
        
        
        
        
        
        
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2024/12/14/Chain-of-Continuous-Thoughts/coconut_training.png&#34;&gt;
            
            
            
            &lt;img
                src=&#34;https://benjamincongdon.me/blog/2024/12/14/Chain-of-Continuous-Thoughts/coconut_training.png&#34; alt=&#34;Figure 2: Training procedure of Chain of Continuous Thought&#34;
                loading=&#34;lazy&#34;
                decoding=&#34;async&#34;
            &gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;Figure 2: Training procedure of Chain of Continuous Thought&lt;a href=&#34;https://arxiv.org/pdf/2412.06769&#34;&gt;Arxiv&lt;/a&gt;&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;This means the model has two modes: a linguistic mode and a reasoning mode,
which are discrete and continuous, respectively. The authors note that in
open-ended prompting, it becomes challenging to know when to switch between
these modes. They propose two approaches:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;a) train a binary classifier on latent thoughts to enable the model to
autonomously decide when to terminate the latent reasoning, or b) always pad
the latent thoughts to a constant length. We found that both approaches work
comparably well. Therefore, we use the second option in our experiment for
simplicity, unless specified otherwise.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;While the second approach does seem simpler from an implementation perspective,
the binary classifier approach seems more robust to me. I find it fascinating
that a model which could autonomously decide when it needs to apply more or less
reasoning to a problem. It&amp;rsquo;s a bit of a reach from the results in this paper,
but if such an autonomous model were trained successfully, that would indicate
to me a new level of metacognition that we&amp;rsquo;ve only recently seen hints of from
frontier models.&lt;/p&gt;
&lt;h2 id=&#34;advantages-of-continuous-representation&#34;&gt;Advantages of Continuous Representation&lt;/h2&gt;
&lt;p&gt;My understanding is that transitioning the reasoning process to a continuous
spaces allows the model to learn more concise representations of the reasoning
process. Reasoning in continuous spaces both allow for richer representations of
the reasoning &amp;ndash; since the intermediate steps don&amp;rsquo;t need to be collapsed back
into discrete tokens &amp;ndash; and because these continuous representations can be
learned.&lt;/p&gt;
&lt;p&gt;The idea of encoding latent thoughts into a continuous vector isn&amp;rsquo;t exactly new.
I’ve seen papers exploring the use of &amp;ldquo;pause&amp;rdquo; or &amp;ldquo;&amp;hellip;&amp;rdquo; tokens in reasoning
steps, as a way of having the LLM perform computation for &amp;ldquo;real&amp;rdquo; before
producing tokens for its output chain (e.g.
&lt;a href=&#34;https://arxiv.org/abs/2310.02226&#34;&gt;this paper&lt;/a&gt;). COCONUT builds off this in
making CoT reasoning explicit, then further generalizes these token &amp;ldquo;pauses&amp;rdquo;
into a system wherein the representation of &amp;ldquo;thought&amp;rdquo; is not done in discrete
tokens.&lt;/p&gt;
&lt;p&gt;The primary result of COCONUT is that continuous CoT is that using the latent
space for reasoning significantly improves the ability for the model to reason.
They found that not only could latent space reasoning outperform discrete CoT,
but also that the model learned improved planning abilities.&lt;/p&gt;
&lt;h2 id=&#34;implications-for-interpretability&#34;&gt;Implications for Interpretability&lt;/h2&gt;
&lt;p&gt;It&amp;rsquo;s not a costless improvement though: One aspect that seems valuable of
chain-of-thought traces is the ability to interpret how a model arrived at its
answer, by just inspecting the raw model output. As an interpretability measure,
having the CoT in pure text tokens is quite useful. There are concerns that the
CoT may be unfaithful to the actual reasoning used by the model: for example,
the model might be learning to stenographically hide reasoning that is distinct
from the face-value reasoning of the outputted CoT. However, as a baseline
prior, it definitely &lt;em&gt;seems&lt;/em&gt; like the CoT offers some insight into the reasoning
steps of the model.&lt;/p&gt;
&lt;p&gt;With continuous CoT, auditing the output of an internal reasoning chain is
inherently non-interpretable by a human reader. This appears to be a tradeoff
that you have to make: if the system operates in a continuous state space during
reasoning, it will be very challenging to map that latent space back to
interpretable words.&lt;/p&gt;
&lt;h2 id=&#34;reading-between-o1s-lines&#34;&gt;Reading Between o1&amp;rsquo;s Lines&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;Epistemic status: Uninformed speculation.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;In the recent o1 technical report from OpenAI, red-teamers (such as
&lt;a href=&#34;https://www.apolloresearch.ai/&#34;&gt;Apollo Research&lt;/a&gt;) were not given access to the
raw CoT traces. These are also not available to ChatGPT/API users. Previously
there seemed to be policy reasons for this: OpenAI did not want to release CoT
traces because they could be used to train reasoning models, which would erode
OpenAI&amp;rsquo;s competitive edge. However, from an
&lt;a href=&#34;https://www.youtube.com/watch?v=pB3gvX-GOqU&#34;&gt;interview&lt;/a&gt; with Apollo Research&amp;rsquo;s
Alexander Meinke, it sounds like with the full o1 model there were &lt;em&gt;technical&lt;/em&gt;
reasons why the full CoT could not be released to red-teamers.&lt;/p&gt;
&lt;p&gt;If we&amp;rsquo;re somewhat conspiratorial, this could be because o1 uses non-discrete
CoT, at least partially. Going over the o1 system report, there&amp;rsquo;s not much
evidence for this, however. This quote, in particular, seems pretty solid
evidence against non-discrete CoT:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;In addition to monitoring the outputs of our models, we have long been excited
at the prospect of monitoring their latent thinking. Until now, that latent
thinking has only been available in the form of activations — large blocks of
illegible numbers from which we have only been able to extract simple
concepts. Chains-of-thought are far more legible by default and could allow us
to monitor our models for far more complex behavior (if they accurately
reflect the model’s thinking, an open research question).&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;So, 🤷‍♂️ :shrug: &amp;hellip;
&lt;a href=&#34;https://www.lesswrong.com/posts/byNYzsfFmb2TpYFPW/o1-a-technical-primer&#34;&gt;This Lesswrong post&lt;/a&gt;
has some additional speculation on what o1 is doing under-the-hood. As a lay
person, none of these seem like hard blockers for the unavailability of the CoT.
If OpenAI is using the CoT traces for safety filtering, as they say in the
system report, it seems strange that these would be unavailable to release in an
external form for any reason other than they weren&amp;rsquo;t actually human legible.&lt;/p&gt;
&lt;p&gt;We already do know that OpenAI has a summarization model which summarizes and
(partially) obfuscates the raw CoT:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We surface CoT summaries to users in ChatGPT. &amp;hellip; We leverage the same
summarizer model being used for o1-preview and o1-mini for the initial o1
launch. &amp;hellip; We trained the summarizer model away from producing disallowed
content in these summaries. We find the model has strong performance here.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;However, the statement that they used the same summarization model for o1 as
o1-preview and o1-mini makes me update slightly against the possibility of
non-discrete reasoning tokens. As does the use of English descriptions for the
o1 CoT in the system report.&lt;/p&gt;
&lt;p&gt;The recent release of
&lt;a href=&#34;https://qwenlm.github.io/blog/qwq-32b-preview/&#34;&gt;Qwen&amp;rsquo;s QwQ reasoning model&lt;/a&gt; &amp;ndash;
which uses entirely discrete reasoning &amp;ndash; also shows that it&amp;rsquo;s definitely
possible for discrete model to have strong performance.&lt;/p&gt;
&lt;p&gt;Overall, I think it&amp;rsquo;s unlikely that o1 uses something like COCONUT &amp;ndash; I&amp;rsquo;d
predict somewhere in the neighborhood of a 20% likelihood.&lt;/p&gt;
&lt;h2 id=&#34;conclusion&#34;&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;So, while traditional, text-based CoT reasoning has become a quite common
technique for eliciting improved reasoning performance out of LLMs, the rise of
explicit &amp;ldquo;reasoning models&amp;rdquo; changes this landscape significantly &amp;ndash; both in
terms of model structure, and the type of prompting that is needed for optimal
model performance. COCONUT and other new ideas for sampling models suggests that
there exists a plethora of low-hanging fruit for extending LLMs to be better
reasoners. If latent space reasoning does offer a nontrivial improvement in
performance, as the COCONUT paper suggests, I suspect that this technique will
be baked into frontier models &amp;ndash; even at a hit to interpretability.&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
        
        
        
        
        
        
        
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2024/12/14/Chain-of-Continuous-Thoughts/footer.webp&#34;&gt;
            
            
            
            &lt;img
                src=&#34;https://benjamincongdon.me/blog/2024/12/14/Chain-of-Continuous-Thoughts/footer.webp&#34;
                loading=&#34;lazy&#34;
                decoding=&#34;async&#34;
            &gt;
            
        &lt;/a&gt;
    

    
&lt;/figure&gt;

&lt;p&gt;&lt;em&gt;Cover &amp;amp; Footer images by Recraft v3&lt;/em&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
        <title>Lake Union&#39;s Lonely Trolley: SLU Streetcar Ridership</title>
        <link>https://benjamincongdon.me/blog/2024/10/12/Lake-Unions-Lonely-Trolley-SLU-Streetcar-Ridership/</link>
        <pubDate>Sat, 12 Oct 2024 00:00:00 -0800</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2024/10/12/Lake-Unions-Lonely-Trolley-SLU-Streetcar-Ridership/</guid>
        <description>&lt;p&gt;I lived in the Eastlake neighborhood of Seattle for several years. Eastlake, by
its name, sits on the east side of Lake Union. As a runner, I spent many
mornings running along the lake, passing by the
&lt;a href=&#34;https://en.wikipedia.org/wiki/South_Lake_Union_Streetcar&#34;&gt;South Lake Union Streetcar&lt;/a&gt;.
Each time I ran past the streetcar, what consistently struck me as odd was that
the streetcars were &lt;em&gt;almost always empty&lt;/em&gt;. I&amp;rsquo;d see maybe one or two people
riding it. I lived within a couple blocks of the streetcar line for years, and
&lt;em&gt;never&lt;/em&gt; rode it a single time.&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
        
        
            
            
            
            
            
            
            
            
            
            
            
            
                
            
        
        
        
        
        
        
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2024/10/12/Lake-Unions-Lonely-Trolley-SLU-Streetcar-Ridership/slu_streetcar.png&#34;&gt;
            
            
            
            &lt;img
                src=&#34;https://benjamincongdon.me/blog/2024/10/12/Lake-Unions-Lonely-Trolley-SLU-Streetcar-Ridership/slu_streetcar.png&#34; alt=&#34;SLU Streetcar&#34;
                loading=&#34;lazy&#34;
                decoding=&#34;async&#34;
            &gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;SLU Streetcar&lt;a href=&#34;https://commons.wikimedia.org/wiki/File:Seattle_Streetcar_301_leaving_Pacific_Place_Station.jpg&#34;&gt;Wikipedia&lt;/a&gt;&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;Out of curiosity, I filed a Freedom of Information Act request for streetcar
ridership data last year. I had all but forgot about the data I received back,
but was recently reminded when I read reporting that the SLU streetcar had to
close for several weeks due to an electrical issue.&lt;/p&gt;
&lt;h2 id=&#34;ridership-data&#34;&gt;Ridership Data&lt;/h2&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
        
        
            
            
            
            
            
            
            
            
            
            
            
            
                
            
        
        
        
        
        
        
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2024/10/12/Lake-Unions-Lonely-Trolley-SLU-Streetcar-Ridership/ridership_over_time.png&#34;&gt;
            
            
            
            &lt;img
                src=&#34;https://benjamincongdon.me/blog/2024/10/12/Lake-Unions-Lonely-Trolley-SLU-Streetcar-Ridership/ridership_over_time.png&#34; alt=&#34;SLU Streetcar Average Weekly Ridership (2020-2023)&#34;
                loading=&#34;lazy&#34;
                decoding=&#34;async&#34;
            &gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;SLU Streetcar Average Weekly Ridership (2020-2023)&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;First, I plotted the average weekly ridership for the streetcar for the roughly
three years of data the city gave me. The most obvious feature is the dip in
ridership in mid-2020. Ridership creeped back up over time. However, even at its
peak in the summer of 2023, ridership was still significantly below its
pre-pandemic high.&lt;/p&gt;
&lt;p&gt;This raises the question: Who is this built for? It&amp;rsquo;s not clear if the streetcar
is supposed to be a tourist transport system (à la Seattle Monorail), or for
residents to commute. The day-of-week ridership numbers seem to suggest it &lt;em&gt;is&lt;/em&gt;
more of a commuter line:&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
        
        
            
            
            
            
            
            
            
            
            
            
            
            
                
            
        
        
        
        
        
        
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2024/10/12/Lake-Unions-Lonely-Trolley-SLU-Streetcar-Ridership/day_of_week_ridership.png&#34;&gt;
            
            
            
            &lt;img
                src=&#34;https://benjamincongdon.me/blog/2024/10/12/Lake-Unions-Lonely-Trolley-SLU-Streetcar-Ridership/day_of_week_ridership.png&#34; alt=&#34;SLU Average Ridership by Week Day (2020-2023)&#34;
                loading=&#34;lazy&#34;
                decoding=&#34;async&#34;
            &gt;
            
        &lt;/a&gt;
    

    &lt;figcaption&gt;&lt;p&gt;SLU Average Ridership by Week Day (2020-2023)&lt;/p&gt;
    &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;Even though there&amp;rsquo;s been a &amp;ldquo;return to office&amp;rdquo; push, it doesn&amp;rsquo;t seem like many
people are using the streetcar. There are some big employers nearby, notably
Fred Hutch and Amazon. Anecdotally, I&amp;rsquo;ve heard that Fred Hutch is continuing to
let people work from home unless they absolutely have to be in the office. Maybe
things will change when
&lt;a href=&#34;https://www.seattletimes.com/business/amazon-workers-will-return-to-the-office-five-days-a-week/&#34;&gt;Amazon requires people to come in 5 days a week in 2025&lt;/a&gt;,
but until then, I doubt ridership will match what it was before the pandemic.&lt;/p&gt;
&lt;p&gt;The streetcar isn&amp;rsquo;t cheap to maintain: The Seattle Times reports that it costs
&lt;a href=&#34;https://www.seattletimes.com/seattle-news/transportation/south-lake-union-streetcars-shut-down-for-many-weeks/&#34;&gt;$4.6 million per year&lt;/a&gt;
to operate and maintain. When the streetcar had to close for a couple of weeks
in September, I genuinely thought that the city might just close the line
indefinitely. But weeks later, service restarted. And, as I watched one of its
cars trundle by yesterday at the southern tip of Lake Union, ridership still
appeared low.&lt;/p&gt;
&lt;p&gt;I largely agree with
&lt;a href=&#34;https://www.seattlebikeblog.com/2024/08/20/seattle-decided-9-years-ago-to-kill-the-slu-streetcar/&#34;&gt;this sentiment from the Seattle Bike Blog&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The First Hill line seems to be filling an actual transportation need while
the SLU line does not.&lt;/p&gt;
&lt;p&gt;Keeping the SLU line alive is a classic case of Seattle indecision. It’s
connected to the city’s years of indecision about the downtown streetcar
project, which remains stalled due to a $93 million budget gap. Worse,
indecision like this can be very damaging to a community because streetcar
supporters have reason to keep fighting for it so long as it seems that
there’s still a chance. I don’t blame them because the vision of a
European-style network of streetcars is genuinely appealing and seems like a
vision worth fighting for. But even if the city built the downtown streetcar,
there are no plans whatsoever to expand the network any further. We’d still
just have one oddly-shaped line for the foreseeable future.&lt;/p&gt;
&lt;p&gt;&amp;hellip;&lt;/p&gt;
&lt;p&gt;The streetcar needs to go big or go home, and Seattle has firmly decided not
to go big.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;em&gt;Cover image: Lake Union &amp;amp; Seattle Skyline as viewed from Gas Works Park&lt;/em&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
        <title>TaskWarrior</title>
        <link>https://benjamincongdon.me/blog/2024/08/31/TaskWarrior/</link>
        <pubDate>Sat, 31 Aug 2024 00:00:00 -0700</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2024/08/31/TaskWarrior/</guid>
        <description>&lt;p&gt;I haven&amp;rsquo;t been writing much recently (&lt;em&gt;sound of crickets coming from this year&amp;rsquo;s
blog archive&lt;/em&gt;), but this is such an OnBrand™ post that I couldn&amp;rsquo;t not write it.
At work, I&amp;rsquo;ve been shifting into more of a TL role, and as such I&amp;rsquo;ve been
tracking an increasingly large number of streams of information. We use JIRA for
bug/feature level work, but a lot of the stuff that I need to track is more
micro-level: Slack threads to respond to, docs to review, reminders to ping
people, etc.&lt;/p&gt;
&lt;p&gt;While I was at Google, and for the first ~year at Databricks, I used my Gmail
inbox primarily as a todo list for micro tasks. I would keep emails as unread as
a reminder to respond to them, and would use the snooze feature as a reminder
system. This worked well when my primary interrupts were code reviews and doc
comments. But this system didn&amp;rsquo;t work with Slack, and the toil of maintaining a
sane inbox front-page was taking too much effort.&lt;/p&gt;
&lt;p&gt;Fast forward to a year ago, and I transitioned mostly to using Slack reminders
as my todo list. Slack has a &amp;ldquo;remind me about this&amp;rdquo; feature that was (and still
is) super useful. In retrospect, I think of this now less as a &amp;ldquo;wow, this is a
great feature for productivity&amp;rdquo; and more as a &amp;ldquo;wow, Slack is so poor at
resurfacing old threads that I need to use a reminder system to keep track of
things&amp;rdquo;. But it worked well enough for a while.&lt;/p&gt;
&lt;p&gt;As I started needing to keep track of more, Slack and Gmail both fell over for
me. I fell back to more manual approaches for tracking: first, Apple Notes, then
a Google Doc. Both Apple Notes and Google Docs have native checkboxes, which
made reasonably nice to use as a todo list. I know some people swear by a
long-running doc/note for tracking work, but I ultimately found it too manual to
keep up with. I&amp;rsquo;d try to have a new section per day, and move uncompleted tasks
to the new day as time went on. But it was too easy to forget to clean up
old/irrelevant tasks, and I&amp;rsquo;d end up with a bunch of stale tasks that I
realistically would never get around to.&lt;/p&gt;
&lt;p&gt;I also made a basic Eisenhower Matrix emoji prefix system for tasks, which was
helpful for prioritizing tasks, but ultimately didn&amp;rsquo;t help as I had to manually
rearrange/filter things to keep what was most important at the top of the list.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Urgent and Important: 🔥&lt;/p&gt;
&lt;p&gt;Not Urgent but Important: 🌟&lt;/p&gt;
&lt;p&gt;Urgent but Not Important: ⚡️&lt;/p&gt;
&lt;p&gt;Not Urgent and Not Important: 💤&lt;/p&gt;
&lt;p&gt;Won&amp;rsquo;t do: ❌&lt;/p&gt;
&lt;p&gt;Top Priority Task: 🥇&lt;/p&gt;
&lt;p&gt;Randomization: 🫨&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;And so recently, I switched to a system that I intuitively feel will be
stickier: &lt;a href=&#34;https://taskwarrior.org/&#34;&gt;Taskwarrior&lt;/a&gt; (though, TBD since each of
these systems seems to have a 2-6 month lifecycle before falling over).
Taskwarrior is a CLI-based task tracker that has a lot of features that I&amp;rsquo;ve
been missing in my previous systems:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Prioritization&lt;/strong&gt;: It has a built-in prioritization system, and automatically
sorts tasks by priority. It&amp;rsquo;s more sophisticated than a &amp;ldquo;Low/Medium/High&amp;rdquo;
priority system as well, as the priority is calculated based on a number of
factors (due date, urgency, importance, dependencies, etc). I really appreciate
that I can enter a bunch of tasks, and I get a sanely sorted list with the most
important tasks at the top. I also appreciate that if I work-crastinate on a
non-urgent task, TastWarrior tells me &amp;ldquo;You have more urgent tasks&amp;rdquo;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Ease of task creation&lt;/strong&gt;: It&amp;rsquo;s super easy to add tasks. I&amp;rsquo;ve aliased &lt;code&gt;task&lt;/code&gt; to
&lt;code&gt;t&lt;/code&gt;, so I can add a task with &lt;code&gt;t add &amp;lt;task&amp;gt;&lt;/code&gt;. I can also add tags, due dates,
etc. inline when creating a task. I always have a terminal window open on my
work laptop, so this ends up (surprisingly) being a lot faster than inputting
something into Google Docs or Apple Notes.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Dependency Tracking&lt;/strong&gt;: It has a built-in dependency and &amp;ldquo;waiting&amp;rdquo; system. I
can mark a task as &amp;ldquo;waiting&amp;rdquo; until a particular time, and it won&amp;rsquo;t show up in my
list until that time. Similarly I can mark a task as dependent on another task,
and it won&amp;rsquo;t show up in my list until the dependent task is completed. &amp;ndash; Tasks
that have dependencies automatically also get a bump in priority, which is nice.
Tracking all of this manually would be a huge pain.&lt;/p&gt;
&lt;p&gt;There are a bunch of other features that I&amp;rsquo;ve only dabbled in so far that are
also appealing:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Recurring tasks&lt;/li&gt;
&lt;li&gt;Projects / Tags / User-defined attributes&lt;/li&gt;
&lt;li&gt;Ecosystem of related tools.
&lt;a href=&#34;https://bugwarrior.readthedocs.io/en/latest/index.html&#34;&gt;Bugwarrior&lt;/a&gt; looks
particularly interesting, as it can pull in tasks from JIRA, etc.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;All this makes me think that Taskwarrior is a good fit for me, at least for
work-related tasks. ~All my non-tirival work is currently done on my work
laptop, so I don&amp;rsquo;t worry about cross-device syncing. (The lack of a friendly
mobile app would be a dealbreaker for using this for personal tasks, though.)&lt;/p&gt;
&lt;p&gt;Ok, enough bikeshedding for now. :)&lt;/p&gt;
</description>
    </item>
    
    <item>
        <title>How I Use AI: Mid-2024</title>
        <link>https://benjamincongdon.me/blog/2024/07/21/How-I-Use-AI-Mid-2024/</link>
        <pubDate>Sun, 21 Jul 2024 00:00:00 -0700</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2024/07/21/How-I-Use-AI-Mid-2024/</guid>
        <description>&lt;p&gt;I&amp;rsquo;ve been in a mode of trying lots of new AI tools for the past year or two, and
feel like it&amp;rsquo;s useful to take an occasional snapshot of the &amp;ldquo;state of things I
use&amp;rdquo;, as I expect this to continue to change pretty rapidly.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Claude 3.5 Sonnet (via API Console or
&lt;a href=&#34;https://github.com/simonw/llm&#34;&gt;LLM&lt;/a&gt;)&lt;/strong&gt;: I currently find Claude 3.5 Sonnet
to be the most delightful / insightful / poignant model to &amp;ldquo;talk&amp;rdquo; with. It
excels at complex reasoning tasks, especially those that &lt;code&gt;GPT-4&lt;/code&gt; fails at.
For example, I tasked &lt;code&gt;Sonnet&lt;/code&gt; with writing an AST parser for
&lt;a href=&#34;https://github.com/google/jsonnet&#34;&gt;Jsonnet&lt;/a&gt;, and it was able to do so with
minimal additional help. I don&amp;rsquo;t subscribe to Claude&amp;rsquo;s pro tier, so I mostly
use it within the API console or via Simon Willison&amp;rsquo;s excellent
&lt;a href=&#34;https://github.com/simonw/llm&#34;&gt;llm&lt;/a&gt; CLI tool. The
&lt;a href=&#34;https://www.anthropic.com/news/claude-3-5-sonnet&#34;&gt;Artifacts&lt;/a&gt; feature of
Claude web is great as well, and is useful for generating throw-away little
React interfaces.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;GPT-4o&lt;/strong&gt;: This is my current most-used general purpose model. The most
powerful use case I have for it is to code moderately complex scripts with
one-shot prompts and some nudges. &lt;code&gt;GPT-4o&lt;/code&gt; seems better than &lt;code&gt;GPT-4&lt;/code&gt; in
receiving feedback and iterating on code. I also use it for general purpose
tasks, such as text extraction, basic knowledge questions, etc. The main
reason I use it so heavily is that the usage limits for &lt;code&gt;GPT-4o&lt;/code&gt; still seem
significantly higher than &lt;code&gt;sonnet-3.5&lt;/code&gt;. And the pro tier of ChatGPT still
feels like essentially &amp;ldquo;unlimited&amp;rdquo; usage.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;GPT macOS App&lt;/strong&gt;: A surprisingly nice quality-of-life improvement over
using the web interface. Having the ability to &lt;code&gt;⌥-Space&lt;/code&gt; into a ChatGPT
session is super handy. I don&amp;rsquo;t use any of the screenshotting features of
the macOS app yet. They&amp;rsquo;re not automated enough for me to find them useful.
If there was a background context-refreshing feature to capture your screen
every time you &lt;code&gt;⌥-Space&lt;/code&gt; into a session, this would be super nice.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Github Copilot&lt;/strong&gt;: I use Copilot at work, and it&amp;rsquo;s become nearly
indispensable. I recently did some offline programming work, and felt myself
at least a 20% disadvantage compared to using Copilot. Copilot has two
components today: code completion and &amp;ldquo;chat&amp;rdquo;. I find the chat to be nearly
useless. It has &amp;ldquo;commands&amp;rdquo; like &lt;code&gt;/fix&lt;/code&gt; and &lt;code&gt;/test&lt;/code&gt; that are cool in theory,
but I&amp;rsquo;ve &lt;em&gt;never&lt;/em&gt; had work satisfactorily. The chat model Github uses is also
very slow, so I often switch to ChatGPT instead of waiting for the chat
model to respond.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;use-cases&#34;&gt;Use cases&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Docs/Reference replacement&lt;/strong&gt;: I never look at CLI tool docs anymore. LLMs
have memorized them all. Whenever I need to do something nontrivial with git
or unix utils, I just ask the LLM how to do it. I very much &lt;em&gt;could&lt;/em&gt; figure
it out myself if needed, but it&amp;rsquo;s a clear time saver to immediately get a
correctly formatted CLI invocation.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Limited Scope Refactorings&lt;/strong&gt;: Copy/pasting a small chunk (&amp;lt;100 lines) of
code or SQL, and asking it to perform some transformation (i.e. &amp;ldquo;Make the
query return weekly data instead of daily data&amp;rdquo;, &amp;ldquo;Change this function to
work with Fizz protos instead of Buzz protos&amp;rdquo;) tends to have a high enough
success rate that it is a time saver.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;General Knowledge Conversations&lt;/strong&gt;: I&amp;rsquo;ve enjoyed using the original ChatGPT
voice chat feature during my commute. It feels like talking with someone who
has read every Wikipedia article ever. As of 2024, the
&lt;a href=&#34;https://openai.com/index/hello-gpt-4o/&#34;&gt;&amp;ldquo;new&amp;rdquo; voice chat&lt;/a&gt; feature powered
by GPT-4o hasn&amp;rsquo;t landed yet, so I don&amp;rsquo;t have any experience with that.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;things-i-havent-had-time-to-try&#34;&gt;Things I Haven&amp;rsquo;t Had Time to Try&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://deepmind.google/technologies/gemini/pro/&#34;&gt;Gemini Pro/Advanced&lt;/a&gt;, or
its related tooling like &lt;a href=&#34;https://notebooklm.google/&#34;&gt;NotebookLM&lt;/a&gt;. The
coolest part of the recent Gemini models is their extremely large context
window (2M input tokens). In my limited testing, Gemini seems &amp;ldquo;good&amp;rdquo;, I just
haven&amp;rsquo;t had enough time tinkering with it to see where it exceeds the
capacities of the OpenAI/Anthropic models.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/deepseek-ai/DeepSeek-Coder-V2&#34;&gt;Deepseek Coder V2&lt;/a&gt;: An
extremely powerful open-source model for coding. This one looks pretty great
by the benchmark results Deepseek have posted. However, I tried playing with
the quantized model locally and was disappointed. The full model is rather
expensive to host locally, which has been a barrier. Deepseek also offer the
model via an API (at quite low cost too), which I hope to try eventually.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;resources&#34;&gt;Resources&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://thezvi.wordpress.com/&#34;&gt;Zvi Mowshowitz&lt;/a&gt;&amp;rsquo;s weekly AI posts are
excellent, and give an extremely verbose AI &amp;ldquo;state of the world&amp;rdquo;.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://simonwillison.net/&#34;&gt;Simon Willison&lt;/a&gt;&amp;rsquo;s blog is also an excellent
source for AI news.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://www.cognitiverevolution.ai/&#34;&gt;The Cognitive Revolution&lt;/a&gt; podcast
hosts some pretty good interviews that I find to be high-signal-to-noise,
and is much less hype-driven than many other AI-centric podcasts I&amp;rsquo;ve
attempted to listen to.&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    
    <item>
        <title>Avoid Load-bearing Shell Scripts</title>
        <link>https://benjamincongdon.me/blog/2023/10/29/Avoid-Load-bearing-Shell-Scripts/</link>
        <pubDate>Sun, 29 Oct 2023 00:00:00 -0700</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2023/10/29/Avoid-Load-bearing-Shell-Scripts/</guid>
        <description>&lt;p&gt;I&amp;rsquo;ve recently been contemplating a recurring pattern that I&amp;rsquo;ve observed in
several teams I&amp;rsquo;ve worked on – the &amp;lsquo;Load-Bearing Script.&amp;rsquo; The outline of this
pattern goes like this: A team member writes a portion of a system as a shell
script for a quick prototype. That shell script, initially quite simple, grows
in complexity over time. Eventually, the script grows to an unmanageable level
of complexity. At that point, it needs to be rewritten in a more
maintainable/testable language.&lt;/p&gt;
&lt;p&gt;In my experience, this usually manifests itself as a bash script, though any
untested/untestable &amp;ldquo;script&amp;rdquo; can exhibit this pattern.&lt;/p&gt;
&lt;h2 id=&#34;examples&#34;&gt;Examples&lt;/h2&gt;
&lt;p&gt;In one case, we were building a system that needed to execute in a CI builder
environment. We wanted to do some basic CI/CD work, and so the script was
initially a simple wrapper around git and Kubernetes commands. Eventually, much
of our system&amp;rsquo;s core business logic found its way into the script (metrics
collection, a basic killswitch system, retry logic, etc.). This system was
particularly challenging to manage because the script wasn&amp;rsquo;t even static. Our
backend used Go templates to assemble the script dynamically and send it to the
CI environment. Our only testing was sanity checks that our templater produced
sensible output, and limited end-to-end testing.&lt;/p&gt;
&lt;p&gt;In another case, my company had a requirement to run certain workloads (again,
CI/CD type actions) in a specific compute environment. This compute environment
made it super easy to executing bash scripts, and had friction to running
team-build binaries. My team did ship our own binaries to this environment, but
for reasons that retroactively aren&amp;rsquo;t defensible we still allowed business logic
to creep into the script portion.&lt;/p&gt;
&lt;p&gt;In both cases, we did a (fairly risky) rewrite. In both cases, the rewrite
resulted in moderate severity incidents, despite best efforts to do so safely.&lt;/p&gt;
&lt;h2 id=&#34;why-does-this-happen&#34;&gt;Why does this happen?&lt;/h2&gt;
&lt;p&gt;I can think of a number of reasons: Shell scripts are easy to prototype with.
They&amp;rsquo;re an attractive option when you require &amp;lsquo;just a small amount of logic&amp;rsquo; and
wish to avoid the complexities of a build system, types, or tests. Software
developers enjoy the avoidance of over-engineering almost as much as the enjoy
over-engineering.&lt;/p&gt;
&lt;h2 id=&#34;why-is-this-bad&#34;&gt;Why is this bad?&lt;/h2&gt;
&lt;p&gt;The primary reason I distrust load-bearing scripts is that they make systems
unstable. The instability most often comes from the inability (or difficulty in)
adding sufficient test coverage. Yes, there are frameworks for bash script
testing! I&amp;rsquo;ve rarely seen them effectively used. Usually, a load-bearing script
comes into existence &lt;em&gt;because&lt;/em&gt; the work it is doing is difficult to test (for
example, wrapping multiple dependent CLI tools in a CI environment). The
load-bearing script becomes problematic because it becomes difficult to change.
The script&amp;rsquo;s complexity surpasses a point where manual testing or limited
end-to-end tests can prevent issues &amp;ndash; and so, breakages will happen.&lt;/p&gt;
&lt;p&gt;The secondary reason load-bearing scripts are nefarious is that you &lt;em&gt;will&lt;/em&gt;
eventually have to do a rewrite. It becomes inevitable. Either you accept
permanent instability or do the rewrite. The longer you delay the write, the
more painful it is. There will be pushback against the rewrite: the rewritten
script needs to be feature compatible with the old system; the rewritten script
needs to be released safely; rewriting the script will consume valuable
developer time that could be spent working on Shiny New Features. But
eventually, the scales tip towards the rewrite.&lt;/p&gt;
&lt;h2 id=&#34;advice-to-myself&#34;&gt;Advice to myself&lt;/h2&gt;
&lt;p&gt;If your script becomes larger than what&amp;rsquo;d be appropriate to store in a single
reasonably sized function, it should no longer be a script. Prefer to bail early
on the shell script and eat the cost of a simple rewrite, rather than let
technical debt continue to accrue.&lt;/p&gt;
</description>
    </item>
    
    <item>
        <title>Soft Boredom</title>
        <link>https://benjamincongdon.me/blog/2023/10/26/Soft-Boredom/</link>
        <pubDate>Thu, 26 Oct 2023 00:00:00 -0700</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2023/10/26/Soft-Boredom/</guid>
        <description>&lt;p&gt;I recently read Pema Chödrön&amp;rsquo;s
&lt;a href=&#34;https://www.goodreads.com/book/show/13414918-living-beautifully&#34;&gt;&lt;em&gt;Living Beautifully&lt;/em&gt;&lt;/a&gt;,
and I was struck by the following passage:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Chögyam Trungpa demonstrated the co-emergent nature of feelings in a teaching
on boredom-on how we feel when nothing&amp;rsquo;s happening. Hot boredom, he said, is a
restless, impatient, I-want-to-get-out of here feeling. But we can also
experience nothing happening as cool boredom, as a care-free, spacious feeling
of being fully present without entertainment &amp;ndash; and being right at home with
that.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I quite like the phrase &amp;ldquo;soft boredom.&amp;rdquo; When younger, I experienced &amp;ldquo;hot
boredom&amp;rdquo; often: when impatiently waiting in the back seat of a car, when waiting
for a class to end, when on a plane without anything to do, when there was
nothing interesting to look forward to. Boredom was so unpleasant that it needed
to be planned around. The anticipation of boredom was itself unpleasant.&lt;/p&gt;
&lt;p&gt;I&amp;rsquo;ve thought to myself over the past several years, &amp;ldquo;I don&amp;rsquo;t really get bored
anymore.&amp;rdquo; This isn&amp;rsquo;t quite true; there are still times that my mind is empty and
searching for something to busy itself with, but the experience is quite
different. With &amp;ldquo;hot boredom,&amp;rdquo; the quality of feeling is distinctly negative.
With &amp;ldquo;soft boredom&amp;rdquo;, it&amp;rsquo;s &amp;ndash; as Trungpa says &amp;ndash; a more &amp;ldquo;care-free&amp;rdquo; experience. A
thought may float in my head that I wish to engage with, or not! And either is
fine.&lt;/p&gt;
&lt;p&gt;Along the spectrum of &amp;ldquo;hot boredom&amp;rdquo; and &amp;ldquo;soft boredom&amp;rdquo;, I believe there&amp;rsquo;s also
an identifiable &amp;ldquo;lukewarm boredom&amp;rdquo; which expresses itself as &amp;ldquo;always having
something to think about&amp;rdquo;. For me, there was a transitionary time when &amp;ldquo;hot
boredom&amp;rdquo; was no longer present in the absence of an engaging activity, but only
because I could always think myself into being engaged. With &amp;ldquo;lukewarm boredom,&amp;rdquo;
you don&amp;rsquo;t get the restless &lt;em&gt;I-need-to-be-doing-something feeling&lt;/em&gt;, but your mind
still needs to be continuously active.&lt;/p&gt;
&lt;p&gt;Now, I feel &amp;ldquo;soft boredom&amp;rdquo; with a relaxed mind. An hour can float by without
serious thought or restless &lt;em&gt;what-happens-next&lt;/em&gt;-ing. It&amp;rsquo;s quite pleasant.&lt;/p&gt;
&lt;p&gt;In any case, I&amp;rsquo;m writing this on a plane after having taken a week off work. I
spent the week hiking in Arizona. As I drove through the desert, hiked
(partially) into the Grand Canyon, and watched sunsets amongst Arizona&amp;rsquo;s sparse
flora, I quite appreciated being able to fall back into the spaciousness of
&amp;ldquo;soft boredom&amp;rdquo;.&lt;/p&gt;
</description>
    </item>
    
    <item>
        <title>Mental Models: Slack</title>
        <link>https://benjamincongdon.me/blog/2023/06/20/Mental-Models-Slack/</link>
        <pubDate>Tue, 20 Jun 2023 00:00:00 -0700</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2023/06/20/Mental-Models-Slack/</guid>
        <description>&lt;p&gt;Two of my all-time favorite articles about managing one&amp;rsquo;s energy and time relate
to the notion of maintaining &amp;ldquo;Slack&amp;rdquo; in one&amp;rsquo;s life. The first,
&lt;a href=&#34;https://thezvi.wordpress.com/2017/09/30/slack/&#34;&gt;Slack&lt;/a&gt;, by Zvi Mowshowitz,
directly describes the Slack concept that I refer to in this post. The second,
&lt;a href=&#34;http://benjaminrosshoffman.com/sabbath-hard-and-go-home/&#34;&gt;Sabbath hard and go home&lt;/a&gt;,
expands on this notion in the context of the author&amp;rsquo;s Jewish upbringing.&lt;/p&gt;
&lt;p&gt;I&amp;rsquo;ve been wanting to write about this concept for a while, but (ironically)
haven&amp;rsquo;t ever found the time to do so.&lt;/p&gt;
&lt;p&gt;Slack (proper noun) is your buffer. It&amp;rsquo;s your buffer of mental energy, physical
energy, and time. It&amp;rsquo;s the ability to get sick for a day or two without
significant interruption to one&amp;rsquo;s commitments. Slack means you can have an off
day without missing an important deadline. Slack allows you to to explore
something you&amp;rsquo;re curious in, without worrying about wasting time. It&amp;rsquo;s writing a
blog post about Slack, when there are assuredly more &amp;ldquo;valuable&amp;rdquo; things one could
do with one&amp;rsquo;s time.&lt;/p&gt;
&lt;p&gt;In the &lt;a href=&#34;https://en.wikipedia.org/wiki/Stock_and_flow&#34;&gt;Stock and Flow&lt;/a&gt; model of
systems, Slack is a Stock, a quantity that can be built up and depleted. It&amp;rsquo;s
significantly easier to deplete one&amp;rsquo;s Stock of Slack than increase it. Depleting
Slack is easy: Unforseen circumstances, the inevitable chaos in life, and
cultural expectations around business and work ethic all make burning through
one&amp;rsquo;s buffer the default outcome. Retaining and rebuilding Slack take purposeful
effort.&lt;/p&gt;
&lt;h2 id=&#34;maintaining-slack&#34;&gt;Maintaining Slack&lt;/h2&gt;
&lt;p&gt;Maintaining a buffer of Slack requires active effort, especially for people who
like to stay busy. I try to use my time well &amp;ndash; both at work, and in my personal
life &amp;ndash; and so I have the tendency to commit to things such that my schedule is
&amp;ldquo;full&amp;rdquo;. This is manageable when you&amp;rsquo;re in complete control over your schedule,
but as soon as exterior forces exert their influence on your life, you quickly
burn through your Slack. So, in one way, Slack is purposefully &lt;em&gt;undercomitting&lt;/em&gt;
yourself. Zvi defines Slack as &amp;ldquo;The absence of binding constraints on behavior&amp;rdquo;,
and so in this way, choosing which constraints you allow to be placed on your
time is critically important to maintaining a buffer.&lt;/p&gt;
&lt;p&gt;Slack also requires handling commitments wisely. Tasks with hard deadlines
should be started sooner than necessary, to have buffer time built-in.
Unimportant tasks should be deferred or delegated to minimize unnecessarily
spent time.&lt;/p&gt;
&lt;p&gt;Having ample Slack needs to be the default case for it to be useful. If you
sometimes have Slack, but often don&amp;rsquo;t, you don&amp;rsquo;t get the benefits. The &amp;ldquo;badness&amp;rdquo;
of stress quickly outpaces the &amp;ldquo;goodness&amp;rdquo; of flexibility. One stressful day or
week looms larger than days and weeks without undue stress. Maintaining a buffer
should be one&amp;rsquo;s standard stance.&lt;/p&gt;
&lt;h2 id=&#34;failure-modes&#34;&gt;Failure Modes&lt;/h2&gt;
&lt;p&gt;Functioning without Slack is like that feeling of always being &amp;ldquo;one bad event&amp;rdquo;
away from letting something slip. It&amp;rsquo;s a precarious feeling! Living this way for
too long leads to burnout or, at best, fatigue. Lacking Slack results in
stressfully working to meet deadlines, dropping commitments, and always being
anxious about &amp;ldquo;what&amp;rsquo;ll go wrong next&amp;rdquo;.&lt;/p&gt;
&lt;p&gt;One feature I&amp;rsquo;ve noticed about Slack is that it tends to be global, or
&lt;a href=&#34;https://drmaciver.substack.com/p/life-complete-problems&#34;&gt;&amp;ldquo;life complete&amp;rdquo;&lt;/a&gt;. One
doesn&amp;rsquo;t have work Slack and personal Slack, as separate quantities. Everything
ultimately comes from the same energy and time budget.&lt;/p&gt;
&lt;p&gt;That being said, Slack is &lt;em&gt;meant to be used&lt;/em&gt;. The optimal amount of burnout is
greater than zero. Having Slack, but not using it to pursue worthy goals is a
waste. Optimally, one should never get to a place of having &lt;em&gt;no&lt;/em&gt; Slack. But this
is challenging, as it&amp;rsquo;s hard to gauge how much Slack actually has. Occasionally
overshooting into having too little Slack is OK, as long as you notice this
quickly, and work to reestablish that buffer.&lt;/p&gt;
&lt;h2 id=&#34;reestablishing-slack&#34;&gt;Reestablishing Slack&lt;/h2&gt;
&lt;p&gt;The longer you are without Slack, the harder it is to bring it back. If I find
I&amp;rsquo;m merely running slighly low on Slack, slowing down for a week or two tends to
be enough to get back to baseline.&lt;/p&gt;
&lt;p&gt;The more dangerous situation is when you get into a longer-term Slackless &lt;em&gt;rut&lt;/em&gt;.
Getting out of a &lt;em&gt;rut&lt;/em&gt; is particularly challenging because (1) getting out of a
rut and (2) Slack is what gives you the breathing room to think dynamically.
Burnout decreases executive function, and so making the necessary changes to
one&amp;rsquo;s routine to get out of the rut is exactly what&amp;rsquo;s most challenging to do.&lt;/p&gt;
&lt;p&gt;I don&amp;rsquo;t have a great answer for how to get out of ruts. I&amp;rsquo;ve found the most
reliable way to get out of a rut is to be pushed out by external circumstances.
It&amp;rsquo;s especially helpful to have people in your life who realize you&amp;rsquo;re in one,
and/or can help you climb out of one.&lt;/p&gt;
&lt;p&gt;In either case, reestablishing Slack requires redirecting your time and energy.
Intentionally &lt;em&gt;do less&lt;/em&gt; to build back up a buffer.&lt;/p&gt;
</description>
    </item>
    
    <item>
        <title>The Soul of an Old Machine</title>
        <link>https://benjamincongdon.me/blog/2023/04/15/The-Soul-of-an-Old-Machine/</link>
        <pubDate>Sat, 15 Apr 2023 00:00:00 -0700</pubDate>
        <author>Ben Congdon</author>
        <guid>https://benjamincongdon.me/blog/2023/04/15/The-Soul-of-an-Old-Machine/</guid>
        <description>&lt;p&gt;I recently got an M2 MacBook Air to replace my 2014 MacBook Pro. Apple offered
to recycle my old machine (and give me a token $90 off my new laptop as a
trade-in), which I gladly opted-in to.&lt;/p&gt;
&lt;p&gt;However, when it got time to actually wipe my old laptop and trade it in, I
couldn&amp;rsquo;t help but get a little sentimental about it. I&amp;rsquo;ve used this laptop for
nearly a decade &amp;ndash; and it was a (perhaps &lt;em&gt;the&lt;/em&gt;) formative decade of my life. I
did all of my college work on this laptop, studied computer science, wrote
essentially all of the posts on this blog (from its inception until ~2021),
traveled internationally with it, took it across several moves, used it to
secure my first job post-college, et cetera, et cetera.&lt;/p&gt;

&lt;figure&gt;
    
    

    
    

    
    
        
        
        
        
            
            
            
            
            
            
            
            
            
            
            
            
                
            
        
        
        
        
        
        
        
        
        &lt;a href=&#34;https://benjamincongdon.me/blog/2023/04/15/The-Soul-of-an-Old-Machine/macbook_pro.jpg&#34;&gt;
            
            
            
            &lt;img
                src=&#34;https://benjamincongdon.me/blog/2023/04/15/The-Soul-of-an-Old-Machine/macbook_pro.jpg&#34;
                loading=&#34;lazy&#34;
                decoding=&#34;async&#34;
            &gt;
            
        &lt;/a&gt;
    

    
&lt;/figure&gt;

&lt;p&gt;It was, and is, a great machine. If not for its woefully aged processor and
now-insufficient memory, I&amp;rsquo;d happily keep using it. And I did keep using it,
well past its point of obsolescence. I&amp;rsquo;ve used it for years in both laptop and
&amp;ldquo;clamshell&amp;rdquo; mode, and the only nontrivial issue I had with it was a swollen
battery (which I was able to fix for a reasonable price). But nearly everything
else &amp;ndash; the keyboard, the port selection, the screen &amp;ndash; was basically my
favorite hardware that Apple has yet released.&lt;/p&gt;
&lt;p&gt;That&amp;rsquo;s not to say there weren&amp;rsquo;t some frustrating aspects of it &amp;ndash; the worst
being that it only had 256GB of internal storage, so I had to juggle external
storage for its entire lifetime. I usually kept an additional 256GB SD card in
it, using a micro SD card and an adapter that kept it flush with the port. Of
course, this wasn&amp;rsquo;t the best solution, as SD cards aren&amp;rsquo;t really meant for this
type of access pattern. But it worked!&lt;/p&gt;
&lt;p&gt;Unfortunately, a couple months ago I realized my MBP was struggling to manage
even a single Chrome tab, and noticed that is just &lt;em&gt;was not pleasant&lt;/em&gt; to use
this machine anymore. I never checked to see if I &lt;em&gt;could&lt;/em&gt; update its macOS
version past the Mojave that I parked it on, but I wouldn&amp;rsquo;t trust any of the
more recent releases to run well on it. Also, as more tools are optimized for
M1+, Intel macs just aren&amp;rsquo;t long for this world.&lt;/p&gt;
&lt;p&gt;I strongly considered keeping the old MBP as a momento of this now-closed
chapter of my life, but ultimately decided on recycling it. One less piece of
old hardware sitting around, and hopefully Apple actually is able to salvage
some materials from it.&lt;/p&gt;
&lt;p&gt;I have a fairly strong tendency to become attached to the tools I use over time.
One practice that tends to work for &amp;ldquo;releasing&amp;rdquo; sentimental objects is taking a
picture of it, and being intentional about what value it brought me, and
allowing it to go (in a fairly
&lt;a href=&#34;https://en.wikipedia.org/wiki/Marie_Kondo&#34;&gt;Kondo-esque&lt;/a&gt; fashion)&lt;/p&gt;
&lt;p&gt;Well. So long and farewell, to my 2014 MacBook Pro. Thanks for your many years
of stable service. 👋🙏&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Cover: Snowshoeing @ Mt. Rainier, March 2023&lt;/em&gt;&lt;/p&gt;
</description>
    </item>
    
  </channel>
</rss>
