Salvatore Sanfilippo, aka antirez, the creator of Redis, posted a piece on his reflections on AI for the end of 2025. He ends with the statement:
The fundamental challenge in AI for the next 20 years is avoiding extinction.
Iâve been hesitant to say this publicly, but I broadly agree with this statement. Here are a few related statements I endorse:
- Building intelligences smarter than humans is dangerous.
- Aligning a smarter-than-human intelligence to human values is an open and unsolved problem.1
- By default, market forces will cause us to underinvest in AI safety and underenforce pre- and post-deployment safety measures.
- Lacking some form of external governance, market forces will encourage âarms race dynamicsâ between frontier AI labs which sideline whatever safety commitments have been made.
- Even if we are able to create and align a smarter-than-human intelligence, there are unresolved long-term concerns around gradual disempowerment2 and mass unemployment that should give us serious pause about continuing down our current path.
An Inconvenient Truth
In 2006, Al Goreâs âAn Inconvenient Truthâ became a focal point for concerns about climate change. We have yet to have an Inconvenient Truth moment for AI existential risk. AI 2027 and If Anyone Builds It, Everyone Dies both came close this year, but neither seemed to spark a broad reaction like Inconvenient Truth or other related climate change movements did.
I think the core framing of An Inconvenient Truth roughly applies to risks from AI. For both climate change and AI x-risk, the immediate impacts feel banal and easy to ignore. The projections based on existing long-term trends, on the other hand, are scary and deserve attention. We are rather poor at allocating attention to long-term risks that would be more easily preventable with short-term governance action. The same tension applies here: it’s costly to put mental energy into considering “doom” from an abstract risk. The risk won’t materialize within the next few years. There remains legitimate uncertainty over just how bad it actually is. And so it gets deprioritized.
AI Risks
The risks of AI exist on a spectrum from âbenign and mundaneâ to âcatastrophic and existentialâ. As you move more towards the catastrophic end of this spectrum, the scenarios one needs to consider get progressively âweirderâ and require more extrapolation to recognize as probable.
Mundane risks are things that are already showing up today. Some examples:
- LLMs engaging in overt sycophancy, and/or encouraging delusions in people with existing mental illness.3
- Using AI to increase the speed and sophistication of existing modes of cybercrime.4
Foreseeable risks include things that are not actively problems today, but are on trend to become problems soon:
- Significant job losses in knowledge work sectors leading to societally impacting disruption.
- Concentration of power in a small number of AI companies.
- Autonomous AI systems being deployed in safety-critical or high-stakes domains (e.g. military, financial markets, critical infrastructure) before we have robustly solved out-of-distribution alignment.
Catastrophic risks include things that are, hopefully, fairly far off but would be very bad for humanity:
- Recursive self-improvement of AI, and/or full automation of AI research and development leading to a widening asymmetry between AI capabilities and humanityâs collective understanding of how AI systems work. This could result in an AI that far outmatches us strategically, with goals that are unaligned with human flourishing.
- Deliberate misuse of AI by a state-level actor to, for example: design novel bioweapons, coordinate attacks on critical infrastructure, develop novel catastrophic technologies that humanity would otherwise have taken much longer to develop (e.g. mirror life).
- Loss of meaningful human control over governance5 and economic6 systems.
Before It’s Obvious
Iâd suggest that at the end of 2025, weâre still in a bit of an awkward place in the dialogue about AI risk. I think people are right to remain skeptical about how big of a deal this is, as it does require a pretty large leap to go from âChatGPTâ to âCatastrophic Risksâ. It required an even larger leap in the pre-ChatGPT era, which is why the types of folks who have been raising the alarm about AI risk (e.g. MIRI) are, respectfully, necessarily a bit âweirdâ.
For the past couple years, Iâve often felt this gut-sinking feeling akin to late December 2019 through late February 2020. I remember the cognitive dissonance of visiting Twitter/X and seeing people whose thinking I respect saying âyou all should be freaking out, this novel pathogen is serious businessâ while more of the mainstream information ecosystem either unknowingly ignored these arguments or knowingly suppressed them as âmisinformationâ.
Iâm writing this post to do my small part to push my information ecosystem the other way. A small stone on the cairn of âyes, we should be concerned about thisâ.
Psychological Defenses
Iâve noticed an interesting set of reactions, both thinking to myself about this problem and in talking to others about it.
Goalpost Moving:
- What it looks like: âAI canât do X, therefore we have nothing to worry about. One year later, AI can do X. But it still canât do Y, so weâre good.â
- Discussion: Look, tools like Claude Code are, by some reasonable definitions, essentially proto-AGIs. If I somehow got access to Claude Code in 2015, I expect Iâd concede that weâd reached AGI. And yet, all frontier models still currently have significant weaknesses. Iâd really appreciate convincing evidence that progress is materially slowing, but Iâve yet to see it. The progress in 2025 alone has been staggering. âX is not possible todayâ is not an argument that it wonât be possible in 10 years.
Argument from Inconvenience:
- What it looks like: âIt would be deeply inconvenient and weird if powerful AI was dangerous, so I will proceed as if it cannot exist.â
- Discussion: Yeah, it is inconvenient and weird. I sympathize with this argument a lot. I donât really want my industry to be disrupted, but it demonstrably already is undergoing this. No one wants the âbadâ that comes along with AI progress, and many people donât want the âgoodâ. There is a legitimate question around âis this type of progress inevitable or chosen?â There isnât a binary answer to this, and there is still optionality for us collectively to decide that dangerous capability progress isnât inevitable. However, âinevitabilityâ does appear to be the default path.
There Are Adults in the Room:
- What it looks like: âThere will always be someone using the AI or in control of the AI, so cooler heads will prevail. No one wants doom.â
- Discussion: Unfortunately, this expects a lot of discretion from people. We are training AI systems to be more agentic on long-horizon tasks for the explicit purpose of handing over control of long-horizon tasks to them. First, people empirically do cede control to automation for convenience, efficiency, and competitiveness reasons. Second, the risk of loss of control gets much scarier as AI systems become more intelligent and autonomous. I do not think loss of control is a particularly scary risk today, but projected forward a decade and it becomes more concerning.
The last main remaining resistance Iâve encountered is the lack of a realistic, non-scifi sounding scenario for how a catastrophe would occur. Both AI 2027 and If Anyone Builds It, Everyone Dies offer expanded scenarios about how a catastrophic risk could occur, while giving the disclaimer that this specific scenario is not particularly likely. In contrast, risks like Mutually Assured Destruction from nuclear warheads are much easier to articulate: (1) âsomething happensâ and the {USA, USSR} launches a first strike on the {USSR, USA}, (2) the recipient of a first strike launches a guaranteed counterstrike, (3) many millions die in the direct and indirect aftermath. AI risks rely more on intuition pumps and extrapolation, for now, which is much less easy to share as an elevator pitch.
The best Iâve heard, so far, is just that âbuilding a smarter-than-human intelligence is dangerousâ. Fortunately, there are now also many good explainers and resources for various levels of technical depth.
Seeing Like a Cat
The core intuition, that smarter-than-human intelligence is dangerous to humans is surprisingly hard to convey. Let me try an analogy.
I recently adopted two very cute 6 month old kittens from my local humane society. They were likely stray/feral cats earlier in life, and so theyâre skittish around humans. As theyâve been adapting to living in my house, Iâve given them space and let them hide wherever they want.
In some sense, my new kittens arenât âentirelyâ less intelligent than me. They can find hidey-holes in my house that I had no knowledge of, they can out-run me, and their sense of sound makes them well equipped to avoid me. And yet, I am in some important sense more intelligent than them.
If I needed to, say, scoop them up for a vet visit I could outsmart them. I could block their hiding spots, strategically shunt them into a room with no other exits, lure them with food or toys, and eventually catch them. This is not obvious to them. From their perspective, I’m just a friendly person who gives them food, plays with them, and otherwise leaves them alone.7
The obvious analogy is, consider that a superintelligent AI is to humanity what I am to my kittens. Something that can out-strategize you without you even realizing youâve been backed into the corner of a room with no doors. This analogy applies in two ways:
First: We might end up in a fairly benign world where humans are treated the same way as cats – cats are cute, they are generally treated well, and to the extent that they are out-strategized by humans, it is largely for their own good. This still doesnât imply that one would want to be disempowered in this way if you, say, have preferences about the future.
Second: you may say, âintelligence isnât some raw scalar quantity that you can just increase linearlyâ. And I would agree. Iâm not sure what it would mean to, say, have an agent with a â300 IQâ. Like, concretely, an agent capable of getting a 300 on an IQ test. That seems meaningless, I agree.
I can, however, imagine an agent with jagged intelligence which has the ability to autonomously query and integrate information from hundreds (or thousands) of realtime sources8, effortlessly hack into key infrastructure systems while evading detection9, subtly influence humans into various opinions or plans of action that they wouldnât already10, and so on. Obviously these capabilities do not exist in a single unified system today. AI agents will have structural advantages – for example, the ability to copy themselves, work in parallel, and âthinkâ much faster than humans.
What To Do
I have high certainty that âAI risk is worth taking seriouslyâ and moderate-to-high certainty that âit is worth taking actions now to reduce the likelihood of these risksâ.
I am less certain on which actions exactly would be net beneficial, and how much of a current-day cost we should be willing to pay to mitigate future risks. Fortunately, there are many groups11 taking these questions seriously and proposing frameworks. This is laudable and should be financially supported.
At a policy level, Iâm becoming increasingly convinced that we need some sort of governance structure to enable transparency and accountability for safety measures, as well as to constrain the arms race dynamics between frontier labs. This is not an easy problem, and will likely require international coordination. I have moderate credence that a âpause AIâ plan, enacted today, would not be net helpful as we are still in the domain of being able to get mostly net positive mundane utility from further progress. However, I think coordination around a future mechanism to enforce a pause would be wise.
The inconvenient truth for AI is that we are racing forward faster than our ability to ensure safe development. Even with todayâs AI capabilities, we have a lot of priced-in âweirdnessâ for societal impacts. It is worth considering this future directly, even in the face of significant uncertainty.
Obligatory: These are my own opinions and do not reflect those of my employer, etc.
Cover image by Recraft v3.
-
See e.g. Evan Hubingerâs Alignment remains a hard, unsolved problem. ↩︎
-
See e.g. Gradual Disempowerment. ↩︎
-
See e.g. âSycophancy in GPT-4o: what happened and what weâre doing about itâ (OpenAI) ↩︎
-
See e.g. âDisrupting the first reported AI-orchestrated cyber espionage campaignâ (Anthropic) ↩︎
-
For example, much of the bureaucratic complexity of governing could be turned over to AI systems for global competitiveness and efficiency reasons. This could be a short-term boon for states which adopt this, but eventually result in such a complex governance structure that wresting control back from the AI becomes impossible. ↩︎
-
For example, individual firms turn over control over investment increasingly to AI for competitive reasons. This would likely be a short-term boon for the individual firms that choose to do so, and would select against firms that maintain a âno AIâ stance. However, long term once the markets are saturated by AI-controlled firms, the alignment of the economy could become increasingly misaligned with human needs and preferences. From Gradual Disempowerment: â[H]umans might lose the ability to meaningfully participate in economic decision-making at any level. Financial markets might move too quickly for human participants to engage with them, and the complexity of AI-driven economic systems might exceed human comprehension, rendering it impossible for humans to make informed economic decisions or effectively regulate economic activity.â ↩︎
-
To be clear, thatâs what I want! Rehabilitating skittish cats is a long game. Both of them are already warming up to me, so my strategy appears to be working. ↩︎
-
The dumb âtodayâ version of this is Deep Research agents. ↩︎
-
The dumb âtodayâ version of this is GPT 5.2 and Opus 4.5, both of which are nearing human-level performance on cybersecurity capture-the-flag benchmarks. ↩︎
-
The dumb âtodayâ version of this is the so-called âLLM psychosisâ effect with sycophantic models like GPT 4o. ↩︎
-
A short list: IAPS, Center for AI Safety, Palisade Research, AIPI. ↩︎