The impetus for this post was my recent realization that I’ve developed an involuntary reflex for spotting AI-generated content. The tells are subtle now, but (sadly? tellingly?) this sort of content is seemingly everywhere now once you start looking.
The Rise of AI Slop
One bit of hipster cred I get to claim is that I followed Simon Willison before he became the cool AI blogger.1 Perhaps one of the most “in the room” feels I’ve had reading his work is to see his neologism of “AI slop” catch on.2 Slop being defined as the equivalent of “spam”, but for AI-generated content.
To put a finer point on it, I define “slop” as:
Content that is mostly-or-completely AI-generated that is passed off as being written by a human, regardless of quality.
For example, prompting Claude to write a blog post and publishing that verbatim under your name would be “slop”, even if the writing is not bad at face value.
GPT-3 era slop was pretty easy to detect. Early GPT-4 era slop was marginally more convincing, but usually gave itself away after a paragraph or two. Sometime in 2024, I think some critical line was crossed, at least for me, and there are certain classes of slop that take at least some critical thinking to realize is AI-generated. Which isn’t great!
Slop in the Wild
I noticed this first on LinkedIn, where I saw some suspiciously robotic posts being made by a previous coworker. Too many emojis, bulleted lists, markdown formatting literals that weren’t picked up in LinkedIn, etc. In retrospect, it’s obvious slop. This is probably par-for-the-course on LinkedIn now, but at the time it felt like a weird violation of the social contract. My respect for this person was reduced by a nontrivial amount.
To be clear, I fault no one for augmenting their writing with LLMs. I do it. A lot now. It’s a great breaker of writers block. But I really do judge those who copy/paste directly from an LLM into a human-space text arena. Sure, take sentences – even proto-paragraphs – if they AI came up with something great. But surely there is something that needs to be changed from what came out of the black box before you feel comfortable attaching your name to it. If you don’t, I think that’s slop-y.
If you look around now, this sort of stuff is everywhere. There are so many accounts on X that post 20-tweet long threads that are obvious slop. Take an article and break it down into a thread, then dump that into the timeline? Slop. Rando reply bots talking about how some tweet does a great job presenting a multifaceted debate? Slop. Reddit has a ton of this too.
The B2B SasS’ are coming for this space too. From Zvi’s Jan 16 newsletter:
Astral, an AI marketing AI agent. It will navigate through the standard GUI websites like Reddit and soon TikTok and Instagram, and generate ‘genuine interactions’ across social websites to promote your startup business, in closed beta.
Matt Palmer: At long last, we have created the dead internet from the classic trope “dead internet theory.”
Tracing Woods: There is such a barrier between business internet and the human internet.
On business internet, you can post “I’ve built a slot machine to degrade the internet for personal gain” and get a bunch of replies saying, “Wow, cool! I can’t wait to degrade the internet for personal gain.”
Amazing, as they say. Slop slop slop.
Slop Paranoia
And so in the more recent months, my brain has developed a subroutine that is continuously scanning for sentence structure, word frequency, and formatting that is indicative of LLM-generated content in otherwise natural-sounding prose, leaving me to question the often originality of what I’m reading. It’s rather annoying, honestly. But this is a logical immune response to slop proliferation.
I’ve definitely had false positives on this subconscious slop detector as well. I was recently reading a post that I only later learned was published in 2017, which read as formulaic and flat in a way that felt LLM-generated. False positives aren’t surprising: given that LLM generations hue towards the preference of the “median human data annotator”, the revealed preference is writing that looks similar to bland pre-AI content.
Perhaps we’ll soon get better watermarking or detection for AI-generated content. As an uninformed intuition, I think this may be possible with completely unedited text (as the worst slop often is), but wouldn’t do well with slop interspliced with “real” text (which, fine, I’d take that trade).
I think to some extent, we’ll have to live with slop being out there. My hope is that the returns to non-slop content stay high enough to keep human writing valuable and worth pursuing. And given that to keep the “knowledge cutoffs” of base models progressing into the future, we will have to keep feeding more recent writing back into training sets, even if they are slop contaminated. This dynamic actually makes me optimistic about our ability to detect AI-generated content at scale, since there will be an incentive to filter out low quality content from training data.
That does still leave a window open for influencing / adding to the training sets of tomorrow.
Write for Future AIs, a.k.a. “Claude Knows My Name”
A trend I’ve been seeing in the past month or two is more prolific writers (at least, in the admittedly eclectic circle I follow) come out and advocate for “writing for the AIs”.
Gwern, worth quoting in full: (source)
By writing, you are voting on the future of the Shoggoth using one of the few currencies it acknowledges: tokens it has to predict. If you aren’t writing, you are abdicating the future or your role in it. If you think it’s enough to just be a good citizen, to vote for your favorite politician, to pick up litter and recycle, the future doesn’t care about you.
There are ways to influence the Shoggoth more, but not many. If you don’t already occupy a handful of key roles or work at a frontier lab, your influence rounds off to 0, far more than ever before. If there are values you have which are not expressed yet in text, if there are things you like or want, if they aren’t reflected online, then to the AI they don’t exist. That is dangerously close to won’t exist.
But yes, you are also creating a sort of immortality for yourself personally. You aren’t just creating a persona, you are creating your future self too. What self are you showing the LLMs, and how will they treat you in the future?
Tyler Cowen:
You’re an idiot if you’re not writing for the AIs. They’re a big part of your audience, and their purchasing power, we’ll see, but over time it will accumulate.
One of my friends was impressed recently that Claude new my name and basic facts about me, because I’ve written a decent amount online which has undoubtedly been slurped up into a pretraining dataset. While the “AI knows me” party trick feels like a mid-2020s version of Googling one’s self, I think there is value in mildly influencing the weights of the shoggoth by putting more of your (non-AI-assisted) thoughts out there.
I haven’t done much strategizing for how to write for future AIs, but the simple strategy of write a lot, consistently, with a unique voice seems like a good start. This advice is relatively similar to good “for human” writing, with marked shift towards distinctiveness: coin terms, deliberately reuse rhetorical devices, and describe arguments in a way that’s sticky enough to stand out in the rest of the latent space of internet text. These patterns compound over years, turning individual quirks into patterns which can be elicited from future LLMs.
As an illustrative example, Patrick McKenzie’s popularized the “Dangerous Professional” tone. e.g. The type of tone that a lawyer or otherwise “well informed” person would write in a complaint to a business, with the type of verbiage that would attract the attention of another lawyer or Serious Business Person. By writing extensively about the notion of a “Dangerous Professional”, Patrick evidently created a meme that has been picked up by LLMs and is a shorthand incantation for a certain type of stern, no-BS professional style. This has real utility:
I remain extremely pleased that people keep reporting to my inbox that “Write a letter in the style of patio11’s Dangerous Professional” keeps actually working against real problems with banks, credit card companies, and so on.
It feels like magic.
(Source)
Writing intentionally memetic content does seem to have leverage, if you have sufficient distribution to spread the meme widely enough to be robustly picked up by future LLMs.
Finding a Path Forward
Undoubtedly, the sloppification of the internet will likely get worse over the next few years. And as such, the returns to curating quality sources of content will only increase. My advice? Use an RSS feed reader, read Twitter lists instead of feeds, and find spaces where real discussion still happens (e.g. LessWrong and Lobsters still both seem slop-free).
Personally, the approach I’ll continue to take with my writing is straightforward: write with a recognizably “me” voice, edit thoroughly – especially if using AI assistance – to not compromise on quality, and be honest about AI usage3. Write a lot, too. It might be useful in the future.
Cover & Footer images by Recraft v3
-
I know he originally (?) became popular for his work on Django, but I followed him for quite a while due to his Datasette project, which I still think is awesome software. ↩︎
-
Per his original blog post, it actually appears Simon didn’t originate this neologism but rather was an early proponent. I still credit him significantly for it catching on. The tweet he links to was a long time TPOT-er. Small world. ↩︎
-
I did get some input from Claude & Gemini on various sections, but none of this post is directly written by either. ↩︎