I Wrote This
My intelligence is natural and my language model large
There’s an interesting discussion going on in the IndieWeb chat forums about signalling that your website does not use AI. JamesG, for example, displays a badge on each post (though not on his home page, that reads “written by human not by AI” and that links through to a site that explains why such a badge is a good thing: “make sure humanity continues to advance”.
Tantek wants to be more explicit and refer specifically to large language models (LLM), rather than AI, because “some of us expect we are on the verge of a tidal wave of LLM generated web spam, which is going to become less and less distinguishable from human authored content (if it isn’t already)”.
There are separate issues too, about content vs plumbing, the code behind the site, and whether the pledge applies to one or both. JamesG happily admits using AI to help him with the code that runs his site, “but not for content”.
This is not new
These are good points, although I will not be adding any such badge to my own site, in part because I strongly agree with Ruben Wardy’s opinion: “I feel like this is like putting ‘Does not contain cow poop’ on food packaging.”1 The discussion does, however, remind me of similar thoughts in many other spheres, where a new version of a thing requires the addition of a distinguishing mark to the old thing.
Before the advent of paperbacks, there were just books. Paperbacks required us to use “hardback”. Audio books and e-books have not yet had the same impact, though it still tickles me when people proudly announce that they read 53 audiobooks in 2023. Records were records (except for LPs, which were not 78s) until CDs came along and required us to talk about vinyl (although, to be fair, vinyl did not require 78s to become shellacs). On the bicycle forums where I hang out, people increasingly talk about acoustic bikes, which are not electric.
And now “content” needs its own distinguisher.
Actually, it has needed one for a while now, as content farms have heaped SEO cow poop onto strings of words, as feedback has honed clickbait headlines, as plagiarists and bad actors have conspired to make most reviews worthless. There’s precious little to be done about it except to bring your own personal cow-poop detector to bear on everything you read. Will a badge — or even a robot.txt file — stop bad actors and plagiarists, human or otherwise, from ripping you off? Of course not.
If the results are going to be indistinguishable from “human authored content,” is that even a problem? Presumably out-and-out hallucinations can be identified, if you have the knowledge to do so, but most people, for most content, simply do not have those skills. We’ve seen how that plays out since 2016 and before. It doesn’t need AI to make that a problem.
I don’t guarantee for one moment that I would be able to detect cow-poop 100% of the time. But I do think that by selecting the people whose words I read, I reduce the need for signals that they aren’t using AI or LLMs. And if I come across someone new and interesting, I’ll increase the sensitivity of my cow-poop detector.
-
Maybe not no cow poop, but no bacteria from cow poop would be nice to have on a label, long a subject dear to my heart: Less pus than other blogs and Cats in the milk ↩
Two ways to respond: webmentions and comments
Webmentions
Webmentions allow conversations across the web, based on a web standard. They are a powerful building block for the decentralized social web.
“Ordinary” comments