This page is within the scope of WikiProject AI Cleanup, a collaborative effort to clean up artificial intelligence-generated content on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.AI CleanupWikipedia:WikiProject AI CleanupTemplate:WikiProject AI CleanupAI Cleanup
To help centralize discussions and keep related topics together, all non-archive subpages of this talk page redirect here.
Has the "AI images in non-AI contexts" list served its purpose?
Wikipedia:WikiProject AI Cleanup/AI images in non-AI contexts has been documenting reasons given for removing AI-generated images from Wikipedia articles, since 2023. Is there any reason to continue keeping track of this, now that WP:AIIMAGES has become policy? I assume the list page was created to help guide that eventual policy with organic examples from across Wikipedia, which would mean it was no longer really needed. Belbury (talk) 11:37, 12 May 2025 (UTC)[reply]
Yep, most of them have been deleted, and "what to do" is much clearer with the policy. Borderline cases (which will be less frequent, but will certainly happen) can be discussed on this very noticeboard. Chaotic Enby (talk · contribs) 14:06, 12 May 2025 (UTC)[reply]
I've (hopefully) deleted all articles I can find created by M1rrorCr0ss, but (a) I'm not absolutely sure I've got them all, and (b) there are still the huge number of redirects and an unknown amount of garbage content inserted into other, legitimate, articles. Are there any tools for digging this sort of thing out, to allow root-and-branch removal of contributions by an editor? — The Anome (talk) 11:09, 22 May 2025 (UTC)[reply]
Hello, a new user has begun editing and on their user page says that they "extensively utilize BIDAI (Boundaryless Information & Data Analysis Intelligence), an advanced analytical system engineered by EIF." I've found their edits to be extremely unproductive and have warned them of such, but I was wondering if there is a standard approach for dealing with such accounts? Reporting without warning or discussion seems extreme, but the potential for this user to cause significant damage to Wikipedia is also very real. I didn't see a clear-cut policy, but I also admittedly didn't look to deep. Thanks. Vegantics (talk) 14:29, 22 May 2025 (UTC)[reply]
We don't specifically have policies for this yet (we still don't have a general AI-use policy), but the course of action for unproductive AI-using editors has usually been to report them to ANI. Chaotic Enby (talk · contribs) 14:37, 22 May 2025 (UTC)[reply]
I believe the obvious lack of any meaningful human oversight means this Spledia (talk·contribs) is merely acting as a facade for a computer program, and that their account is thus in effect a disguised bot account. I've suggested they request approval via the normal bot approval process. Given their past editing record, I think they have a mountain to climb with this, but the bot approval process seems like a good way to deal with this kind of blatant automated editing. In the meantime, I've blocked them from editing or creating article content. — The Anome (talk) 05:50, 23 May 2025 (UTC)[reply]
Collapsible templates
I've created the {{Collapse AI top}} and {{Collapse AI bottom}} templates that can be used for collapsing (hatting) disruptive talk page discussions that contain LLM-generated text. The {{cait}} and {{caib}} shortcuts are easier to use than the full template names. For an example of the template in action, see Talk:Ark of the Covenant.
The benefits of these AI-focused templates over generic collapsible templates like {{hat}} and {{hab}} are the convenient standardized message and the fact that transclusions of these templates can be tracked to monitor the extent of disruptive LLM use on talk pages.
Would it be possible to create a bot that would check new articles, follow all embedded links, such as citation links, and attempt to fetch them? 404-ing and similar reference links are an obvious sign of lazy AI slop, and it would be easy to catch these early using this, and to tag articles for examination by editors. It could also try to check the linked references for at least some reseblance to the subject of the article: either through simple text comparison, or a ML method such as comparing embeddings (of which text comparison is a trivial example). It would obviously not detect sophisticated AI slop, but that's another issue entirely.
The obvious problem is the anti-crawler features of websites themselves that would tend to block accesses by the bot. Are there any services that can provide this kind of crawler access to third party sites in an ethical way, for example via a WMF-brokered use-whitelisted API obtained via an organization like Google, Cloudflare, Microsoft, Kagi ([1]) or the Internet Archive who have generally unrestricted access to crawling (something like, say, Google's "Fetch as Google" service)? — The Anome (talk) 10:42, 23 May 2025 (UTC)[reply]
See also this: https://news.ycombinator.com/item?id=23149841 While slow, the IA's fetch would be ideal for this purpose. Combined with a cache, it would be highly effective. It doesn't really matter if it takes several minutes to do a fetch, for the purposes of bots, which can take as long as they like. Because it would get a lot of hits, it would probably have to be a service agreeement with the IA to prevent it being rate-limited or blocked by them. The IA also seems to offer an API: https://archive.org/help/wayback_api.php — The Anome (talk) 11:24, 23 May 2025 (UTC)[reply]
Some AI generated content possibly goes under the radar. So, this bot proposal is a good idea. But this will only be good for new articles, which needs to undergo patrolling, so there is already some human supervision. For AI editors expanding existing articles with fake references, bot would need to check every article that has seen a recent edit. —CX Zoom[he/him](let's talk • {C•X})12:49, 23 May 2025 (UTC)[reply]
Absolutely. It will only catch the very dumbest AI slop content, but it appears that is currently low-hanging fruit, and still worth doing. I really like the idea of a content cache for already-fetched reference content; automated checking of references is a really promising research area, and one, I think, where using LLMs is entirely valid, if it is used with the correct threshold settings, so that it is more sceptical than the average human reviewer, and bad references can either be flagged as wholly bad (naive slop detection) or simply questionable (detecting either superior-quality slop, vandalism, or mediocre human contributions), and human review can then take over. — The Anome (talk) 13:30, 23 May 2025 (UTC)[reply]
link-dispenser.toolforge.org (a tool I wrote) also exists to check if a link is dead, it directly makes requests instead of routing through IA since IA heavily ratelimits Toolforge. Sohom (talk) 19:23, 30 May 2025 (UTC)[reply]
Should this talkpage be considered the LLM noticeboard (perhaps adding a couple of redirects like WP:LLMN and Wikipedia:Large language models/Noticeboard?)? If not, should one be made? I wonder because I came across Zaida, Khyber Pakhtunkhwa and wanted someone more familiar with LLM to take a look, though I did find a maintenance template I added to the article. Gråbergs Gråa Sång (talk) 05:31, 24 May 2025 (UTC)[reply]
To facilitate searching for specific discussions in the archives, I suggest the active participants on this talk page should consider if it wants to keep project discussion separate from discussions of specific situations. isaacl (talk) 15:41, 24 May 2025 (UTC)[reply]
That could also be a good alternative, assuming there are too many discussions and searching them ends up overwhelming. However, some discussions of specific situations can easily end up broadening in scope, so a separation between them might not always be practical. Chaotic Enby (talk · contribs) 15:46, 24 May 2025 (UTC)[reply]
All of the edits made by new user User:1january2000 the past few days and the fast rate at which they have been made considering the amount seem to be almost entirely A.I.-generated in volume, with many of the sources they've cited seeming to not actually exist, although referenced as if real. I am not sure what to do about this, but this seems like the right place to report it. Hellginner (talk) 17:38, 6 June 2025 (UTC)[reply]
My general approach with articles that are mostly or all LLM hallucinations, particularly if a chunk of references are clearly made up sources, is to tag them for speedy deletion as hoaxes with {{db-hoax}}. As that template doesn't seem to have a comments or rationale field, I usually add in my analysis and rationale as an HTML comment too. Cheers, SunloungerFrog (talk) 18:29, 6 June 2025 (UTC)[reply]
I have already taken care of a few edits by the same user on "Millennium celebrations" (section about Rio and South Georgia) which cited nonexistent sources from Folha de Spaulo, British Antarctic Survey among others. Ramkarlo82 (V • T • C) 01:25, 7 June 2025 (UTC)[reply]
A user on Talk:Bidirectional search alerted me to a problem with mass additions of content with hallucinated fake references by User:Noshin Nawal on bidirectional search. I have reverted the article to a version before the additions, and Noshin Nawal has not contributed to any other article, but I thought I'd leave this here in case it sounds familiar to anyone or might be helpful as a record of this action. —David Eppstein (talk) 22:06, 8 June 2025 (UTC)[reply]
ToneCheck community call/discussion
Hi hi, the team behind Tone Check, a feature that will use AI prompt people adding promotional, derogatory, or otherwise subjective language to consider "neutralizing" the tone of what they are writing while they are in the editor, will be hosting a community consultation tomorrow on the Wikimedia Discord voice channels from 16:00 UTC to 17:00 UTC. Folks interested in listening in joining in, asking questions should join the Wikimedia Discord server and subscribe to this eventSohom (talk) 19:13, 9 June 2025 (UTC)[reply]
I am a newcomer and I don't know how these are handled. What should be done about this? I genuinely don't think the article is a good fit for an encyclopedia, and checking/reworking everything that was included in the linked revision is a huge chore. I couldn't verify most of the sources used. I don't know if they're real, though I manage to find at least one of them. MeowsyCat99 (talk) 13:14, 13 June 2025 (UTC)[reply]