This is an archive of past discussions with User:Citation bot. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.
Getting "Run blocked by your existing big run" when I try processing in category. Weird because I'm not running anything. --BorgQueen (talk) 02:31, 1 November 2022 (UTC)
Ok?... But what are they? Is there a "Technical reasons for Dummies" explanation for the non-techie that I am? It just somehow doesn't seem correct to truncate the actual URL... Shearonink (talk) 02:11, 1 November 2022 (UTC)
One of the simplest reason is when you print things, shorter urls are much more convenient / less space-y. Another is that pointless clutter is pointless, and it makes the edit window easier to read without a bunch of pointless "#v=onepage&q&f=false" to every url. Headbomb {t · c · p · b}04:44, 1 November 2022 (UTC)
So. Even though the "#onepage(etc.)" is what the shorter URL actually resolves to, that doesn't matter because it doesn't matter. (I'm wondering...if it's pointless why does that bit even exist?) I suppose in terms of memory/databases/storage a few bytes of code can add up in an endeavor the size of Wikipedia so we technically prefer the shorter URLs. But the longer URL are not exactly incorrect - We just prefer the shorter ones. Do I have that right? Shearonink (talk) 06:04, 1 November 2022 (UTC)
it is not just a preference. It makes editing easier. It makes printouts better. The #part is prone to changing over time and is unstable, and thus should not be used. There is one specific case where stuff after the # does matter and the bot makes sure that it is there (and even adds it back when people have removed it). AManWithNoPlan (talk) 20:33, 1 November 2022 (UTC)
The bot continually changes the work= parameter in cite news to newspaper= only for The Washington Post, while dozens of other newspapers are left in work=. This is causing tons of pointless cosmetic edits and should be fixed by allowing work= to be used for The Washington Post. Similar issues seem to also happen with magazines.
Given it only happens to The Washington Post (and a few other papers, but I haven't seen them done to other major U.S. dailies), perhaps just removing it from the bot's assignments would be best. SounderBruce03:34, 7 November 2022 (UTC)
Citation bot decided to move the comma outside the quotation marks when changing from curly to straight, even though in the previous version it was inside and likewise the original title has it inside (and according to the quotation rules of Wikipedia it should be inside).
Not that. The bibcode lists this author (which is presumably where the bot got it) but he is the author of the book being reviewed, not the author of the review itself. If the review has an author I don't know who it is. It seems difficult for a bot to figure this sort of thing out, so I have left an author=comment in hope that that's enough to block it from happening again for this specific instance. But despite the difficulty of doing anything else in this case, I think that making edits that a human can easily detect as wrong means there is a bug. The bot is supposed to be making life easier for the humans, not making us run around after it cleaning up its messes. —David Eppstein (talk) 05:35, 12 November 2022 (UTC)
We can't proceed until
Feedback from maintainers
bot created an empty citation
Status
{{wontfix}}, since rare and often should simply be deleted, since it is next to other refs. There is not a good automated fix.
Added dergipark URL for book doi:10.1163/9789047401216 that is a Turkish page with some handwritten table of contents? Don't see how this is a good URL.
{{fixed}} by fixing code that logs these errors so that I can fix them. These are pretty much always in need of human love, so I log them and then manually fix them.
Ok then. I've been mostly running Citation bot on articles from students, who almost certainly don't care and just generated the citations using some other tool, so the fact that this would be controversial didn't occur to me. * Pppery *it has begun...21:28, 23 November 2022 (UTC)
If you're doing a targeted run, you can always use AWB for that.
One link we should considered purging is pubmed however. There will never be an article at the end of that link. Might need and RFC for it though. Headbomb {t · c · p · b}02:24, 24 November 2022 (UTC)
Cleanup [ITAL] [/ITAL] markers
Status
{{wontfix}}, since it need some human tender love and care.
In Special:Diff/1121199837 it added |page=291 to a doctoral dissertation linked to ProQuest and with a given bibcode. Presumably it got this number, which represents the length of the dissertation, from one of those two sources.
What should happen
Not that. The |page= parameter is only for citing specific page numbers from a longer work; it is incorrect to use it for total page counts. In this case, the specific pages cited were actually xi and xiii, already given in the text after the template in order to describe the content cited from each of those two pages.>
We can't proceed until
Feedback from maintainers
expand cite web with pmid/pmc
Status
{{notabug}} - seems to have been a one time failure
* {{cite web|title=Genus ''Aha'' Menke, 1977|url=https://biodiversity.org.au/afd/taxa/Aha|website=Australian Faunal Directory|access-date=27 November 2017}}
to
* {{cite journal|title=Genus ''Aha'' Menke, 1977|url=https://biodiversity.org.au/afd/taxa/Aha|journal=Polskie Pismo Entomologiczne|year=1977 |volume=47 |pages=671–681 |access-date=27 November 2017|last1=Menke |first1=A. S. }}
The citation template was for the Australian Faunal Directory, not the journal entry. Honestly every experience I've seen with this bot changing pages has been negative...it doesn't help that seemingly it doesn't let people review changes before it publishes the edit.
Attempts to input a category on the web interface brings up "Run blocked by your existing big run." despite the previous category either having completed or been rejected for size.
What should happen
The bot should be processing right sized categories.
Bot applies title case to journal titles identified as Finnish-language by language=fi. diff
What should happen
Finnish titles in general do not use title case. This formating should be retained, per MOS:FOREIGNTITLE. Even if title-casing preferred, ja should be lowercase.
@Headbomb: I'm curious: the official title appears to be Revue scientifique et technique (International Office of Epizootics). I don't know why it's bilingual but shouldn't the French part use MOS:FRENCHCAPS conventions? So "Rev sci tech Off Int Epiz", schizophrenic as that looks? —David Eppstein (talk) 05:22, 7 November 2022 (UTC)
There's many weird/non-standard ways to abbreviate / refer to this journal, but "Rev Sci Tech off Int Epiz" is never valid, and "Rev Sci Tech Off Int Epiz" is the closest thing that is valid. Headbomb {t · c · p · b}05:24, 7 November 2022 (UTC)
This change is not cosmetic for multiple reasons, but there is the obvious one: please reinspect the display with each version. Izno (talk) 18:30, 15 December 2022 (UTC)
Apply capitalization logic to Turkish / Russian languages too.
Specifically this appears to be happening when the title parameter in a citation contains the word "arXiv"; the bot inserts a space and italicizes the last three letters. Very strange. —David Eppstein (talk) 20:29, 28 December 2022 (UTC)
Digital Spy is not an agency. The correct change would have been moving what's in |agency= to |website= or |work=. Izno (talk) 22:05, 23 December 2022 (UTC)
See [12]. {{cite web}} is correct for this outlet. In the same edit, the bot also converted a review to {{cite news}}, although it is not a news piece. IceWelder [✉] 08:23, 27 December 2022 (UTC)
Since the day before yesterday, I can't use the gadget because the my browser (Chrome and Edge) display that "Error: citations request failed". In usually, after waiting for about an hour, it works fine if I try again, but in this case it doesn't work. thanks ! SilverMatsu (talk) 02:14, 4 January 2023 (UTC)
For what it's worth, I checked because of the weirdness Izno mentioned and the possible abuse of automated processes. They haven't logged into their account in at least 90 days. TonyBallioni (talk) 04:44, 4 January 2023 (UTC)
I've blocked the account due to the suspect nature since OAuth is used to verify logins, they can explain what's going on in an unblock request. @AManWithNoPlan, can you dump the jobs? Izno (talk) 05:03, 4 January 2023 (UTC)
with some citation templates, typically cite journal, with identifiers such as doi or arxiv, with valid bibcodes, do not always get the bibcode added
What should happen
bibcodes should always be added to citation templates when they exist for papers with matching doi or arxiv (and pmid/s2cid, etc?) (or exact matching title?)
Relevant diffs/links
This diff shows a manual edit where I added several missing bibcodes (and some other things such as journals). When I ran the bot against the prev version, it did nothing.
Replication instructions
For a simple testcase, run the bot against "Orosz, Jerome A.; Jain, Raj K.; Bailyn, Charles D.; McClintock, Jeffrey E.; Remillard, Ronald A. (2002). "Orbital Parameters for the Soft X-Ray Transient 4U 1543-47: Evidence for a Black Hole". The Astrophysical Journal. 499: 375–384. arXiv:astro-ph/9712018. doi:10.1086/305620. S2CID16991861.". Same problem with the even simpler case of ". arXiv:1309.3652. {{cite journal}}: Cite journal requires |journal= (help); Missing or empty |title= (help)". However, sometimes it works; are requests getting throttled? Reliably, cases such as ". Bibcode:1998ApJ...499..375O. {{cite journal}}: Cite journal requires |journal= (help); Missing or empty |title= (help)" will be successfully expanded.
When fishing out authors, the bot decided a publication by the World Bank should have |last=Bank |first=World
What should happen
The bot should realise the found author is an institution, although I am not sure if this can be done through a system or would need to be whack-a-mole.
For implementations of {{cite web}} where FindLaw is cited, the bot changes its parameter from |publisher= to |work=. I can't speak to whether FindLaw is a work or a publisher, but this action italicizes the output, which is in contravention of our own article on the topic, which suggests its the latter.
for {{cite web}}, work is the same as website, and Findlaw is definitely a website. Seems like if you're actually trying to cite the court case, a better template would be {{cite court}}, maybe something like:
{{cite court|litigants=Wood v. State |reporter=2-98-441-CR |court=TX Ct. App. |date=14 October 1999 |url=http://caselaw.findlaw.com/tx-court-of-appeals/1079245.html }}
Wood v. State, 2-98-441-CR (TX Ct. App. 14 October 1999).
For reference from Grove Music Online the bot adds the co-authors as "editors"
What should happen
The most correct addition would be to follow what the article says, i.e. add "Others=Revised by [Name]". If this is not possible then adding these people as co-authors works as well. They are definitely not akin to "editors" as they are updating content, not editing preexisting material. The encyclopedia has an overall editor who is a different person
This is going to end up under "not a bug" because Citation bot is only re-using the Crossref data which incorrectly lists them as editors. Oxford needs to publish correct data so others can use it. — Chris Capoccia💬16:22, 4 January 2023 (UTC)
Moving identifiers in templates in the 'id' field to a relevant parameter: {{cite journal |title=ex1 |journal=studies of science |doi= |id={{doi|10.11xxxx}} }} would turn out to {{cite journal |title=ex1 |journal=studies of science |doi=10.11xxxx }}.
I hope this isn't a very hard implementation, but it would be very useful! BhamBoi (talk) 00:12, 14 January 2023 (UTC)
I see it around particularly when there is a cite journal parameter for it but it is not on by default.
Cite journal supports |hdl= and |S2CID= if you go into source mode and add them as an undocumented parameter, but some people just add them in a template (e.g {{s2cid|x}}) in the generic id field because they don't want to go into source mode and add it that way. BhamBoi (talk) 02:27, 14 January 2023 (UTC)
In this edit, there were some templates that had id={{s2cid|12345}} that the bot didn't move to id= |s2cid=12345. I think because s2cid doesn't show up as a parameter in the default visual editor interface for cite journal (Though if you add the param in source mode it works as expected), people add it as a template in the default id= field. BhamBoi (talk) 20:25, 14 January 2023 (UTC)
Is there a reason the nature.com URL was deleted, such that only the DOI remains? Is the URL not independently valuable, lest the DOI become broken? --Usernameunique (talk) 01:35, 16 January 2023 (UTC)
Not a bug. That Nature url goes to the same place as doi:10.1038/190586a0 and because |doi-access=free, Module:Citation/CS1 auto-links |title= with the doi url. The doi is persistent; the url may change at the whim of some web programmer at Nature.
Why are DOIs necessarily more persistent than URLs? DOIs break too; I've had to email journals before to restore DOIs that I had included in citations, and then became broken. --Usernameunique (talk) 14:29, 16 January 2023 (UTC)
True, but doesn't the publisher generally fix doi redirects (as I understand it, only they can) when they change source urls? So long as they do that, the doi is more reliable than the value in |url= which will be broken until some editor discovers that it is in fact broken. Sure, the publishers don't always get it right so emails to publishers will always be needed.
Citation bot adds a date parameter to Cite web templates citing Github. The date is usually recent (more than the corresponding reference on Wikipedia), it looks it is the date when the repository was last modified.
What should happen
The bot should not add a date parameter to templates citing Github.
I've posted 4 specifically indicating the citations that are problematic, showing that Citation bot altered USGS report reports resulting in empty=Title errors.--Kevmin§15:37, 20 January 2023 (UTC)
(Facepalm) Correct, I posted the Revision that Citationbot generated on each of the pages in question, that displays the errors that Citationbot created, as a result of citationbot making incorrect changes. But okay, here they are [18][19][20][21][22]
They are now fixed. Since no one complained - other than the bots logs - I wonder if it was a real person or was it some automated tool like the google web scraper? AManWithNoPlan (talk) 12:46, 22 January 2023 (UTC)
So the page has been updated since our editor looked at it. In that case, the tool should report an error. Unless the page in its original form got archived, how can anyone verify that it ever said anything different from what it says now? To fail to report a date is not a solution, it just papers over the crack. --𝕁𝕄𝔽 (talk) 09:11, 28 January 2023 (UTC)
2017 is not more recent than 2018.
Some websites keep changing the date for misguided SEO reasons (because Google gives higher rank to recent pages, until it gets convinced that you're abusing it). Such websites aren't typically reliable sources, so it's good to spot the trend in the references. Nemo10:20, 28 January 2023 (UTC)
Oh, I've just noticed that I misread the year in the date parameter. (I thought it was 2018.) It's my mistake, sorry. Janhrach (talk) 20:06, 28 January 2023 (UTC)
{{fixed}} by removing error message. That is not an error, but rather the lack of a free copy. I have also removed some other cases of "the bot that cried wolf".
Quite often the citation bot reports requesting data from Unpaywall but failing to get it ("Could not retrieve open access details from Unpaywall API for doi").
What should happen
The bot might be hitting the limit for gratis usage of the Unpaywall API, which used to be 100k requests a day. The options include 1) asking a grant from WMF to pay for API usage; 2) keep track of the number of requests and stop after a certain number per day, 3) try to reduce requests by keeping a local cache or by avoiding requests if/when we already know that the result will not be used, 4) just let it be as no real harm is done.
Relevant diffs/links
special:diff/1136273600 is one example run where I got this effect, you can't see it from the diff.
Replication instructions
Usually, just run the bot from the web interface on any page with a DOI.
Not a question about Citation bot but I am trying to understand the archiving of this particular page...
ClueBot III always confuses me, I usually use Lowercase sigmabot III. Anyway, in the archiving set-up for this page it states:
{{User:ClueBot III/ArchiveThis
|archiveprefix=User talk:Citation bot/Archive
|format= %%i
|maxarchsize=150000
|minarchthreads=1
|minkeepthreads=4
|age=2160000
So...what exactly does the age=2160000 mean? At Template:Setup cluebot archiving it states that
|age=
How many days old a thread should be before archiving. Default: 90
But that obviously isn't the case because of the 2160000...I've tried looking everywhere around here so I can understand this but am having no luck. If someone would post what the "age=" parameters are and where I can find an easy-to-understand explanation that would be awesome. Thanks, Shearonink (talk) 22:53, 29 January 2023 (UTC)
Ok, yes, I kind of know what is supposed to happen but if that is true then 2160000 hours = 9000 days. And that isn't the archiving at this page, is it? The last 2 times ClueBot III archived this page was today when the bot archived a post that was posted earlier today and then the bot archived a post from January 26th...I just don't understand when and why the bot is archiving and the code that is posted way up there at the top...Teach me your ways O Wiki Mavens & Coders... Shearonink (talk) 02:58, 30 January 2023 (UTC)
As AMWNP implies, you can set at least ClueBot up to archive based on wikitext patterns. (I daren't put the specifics in this section lest the bot archive it. :) Izno (talk) 03:27, 30 January 2023 (UTC)
At 2020–present global chip shortage, |website=Bloomberg was replaced with |newspaper=Bloomberg.com. The name of the publication, which is what should go in this parameter, is Bloomberg. Bloomberg.com is the domain name, which is a technical detail that should not be cited unless it coincides with the publication's name, which is not the case here.
My watchlist today is clogged by huge numbers of cosmetic edits with the edit summary "Misc citation tidying", "Suggested by AManWithNoPlan", that appear to consist solely of replacing the template alias {{jstor}} with {{JSTOR}}. Make it stop. This is not what bots are for. —David Eppstein (talk) 01:19, 7 February 2023 (UTC)
improperly changes |chapter-url= to |url= when the template does not use |chapter= but does use |script-chapter=. |chapter-url= requires |chapter= or |script-chapter=. So long as one (or both) of those are present in a cs1|2 template, |chapter-url= should not be changed to |url=. This applies to all aliases of |chapter=: |contribution=, |entry=, |article=, |section=.
improperly changes |chapter-url= to |url= when the template does not use |chapter= but does use |script-chapter=. |chapter-url= requires |chapter= or |script-chapter=. So long as one (or both) of those are present in a cs1|2 template, |chapter-url= should not be changed to |url=. This applies to all aliases of |chapter=: |contribution=, |entry=, |article=, |section=.
Beowulf and its Analogues is a 1968 work that has an SBN, not an ISBN. Notwithstanding the fact that the citation was already correct, Citation bot adds an ISBN, form who knows where.
In citations of The Royal Family website, the bot consistently adds authors' names to what are anonymous articles. I assume these names are encoded somewhere, but they're certainly not visible to the average visitor. The authors are presumably the equivalent of staff reporters, and if the article has been published anonymously on behalf of the organisation, that's the convention we should follow. In addition, the bot has been labelling the site a "newspaper", which it clearly isn't.
The more interesting issue to me is why did the bot change {{cite web}} to {{cite news}} and |website= and |publisher= to |newspaper=? But even that is inconsistent; in this diff the bot changed {{cite web}} to {{cite news}} but kept |website=. The bot did correctly change the assigned value www.royal.uk to The Royal Family.
It seems to me that for these citations, {{cite web}} and |website=The Royal Family are correct. The bot can't really know if the author names are displayed or hidden so that doesn't seem much of a bug to me. I seem to recall that there was some recent discussion about that on this page... You might want to look in the archives.
The link on the page goes to a dead archive page. Anyway, that is not really a 100% bot edit, since that type of edit requires human approval. While running. I am doing a multi-month long run where I approve the titles from the archive, or flag them to be deleted. AManWithNoPlan (talk) 13:07, 12 February 2023 (UTC)
Yes but no, it's far more likely that the edit goes unnoticed. I suggest to remove the wayback URL only if it's total garbage, otherwise just comment it. Nemo20:06, 12 February 2023 (UTC)
The bot's been adding and populating a date= parameter for sources with no listed publication date. There is no indication as to where the bot gets these dates from and no reason to think that they're accurate.
What should happen
The bot shouldn't add a publication date for sources that don't have one listed.
These dates are stated in the web pages' HTML. You can check with Ctrl-U or other method to view source in your browser. Nemo21:29, 27 January 2023 (UTC)
The first one is coming from one of the items in the HTML source (Ctrl + U in Firefox):
(The bot should probably prefer the modified_time/updated_time if it is the source responsible, and if it's getting it from Citoid or other ext service maybe an upstream notification would be valuable.)
Hmmm. Well, this is interesting to me. Chiming in here as the person who originally added the cites to these articles. The dates that the Bot is adding to the cites would appear to be incorrect in that they are not published on the page with the source material. Also, the date that the Bot is finding would appear to be the date that the material was published onto the web but it might not be the actual date the material was written or the date that the material was published in print. In the case of the Archipedia material on the Ramsdell, that information seems to have originally been published in print in 2012. In any case, is a researcher/WP-editor expected or supposed to always to look up the html dates if material is undated on the page? Shearonink (talk) 16:57, 28 January 2023 (UTC)
I usually check the date in the HTML if it's not stated, but one can be forgiven for not doing so. The date in {{cite web}} is usually the date of the web page itself. If the date of original publication of the work carried by the web page has some significance, you can instead use {{cite publication}} or other cite template with the date of the work, indicating that the URL is just one representation.
For the sake of WP:RS, I'd expect editors to know whether they're citing a website or some publication of which the website provides a copy, and ideally they'd use citation templates accordingly, but such details can be addressed if/when confusion arises. Nemo17:21, 28 January 2023 (UTC)
I try to be SO scrupulous and careful when citing whatever reference... Does that "Control-U" thingy work with all laptops? (Yay yet another parameter to remember when info or a webpage "appears" to be undated...) I'd never heard about being able to see the date in the html before. Is it something that only works with PCs or Macs/whatever?... Shearonink (talk) 17:42, 28 January 2023 (UTC)
On Windows in Firefox: Ctrl + U is how Firefox does it. It should work in other browsers but the specific key combo may be different. A second way: if you right-click on a page, also provides "View page source". The third way is to open console, which is F12 or also right click and select inspect.
There is no requirement to hunt down information in the page source, it is simply another way to get the date usually since indeed many pages don't have a displayed date (but of course they all have a publication date). I would suggest leaving the dates if Citation bot adds one, so long as you can verify at least in the page source that the date didn't spontaneously poof into thin air. Izno (talk) 18:47, 28 January 2023 (UTC)
Still not fixed. When you have a journal=arxiv..., convert the cite journal to cite arxiv, throw away the journal, get rid of url and all non-arxiv identifiers, then expand from there. I.e.
This doesn't seem harmful to me per se, but it does add a revision to the page history that doesn't actually change anything. BhamBoi (talk) 06:37, 27 January 2023 (UTC)
Not sure what the best plan is for these URLs, I'd guess they were added by automatic processes potentially including this one, so I think perhaps having the bot nuke the title without an attempted replacement would be preferable, but if Citation bot could actually resolve these to the correct page, that would be best. Izno (talk) 23:26, 14 February 2023 (UTC)
Sorry for that. Personally I don't expect Citation bot to clean up my mistakes with User:OAbot. :/ That bug had been fixed quite quickly but some broken edits remain. I'm thinking of proposing a new version of OAbot which would identify such metadata inconsistencies. Nemo18:55, 16 February 2023 (UTC)
Beowulf and its Analogues is a 1968 work that has an SBN, not an ISBN. Notwithstanding the fact that the citation was already correct, Citation bot adds an ISBN, form who knows where.
(Unarchived and removed {{not a bug}}). THe fact that an SBN can be forwarded to an ISBN does not mean that it is correct to add an ISBN to a citation for a book does not have one. This book has an SBN, which was already in the citation; the duplicative (and borderline erroneous) ISBN should not have been added. --Usernameunique (talk) 22:26, 13 February 2023 (UTC)
The book clearly has an ISBN because ISBNs are designed to be forward compatible with SBNs (add a leading 0 to the SBN and you have the 10 digit ISBN). You don't need to have it printed on the book for the book to have an ISBN. All books with SBNs have ISBNs. Headbomb {t · c · p · b}00:43, 14 February 2023 (UTC)
The wrong title is replaced with no title, with is less than ideal, but still an improvement. The PressReader title should also be an error. AManWithNoPlan (talk) 17:49, 15 February 2023 (UTC)
I agree that the generic title should be an error, but I cannot see how changing it to no title at all is an improvement. The bot feels the need to edit the article but only changes one error to another. IceWelder [✉] 07:58, 16 February 2023 (UTC)
None-the-less, giving no information (with an error message about the lack of information) is still better than confidently giving wrong information. Both are bad though. AManWithNoPlan (talk) 13:52, 16 February 2023 (UTC)
A visible error trackable error is better than a silent error that is untracked. Yes it's ugly. That's a good thing because it makes people want to fix it. Headbomb {t · c · p · b}16:07, 16 February 2023 (UTC)
I requested this a day or two ago under the same rationale as Headbomb. I'd rather have an error than the previous title, and "no title error" is good enough for me. Izno (talk) 17:30, 16 February 2023 (UTC)
"Rabbi" is a job title, not part of a person's name
When processing a citation where the author is credited as "Rabbi Example Suchandsuch", the bot thinks that "Rabbi" is part of the person's first name. It isn't; it's merely their job title, and shouldn't be dragged into the cite templates' name fields.
Reminds of this classic: A priest, a minister, and a rabbit walk into a bar... The rabbit says, "I think I might be a typo." -- GreenC17:54, 18 February 2023 (UTC)
Citation bot inserting pharmaceutical spam into article space
Status
Not a bug - makes garbage archives obvious, which I have now fixed instead of covering up and reverting. Also, the bot did not add any scams, it simply pointed out that is was one.
When a URL has been squatted on by a spammy website, and the Wayback Machine archives the spam, Citation Bot will sometimes copy over the spam, resulting in stuff like Viagra ads in public-facing article space. Examples below but there are almost certainly more since I have only looked for about 10 minutes. While this is not a bug exactly, it is a really embarrassing look for wikipedia, more so than just a wonky parameter, and bad enough that there really should be a better way.
I was very clear in my original report that this is not exactly a bug. However, Wikipedia is not a repository for Viagra and essay-writing scams, and turning it into one, however unintentional, is at best unneeded behavior and at worst disruptive editing. Gnomingstuff (talk) 14:56, 21 February 2023 (UTC)
The URL is this. There may be variants, but sify.com has been the most constant throughout history. Because it is a website, not a publisher, Sify or Sify.com should only be in the website field per {{cite web}} and {{cite news}}. Kailash29792(talk)11:13, 24 February 2023 (UTC)
Sometimes references contain proxy URLs which are meaningless, in the sense that they don't contain any useful identifier that could be used for link recovery, so the bot doesn't yet know how to handle them. The reference may use templates with meaningless data such as a title "Shibboleth Authentication Request", or be unstructured.
What should happen
Any available information should be used to retrieve the correct identifier, and a structured citation generated from said identifier, throwing away all the garbage input. It might be possible to achieve this by screen-scraping the meaningless URL's target, or by searching the unstructured citation on Internet Archive Scholar (any result could be verified by searching its title, author, year etc. in the original reference to make sure they all match).
Citation bot often interacts poorly with articles that use shortened footnote citations. A common problem is citation bot edits introducing references with the same author list and year as existing long-form references with shortened footnotes – see this edit and the subsequent discussion at Whoop whoop pull up's talk page for an example. Another issue I have seen is when citation bot changes the publication dates of references, but not the associated shortened footnotes – for example in this edit (where in addition the change is incorrect). This results in user-visible error messages and potentially ambiguous references (in the first case) or broken references (in the second case), and the resulting errors have to be manually gnomed.
The second edit is not a bot edit - that is tool assisted human edit. The first edit problem seems to be the result of a serious case of GIGO with the vcite and cite templates doubling up. I do not know of any way to detect the footnote problem without actually saving the page. AManWithNoPlan (talk) 16:32, 29 January 2023 (UTC)
Scientific name should be in italics. Also should list ADW as website and UMich Museum of Zoology as publisher (but yay! that the bot can even do this much!)
I did not change my skin or settings. My profile continues to have: Citation expander: automatically expand and format citations using Citation bot. Comfr (talk) 15:38, 3 March 2023 (UTC)
The pmc value was in the pmid field. The pages and the date fields were wrong for the supplied doi. The bot didn't fix the pmid and pmc to match the doi. It didn't fix the pages or the date.
The bot can't magically fix everything. When you've got the wrong identifiers in place, it doesn't know which is correct thus which information to use. Headbomb {t · c · p · b}01:18, 27 February 2023 (UTC)
Even so, the bot added a s2cid that matched the doi but did not match the pmid. Should the bot make any changes or additions when there are inconsistencies? The bot could flag fields that don't match the information it retrieves for the pmid and/or doi so others could try to fix the citation. --Whywhenwhohow (talk) 01:24, 27 February 2023 (UTC)
Given that it is not out of the question for an article to have two (I have seen up to four) different DOIs, this is hard to police. Also, there is always the problem of (who is right?). Lastly, pubmed is not queried in these cases anyway, since the citation is completed already. AManWithNoPlan (talk) 16:10, 9 March 2023 (UTC)
There is only one doi in the citation. The bot could verify that the existing fields are congruent before making changes or additions. What does "the citation is completed already" mean? --Whywhenwhohow (talk) 06:59, 11 March 2023 (UTC)
"There is only one doi in the citation" that does not matter. There are lots of citations with the wrong DOI in pubmed or five dois for the same article. How could one reliably determine that a reference is suspect without pissing off an army of editors by adding bullcrap "this citation is probably wrong" flags to pages. AManWithNoPlan (talk) 14:23, 11 March 2023 (UTC)
The citation was apparently not "complete" since the bot added the s2cid. It looks like it added the s2cid based on the doi. Since it was making changes, why didn't it update the incorrect pages and date to match the doi? --Whywhenwhohow (talk) 05:27, 12 March 2023 (UTC)
Three errors: 1) Mawgan Porth is being cited as a book, and does not even have the supposed chapter, "Archaeology", which the bot added; 2) the bot incorrectly capitalized "het" in "Verslag van het Friesch"; 3) Flint Implements already has an SBN, so there is no reason to add an ISBN which does not even appear in the book.
Businesswire contains press releases. Same for prnewswire, globenewswire, newswire, et al. cite press release should be used for press releases. --Whywhenwhohow (talk) 04:24, 13 March 2023 (UTC)
I don't think that is necessary but it should use press release going forward. Consider using the "via=" parameter for the distribution site instead of work or publisher. --Whywhenwhohow (talk) 01:38, 14 March 2023 (UTC)
timetravel.mementoweb.org
Status
Fixed does not seem like timetravel.mementoweb.org page is a good URL to add here. has the bot been adding it anywhere else?
A journal paper beginning "η Carinae's Dusty Homunculus Nebula ..." is rendered as "Η Carinae's Dusty Homunculus Nebula ..."
What should happen
Previous discussions notwithstanding, this forcing of sentence case is just wrong. The Greek letter is part of a Bayer designation and should be rendered with the lowercase Greek letter in all situations (although when Romanised it is always upper-cased, for example as "Eta Carinae's ...").
Replication instructions
Here is an example of a citation generated by the bot. What appears to be an "H" is actually an upper-case Greek letter eta (η, upper case Η). Clicking through any of the identifiers will show the lower-cased form. The bot expands to the uppercase form whichever of the identifiers is given as a seed, so I think it is the bot forcing the upper-case.
The bot's unnecessary change of "vid" to "id" in the URL to MacCary's and Willcock's book in Plautus produced a 404 "page not found" error. I have changed it back twice. Kanjuzi (talk) 19:38, 17 March 2023 (UTC)
Can Citation bot be instructed to avoid making a specific change?
The bot repeatedly adds a bogus "series" parameter almost every time it edits an article, besides adding a modern ISBN to an 1899 work (with different pagination at times). I don't know how many times I've posted this here, but Also known as:Official records of the Union and Confederate armies is not something that's useful for the bot to be adding, and adding the ISBN to a knock-off reprint is potentially causing pagination and verification issues in a number of articles, including good and featured articles. Is there any way to keep the bot from editing citations to this source, since it consistently makes the two same dubious items. I don't want to have to throw in a bot-stopper template in all of these articles because the bot does sometimes make productive changes to other citations. I do wish bot-ops would be a bit more understanding of how frustrating it is to painstakingly make sure everything is properly referenced, and then to see a bot add crap to citation templates and break verifiability. Hog FarmTalk14:51, 16 March 2023 (UTC)
The |series= is now black-listed. You can specify | isbn = <!-- a comment --> to block a parameter. I have added |isbn=9780918678072 to a black-list, since it seems to point to multiple books. AManWithNoPlan (talk) 15:11, 16 March 2023 (UTC)
Hog Farm is not the only writer affected by this bot. I use the Official Records often (it has numerous volumes), and have over a dozen articles with changes to the Reference section. TwoScars (talk) 20:41, 16 March 2023 (UTC)
We can find literally hundreds of articles where kannada numerals are used in infoboxes and cs1/2 templates under "date format in template" of checkwiki. Use of kannada numerals can be discouraged in templates only or we can search for alternate solutions. From my point of view, i have seen an issue and reported it. If this can be resolved using this gadget, if not, please consider continue this issue at other appropriate place. Meanwhile i will check whether this is limited to kannada wiki or other dravidian or indo-aryan languages. రుద్రుడు (talk) 02:07, 16 March 2023 (UTC)
Is there a reason we're making a minor edit to citation templates that have a single author? This diff [54] changes a single author parameter to multiple when there is only a single author. Unless there was a discussion I was unaware of (which is certainly possible), the template docs still indicate |last=/|first= for a single author. Is there consensus for the bot to be making such a non-substantive change? Because I've been seeing a lot of these in my watchlist. ButlerBlog (talk) 12:33, 19 March 2023 (UTC)
I think the better answer (though that is one) is that there is a substantive change in this edit. Izno (talk) 19:06, 19 March 2023 (UTC)
Whoops!! I guess I wasn't looking far enough down the list of params and missed the |first2=/|last2= in those - disregard my stupid question - it should be |first1=/|last1= in this case. ButlerBlog (talk) 20:41, 19 March 2023 (UTC)
Sometimes when a {{cite thesis}} template contains an external link to an institutional repository, the citation bot will add a spurious |journal= parameter.
Relevant diffs/links
The bot adds "|journal=" to existing "cite thesis" templates: [55][56][57][58].
The bot adds a journal parameter while also changing from a different template type to "cite thesis": [59][60].
The bot changes a "|via=" parameter to "|journal=": [61].
CB keeps marking a game review as a news piece, which should not be marked as such. When in doubt, the bot should not make unnecessary template conversions, especially when it makes no visual difference.
slightly more accurate would be |volume= 22 (but who t.f. cares?)
now we have |volume= 22 |number= 22, so worse than at the start. QED, baaad bot! Unnecessarily pedantic, AND faulty. Arminden (talk) 17:16, 8 April 2023 (UTC)
Yes. But please don't (a) change the domain (Google makes its own call on that anyway) or (b) remove the hl=en tag, which is often there for good reason. Ping AManWithNoPlan, who seems to be the active maintainer. Thanks, Justlettersandnumbers (talk) 12:17, 23 March 2023 (UTC)
Anyone finding a ulr that should be trimmed, and is not. Let me know here. Please note that much of the capability is not yet in the main bot source code branch. So, I can run it, but you cannot. AManWithNoPlan (talk) 14:43, 25 March 2023 (UTC)
Bot didn't know what to do, but also didn't time out.
We can't proceed until
Feedback from maintainers
I tried to get the bot to add citations from two different dois but it never did anything (granted, the dois and their websites are squirrelly). This happened both with the citation expander gadget in the edit window and when trying from the toolbar. I seem to recall that the bot used to time out after 5 minutes. The dois in question are 10.14255/2308-9628/06.21/1 and 10.32999/ksu1990-553X/2021-17-3-1 . Abductive (reasoning)04:48, 4 April 2023 (UTC)
You can look them up from other metadata. For instance (although I think this is not exactly how the bot does it) you can see the metadata for a doi by using a Unix/MacOS command line like
Not a bug. The edit was correct – as far as it goes; not all metadata are available to the bot. |title=PressReader.com - Connecting People Through News is a completely bogus title. |website=Pressreader.com is hardly any better. The correct citation template should look something like this:
{{cite news |last=Kryk |first=John |date=1 February 2017 |url=https://www.pressreader.com/canada/ottawa-citizen/20170201/281951722545654 |title=Owner honed his Kraft in Canada |newspaper=Ottawa Citizen |via=Pressreader}}
Nice tool, thank you for making it. I use it regularly to check my additions.
For values in "journal" etc, non-English titles wouldn't need to be capitalized like English ones: [63][64]. In these cases, there should be a value in language= other than "en".
There is a source listed on the article Fort Miami (Indiana) called: “Poinsatte”. To my knowledge “Poinsatte” is a surname. I could not find any documents or sources elsewhere named “Poinsatte”. What does “Poinsatte” mean or refer to? It is a source listed several times as a citation for parts of the article, yet there is no further information on what it is. Thank you, any help is appreciated.
You are correct, this is very hard to do - which just seems so wrong. I agree completely. I will not flag as wontfix and will keep open hope to do this. AManWithNoPlan (talk) 16:50, 2 April 2023 (UTC)
Doesn't seem to be specific to s2cid. The bot currently does its best to replicate the field spacing style already in use when it updates any template. So if you have, for example, {{cite journal |bibcode=2005ApJ...624..973V }}, then you'll get an expanded citation with the s2cid spaced as you prefer. Having the bot force a different style from the one already being used in a template is probably undesirable. Equally, any mass changes to the styles of existing citations probably isn't warranted, even if some of them form dense blocks of text with line-breaks in odd places or not at all. Lithopsian (talk) 17:12, 2 April 2023 (UTC)
Three errors. 1: for ref "ref name=CU", the website is being cited, not the book. The book is already cited (as Bruce-Mitford 1989b), but the website is being cited for the information it contains about the specific copy held by Columbia University. 2: "pages = 51–51" was changed to "pages = 51". The original was a typo (it should have been "pages = 50–51"), but even then, it shouldn't have been changed to "pages = 51" (perhaps "page = 51"). 3: "last1 = Mitford" / "first1 = Bruce" is added for a January 1939 work. This is based on errant data from Cambridge Core (which I'm trying to get them to correct), but I've already removed this error at least once from the article, and Citation bot keeps adding it back.