Wikipedia talk:Version 1.0 Editorial Team/Index

Archives

This page has archives. Sections older than 30 days may be automatically archived by when more than 4 sections are present.

Bot seems to have stopped running

It has not made an edit for 30 hours. Nurg (talk) 21:23, 6 April 2025 (UTC)[reply]

3 days now. Anyone know where the on/off switch is? The-Pope (talk) 14:45, 8 April 2025 (UTC)[reply]

Ping User:Audiodude, who has helped previously – are you able to advise this time? Thanks. Nurg (talk) 21:23, 8 April 2025 (UTC)[reply]

I've deployed the latest changes to the server, which includes a full restart, so let's see if it runs tonight. audiodude (talk) 22:37, 8 April 2025 (UTC)[reply]

Okay, it looks like an update from earlier this week broke table/log uploading (though the backend was still updated). I fixed it about halfway through it's current run ("Kentucky") so it will be incomplete tonight but should work fine tomorrow night. Thanks for everybody's patience, and please do (a)mention me in the future with any issues. I've tried to set up watchlist/emailing for this page but have never gotten it to work! audiodude (talk) 02:39, 9 April 2025 (UTC)[reply]

Bot is gutting many of the tables

The bot seems to be largely gutting many of the tables. Eg, Auckland, Applied linguistics, Austria. Nurg (talk) 05:25, 14 May 2025 (UTC)[reply]

I've blocked for now. Firsfron of Ronchester 05:37, 14 May 2025 (UTC)[reply]

Looking into it now. Seems to be a problem with the bot contacting the Wikipedia API? audiodude (talk) 16:03, 14 May 2025 (UTC)[reply]

Oh nevermind, I think those API log messages are a result of being blocked. audiodude (talk) 16:13, 14 May 2025 (UTC)[reply]

Okay I have literally no idea what happened, and I haven't changed anything, but manual runs of the bot are running properly now.

I see this in the logs from May 14 ~00:00 UTC (yesterday)

> wp1-update-2.log.2:INFO:wp1.logic.project:Deleted 32228 ratings that were empty for project: Austria

But it's not clear why the bot thought those ratings were "empty".

I also have about a million (!) lines of:

> WARNING:wp1.logic.api.page:Error contacting API, returning None

The API is only contacted in the case that an article that was previously rated can't be found, to check for article move metadata. So yes, maybe the API was broken, but why did the tool think the article was deleted/moved in the first place?

This is the current page in the tool, which would be uploaded to en wiki, after a manual run of Austria:

https://wp1.openzim.org/#/project/Austria

The only problem now is that the next time the bot runs, it will see all of these 32k articles as "new" and write that into the Wikipedia:Version_1.0_Editorial_Team/Austria_articles_by_quality_log, which will probably be too large to upload (like the current version, which is likely "everything got deleted" messages). So I think what we want to do is purge the logs after the next successful run, and they will start being created again on the next day.

I've created an issue to track adding a failsafe so that this doesn't occur in the future: https://github.com/openzim/wp1/issues/868 audiodude (talk) 17:52, 14 May 2025 (UTC)[reply]

cc: @Kelson audiodude (talk) 17:52, 14 May 2025 (UTC)[reply]

Audiodude, if you think the process should be ok now, do you think the bot can be unblocked now, rather than just waiting for the block to expire? Nurg (talk) 01:00, 15 May 2025 (UTC)[reply]

No, I don't think it's a good idea to remove the block early. I honestly don't think it's a good idea to remove it at all until we have a better idea of what caused this and how to prevent it in the future. audiodude (talk) 03:17, 15 May 2025 (UTC)[reply]

I've extended the block indefinitely until it can be determined that the problem is resolved, or what caused the issue is known. Firsfron of Ronchester 03:54, 15 May 2025 (UTC)[reply]

I've merged and deployed https://github.com/openzim/wp1/pull/869 which addresses/fixes the issue I raised earlier, to hopefully prevent this from happening in the future. However, I still have no idea what caused it, and I don't have any ideas where or what to look for in that regard. The bot logs aren't helpful here. I guess I'm hoping to get an email along the lines of "FYI tool maintainers: the database was returning garbage briefly a day ago".

Let's take a day or two for @Kelson to weigh in, and then potentially remove the block, after which we will have to purge logs as I mentioned. audiodude (talk) 04:54, 15 May 2025 (UTC)[reply]

To be clear, as mentioned in that PR, it is plausible that if the replica database returned an empty list for "FooBar articles by quality", this could happen (which is what my PR aims to address). However, it's not clear why it would return an empty list and not an out-of-band database error. It's not an issue with any of the tables or our query, because those remain unchanged and as I pointed out, manual updates are working.

So really, I think I've narrowed down the problem to "enwiki_p returned an empty list" but I have no idea still what could have caused that. audiodude (talk) 04:57, 15 May 2025 (UTC)[reply]

I strongly recommend to open an upstream bug at Phabricator, as current fix in the bot looks like more like a workaround than a fix. Kelson (talk) 05:57, 15 May 2025 (UTC)[reply]

Filed a phabricator ticket audiodude (talk) 15:35, 15 May 2025 (UTC)[reply]

Thank you. --Rosiestep (talk) 19:51, 16 May 2025 (UTC)[reply]

Any update? --Rosiestep (talk) 10:55, 19 May 2025 (UTC)[reply]

Thanks for the ping. @Firsfron can you unblock the bot now? I don't expect this to happen again, and we haven't gotten any real response on the phabricator ticket. audiodude (talk) 12:59, 19 May 2025 (UTC)[reply]

Okay, I have unblocked. I hope it can be determined what caused the issue. Good luck! Firsfron of Ronchester 17:46, 19 May 2025 (UTC)[reply]

Audiodude, Firsfron, Kelson. The problem is still happening. Eg, diffs: Basque, Firefighting. Please block it again. And next time you think the problem is solved and unblock it, please, monitor whether the problem resumes. Nurg (talk) 22:36, 19 May 2025 (UTC)[reply]

I have reblocked. I'll continue to monitor this page. Firsfron of Ronchester 22:41, 19 May 2025 (UTC)[reply]

Thanks very much for the very speedy response. That's a relief. Nurg (talk) 22:43, 19 May 2025 (UTC)[reply]

Thank you @Firsfron and @Nurg for the quick actions. Apologies that I wasn't able to have my finger on the absolute pulse of the bot as soon as it started this evening. We're all volunteers here. I will continue to pursue a permanent resolution to this problem. Thanks again for your patience. audiodude (talk) 00:07, 20 May 2025 (UTC)[reply]

Okay I've done more debugging this evening, but haven't come to any real conclusions as to what's going wrong. You can track my progress at https://github.com/openzim/wp1/issues/875 and feel free to leave any technical comments there. audiodude (talk) 05:13, 21 May 2025 (UTC)[reply]

Is it possible to block the bot in the User namespace while leaving it unblocked in the Wikipedia namespace, since the problematic edits only occur in the User namespace? Or would a partial block cause the bot to get stuck or prevent it from functioning properly? I'm asking because its edits to the 'Articles by Quality log' pages in the Wikipedia namespace, such as Special:Diff/1291205742, appear fine. Those pages are helpful to many, so it would be great if the bot could continue updating them normally, as it seems capable of doing so. 87.95.243.221 (talk) 16:23, 21 May 2025 (UTC)[reply]

Unfortunately this wouldn't work. The only reason those pages aren't broken is because the bot was stopped/banned before it could get to them. With the current bug, I anticipate that the bot would effectively destroy all pages. audiodude (talk) 16:51, 21 May 2025 (UTC)[reply]

Note that the table is still visible, and manually updatable at [1]. Doesn't help with a changelog, but it does at least let you click on each number to get the list of articles in each cell in the table matrix. Could the "last updated" timestamp be added as a footnote to the table on openzim? The-Pope (talk) 00:13, 22 May 2025 (UTC)[reply]

Right, the tables are being generated correctly, which is why it's all the more confusing why the on-wiki updates aren't working.

I can add a timestamp to the table, sure. audiodude (talk) 01:13, 22 May 2025 (UTC)[reply]

Okay this is the update I just posted to Github.

This morning, I moved the old logs directory out of the way and created a fresh new one. Then I queued and ran the entire update job in the same way that the cron job does.

It seems to have run successfully.

The update jobs all successfully completed, and no project had more than around 100 articles deleted. I spot checked a few of the tables on the website and they looked good. I also was logging the generated data for the on-wiki tables, and they all had totals except for the project Vital which had 0. But that seems right because this category is empty: https://en.wikipedia.org/wiki/Category:Vital_articles_by_quality.

I have no idea what caused the bot to blank out those tables, and it seems to me like it wouldn't happen again since I'm running it exactly as it gets run nightly. But I can't be sure because I haven't identified a root cause.

@User:Kelson how should we proceed? audiodude (talk) 17:40, 24 May 2025 (UTC)[reply]

Currently, my idea is to turn off the automatic run of the bot (cron job), unblock it, then run the bot manually and closely monitor it. audiodude (talk) 17:53, 24 May 2025 (UTC)[reply]

I think that's the sensible thing to do. --Pres N 19:15, 24 May 2025 (UTC)[reply]

Agree, but very worrying that we are not able to understand the root cause. I would recommend to open issue to improve the logging/tracing capabilities (if not already done). Kelson (talk) 15:51, 25 May 2025 (UTC)[reply]

@Firsfron okay I've turned off the automatic runs of the bot. Can you please unblock so that I can test updating tables manually? Thanks! audiodude (talk) 21:07, 25 May 2025 (UTC)[reply]

I've merged https://github.com/openzim/wp1/pull/883 and https://github.com/openzim/wp1/pull/884 to make logging configurable without deploying the app, and add additional DEBUG logging calls. audiodude (talk) 23:02, 25 May 2025 (UTC)[reply]

@Audiodude: Unblocked. --Pres N 00:29, 26 May 2025 (UTC)[reply]

Okay I've run the following projects, of various sizes: Sports Norway Anthropology Firearms

Note that there were a couple of stray projects in the queue when the block was lifted, which accounts for the other changes in contribs.

These were run with the queueing system, not from the command line, so as close as possible to the actual mechanism used by the bot. The next step is to do a full, manually triggered run (and monitor it!). If that works, I will restore the database back to the May 11th backup (because all of the "original rating" dates for all ratings have now been lost), and then re-enable scheduled updating. audiodude (talk) 01:40, 26 May 2025 (UTC)[reply]

To be clear, I still have no idea of the "root cause" of this problem. I also believe that when it appeared that the problem was still happening (after the bot was unblocked the first time), the actual issue was that there was a backlog of "upload" jobs in the queue, that were uploading empty tables before the "update" jobs of the bot could properly run and refresh the data. The bot runs in two phases: update, then upload. audiodude (talk) 01:45, 26 May 2025 (UTC)[reply]

Okay I'm going to start the manual full run now. audiodude (talk) 17:04, 26 May 2025 (UTC)[reply]

The manual full run seemed to be successful. I'm going to take down the tool and website for a couple of hours right now while I try to restore the data from a backup. Then I'll do another manual run. audiodude (talk) 14:59, 27 May 2025 (UTC)[reply]

Okay, I've restored the database (as mentioned above) and done a full manually kicked-off run. Everything looks good to me. I'm going to re-enable automatic upates and consider this issue closed. Please continue to look at things everyone and feel free to speak up if anything seems off again. Thanks! audiodude (talk) 00:31, 28 May 2025 (UTC)[reply]

Seems to be running fine, as far as I see. Thanks you very much for your work on it, audiodude. Nurg (talk) 00:14, 1 June 2025 (UTC)[reply]

Athletics assessment statistics

The page User:WP 1.0 bot/Tables/Project/Athletics stopped being updated 10 days ago. – Editør (talk) 02:00, 26 May 2025 (UTC)[reply]

Yes, we've been experiencing issues with the bot. See discussion above. audiodude (talk) 02:08, 26 May 2025 (UTC)[reply]

Copying assessment table but not updating quality log

The last couple of times this has run for WP:LGBTQ+ (including a manual run just now), it has copied the assessment table but not updated the quality log. Is there anything in the bot logs that would indicate what the problem is?--Trystan (talk) 11:50, 31 May 2025 (UTC)[reply]

I'll take a look! audiodude (talk) 15:53, 31 May 2025 (UTC)[reply]

Okay definitely seems like some kind of bug. I've filed an issue on GitHub and will be investigating. Thanks for reporting! audiodude (talk) 17:58, 31 May 2025 (UTC)[reply]

This is resolved now: Wikipedia:Version_1.0_Editorial_Team/LGBTQ+_studies_articles_by_quality_log

Please see the above GitHub bug for an explanation. Thanks again for reporting! audiodude (talk) 23:09, 31 May 2025 (UTC)[reply]

Thanks!--Trystan (talk) 23:11, 31 May 2025 (UTC)[reply]

Short summary for people who aren't familiar with Github: yesterday there was an update to something in the "Event" namespace for that project, and the code didn't understand that namespace and so gave up. Now it understands! --Pres N 23:39, 31 May 2025 (UTC)[reply]