Skip to content

Instantly share code, notes, and snippets.

@Glorfindel83
Last active August 19, 2023 14:08
  • Star 13 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
Star You must be signed in to star a gist
Save Glorfindel83/9d954d34385d2ac2597bbe864466259f to your computer and use it in GitHub Desktop.
Broken Image Repairer

Broken Image Repairer

What is the problem?

A long time ago, it was possible to inline images from all kinds of external sources. Since the switch from HTTP to HTTPS, this is no longer possible; only HTTPS sources are allowed. This leads to ugly blurbs like

alt text http://example.com/image.png

instead of a nicely formatted page with images. Sometimes, the links don't even work anymore, even with HTTPS images, which will show like this: ... Luckily, we have the Wayback Machine which is able to rescue some of the lost images. Since a picture often says more than a thousand words, it's important to bring back the post into its original state; important enough to justify the occasional bump of an old post (see below).

What does it do?

'It' is a program which automatically attempts to repair these broken images. You can think of it as a paintings conservator. It will upload the images to Stack Exchange's imgur channel, which makes sure they'll live as long as the posts they appear in, and not get broken again in the future. If the image isn't hosted on one of the popular image hosting sites like ImageShack or TinyPic, it will attempt to add attribution if it isn't present already. It will edit posts (under my own account) if I have edit privileges on a site; if not, it will suggest an edit.

How much does it do?

The program runs approximately once every 36 hours. Edits are limited to one suggested edit at the time, and up to three autonomous edits to e.g. Community Wiki posts; this both limits the effects on the front page, and does not increase the burden on reviewers too much. (I'll run the program more often on Stack Overflow because of the vast number of posts to fix there; flooding the front page is rather hard there.) If the previous suggested edit is still pending review, the program will skip that site. I'm an avid reviewer myself and wouldn't like to review hundreds of the same type of edits. Also, whenever I'm able to, I'm trying to manually review the edits to correct typos and improve formatting. Reviewing by other people helps finding bugs like this one where the program attempted to replace images in code. Hats off to the reviewers there!

Questions?

I'm always happy to discuss automation of moderating tasks. Just ping me in chat, in Charcoal HQ, Ask Different Chat, SOCVR, Tavern on the Meta or invite me into a separate room. A post on your local Meta is fine as well, as long as I'm somehow notified about this (either via a ping in chat or a comment reply - those also work on the posts I've edited, even though my username isn't autocompleted). FWIW, I also examine all rejected edits made by the program. Leaving a comment below works, but since I don't get notified of any new comments, it could take a while before I react.

The bot will detect error/placeholder images like this one, but only if it knows how they look like. Some of these have been hardcoded into the source code, but please let me know if you encounter a situation where this happens again.

@gparyani
Copy link

Is there a way to run the image fixer script manually on a specific post, in case we come across a post that should be fixed?

@Unihedro
Copy link

This is a smart idea and some of my posts on MSO got edited, I do like that they're now properly rehosted. Though I'm manually deleting the links to the old images, which I feel are not necessary (with a "source:" link added)

@roddypratt
Copy link

Placeholder image from clip2net.com incorrectly "repaired" on this question. https://stackoverflow.com/posts/532777/revisions

@Glorfindel83
Copy link
Author

Thank you @roddypratt; it's been added to a list of known 'error images'.

@dtgriscom
Copy link

This is a great tool, but a question: when I see a "Suggested Edit" made by it, has it been reviewed by you? Alternately, how likely is it that it might actually break something? (I'm wondering whether I can just reflexively approve anything the tool suggests, or if I need to actually check the rendered output. Trying to save time...)

@Glorfindel83
Copy link
Author

@dtgriscom I check most but not all edits, and I usually do it not immediately after the edit has been made, but after the bot has finished a run for all sites in the network (this takes an hour or so). I'm pretty sure it doesn't make any wrong edits anymore (except for the occasional error image); I have 60 test cases (posts the bot previously failed to edit correctly) in an automated test case suite (example). But if you can improve the post further (e.g. spelling / grammar corrections), it's always a plus. (Though I'm working on something to automate those corrections too, but it will be a lot harder.)

@Melebius
Copy link

Melebius commented Aug 7, 2019

I’ve got one enhancement request. The edit comment posted by the script currently:

broken images fixed (click 'rendered output' or 'side-by-side' to see the difference; images retrieved via Wayback Machine); for more info, see https://gist.github.com/Glorfindel83/9d954d34385d2ac2597bbe864466259f

is pretty long and I find it useless to push it repeatedly into the database (see also SSOT, DRY). I’d shorten it to something like:

broken images fixed, see https://gist.github.com/Glorfindel83/9d954d34385d2ac2597bbe864466259f

The details in parentheses are something what good reviewer should already know (viewing the difference) or what can be found by viewing the markdown code (source of images).

By the way, is the code of the script available somewhere on GitHub, for example? I couldn’t find it quickly. If it was clearly stated on this page, I could create an issue or post a pull request directly.

@Glorfindel83
Copy link
Author

Thank you for reviewing my suggested edits!
I completely agree that reviewers should know the two different 'diff' views, but I noticed that adding this information increased the approval rate from (roughly) 95% to 99%. Some users just don't know what's going on and need a little nudge. It might also be useful for other new contributors, unfamiliar with how Stack Exchange edits work, who are wondering why that Glorfindel guy is making so much edits.
I don't think the extra long edit summaries are causing disk space problems for Stack Exchange. If they do, they'd probably told me already.
I haven't publicized the source code yet since 1) it contains my access tokens 2) it's not properly documented 3) I'd rather not have multiple instances of these running, to avoid blowing up the suggested edits review queue and/or the front page with bumped posts. I'll probably publish a modified version in the future to accommodate requests like this one (in German, but Google Translate does a pretty decent job).

@x-yuri
Copy link

x-yuri commented May 7, 2020

@Glorfindel83 I should've probably just approved the edit, but it says, "broken image fixed." And there's no broken image. So what it does is replaces the image with an identical one from imgur. Is that correct?

This one is probably primarily to myself. I'm thinking of adding a link to the original article.

@Glorfindel83
Copy link
Author

@x-yuri that's a surprise. I guess that while the script was running, the Docker website (or at least the part hosting that image) was temporarily down, so that the script thought is was broken. One more reason to host it on Stack Exchange's own Imgur channel...

@timtjtim
Copy link

It probably shouldn’t be replacing Valuable Flair with imgur links. It would be better to update these to https. See https://lifehacks.meta.stackexchange.com/revisions/1272/5

There’s an argument that nominations like that should be “frozen” in time when they’re posted, but it’s a bit late for that now.

@Glorfindel83
Copy link
Author

Thanks @timtjtim, that makes sense indeed. Strange that this situation never occurred before...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment