Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Broken Image Repairer

Broken Image Repairer

What is the problem?

A long time ago, it was possible to inline images from all kinds of external sources. Since the switch from HTTP to HTTPS, this is no longer possible; only HTTPS sources are allowed. This leads to ugly blurbs like

alt text http://example.com/image.png

instead of a nicely formatted page with images. Sometimes, the links don't even work anymore, even with HTTPS images, which will show like this: ... Luckily, we have the Wayback Machine which is able to rescue some of the lost images. Since a picture often says more than a thousand words, it's important to bring back the post into its original state; important enough to justify the occasional bump of an old post (see below).

What does it do?

'It' is a program which automatically attempts to repair these broken images. You can think of it as a paintings conservator. It will upload the images to Stack Exchange's imgur channel, which makes sure they'll live as long as the posts they appear in, and not get broken again in the future. If the image isn't hosted on one of the popular image hosting sites like ImageShack or TinyPic, it will attempt to add attribution if it isn't present already. It will edit posts (under my own account) if I have edit privileges on a site; if not, it will suggest an edit.

How much does it do?

The program runs approximately once every 36 hours. Edits are limited to one suggested edit at the time, and up to three autonomous edits to e.g. Community Wiki posts; this both limits the effects on the front page, and does not increase the burden on reviewers too much. (I'll run the program more often on Stack Overflow because of the vast number of posts to fix there; flooding the front page is rather hard there.) If the previous suggested edit is still pending review, the program will skip that site. I'm an avid reviewer myself and wouldn't like to review hundreds of the same type of edits. Also, whenever I'm able to, I'm trying to manually review the edits to correct typos and improve formatting. Reviewing by other people helps finding bugs like this one where the program attempted to replace images in code. Hats off to the reviewers there!

Questions?

I'm always happy to discuss automation of moderating tasks. Just ping me in chat, in Charcoal HQ, Ask Different Chat, SOCVR, Tavern on the Meta or invite me into a separate room. A post on your local Meta is fine as well, as long as I'm somehow notified about this (either via a ping in chat or a comment reply - those also work on the posts I've edited, even though my username isn't autocompleted). FWIW, I also examine all rejected edits made by the program. Leaving a comment below works, but since I don't get notified of any new comments, it could take a while before I react.

The bot will detect error/placeholder images like this one, but only if it knows how they look like. Some of these have been hardcoded into the source code, but please let me know if you encounter a situation where this happens again.

@gparyani

This comment has been minimized.

Copy link

gparyani commented Jan 29, 2019

Is there a way to run the image fixer script manually on a specific post, in case we come across a post that should be fixed?

@Unihedro

This comment has been minimized.

Copy link

Unihedro commented Feb 23, 2019

This is a smart idea and some of my posts on MSO got edited, I do like that they're now properly rehosted. Though I'm manually deleting the links to the old images, which I feel are not necessary (with a "source:" link added)

@roddypratt

This comment has been minimized.

Copy link

roddypratt commented Mar 1, 2019

Placeholder image from clip2net.com incorrectly "repaired" on this question. https://stackoverflow.com/posts/532777/revisions

@Glorfindel83

This comment has been minimized.

Copy link
Owner Author

Glorfindel83 commented Apr 12, 2019

Thank you @roddypratt; it's been added to a list of known 'error images'.

@dtgriscom

This comment has been minimized.

Copy link

dtgriscom commented Apr 17, 2019

This is a great tool, but a question: when I see a "Suggested Edit" made by it, has it been reviewed by you? Alternately, how likely is it that it might actually break something? (I'm wondering whether I can just reflexively approve anything the tool suggests, or if I need to actually check the rendered output. Trying to save time...)

@Glorfindel83

This comment has been minimized.

Copy link
Owner Author

Glorfindel83 commented May 3, 2019

@dtgriscom I check most but not all edits, and I usually do it not immediately after the edit has been made, but after the bot has finished a run for all sites in the network (this takes an hour or so). I'm pretty sure it doesn't make any wrong edits anymore (except for the occasional error image); I have 60 test cases (posts the bot previously failed to edit correctly) in an automated test case suite (example). But if you can improve the post further (e.g. spelling / grammar corrections), it's always a plus. (Though I'm working on something to automate those corrections too, but it will be a lot harder.)

@Melebius

This comment has been minimized.

Copy link

Melebius commented Aug 7, 2019

I’ve got one enhancement request. The edit comment posted by the script currently:

broken images fixed (click 'rendered output' or 'side-by-side' to see the difference; images retrieved via Wayback Machine); for more info, see https://gist.github.com/Glorfindel83/9d954d34385d2ac2597bbe864466259f

is pretty long and I find it useless to push it repeatedly into the database (see also SSOT, DRY). I’d shorten it to something like:

broken images fixed, see https://gist.github.com/Glorfindel83/9d954d34385d2ac2597bbe864466259f

The details in parentheses are something what good reviewer should already know (viewing the difference) or what can be found by viewing the markdown code (source of images).

By the way, is the code of the script available somewhere on GitHub, for example? I couldn’t find it quickly. If it was clearly stated on this page, I could create an issue or post a pull request directly.

@Glorfindel83

This comment has been minimized.

Copy link
Owner Author

Glorfindel83 commented Aug 7, 2019

Thank you for reviewing my suggested edits!
I completely agree that reviewers should know the two different 'diff' views, but I noticed that adding this information increased the approval rate from (roughly) 95% to 99%. Some users just don't know what's going on and need a little nudge. It might also be useful for other new contributors, unfamiliar with how Stack Exchange edits work, who are wondering why that Glorfindel guy is making so much edits.
I don't think the extra long edit summaries are causing disk space problems for Stack Exchange. If they do, they'd probably told me already.
I haven't publicized the source code yet since 1) it contains my access tokens 2) it's not properly documented 3) I'd rather not have multiple instances of these running, to avoid blowing up the suggested edits review queue and/or the front page with bumped posts. I'll probably publish a modified version in the future to accommodate requests like this one (in German, but Google Translate does a pretty decent job).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.