Skip to content

Instantly share code, notes, and snippets.

@ngbeslhang
Last active December 1, 2022 06:47
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ngbeslhang/d09b3ad39d791c4ca8e47be5c5b69238 to your computer and use it in GitHub Desktop.
Save ngbeslhang/d09b3ad39d791c4ca8e47be5c5b69238 to your computer and use it in GitHub Desktop.
How to stop redirects when attempting to access UOW Malaysia KDU Archive.org snapshots

TL;DR: replace http with https, such that e.g. http://web.archive.org/web/20210617234942/https://www.uowmkdu.edu.my/study-at-uow-kdu/transportation/ becomes https://web.archive.org/web/20210617234942/https://www.uowmkdu.edu.my/study-at-uow-kdu/transportation/.

Journey

Since I've just recently enrolled myself into a diploma programme for UOW Malaysia KDU (hereby referred to as KDU for the rest of the document), and since I'm living outside the campus, I need to rely on the university shuttle to go back and forth.

Out of curiosity, I decided to look at the historical snapshots of the transportation webpage on Wayback Machine, but all of the snapshots redirected me back to the original URL rendering me unable to access them at all, so I opened up the web developer tools > debugger to run the webpage step-by-step to see which script caused it.

I ended up finding out this:

<script>if (document.location.protocol != "https:") {document.location = document.URL.replace(/^http:/i, "https:");}</script><script type="text/javascript" src="http://web.archive.org/web/20210617234942js_/https://www.uowmkdu.edu.my/wp-includes/js/jquery/jquery.js?ver=1.12.4-wp"></script>

This single-line JavaScript, in layman's terms, essentially redirects the user from the non-secure HTTP protocol to HTTPS protocol by literally replacing the web browser's URL. This redirection should have been done on the web server configuration level with relevant HTTP status codes.

Seeing as it's already fully archived by Archive.org, which means there's no way for us to edit it on the browser level, I decided to try writing a Greasemonkey script, but the first draft ended up not working.

Reading back the original script again, I then remembered that it only runs when the protocol isn't https, and that Archive.org by default doesn't redirect you to HTTPS if you just type in the URL, so instead I changed the protocool to https and I'm suddenly able to access the snapshots.

I don't necessarily have to type all of this out, but I hope this serves as a reminder, both to myself and anyone else who somehow comes across this, that at least with IT you should always, ALWAYS, keep Occam's razor in your mind.

Essentially, try the simplest (potential) solutions first before moving onto more complicated solutions.

In case anyone's interested, I have included the first draft of the Greasemonkey script I have written.

// ==UserScript==
// @name (NON-WORKING) UOWMKDU Archive.org Snapshot No Redirect
// @version 1
// ==/UserScript==
// Relies on https://stackoverflow.com/questions/3972038/stop-execution-of-javascript-function-client-side-or-tweak-it/10468821#10468821
// @run-at document-start
// DOESN'T WORK because somewhere along the webpage's source code, document.URL has been predefined as a script, and I can't find it with Ctrl+F
// You can try out the regex at regex101.com, remember to set the flavor to ECMAScript (JavaScript) first, and set the regex flags to gmi
// Intentionally avoids replacing document.URL for pages like http://web.archive.org/web/20210915000000*/https://www.uowmkdu.edu.my/study-at-uow-kdu/transportation/
// See also: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions
const re = /^https?\:\/\/web.archive.org\/web\/[0-9]+\/(https?\:\/\/(www.)?uowmkdu.edu.my.+)$/gi;
// See also:
// - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/match
// - https://developer.mozilla.org/en-US/docs/Glossary/Truthy
// Essentially, it relies on the fact that `null` is equivalent to false in JavaScript, and that any matched objects as equivalent to true.
if (re.match(window.location.href))
document.URL = window.location.href;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment