Skip to content

Instantly share code, notes, and snippets.

@hacker1024
Last active March 28, 2022 00:49
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save hacker1024/c783e2efedcf2fdcca0336559e8c7a96 to your computer and use it in GitHub Desktop.
Save hacker1024/c783e2efedcf2fdcca0336559e8c7a96 to your computer and use it in GitHub Desktop.
A method to download restricted eBooks from ProQuest Ebook Central.

Step 1: Load in all the page elements.

let eBookPageCount = /* Set this. */
function sleep(ms) {
  return new Promise(resolve => setTimeout(resolve, ms));
}
async function load() {
  for (i = 1; i < eBookPageCount; ++i) {
    let pageContainer = document.getElementById('mainPageContainer_' + i)
    pageContainer.scrollIntoView()
    await sleep(320) // This sleep time may need to be changed to fit your environment.
  }
}
load()

Step 2: Convert the page elements into a list of aria2 URLs.

let imgUrls = pageContainers.map(pageContainer => pageContainer.querySelector('.mainViewerImg')['src'])
let imgUrlData = ""
for (var i = 0; i < imgUrls.length; ++i) {
  imgUrlData += imgUrls[i]
  imgUrlData += "\r\n"
  imgUrlData += ("\tout=" + i + ".png")
  imgUrlData += "\r\n"
}
copy(imgUrlData)

Step 3: Download the images.

  1. Paste the URL list into a text file.
  2. Copy the cookie header sent by your browser when requesting page images (can be found in the developer network tools).
  3. aria2c -j 16 -x 16 -i imgUrlData.txt --header="Cookie: <YOUR_COOKIES>".
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment