Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save larrasket/8782cdf0b1c793c1cbe9c0c193d5dcb6 to your computer and use it in GitHub Desktop.
Save larrasket/8782cdf0b1c793c1cbe9c0c193d5dcb6 to your computer and use it in GitHub Desktop.
Download PDFs from Quran Explorer/Quran Archive

Download PDFs from Quran Explorer/Quran Archive

First open the desired copy, I choose M. H. Shakir, The Holy Quran; Arabic Text and English Translation; Foot-notes by M. H. Shakir (1974). Right click in the first page image and select “copy link address”. You should get something like this: https://quran.sfo2.digitaloceanspaces.com/m-h-shakir-1974/m-h-shakir-0001.jpeg

The idea basically is to enumerate through the files by the number. The maximum number should be listed at https://quran-archive.org/explorer/m-h-shakir/1974; Page 1 of 976, 976 is the maximum number.

#!/usr/bin/env python3

import os
import requests
from PIL import Image

# specify the base URL of the images
base_url = "https://quran.sfo2.digitaloceanspaces.com/m-h-shakir-1974/m-h-shakir-"

# specify the range of image numbers
start_num = 1
end_num = 976

# specify the directory where the images will be saved
directory = "images"

# specify the name of the output PDF file
output_file = "output.pdf"


# use a for loop to iterate through the range of image numbers
for i in range(start_num, end_num + 1):
    # create the URL of the image
    image_url = base_url + str(i).zfill(4) + ".jpeg"

    # download the image and save it to the specified directory
    response = requests.get(image_url)
    open(directory + "/m-h-shakir-" + str(i).zfill(4) + ".jpeg", "wb").write(response.content)

# sort the list of image files by filename
image_files = sorted(os.listdir(directory))
image_paths = [os.path.join(directory, f) for f in image_files]

# open the first image
try:
    im = Image.open(image_paths[0])
    # create a new PDF document with the same size as the image
    im.save(fp=output_file, format='pdf', save_all=True, append_images=image_paths[1:])
    # close the image file after saving
    im.close()
    print("PDF created successfully")
except Exception as e:
    print(f"An error occurred while opening the image file {image_paths[0]}: {e}")
    # open the second image and create the pdf file
    im = Image.open(image_paths[1])
    im.save(fp=output_file, format='pdf', save_all=True, append_images=[Image.open(p) for p in image_paths[2:]])
    # close the image file after saving
    im.close()
    print("PDF created successfully")

# delete the images from the directory
for i in image_paths:
    os.remove(i)

Create a folder (in the same directory as the script) and name it ‘images’. Run the script and wait sometime (don’t forget to change the base_url and end_num) and you will find the file at output.pdf in the same directory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment