Skip to content

Instantly share code, notes, and snippets.

@yshalsager
Created May 15, 2020 22:41
Show Gist options
  • Save yshalsager/63e5e9bf0ab5c9aa7fe08a94f4f9d5f4 to your computer and use it in GitHub Desktop.
Save yshalsager/63e5e9bf0ab5c9aa7fe08a94f4f9d5f4 to your computer and use it in GitHub Desktop.
Webscraper that gets Quran Ayah translation from http://corpus.quran.com/translation.jsp
#!/usr/bin/env python3
from requests import get
from bs4 import BeautifulSoup
chapter = input("Enter Sura number\n")
url = f"http://corpus.quran.com/translation.jsp?chapter={chapter}"
page = BeautifulSoup(get(f'{url}&verse=1').content, "html.parser")
verses = int(page.select_one("#verseList > option:last-of-type")['value'])
with open(f'{chapter}.txt', 'w') as out:
for verse in range(1, verses + 1):
print(verse)
translations = BeautifulSoup(get(f'{url}&verse={verse}').content, "html.parser").select(".content > p")
out.write(f'{verse}\n')
for translation in translations[2:]:
out.write(f"{translation.get_text()}\n")
out.write('\n')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment