Created
April 24, 2022 17:42
-
-
Save nileshtrivedi/45c5daf86b465734cae03c5627f9fad7 to your computer and use it in GitHub Desktop.
Ruby script to download all NCERT book PDFs
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# NCERT books are excellent but being altered for political or other reasons | |
# See: https://twitter.com/SouthAsiaIndex/status/1518062204058103809 | |
# To download the entire current set, run this script with Ruby | |
require 'httparty' | |
source = HTTParty.get('https://ncert.nic.in/textbook.php').force_encoding("ISO-8859-1").encode("utf-8", replace: nil) | |
# book names are like aeen1dd.zip | |
# First letter tells the class number a to l is class 1 to class 12. m stands for class 11 and 12 combined | |
# Second letter is the language the book is written in: e for English, h for Hindi, u for Urdu | |
bookids = source.scan(/textbook.php\?[a-z]{4,4}\d/).uniq | |
def download_book(book_name) | |
puts "Downloading #{book_name}" | |
File.open(book_name, "w") do |file| | |
file.binmode | |
HTTParty.get('https://ncert.nic.in/textbook/pdf/' + book_name, follow_redirects: true, stream_body: true) do |fragment| | |
file.write(fragment) | |
end | |
end | |
rescue | |
end | |
bookids.each do |bid| | |
book_name = bid.gsub("textbook.php?", "") + "dd.zip" | |
download_book(book_name) | |
sleep(0.2) | |
end |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment