Skip to content

Instantly share code, notes, and snippets.

What would you like to do?
Wikipedia scraping with python
#Scraping wikipedia page according to your command line input
import sys
import requests
import bs4
RED = '\033[31m'
END = '\033[0m'
ascii_art = RED \
+ """
iiii kkkkkkkk iiii
i::::i k::::::k i::::i
iiii k::::::k iiii
wwwwwww wwwww wwwwwwwiiiiiii k:::::k kkkkkkkiiiiiiippppp pppppppppyyyyyyy yyyyyyy
w:::::w w:::::w w:::::w i:::::i k:::::k k:::::k i:::::ip::::ppp:::::::::py:::::y y:::::y
w:::::w w:::::::w w:::::w i::::i k:::::k k:::::k i::::ip:::::::::::::::::py:::::y y:::::y
w:::::w w:::::::::w w:::::w i::::i k:::::k k:::::k i::::ipp::::::ppppp::::::py:::::y y:::::y
w:::::w w:::::w:::::w w:::::w i::::i k::::::k:::::k i::::i p:::::p p:::::p y:::::y y:::::y
w:::::w w:::::w w:::::w w:::::w i::::i k:::::::::::k i::::i p:::::p p:::::p y:::::y y:::::y
w:::::w:::::w w:::::w:::::w i::::i k:::::::::::k i::::i p:::::p p:::::p y:::::y:::::y
w:::::::::w w:::::::::w i::::i k::::::k:::::k i::::i p:::::p p::::::p y:::::::::y
w:::::::w w:::::::w i::::::ik::::::k k:::::k i::::::ip:::::ppppp:::::::p y:::::::y
w:::::w w:::::w i::::::ik::::::k k:::::k i::::::ip::::::::::::::::p y:::::y
w:::w w:::w i::::::ik::::::k k:::::k i::::::ip::::::::::::::pp y:::::y
www www iiiiiiiikkkkkkkk kkkkkkkiiiiiiiip::::::pppppppp y:::::y
p:::::p y:::::y
p:::::p y:::::y
p:::::::p y:::::y
p:::::::p y:::::y
p:::::::p yyyyyyy
[++] wikipy is simple wikipedia scraper [++]
Coded By: Ankit Dobhal
Let's Begin To Scrape..!
wikipy version 1.0
""" \
res = requests.get('' + ' '.join(sys.argv[1:]))
#Just to raise the status code
wiki = bs4.BeautifulSoup(res.text,"lxml")
elems ='p')
for i in range(len(elems)):

This comment has been minimized.

Copy link

heelrayner commented May 3, 2020

could this be used on other wiki?


This comment has been minimized.

Copy link
Owner Author

ankitdobhal commented May 3, 2020

could this be used on other wiki?
It was designed only for Wikipedia but i don't think it will work other one.
But you check the css of that wiki and made changes according this code.


This comment has been minimized.

Copy link

danhowe0 commented Jul 15, 2020

I get the error:
Traceback (most recent call last):
File "/data/user/0/ru.iiec.pydroid3/files/accomp_files/iiec_run/", line 31, in
File "/data/user/0/ru.iiec.pydroid3/files/accomp_files/iiec_run/", line 30, in start
exec(open(mainpyfile).read(), main.dict)
File "", line 48, in
File "/data/user/0/ru.iiec.pydroid3/files/aarch64-linux-android/lib/python3.8/site-packages/bs4/", line 242, in init
raise FeatureNotFound(
bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.