Skip to content

Instantly share code, notes, and snippets.

@wtype
Last active February 9, 2023 22:47
Show Gist options
  • Save wtype/a8d20629917f11d49b830a654b74de45 to your computer and use it in GitHub Desktop.
Save wtype/a8d20629917f11d49b830a654b74de45 to your computer and use it in GitHub Desktop.
Generate a word cloud image from a URL
import matplotlib.pyplot as plt
from wordcloud import WordCloud
import requests
from bs4 import BeautifulSoup
def is_wanted_element(element):
if element.parent.name in ['style', 'script', 'head', 'title', 'meta', '[document]']:
return False
return True
def words_in_html(body):
soup = BeautifulSoup(body, 'html.parser')
texts = soup.findAll(text=True)
visible_texts = filter(is_wanted_element, texts)
return u" ".join(t.strip() for t in visible_texts)
# enter a url to make a word cloud
url = raw_input('Please enter the URL you want a word cloud of...\n For example: https://en.wikipedia.org/wiki/Lauterbrunnen\n\n URL: ')
page = requests.get(url)
html = page.content
words = words_in_html(html)
# print(words)
wordcloud = WordCloud(width=1000, height=700, max_font_size=150).generate(words)
plt.figure()
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.show()
@wtype
Copy link
Author

wtype commented Jan 12, 2023

Web Page to Word Cloud

Generate a word cloud image from a url using Python.

There are lots of free word cloud generators online. I tried to use one recently and was met with a barrage of popups asking me to "OK" cookies and warnings that "We may sell data to advertisers". So here's a quick script to make your own word cloud images for free.

📃 → ☁️

Create a Word Cloud

git clone https://github.com/wtype/word-cloud-from-url.git

cd word-cloud-from-url

python cloud.py

Enter your url and press enter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment