Skip to content

Instantly share code, notes, and snippets.

@premrajnarkhede
Created January 22, 2020 07:00
Show Gist options
  • Save premrajnarkhede/96268e100d478d102ea5588e07e665ca to your computer and use it in GitHub Desktop.
Save premrajnarkhede/96268e100d478d102ea5588e07e665ca to your computer and use it in GitHub Desktop.
from bs4 import BeautifulSoup
def extract_meta(data):
"""
This function takes raw html data as input
and gives title, description, keywords as output
"""
soup = BeautifulSoup(data, 'html.parser')
meta = ""
title = soup.title.string
title = title
meta = soup.findAll('meta')
data ={}
data["title"] = title
for tag in meta:
name = tag.get('name')
if name in ["title",'description','keywords']:
desc = tag.get('content')
data[name] = data.get(name," ")+desc
data[name].strip()
return data
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment