Skip to content

Instantly share code, notes, and snippets.

@CodersArts
Created August 23, 2019 07:10
Show Gist options
  • Save CodersArts/4ed31166f21e2ddecc948f88740333ca to your computer and use it in GitHub Desktop.
Save CodersArts/4ed31166f21e2ddecc948f88740333ca to your computer and use it in GitHub Desktop.
Web Scraping Host way to Scrape content
soup.title
# <title>Returns title tags and the content between the tags</title>
soup.title.string
# u'Returns the content inside a title tag as a string'
soup.p
# <p class="title"><b>This returns everything inside the paragraph tag</b></p>
soup.p['class']
# u'className' (this returns the class name of the element)
soup.a
# <a class="link" href="http://example.com/example" id="link1">This would return the first
matching anchor tag</a>
// Or, we could use the find all, and return all the matching anchor tags
soup.find_all('a')
# [<a class="link" href="http://example.com/example1" id="link1">link2</a>,
# <a class="link" href="http://example.com/example2" id="link2">like3</a>,
# <a class="link" href="http://example.com/example3" id="link3">Link1</a>]
soup.find(id="link3")
# <a class="link" href="http://example.com/example3" id="link3">This returns just the matching element by ID</a>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment