Skip to content

Instantly share code, notes, and snippets.

@ramiroluz
Created November 12, 2013 17:47
Show Gist options
  • Save ramiroluz/7435378 to your computer and use it in GitHub Desktop.
Save ramiroluz/7435378 to your computer and use it in GitHub Desktop.
# Salvar arvore html da página de vídeos pythonbrasil no arquivo /tmp/dom.html
# http://www.youtube.com/user/pythonbrazil/videos?view=0&sort=dd&live_view=500&flow=list
# BeautifulSoup versão 3.2.1, o método para buscar a classe mudou na versão 4.
s = file('/tmp/dom.html').read()
from BeautifulSoup import BeautifulSoup as BS
soup = BS(s)
soup.findAll('li', { "class" : "yt-lockup clearfix channels-browse-content-list-item yt-lockup-video yt-lockup-tile vve-check context-data-item" })
items = soup.findAll('li', { "class" : "yt-lockup clearfix channels-browse-content-list-item yt-lockup-video yt-lockup-tile vve-check context-data-item" })
l = [x.attrMap['data-context-item-title'] for x in items]
# Esse vídeo não é do evento
l.index(u'Introdu\xe7\xe3o ao CMS Plone')
l = l[:134]
print('\n'.join(l))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment