Skip to content

Instantly share code, notes, and snippets.

@danneu
Created March 11, 2013 02:51
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save danneu/5131596 to your computer and use it in GitHub Desktop.
Save danneu/5131596 to your computer and use it in GitHub Desktop.
import requests
from lxml import html
res = requests.get('http://www.example.com')
doc = html.fromstring(res.text)
el = doc.cssselect('#id.class')
require "open-uri"
require "nokogiri" # 3rd party html parsing lib
html = open("http://www.example.com")
doc = Nokogiri::HTML(html)
el = doc.at("#id.class")
(ns crawler.core
(:require [net.cgrand.enlive-html :as enlive])
(:import [java.net.URL]))
(def nodes
(enlive/html-resource
(java.net.URL. "http://www.example.com")))
(def el
(enlive/select nodes
[:#id.class]))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment