Skip to content

Instantly share code, notes, and snippets.

@chimerast
Created April 16, 2011 18:01
Show Gist options
  • Save chimerast/923348 to your computer and use it in GitHub Desktop.
Save chimerast/923348 to your computer and use it in GitHub Desktop.
Jericho+JaxenをつかってHTMLにXPathでアクセスする
package st.chimera.scraper
import net.htmlparser.jericho._
import st.chimera.scraper.HtmlScraper._
object Main {
def main(args: Array[String]) {
val doc = HtmlScraper("http://www.scala-lang.org/")
doc.eval("div[@class='node']/h2").foreach(_ match {
case node: Segment =>
node.eval("text()").foreach(println)
node.eval("a/@href").foreach(_ match {
case attr: Attribute => println(" " + attr.getValue)
})
})
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment