Mechanize with Hpricot on AppEngine

Here is a Rack app that does a Google Search using Mechanize:

We are using Mechanize 0.8.5 and Hpricot 0.8.2. The most recent version Mechanize uses Nokogiri, which requires native libraries and therefore does not work on AppEngine. There is an effort to finish up the pure-Java Nokogiri, maybe YOU can help.

Special thanks to _Why, Ola Bini and Nick Sieger for creating, porting and maintaining Hpricot. Mechanize is my favorite gem of all time, so thanks to Aaron Patterson and Mike Dalessio for creating such an awesome tool. Here is a nice screencast.

When using gems with Java extensions, appengine-tools drops the appropriate jars into WEB-INF/lib for you.

find .gems -name "*.jar"
require 'appengine-rack'
require 'appengine-apis/urlfetch'
require 'mechanize'
:application => "mechanize-hpricot",
:precompilation_enabled => true,
:version => "1")
def my_search(query)
out = []
a = { |agent| agent.user_agent_alias = 'Mac Safari' }
a.get('') do |page|
search_result = page.form_with(:name => 'f') do |search|
search.q = query
search_result.links.each do |link|
out << link.text if link.text =~ /#{query}/i
def my_html(query)
title = "Mechanize with Hpricot on AppEngine"
html = <<HTML
<h2>#{title}</h2><p>This is a Google Search for: #{query}</p>
run lambda { |env| [200, {}, my_html('appengine-jruby') ] }
# Critical default settings:
bundle_path ".gems/bundler_gems"
# List gems to bundle here:
gem "appengine-rack"
gem "appengine-apis"
gem 'hpricot', '0.8.2'
gem 'mechanize', '0.8.5'
$ dev_appserver.rb .
=> Booting DevAppServer
=> Press Ctrl-C to shutdown server
=> Bundling gems
Calculating dependencies...
Updating source:
Caching: appengine-apis-0.0.12.gem
Caching: appengine-rack-0.0.6.gem
Downloading hpricot-0.8.2-java.gem
Downloading mechanize-0.8.5.gem
Downloading rack-1.1.0.gem
Installing hpricot (0.8.2)
Installing rack (1.1.0)
Installing appengine-rack (0.0.6)
Installing appengine-apis (0.0.12)
Installing mechanize (0.8.5)
=> Packaging gems
Installing fast_xs.jar
Installing hpricot_scan.jar
The server is running at http://localhost:8080/

yjx723 commented May 25, 2010

"no such file to load -- appengine-apis/urlfetch from "

