Skip to content

Instantly share code, notes, and snippets.

@mickaelandrieu
Last active August 29, 2015 13:56
Show Gist options
  • Save mickaelandrieu/8964107 to your computer and use it in GitHub Desktop.
Save mickaelandrieu/8964107 to your computer and use it in GitHub Desktop.
A very minimal benchmark: casperjs vs beautifulsoup4
python version:
from bs4 import BeautifulSoup
import requests
r = requests.get("http://www.google.fr/search?num=100&q=scrapping")
data = r.text
soup = BeautifulSoup(data)
for link in soup.find_all('a'):
print(link.get('href'))
casperjs version:
var casper = require('casper').create();
function getLinks() {
var links = document.querySelectorAll('h3.r a');
return Array.prototype.map.call(links, function(e) {
return e.getAttribute('href');
});
}
casper.start('http://google.fr/search?num=100&query=scrapping', function() {
});
casper.then(function() {
links = this.evaluate(getLinks);
});
casper.run(function() {
this.echo(' - ' + links.join('\n - ')).exit();
});
For a round of 100 calls:
time -o {command}
python version : 1.082
casperjs : 1.107
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment