Last active
August 15, 2018 12:43
-
-
Save benkitzelman/5309967 to your computer and use it in GitHub Desktop.
Using and configuring the Google Ajax Crawler to facilitate search engine indexing of ajax rich pages (client MVC and the Google Ajax Crawling Scheme)
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# | |
# to run: | |
# $ rackup config.ru -p 3000 | |
# open browser to http://localhost:3000/#!test | |
# | |
require 'bundler/setup' | |
require './lib/google_ajax_crawler' | |
use GoogleAjaxCrawler::Crawler do |config| | |
config.driver = GoogleAjaxCrawler::Drivers::CapybaraWebkit | |
config.poll_interval = 0.25 # how often to check if the page has loaded | |
# | |
# for the demo - the page is considered loaded when the loading mask has been removed from the DOM | |
# this could evaluate something like $.active == 0 to ensure no jquery ajax calls are pending | |
# | |
config.page_loaded_test = lambda {|driver| driver.page.evaluate_script('document.getElementById("loading") == null') } | |
end | |
# a sample page using #! url fragments to seed page state | |
page_content = File.read('./page.html') | |
run lambda {|env| [200, { 'Content-Type' => 'text/html' }, [page_content]] } |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<!-- | |
To view this page's markup as a search engine would see it without the google_ajax_crawler gem, open in a browser and | |
view source... | |
To see how the google_ajax_crawler gem delivers a rendered snapshot of the page, open /?_escaped_fragment_=test | |
--> | |
<html> | |
<head></head> | |
<body> | |
<h1>A Simple State Test</h1> | |
<!-- the url fragment (i.e. /#!something) will be rendered via JS in the span--> | |
<p>State: <span id='page_state'></span></p> | |
<!-- will be removed by js on page load --> | |
<div class='loading' id='loading'>Loading....</div> | |
<script type='text/javascript'> | |
var init = function() { | |
var writeHash = function() { | |
document.getElementById('page_state').innerHTML = "Javascript rendering complete for client-side route " + document.location.hash; | |
var loadingMask = document.getElementById('loading'); | |
if(loadingMask) loadingMask.parentNode.removeChild(loadingMask); | |
console.log('done...'); | |
}; | |
window.addEventListener("hashchange", writeHash, false); | |
setTimeout(writeHash, 500); | |
}; | |
// | |
// Only execute js if loading the page using an unescaped url | |
// | |
if(/#.*$/.test(document.location.href)) init(); | |
</script> | |
</body> | |
</html> |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment