Skip to content

Instantly share code, notes, and snippets.

@benkitzelman
Last active August 15, 2018 12:43
Show Gist options
  • Save benkitzelman/5309967 to your computer and use it in GitHub Desktop.
Save benkitzelman/5309967 to your computer and use it in GitHub Desktop.
Using and configuring the Google Ajax Crawler to facilitate search engine indexing of ajax rich pages (client MVC and the Google Ajax Crawling Scheme)
#
# to run:
# $ rackup config.ru -p 3000
# open browser to http://localhost:3000/#!test
#
require 'bundler/setup'
require './lib/google_ajax_crawler'
use GoogleAjaxCrawler::Crawler do |config|
config.driver = GoogleAjaxCrawler::Drivers::CapybaraWebkit
config.poll_interval = 0.25 # how often to check if the page has loaded
#
# for the demo - the page is considered loaded when the loading mask has been removed from the DOM
# this could evaluate something like $.active == 0 to ensure no jquery ajax calls are pending
#
config.page_loaded_test = lambda {|driver| driver.page.evaluate_script('document.getElementById("loading") == null') }
end
# a sample page using #! url fragments to seed page state
page_content = File.read('./page.html')
run lambda {|env| [200, { 'Content-Type' => 'text/html' }, [page_content]] }
<!--
To view this page's markup as a search engine would see it without the google_ajax_crawler gem, open in a browser and
view source...
To see how the google_ajax_crawler gem delivers a rendered snapshot of the page, open /?_escaped_fragment_=test
-->
<html>
<head></head>
<body>
<h1>A Simple State Test</h1>
<!-- the url fragment (i.e. /#!something) will be rendered via JS in the span-->
<p>State: <span id='page_state'></span></p>
<!-- will be removed by js on page load -->
<div class='loading' id='loading'>Loading....</div>
<script type='text/javascript'>
var init = function() {
var writeHash = function() {
document.getElementById('page_state').innerHTML = "Javascript rendering complete for client-side route " + document.location.hash;
var loadingMask = document.getElementById('loading');
if(loadingMask) loadingMask.parentNode.removeChild(loadingMask);
console.log('done...');
};
window.addEventListener("hashchange", writeHash, false);
setTimeout(writeHash, 500);
};
//
// Only execute js if loading the page using an unescaped url
//
if(/#.*$/.test(document.location.href)) init();
</script>
</body>
</html>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment