Skip to content

Instantly share code, notes, and snippets.

@barryvdh
Created June 9, 2015 19:20
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save barryvdh/a6b00c9d5a921f6cfad8 to your computer and use it in GitHub Desktop.
Save barryvdh/a6b00c9d5a921f6cfad8 to your computer and use it in GitHub Desktop.
Laravel docs scraper
<?php
// composer require fabpot/goutte
Route::get('docs', function(){
$host = 'http://laravel.com';
$base = '/docs/5.1/';
$client = new Goutte\Client();
$crawler = $client->request('GET', $host.$base);
$head = $crawler->filter('head')->first()->html();
$head = str_replace('/build/assets/', $host.'/build/assets/', $head);
$nav = $crawler->filter('.slide-docs-nav')->first();
$body = '<article>'.$nav->html().'</article>';
foreach ($nav->filter('a')->links() as $link) {
$article = $client->click($link)->filter('article')->first();
$id = basename($link->getUri());
$body .= '<article id='.$id.'>'.$article->html().'</article>';
}
$body = str_replace('href="'.$base, 'href="#', $body);
$body = str_replace('#collections#', '#', $body);
$output = <<<EOT
<html>
<head>
$head
<style>
@media print {
article {
page-break-after: always;
}
p, pre {
page-break-inside: avoid;
}
}
</style>
</head>
<body>
<div class="container">
$body
</div>
</body>
</html>
EOT;
return $output;
});
@harikt
Copy link

harikt commented Jun 10, 2015

Nice :) .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment