Skip to content

Instantly share code, notes, and snippets.

@speedyGonzales
Forked from kui/scrape.dart
Created July 3, 2018 07:53
Show Gist options
  • Save speedyGonzales/3fdba9ae830c791c0516f4d16ec1bfa0 to your computer and use it in GitHub Desktop.
Save speedyGonzales/3fdba9ae830c791c0516f4d16ec1bfa0 to your computer and use it in GitHub Desktop.
a web scraping script with Dart and html5lib
import 'dart:io';
import 'dart:async';
import 'package:html5lib/parser.dart';
import 'package:html5lib/dom.dart';
main() {
final url = 'http://comic-walker.com/';
getHtml(url).then((document) {
// page title
print(document.querySelector('title').text);
// Newer comics
document.querySelectorAll('#bookList > li').forEach((e) {
print(e.querySelector('.list_bookName').text);
});
});
}
/// fetch and parse the HTML from [url]
Future<Document> getHtml(String url) =>
new HttpClient()
.getUrl(Uri.parse(url))
.then((req) => req.close())
.then((res) => res
.asyncExpand((bytes) => new Stream.fromIterable(bytes))
.toList())
.then((bytes) => parse(bytes, sourceUrl: url));
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment