Skip to content

Instantly share code, notes, and snippets.

@jarsen
Created March 12, 2010 19:47
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jarsen/330701 to your computer and use it in GitHub Desktop.
Save jarsen/330701 to your computer and use it in GitHub Desktop.
void HTML::parse_links() {
string::const_iterator start, end;
start = src.begin();
end = src.end();
boost::match_results<std::string::const_iterator> what;
// handles single,double, and no quotes, and space/newlines, etc
boost::regex hrefs("href(\\s*)=(\\s)*[\"']?(.*?)[\"']?(\\s)*>",boost::regex_constants::icase);
while(boost::regex_search(start, end, what, hrefs)) {
cout << "link: " << string(what[3].first, what[3].second) << endl;
links.add(URL(string(what[3].first, what[3].second)));
start = what[0].second;
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment