Skip to content

Instantly share code, notes, and snippets.

@takageymt
Created June 12, 2018 16:49
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save takageymt/ea646953214b84fe76ad1500b86c4482 to your computer and use it in GitHub Desktop.
Save takageymt/ea646953214b84fe76ad1500b86c4482 to your computer and use it in GitHub Desktop.
std::vector<Page> get_pages(const std::vector<std::string>& urls) {
std::vector<Page> pages;
std::map<std::string, int> url_conv;
for(int i = 0; i < static_cast<int>(urls.size()); ++i) {
pages.emplace_back(i, urls[i]);
url_conv[urls[i]] = i;
}
for(int i = 0; i < static_cast<int>(pages.size()); ++i) {
std::string dirpath = [](const std::string& url) {
auto pos = url.find_last_of('/');
if(pos == std::string::npos) return std::string("./");
return url.substr(0, pos+1);
}(pages[i].url());
std::vector<std::string> links = filter_with_regex(get_words_in(pages[i].url()), ".*\\.html$");
for(const std::string& to_url : links) {
if(url_conv.count(dirpath+to_url)) pages[i].add_link(url_conv[dirpath+to_url]);
}
}
return pages;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment