Skip to content

Instantly share code, notes, and snippets.

@Tokubara
Last active July 24, 2021 08:51
Show Gist options
  • Save Tokubara/16cca266fbed813521993c0a75ca200d to your computer and use it in GitHub Desktop.
Save Tokubara/16cca266fbed813521993c0a75ca200d to your computer and use it in GitHub Desktop.
rcore-tutorial
chap0_index_url="https://rcore-os.github.io/rCore-Tutorial-Book-v3/chapter0/index.html"
html=read_html(chap0_index_url)
link_sets0=html%>% html_nodes(".current .internal") %>% html_attr("href")
index_urls=index_urls%>%str_replace("\\.\\./", "https://rcore-os.github.io/rCore-Tutorial-Book-v3/")
index_urls[length(index_urls)+1]='https://rcore-os.github.io/rCore-Tutorial-Book-v3/chapter0/index.html'
for(index_url in index_urls) { # 是完整的
index_page=read_html(index_url)
all_links=index_page%>% html_nodes(".current .internal") %>% html_attr("href")
in_links=all_links%>%str_subset("^\\d")
chap_name=index_url%>%word(sep=fixed('/'),start=-2)
base_url=index_url%>%str_replace('index.html','')
for(in_url in in_links) {
section_url=str_c(base_url, in_url)
try(
{
download.file(section_url,str_c("~/Downloads/",chap_name,'_',in_url))
})
}
}
library(rvest)
library(stringr)
html=read_html("https://decaf-lang.github.io/minidecaf-tutorial/docs/ref/typescript-jyk.html")
setwd('~/Downloads/tmp')
links=html%>%html_nodes(".chapter a") %>% html_attr("href")
full_links=str_c("https://decaf-lang.github.io/minidecaf-tutorial/docs/ref/",links)%>%str_subset("html$")
full_links_name=sapply(full_links, function(link){.t=str_split(link,'/')[[1]];n=length(.t);if(str_detect(.t[n-1],'lab.')) {paste(.t[n-1], .t[n], sep='-')} else {.t[n]}})
full_links_name=unname(full_links_name)
for(i in 1:length(full_links)) {
try(
{
download.file(full_links[i],full_links_name[i])
}
)
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment