Skip to content

Instantly share code, notes, and snippets.

@dmarcelinobr
Created April 2, 2019 00:01
Show Gist options
  • Save dmarcelinobr/796aa9a3df6c88bf57fb9712daddcfdc to your computer and use it in GitHub Desktop.
Save dmarcelinobr/796aa9a3df6c88bf57fb9712daddcfdc to your computer and use it in GitHub Desktop.
Tutorial de uso do R e Selenium apresentado no IPEA, 2015
#Instalar e carregar pacotes de interesse
install.packages('RSelenium')
require('RSelenium')
require('XML')
#Atualiza seu aplicativo Java do Selenium
RSelenium::checkForServer()
#Abrir o servidor local
RSelenium::startServer()
remDr <- remoteDriver()
remDr$open()
#Navegar para o site
remDr$navigate("http://www.google.com/")
#Identificar a barra de texto
webElem <- remDr$findElement(using = "xpath", "//*[@id='lst-ib']")
#Atributos do elemento
webElem$getElementAttribute("class")
webElem$getElementAttribute("type")
webElem$getElementAttribute("name")
webElem$getElementAttribute("id")
#Digitar 'IPEA' e pressionar 'Enter'
webElem$sendKeysToElement(list("IPEA", "\uE007"))
#Identificar os nomes da busca
webElems <- remDr$findElements(using = 'css selector', "li.g h3.r a")
resHeaders <- unlist(lapply(webElems, function(x){x$getElementText()}))
resHeaders
webElem <- webElems[[which(resHeaders == "Chamadas P?blicas")]]
webElem$clickElement()
#Identificar os nomes da busca no site do IPEA
webElems <- remDr$findElements(using = 'css selector', "div#conteudo p a")
resHeaders <- unlist(lapply(webElems, function(x){x$getElementText()}))
resHeaders
#Clicar nas chamadas mais recentes
webElem <- webElems[[1]]
webElem$clickElement()
#Identificar nome das chamadas
webElems <- remDr$findElements(using = 'css selector', "div#conteudo li a")
resHeaders <- unlist(lapply(webElems, function(x){x$getElementText()}))
resHeaders
#Clicar nas chamadas mais recentes
webElem <- webElems[[1]]
webElem$clickElement()
#Identificar PDF
webElems <- remDr$findElements(using = 'css selector', "div#conteudo p a")
resHeaders <- unlist(lapply(webElems, function(x){x$getElementAttribute('href')}))
#Download do PDF
download.file(resHeaders[1],'Chamada.pdf',mode='wb')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment