Skip to content

Instantly share code, notes, and snippets.

@awhstin
Created March 23, 2016 18:32
Show Gist options
  • Save awhstin/6a9b0d7ca5e7ed23ede4 to your computer and use it in GitHub Desktop.
Save awhstin/6a9b0d7ca5e7ed23ede4 to your computer and use it in GitHub Desktop.
Loop for web scraping from 2 column dataframe.
library(syuzhet)
library(rvest)
library(plyr)
news<- read.csv("H:/news.csv", stringsAsFactors=FALSE)
newslist<-NULL
for(i in 1:nrow(news)){
article<-read_html(news[i,1])%>%
xml_nodes(news[i,2]) %>%
html_text()
article<-gsub("[^[:alnum:]]", " ", article)
alist<-as.list(article)
dat<- do.call(rbind.fill.matrix,alist)
sentiment<-(get_sentiment(as.vector(na.omit(dat)), method = "bing"))
newssource<-data.frame(article,news[i,1],sentiment,stringsAsFactors = FALSE)
rename(newssource, c('news.i..1.'='Source'))
newslist<-rbind(newslist,newssource)
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment