Skip to content

Instantly share code, notes, and snippets.

@andrie
Last active November 20, 2016 12:43
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save andrie/a9ff69a274963c70b72f to your computer and use it in GitHub Desktop.
Save andrie/a9ff69a274963c70b72f to your computer and use it in GitHub Desktop.
Functions to read Google Docs spreadsheet into R
library(XML)
cleanGoogleTable <- function(dat, table=1, skip=0, ncols=NA, nrows=-1, header=TRUE, dropFirstCol=NA){
if(!is.data.frame(dat)){
dat <- dat[[table]]
}
if(is.na(dropFirstCol)) {
firstCol <- na.omit(dat[[1]])
if(all(firstCol == ".") || all(firstCol== as.character(seq_along(firstCol)))) {
dat <- dat[, -1]
}
} else if(dropFirstCol) {
dat <- dat[, -1]
}
if(skip > 0){
dat <- dat[-seq_len(skip), ]
}
if(nrow(dat) == 1) return(dat)
if(nrow(dat) >= 2){
if(all(is.na(dat[2, ]))) dat <- dat[-2, ]
}
if(header && nrow(dat) > 1){
header <- as.character(dat[1, ])
names(dat) <- header
dat <- dat[-1, ]
}
# Keep only desired columns
if(!is.na(ncols)){
ncols <- min(ncols, ncol(dat))
dat <- dat[, seq_len(ncols)]
}
# Keep only desired rows
if(nrows > 0){
nrows <- min(nrows, nrow(dat))
dat <- dat[seq_len(nrows), ]
}
# Rename rows
rownames(dat) <- seq_len(nrow(dat))
dat
}
readGoogleSheet <- function(url, na.string="", header=TRUE){
stopifnot(require(XML))
# Suppress warnings because Google docs seems to have incomplete final line
suppressWarnings({
doc <- paste(readLines(url), collapse=" ")
})
if(nchar(doc) == 0) stop("No content found")
htmlTable <- gsub("^.*?(<table.*</table).*$", "\\1>", doc)
ret <- readHTMLTable(htmlTable, header=header, stringsAsFactors=FALSE, as.data.frame=TRUE)
lapply(ret, function(x){ x[ x == na.string] <- NA; x})
}
@colinglaes
Copy link

Hey Andrie,

I would love to use this code for a project but was wondering if there was any way to use a private google doc so that I don't have to publish it online, the information i want to utilize is sensitive.

Thanks,
Colin

@feelosophy13
Copy link

I have the same issue as colinglaes: trying to read a private spreadsheet.

@brohaut
Copy link

brohaut commented Mar 22, 2016

Hi, It works if you use the sharable link from google doc, without web publishing
best

@brohaut
Copy link

brohaut commented Mar 22, 2016

Humm... does anyone know how to import more than 99 lines?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment