Jenny Bryan
23 May, 2015
library("readxl")
I'm getting puzzling behavior from the sheet =
argument of readxl::read_excel()
with a small spreadsheet I use for examples and tests in googlesheets
. The file is mini-gap.xlsx and it was created by downloading a Google Sheet as an xlsx
file. Basically, I don't get the sheet I'm requesting.
mini_gap <- read_excel("mini-gap.xlsx")
#read_excel(system.file("mini-gap.xlsx", package = "googlesheets"))
#https://github.com/jennybc/googlesheets/blob/master/inst/mini-gap.xlsx
str(mini_gap)
## Classes 'tbl_df', 'tbl' and 'data.frame': 5 obs. of 6 variables:
## $ country : chr "Albania" "Austria" "Belgium" "Bosnia and Herzegovina" ...
## $ continent: chr "Europe" "Europe" "Europe" "Europe" ...
## $ year : num 1952 1952 1952 1952 1952
## $ lifeExp : num 55.2 66.8 68 53.8 59.6
## $ pop : num 1282697 6927772 8730405 2791000 7274900
## $ gdpPercap: num 1601 6137 8343 974 2444
mini_gap$continent
## [1] "Europe" "Europe" "Europe" "Europe" "Europe"
The "Europe" sheet actually appears as the 4th tab. Why is it the default sheet, i.e. the first sheet?
Let's systematically request the sheets 1 through 5.
continents <- sapply(1:5, function(i) {
mini_gap <- read_excel("mini-gap.xlsx", sheet = i)
mini_gap$continent[1]
})
continents
## [1] "Europe" "Oceania" "Americas" "Africa" "Asia"
The tabs are arranged alphabetically in the spreadsheet but when read_excel()
access them via the integers 1 through 5, we get Europe (4), Oceania (5), Americas (2), Africa (1), Asia (3).
Let's systematically request the sheet by name.
(continents <- sort(continents))
## [1] "Africa" "Americas" "Asia" "Europe" "Oceania"
sapply(continents, function(i) {
mini_gap <- read_excel("mini-gap.xlsx", sheet = i)
mini_gap$continent[1]
})
## Africa Americas Asia Europe Oceania
## "Europe" "Oceania" "Americas" "Africa" "Asia"
When I access the sheets by name, I get the same strange behavior.
If I open mini-gap.xlsx
in Excel and save it the file, this weird behavior goes away. I've confirmed all of this with a second sheet downloaded as xlsx
from Google Sheets. I don't know if Google writes non-standard xlsx
or ???