Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save patternproject/0a7ade8fa3d85453076d9bafc2087127 to your computer and use it in GitHub Desktop.
Save patternproject/0a7ade8fa3d85453076d9bafc2087127 to your computer and use it in GitHub Desktop.
How to successively combine years and months using purrr
Hi,
-----------------------------------
PROBLEM STATEMENT
-----------------------------------
I want to generate URLS for each month for each year as follows:
[...]
https://s3.amazonaws.com/data/201611.csv.zip
https://s3.amazonaws.com/data/201612.csv.zip
https://s3.amazonaws.com/data/201701.csv.zip
https://s3.amazonaws.com/data/201702.csv.zip
[...]
-----------------------------------
CODE
-----------------------------------
Here is my attempt:
head.f.name = "https://s3.amazonaws.com/data/%sA"
tail.f.name = ".csv.zip"
v.i = str_pad(1:12,2,pad="0")
map_chr(2015:2017, ~sprintf(head.f.name,.)) %>%
map2_chr(v.i,~str_replace(.,"[:upper:]$") )
-----------------------------------
ERROR
-----------------------------------
But I get an error:
Error: `.x` (3) and `.y` (12) are different lengths
Should not the smaller vector be recycled in this case.
@jennybc
Copy link

jennybc commented Apr 27, 2017

I would just do this:

I put in a tweet reply, but Twitter sort of mangles because of the URL.

df <- expand.grid(m = 1:12, y = 2015:2016)
sprintf("https://s3.amazonaws.com/data/%d%02d.csv.zip", df$y, df$m)
#>  [1] "https://s3.amazonaws.com/data/201501.csv.zip"
#>  [2] "https://s3.amazonaws.com/data/201502.csv.zip"
#>  [3] "https://s3.amazonaws.com/data/201503.csv.zip"
#>  [4] "https://s3.amazonaws.com/data/201504.csv.zip"
#>  [5] "https://s3.amazonaws.com/data/201505.csv.zip"
#>  [6] "https://s3.amazonaws.com/data/201506.csv.zip"
#>  [7] "https://s3.amazonaws.com/data/201507.csv.zip"
#>  [8] "https://s3.amazonaws.com/data/201508.csv.zip"
#>  [9] "https://s3.amazonaws.com/data/201509.csv.zip"
#> [10] "https://s3.amazonaws.com/data/201510.csv.zip"
#> [11] "https://s3.amazonaws.com/data/201511.csv.zip"
#> [12] "https://s3.amazonaws.com/data/201512.csv.zip"
#> [13] "https://s3.amazonaws.com/data/201601.csv.zip"
#> [14] "https://s3.amazonaws.com/data/201602.csv.zip"
#> [15] "https://s3.amazonaws.com/data/201603.csv.zip"
#> [16] "https://s3.amazonaws.com/data/201604.csv.zip"
#> [17] "https://s3.amazonaws.com/data/201605.csv.zip"
#> [18] "https://s3.amazonaws.com/data/201606.csv.zip"
#> [19] "https://s3.amazonaws.com/data/201607.csv.zip"
#> [20] "https://s3.amazonaws.com/data/201608.csv.zip"
#> [21] "https://s3.amazonaws.com/data/201609.csv.zip"
#> [22] "https://s3.amazonaws.com/data/201610.csv.zip"
#> [23] "https://s3.amazonaws.com/data/201611.csv.zip"
#> [24] "https://s3.amazonaws.com/data/201612.csv.zip"

@drsimonj
Copy link

I think the base R solution from @jennybc is the best way to go. However, just for variety sake, here's a possible approach using purrr:

library(purrr)
url <- "https://s3.amazonaws.com/data/%d%02d.csv.zip"

d <- cross2(2015:2016, 1:12)
map_chr(d, ~ sprintf(url, .[[1]], .[[2]]))
#>  [1] "https://s3.amazonaws.com/data/201501.csv.zip"
#>  [2] "https://s3.amazonaws.com/data/201601.csv.zip"
#>  [3] "https://s3.amazonaws.com/data/201502.csv.zip"
#>  [4] "https://s3.amazonaws.com/data/201602.csv.zip"
#>  ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment