Skip to content

Instantly share code, notes, and snippets.

@BenjaminWolfe
Last active August 31, 2020 17:30
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save BenjaminWolfe/c20b9654e76df0af919c0be3e678394e to your computer and use it in GitHub Desktop.
Save BenjaminWolfe/c20b9654e76df0af919c0be3e678394e to your computer and use it in GitHub Desktop.
Nested Game of Thrones Data - Snowflake + R
library(tidyverse)
library(repurrrsive) # game of thromes dataset
library(listviewer) # jsonedit for pretty viewing
jsonedit(got_chars, mode = "view")
really_nested <- tibble(id = 1:30, nested_stuff = got_chars)
really_nested %>%
hoist(
nested_stuff,
name = list("name"),
primary_title = list("titles", 1)
)
-- in Snowflake:
select id
, nested_stuff:name::string as name -- end with double-colon and data type
, nested_stuff:title[0]::string as primary_title -- zero-based
from really_nested;
# Snowflake seems to prefer ALL CAPS;
# if you run into errors try using them, and/or putting field names in quotes.
library(tidyverse)
library(rJava) # connect to Java
library(dbplyr) # database back end for the tidyverse's dplyr
library(dplyr.snowflakedb) # the snowflake connector
# I set these connection parameters in my .Renviron file;
# there are better ways to do it
# see https://github.com/snowflakedb/dplyr-snowflakedb for more on connecting to your DB
options(dplyr.jdbc.classpath = Sys.getenv("SNOWFLAKE_DRIVER"))
snowflake_connection <-
src_snowflakedb(
user = Sys.getenv("SNOWFLAKE_USER"),
password = Sys.getenv("SNOWFLAKE_PWORD"),
account = Sys.getenv("SNOWFLAKE_ACCOUNT"),
host = Sys.getenv("SNOWFLAKE_URL"),
opts = list(
warehouse = "GOT_WAREHOUSE",
db = "GOT_DB"
)
)
# here's where the magic happens;
# notice how sql() injects Snowflake SQL to query the JSON,
# analogous to tidyr::hoist()
snowflake_connection %>%
tbl(in_schema("GOT_SCHEMA", "REALLY_NESTED")) %>%
mutate(
name = sql('NESTED_STUFF:name::string'),
primary_title = sql('NESTED_STUFF:title[0]::string')
) %>%
rename_all(str_to_lower) # I prefer lowercase!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment