Skip to content

Instantly share code, notes, and snippets.

@RCura
Last active October 12, 2022 08:58
Show Gist options
  • Save RCura/c11433a697bc7660ed4cde37278fb580 to your computer and use it in GitHub Desktop.
Save RCura/c11433a697bc7660ed4cde37278fb580 to your computer and use it in GitHub Desktop.
library(tidyverse)
library(duckdb)
library(DBI)
# création de la BDD en mémoire
connexion <- dbConnect(drv = duckdb(), dbdir=":memory:")
# Création d'une vue correspondant au fichier (csv ici)
dbExecute(connexion, statement = "CREATE VIEW matable AS SELECT * FROM 'stations.csv'")
# Enregistrement de la table dans R
stations <- tbl(connexion, "matable")
# Interrogation classique
stations
# Source: table<matable> [?? x 69]
# Database: DuckDB 0.5.1 [robin@Windows 10 x64:R 4.2.1/:memory:]
id name slug uic uic8_sncf latitude longitude
<int> <chr> <chr> <int> <int> <dbl> <dbl>
1 1 Château-Arnoux—St-… chat… NA NA 44.1 6.00
2 2 Château-Arnoux—St-… chat… 8775123 87751230 44.1 6.00
3 3 Château-Arnoux Mai… chat… 8775122 87751222 44.1 6.01
4 4 Digne-les-Bains dign… 9921024 NA 44.4 6.35
5 6 Digne-les-Bains dign… 8775149 87751495 44.1 6.22
6 7 La Crau la-c… 8775561 87755611 43.1 6.07
7 8 Aire-sur-l’Adour aire… 8767104 87671040 43.7 -0.258
8 9 Cagnes-sur-Mer cagn… 8775632 87756320 43.7 7.15
9 10 Menton ment… NA NA 43.8 7.50
10 11 Menton ment… 8775648 87756486 43.8 7.49
# … with more rows, and 62 more variables: parent_station_id <int>,
# country <chr>, time_zone <chr>, is_city <lgl>,
# is_main_station <lgl>, is_airport <lgl>, is_suggestable <lgl>,
# country_hint <lgl>, main_station_hint <lgl>, sncf_id <chr>,
# sncf_tvs_id <chr>, sncf_is_enabled <lgl>, idtgv_id <chr>,
# idtgv_is_enabled <lgl>, db_id <int>, db_is_enabled <lgl>,
# busbud_id <chr>, busbud_is_enabled <lgl>, distribusion_id <chr>,
stations %>% group_by(country) %>% count(sort = TRUE) # Les résultats s'affichent mais ne sont pas enregistrés dans une variable
# Source: SQL [?? x 2]
# Database: DuckDB 0.5.1 [robin@Windows 10 x64:R 4.2.1/:memory:]
# Ordered by: desc(n)
country n
<chr> <dbl>
1 DE 7610
2 FR 7286
3 ES 4500
4 IT 3740
5 GB 2791
6 CH 740
7 BE 580
8 AT 461
9 NL 434
10 CZ 398
# … with more rows
# Enregistrement des résultats "en local"
monresultat <- stations %>%
group_by(country) %>%
count(sort = TRUE) %>%
collect()
monresultat
# A tibble: 37 × 2
country n
<chr> <dbl>
1 DE 7610
2 FR 7286
3 ES 4500
4 IT 3740
5 GB 2791
6 CH 740
7 BE 580
8 AT 461
9 NL 434
10 CZ 398
# … with 27 more rows
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment