Skip to content

Instantly share code, notes, and snippets.

@flarco
Last active December 13, 2022 11:03
Show Gist options
  • Save flarco/02ea01a1f0bf3f34889dbd3c7f78e30d to your computer and use it in GitHub Desktop.
Save flarco/02ea01a1f0bf3f34889dbd3c7f78e30d to your computer and use it in GitHub Desktop.
Sling CLI Examples - Loading from a File System into a Database (https://docs.slingdata.io/sling-cli/)

Sling CLI Examples

  • This is to demonstrate the various ways Sling CLI is able to load files from a File System into a Database
  • Compatible file formats include: CSV, TSV, JSON and XML. Parquet to come soon.
  • Compatible File Systems include: AWS S3, Google Cloud Storage, Azure Blob Storage, DigitalOcean Spaces, BackBlaze B2, Cloudflare R2, Wasabi
  • Compatible database systems: MySQL, Clickhouse, BigTable, Postgres, Snowflake, BigQuery, Redshift, SQL Server, Oracle
  • Check here for all connector list

Objective:

  • Goal is to ingest a single CSV file from our local drive into a Database

Assumptions:

  • Sling CLI is installed (see here)
  • CSV file exists at path /tmp/accounts.csv (or C:/Temp/accounts.csv for Windows)
  • Connections named SNOWFLAKE and POSTGRES are setup for sling to read (see here)

Using File Path

With full-refresh mode:

# For Mac / Linux
sling run --src-stream file:///tmp/accounts.csv --tgt-conn SNOWFLAKE --tgt-object sling.accounts --mode full-refresh

# For Windows (Powershell)
sling run --src-stream file://C:/Temp/accounts.csv --tgt-conn SNOWFLAKE --tgt-object sling.accounts --mode full-refresh

Using stdin Pipe

# For Mac / Linux
cat /tmp/accounts.csv | sling run --tgt-conn POSTGRES --tgt-object sling.accounts --mode full-refresh

# For Windows (Powershell)
cat C:/Temp/accounts.csv | sling run --tgt-conn POSTGRES --tgt-object sling.accounts --mode full-refresh

Objective:

  • Goal is to ingest a single JSON file from our local drive into a Database, with full-refresh mode.

Assumptions:

  • Sling CLI is installed (see here)
  • CSV file exists at path /tmp/records.json (or C:/Temp/records.json for Windows)
  • Connections named BIGQUERY and MYSQL are setup for sling to read (see here)

Using File Path

With full-refresh mode:

# For Mac / Linux
sling run --src-stream file:///tmp/records.json --tgt-conn BIGQUERY --tgt-object sling.records --mode full-refresh

# For Windows (Powershell)
sling run --src-stream file://C:/Temp/records.json --tgt-conn BIGQUERY --tgt-object sling.records --mode full-refresh

Using stdin Pipe

# For Mac / Linux
cat /tmp/records.json | sling run --tgt-conn MYSQL --tgt-object sling.records --mode full-refresh

# For Windows (Powershell)
cat C:/Temp/records.json | sling run --tgt-conn MYSQL --tgt-object sling.records --mode full-refresh

Objective:

  • Goal is to ingest a single CSV file from our SFTP connection into a Database

Assumptions:

  • Sling CLI is installed (see here)
  • CSV file exists at path /tmp/accounts.csv in our SFTP connection
  • Connections named MY_SFTP and POSTGRES are setup for sling to read (see here)
  • The host value for MY_SFTP is 11.12.23.45 (an IP). Could also be a hostname (site.com).

Using File Path

With full-refresh mode:

sling run --src-conn MY_SFTP --src-stream sftp://11.12.23.45:22/tmp/accounts.csv --tgt-conn SNOWFLAKE --tgt-object sling.accounts --mode full-refresh
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment