Skip to content

Instantly share code, notes, and snippets.

@louisdorard
Created September 27, 2022 15:32
Show Gist options
  • Save louisdorard/79300a8c10d4f724100208fbda471228 to your computer and use it in GitHub Desktop.
Save louisdorard/79300a8c10d4f724100208fbda471228 to your computer and use it in GitHub Desktop.
# %% [markdown]
# # Dataiku API Quickstart
#
# ## Load secrets
#
# * Option 1: from environment variables
# %%
DKU_DSS_URL = os.getenv("DKU_DSS_URL")
DKU_API_KEY = os.getenv("DKU_API_KEY")
# %% [markdown]
# * Option 2: from config file
# %%
from json5 import load
config = load(open(os.path.expanduser("~") + "/.dataiku/config.json"))
DKU_DSS_URL = config["dss_instances"][config["default_instance"]]["url"]
DKU_API_KEY = config["dss_instances"][config["default_instance"]]["api_key"]
# %% [markdown]
# ## Install Dataiku package
# %%
import os
os.system("pip install " + DKU_DSS_URL + "public/packages/dataiku-internal-client.tar.gz")
os.system("pip install dataiku-api-client")
# %% [markdown]
# ## Call Dataiku API
# %%
import dataiku
dataiku.set_remote_dss(DKU_DSS_URL, DKU_API_KEY)
# %% [markdown]
# Get a dataframe from a Dataiku dataset.
#
# (Make sure you have the "DKU_TSHIRTS" project by clicking on "+ New Project" from your Dataiku instance, choosing "Sample projects" then "Dataiku TShirts".)
# %%
project_name = "DKU_TSHIRTS"
dataset_name = "crm_and_web_history_enriched"
dataset = dataiku.Dataset(dataset_name, project_name)
df = dataset.get_dataframe()
df
# %% [markdown]
# Explore the Dataiku API: here we look into the dataset's configuration and its schema.
# %%
from pandas import DataFrame
columns_list = DataFrame(dataset.get_config()["schema"]["columns"])["name"].to_list()
columns_list
# %% [markdown]
# Read more at https://doc.dataiku.com/dss/latest/python-api/index.html
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment