Skip to content

Instantly share code, notes, and snippets.

@richiefrost
Created September 10, 2019 17:01
Show Gist options
  • Save richiefrost/839ad8a7544b0064370caa3c0dc482df to your computer and use it in GitHub Desktop.
Save richiefrost/839ad8a7544b0064370caa3c0dc482df to your computer and use it in GitHub Desktop.
Pandas read_csv from Azure Data Lake with interactive login

How to read a Pandas Dataframe from Azure Data Lake

This demo is mostly meant as a proof of concept for use in something like a Jupyter notebook, like for exploratory data analysis, for example. You can also authenticate with lib.auth() using a Service Principal or username/password/tenant ID combo. In this demo, you can load a Pandas Dataframe into memory from a CSV file that resides on Azure Data Lake.

from azure.datalake.store import core, lib, multithread
import pandas as pd
class ADLSHelper:
def __init__(self, store_name='mystorename'):
"""
When initializing this helper, it will prompt you to do an interactive login to connect to your data lake store.
It uses Azure Active Directory for authentication, and you use the token returned from
your login process to connect to your Azure Data Lake instance.
You can also authenticate with username/password or ServicePrincipal for production.
"""
token = lib.auth()
self.client = core.AzureDLFileSystem(token, store_name=store_name)
def get_df(self, dataframe_path):
"""
Reads the Pandas Dataframe from your Azure Data Lake instance at the given path.
Dataframe is loaded into memory, not saved to disk.
"""
with self.client.open(dataframe_path) as dataframe_file_ptr:
df = pd.read_csv(dataframe_file_ptr)
return df
# Example
helper = ADLSHelper(store_name='dog_facts_fake_datalake')
# At this point you'll be asked to log in. You click a link to go to a separate screen and input a unique code generated here.
# Once you've connected, you can get the dataframe like so:
df = helper.get_df('/archive/interesting_facts/dog_facts.csv')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment