Skip to content

Instantly share code, notes, and snippets.

@lindacmsheard
Last active September 14, 2021 08:24
Show Gist options
  • Save lindacmsheard/b10a562b4b5f03488770870fc2a45ea9 to your computer and use it in GitHub Desktop.
Save lindacmsheard/b10a562b4b5f03488770870fc2a45ea9 to your computer and use it in GitHub Desktop.
Connect Azure ML to Azure Datalake gen 2 with service principal

Connect Azure ML to Azure DataLake Gen 2 with a Service Principal

Note that service principal role assignments may take a short while to become available, so give it a few minutes before testing the access.

Create a service principal

az ad sp create-for-rbac --name myAMLWorkspaceRep

Note the appId (client ID) and password (client secret) returned:

(Provide these to the user who will configure data stores in Azure ML)

{
  "appId": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
  "displayName": "myAMLWorkspaceRep",
  "name": "http://myAMLWorkspaceRep",
  "password": "abcdefghijklmnopqrstuvwxyz1234567890",
  "tenant": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
}

Prepare the service principal specifically for data access:

Remove the default role assignment of Contributor at the subscription level (to emulate future behaviour - as of 04/2021)

az role assignment delete --assignee <appId> --role "Contributor"

Assign write access to a specific datalake account

az role assignment create --assignee <appId> --role "Storage Blob Data Contributor" --scope /subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/myresourcegroup/providers/Microsoft.Storage/storageAccounts/mydatalake

(Optional) Assign read access to storage accounts in a wider scope, e/g. the subscription

az role assignment create --assignee <appId> --role "Storage Blob Data Reader" --scope /subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx

Alternatively, create a multi-purpose Service Principal

assign contributor rights at subscription level (currently the default)

az role assignment create --assignee <appId> --role "Contributor"

Note that service principal role assignments may take a short while to become available.

Use the Service Principal when configuring Datastores in AML

In the create Datastore interface of AML,

  • choose a datastore name that represents a specific container within a data lake storage account.
  • select the datalake and container from the subscription
  • select authentication by Service Principal
  • provide the appId obtained above as the client ID
  • provide the password returned above as the client secret
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment