Skip to content

Instantly share code, notes, and snippets.

@richeney
Last active January 22, 2024 05:28
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save richeney/971798291008b0c0793c67a0c70ba90f to your computer and use it in GitHub Desktop.
Save richeney/971798291008b0c0793c67a0c70ba90f to your computer and use it in GitHub Desktop.
Immutable Storage Accounts and Private Link Service

Immutable Storage Accounts, Private Link and Private Link Service

Introduction

This is a POC for immutability in a legal hold situation. The assumption is that an Azure Storage Account will be used as a target for images, documents etc. that need to be provably unchanged for a required legal period.

Immutability policy scope

Immutability policies can be scoped to a blob version or to a container. How an object behaves under an immutability policy depends on the scope of the policy.

  1. Time-based retention policy scope
  2. Legal hold scope

You can configure both at the same time. Time-based can be extended up to five times.

Legal hold allows new blob uploads (not updates / overwrites) whilst time-based stops everything bar read.

As we need legal hold then the storage account must be either

  • General-purpose v2
  • Premium block blob

Hierarchial namespace is supported. but only with container-level scope

Version-level scope

To configure an immutability policy that is scoped to a blob version, you must enable support for version-level immutability on either the storage account or a container.

Once version-level immutability is enabled you can configure a default policy at the account or container level, but only for time-based immutability. Legal hold must be applied on individual blobs.

The most flexible looks like it is version-level immutability, that works on a specific version. However policies can be enabled at the container level so that new blobs (or blob versions) will automatically be placed on legal hold.

Overview

On this page we'll create two storage accounts:

  1. container time-based immutability lock
  2. version level legal hold

And see what that means for REST API actions, reporting, etc.

Working directory

  1. Create a directory

    mkdir ~/immutable
  2. Move to it

    cd ~/immutable

Variables

  1. Input variables

    Customise if required.

    resource_group="immutable"
    location="uksouth"
  2. Set local defaults

    rgId="/subscriptions/$(az account show --query id -otsv)/resourceGroups/$resource_group"
    hash=$(md5sum <<< $rgId | cut -c1-12)
    
    az config set --local \
      defaults.group=$resource_group \
      defaults.location=$location \
      storage.legalhold=legalhold$hash \
      storage.timebased=timebased$hash

    The last command creates a .azure/config file in the current working directory.

    Note that storage.account and storage.auth_mode etc. are valid defaults, whereas storage.legalhold and storage.timebased are not. However the file will store those values and we can recall them easily. (Storage accounts need to be globally unique as they form part of the public endpoint's FQDN.)

  3. Suppress warnings (optional)

    This guide uses some newer features so there are warnings. If desired, these can be suppressed, but not in the local config.

    az config set core.only_show_errors=true
  4. Set auth mode

    You can also set defaults for the auth-mode, so that it purely uses RBAC role assignments. This is a good recommendation to avoid any issues with leaked access keys.

    az config set storage.auth_mode=true

Resource group

  1. Resource group and defaults

    az group create --name $(az config get defaults.group --local --query value -otsv)

Time-based immutable storage account

  1. Base storage account

    az storage account create \
        --name $(az config get storage.timebased --local --query value -otsv) \
        --allow-blob-public-access=false \
        --allow-cross-tenant-replication=true \
        --allow-shared-key-access=false \
        --https-only=true \
        --kind=StorageV2 \
        --min-tls-version=TLS1_2 \
        --public-network-access=Enabled \
        --default-action=Allow \
        --sku=Standard_LRS

    Ensure TLS 1.2 and https, plus RBAC permissions only. Versioning is not supported on Premium_LRS.

  2. Configure network rules

    az storage account network-rule add \
        --account-name $(az config get storage.timebased --local --query value -otsv) \
        --action=Allow \
        --ip-address=$(curl -sSL https://myexternalip.com/raw)
  3. Tighten access

    az storage account update \
        --name $(az config get storage.timebased --local --query value -otsv) \
        --public-network-access=Enabled \
        --default-action=Deny \
        --bypass AzureServices

    ⚠️ This could be tightened up further.

  4. Configure data protection

    Blob versioning.

    az storage account blob-service-properties update \
        --account-name $(az config get storage.timebased --local --query value -otsv) \
        --enable-versioning true

    Soft delete for blobs.

    az storage account blob-service-properties update \
        --account-name $(az config get storage.timebased --local --query value -otsv) \
        --enable-delete-retention true \
        --delete-retention-days 7 # 1-365

    Soft delete for containers

    az storage account blob-service-properties update \
        --account-name $(az config get storage.timebased --local --query value -otsv) \
        --enable-container-delete-retention true \
        --container-delete-retention-days 7 # 1-365

    Enable the change feed.

    az storage account blob-service-properties update \
        --account-name $(az config get storage.timebased --local --query value -otsv) \
        --enable-change-feed true \
        --change-feed-days 7 # 1-146000

    This is auto-enabled when you add a backup policy.

Consolidated commands

For convenience. Repeat of the above commands in a single code block.

storage_account=$(az config get storage.timebased --local --query value -otsv)
az storage account create \
    --name $storage_account \
    --allow-blob-public-access=false \
    --allow-cross-tenant-replication=true \
    --allow-shared-key-access=false \
    --https-only=true \
    --kind=StorageV2 \
    --min-tls-version=TLS1_2 \
    --public-network-access=Enabled \
    --default-action=Allow \
    --sku=Standard_LRS

az storage account network-rule add \
    --account-name $storage_account \
    --action=Allow \
    --ip-address=$(curl -sSL https://myexternalip.com/raw)

az storage account update \
    --name $storage_account \
    --public-network-access=Enabled \
    --default-action=Deny \
    --bypass AzureServices

az storage account blob-service-properties update \
    --account-name $storage_account \
    --enable-versioning true \
    --enable-delete-retention false \
    --enable-container-delete-retention false \
    --enable-change-feed true \
    --change-feed-days 7 \
    --enable-restore-policy false

unset storage_account

Note that --enable-delete-retention (point in time restore) cannot be used with time based immutability.

Legal hold immutable storage account

Defined at the storage account level with --enable-alw. Note the --immutability-period-in-days which sets the default immutability period for uploaded blobs. You can set these at the container level instead, but I'm assuming that these storage accounts are being created specifically for legal hold use.

Blob uploads can specify a different retention period. It is also possible to add a legal hold onto specific blob versions. I assume that the default account level time period will be used to establish base immutability across all uploaded blobs, and then the legal hold will be added to items of interest. Any uploads of a blob with the same name will create a new version.

Also note the immutability-state. This is set to unlocked to allow the default to be modified. Set to locked to harden. Can only extend at that point, up to five times.

storage_account=$(az config get storage.legalhold --local --query value -otsv)

az storage account create \
    --name $storage_account \
    --allow-blob-public-access=false \
    --allow-cross-tenant-replication=true \
    --allow-shared-key-access=false \
    --https-only=true \
    --kind=StorageV2 \
    --min-tls-version=TLS1_2 \
    --public-network-access=Enabled \
    --default-action=Allow \
    --sku=Standard_LRS \
    --enable-alw \
    --immutability-period-in-days 2 \
    --immutability-state unlocked \
    --allow-protected-append-writes true

az storage account network-rule add \
    --account-name $storage_account \
    --action=Allow \
    --ip-address=$(curl -sSL https://myexternalip.com/raw)

az storage account update \
    --name $storage_account \
    --public-network-access=Enabled \
    --default-action=Deny \
    --bypass AzureServices

az storage account blob-service-properties update \
    --account-name $storage_account \
    --enable-versioning true \
    --enable-delete-retention false \
    --enable-container-delete-retention false \
    --enable-change-feed true \
    --change-feed-days 7 \
    --enable-restore-policy false

Note that point in time restores cannot be configured at the same time as Azure Blobs backup. We'll use that service as the backups can also be immutable.

Azure Blobs backup

Note that Azure Blob backup uses Azure Backup Vault rather than Azure Recovery Vault. Azure Backup Vault covers some newer backup scenarios such as Azure PostgreSQL and Azure Blob.

Using az backup vault create creates the older Recovery Services vaults. Use az dataprotection backup-vault create instead.

  1. Install the dataprotection extension

    az extension add --name dataprotection
  2. Create an Azure Backup Vault

    az dataprotection backup-vault create \
        --vault-name "immutable" \
        --storage-setting "[{type:'LocallyRedundant',datastore-type:'VaultStore'}]" \
        --azure-monitor-alerts-for-job-failures="Enabled" \
        --immutability-state="Unlocked" \
        --soft-delete-state="On" \
        --retention-duration-in-days=14 \
        --type="SystemAssigned"

    The immutability ensures that recovery points cannot be deleted from the Backup Vault before their expiry date. Valid values are Disabled, Unlocked and Locked.

    ⚠️ Setting to Locked is irreversible so consider the length of recovery points.

  3. Grant permissions on the Backup vault on the storage account

    Grab the backup vault's managed identity object id.

    managed_identity=$(az dataprotection backup-vault show --vault-name "immutable" --query identity.principalId -otsv)

    Grab the time based storage account's resource id.

    storage_account=$(az config get storage.timebased --local --query value -otsv)
    storage_account_id=$(az storage account show --name $storage_account --query id -otsv)

    Add the Storage Account Backup Contributor RBAC role assignment.

    az role assignment create \
      --role "Storage Account Backup Contributor" \
      --assignee $managed_identity \
      --scope $storage_account_id
  4. Repeat for the legal hold storage account

    managed_identity=$(az dataprotection backup-vault show --vault-name "immutable" --query identity.principalId -otsv)
    storage_account_id=$(az storage account show --name $(az config get storage.legalhold --local --query value -otsv) --query id -otsv)
    az role assignment create --role "Storage Account Backup Contributor" --assignee $managed_identity --scope $storage_account_id
  5. Create a backup policy

    Create a backup policy.json file from the template for Azure Blob.

    az dataprotection backup-policy get-default-policy-template --datasource-type AzureBlob > backup_policy.json

    Customise name and retention etc. if you wish. Example:

    {
      "datasourceTypes": [
        "Microsoft.Storage/storageAccounts/blobServices"
      ],
      "name": "MyDefaultPolicy",
      "objectType": "BackupPolicy",
      "policyRules": [
        {
          "isDefault": true,
          "lifecycles": [
            {
              "deleteAfter": {
                "duration": "P14D",
                "objectType": "AbsoluteDeleteOption"
              },
              "sourceDataStore": {
                "dataStoreType": "OperationalStore",
                "objectType": "DataStoreInfoBase"
              }
            }
          ],
          "name": "Default",
          "objectType": "AzureRetentionRule"
        }
      ]
    }

    Create the backup policy.

    az dataprotection backup-policy create --backup-policy-name DefaultPolicy \
      --policy backup_policy.json --vault-name immutable

    ⚠️ Don't use --backup-policy-name Default. (Ordefault.) This triggers an odd error.

  6. Configure backup for the time based storage account

    Grab the backup policy ID.

    backup_policy_id=$(az dataprotection backup-policy show --backup-policy-name DefaultPolicy \
        --vault-name immutable --query id -otsv)

    Grab the timebased storage account's resource id.

    storage_account=$(az config get storage.timebased --local --query value -otsv)
    storage_account_id=$(az storage account show --name $storage_account --query id -otsv)

    Create the backup_instance.json.

    az dataprotection backup-instance initialize --datasource-type AzureBlob \
        --policy-id $backup_policy_id --datasource-id $storage_account_id | tee timebased_backup_instance.json

    Create the backup instance.

    az dataprotection backup-instance create --vault-name immutable --backup-instance timebased_backup_instance.json
  7. Repeat for the legal hold storage account

    backup_policy_id=$(az dataprotection backup-policy show --backup-policy-name DefaultPolicy --vault-name immutable --query id -otsv)
    storage_account_id=$(az storage account show --name $(az config get storage.legalhold --local --query value -otsv) --query id -otsv)
    az dataprotection backup-instance initialize --datasource-type AzureBlob --policy-id $backup_policy_id --datasource-id $storage_account_id > legalhold_backup_instance.json
    az dataprotection backup-instance create --vault-name immutable --backup-instance legalhold_backup_instance.json

    Uses a JSON string rather than a templated file.

Storage blob inventory

https://learn.microsoft.com/azure/storage/blobs/blob-inventory

Enable blob inventory. Supports CSV and Apache Parquet format, which is more efficient for ingestion into Azure Databricks. The daily or weekly inventory is accompanied by an Event Grid trigger.

Use rules and filters at the blob level as Content-MD5 is not supported at the container level.

Storage containers

You can create multiple containers.

It is also possible to use hierarchical namespace, but this then adds some limitations.

Time-based

You can set time based at the storage account level. Here we will do it at the container level.

Retention period between 1 and 146000 days. Note that you can permit additional writes or appends to block blobs with --allow-protected-append-writes or --allow-protected-append-writes-all.

  1. Get the storage account name

    The storage CLI commands do not seem to fully respect local config settings or environment variables.

    storage_account=$(az config get --local storage.timebased --query value -otsv)
  2. Create a container

    az storage container create --name "time-based"  \
      --account-name $storage_account --auth-mode login --public-access off
  3. Add a time based retention policy

    az storage container immutability-policy create \
        --account-name $storage_account --container-name "time-based" \
        --period 1

Once locked you can only extend up to five times.

  1. Example modification (optional)

    You can test and modify until locked.

    Get the ETag.

    etag=$(az storage container immutability-policy show \
      --account-name $storage_account --container-name time-based \
      --query etag -otsv)

    Then modify.

    az storage container immutability-policy extend \
        --account-name $storage_account --container-name time-based \
        --period 2 --if-match $etag

    ⚠️ Retest tomorrow. Getting "operation not allowed on immutability policy with current state" error, yet showing unlocked. Same day?

  2. Lock the time based policy

    Once locked you can only extend up to five times.

    Get the ETag.

    etag=$(az storage container immutability-policy show \
      --account-name $storage_account --container-name time-based \
      --query etag -otsv)

    Then lock.

    az storage container immutability-policy lock \
        --account-name $storage_account --container-name time-based \
        --if-match $etag

Legal hold

  1. Get the storage account name

    storage_account=$(az config get --local storage.legalhold --query value -otsv)
  2. Create a container

    Note that this uses the container-rm subcommand.

    az storage container-rm create \
    --name legal-hold \
    --storage-account $storage_account  \
    --public-access off --enable-vlw
  3. Check (optional)

    az storage container-rm show \
        --storage-account $storage_account \
        --name legal-hold \
        --query '[immutableStorageWithVersioning.enabled]' \
        --output tsv
  4. What about one with a different default period?

az storage container immutability-policy create \
    --account-name <storage-account> \
    --container-name <container> \
    --period <retention-interval-in-days> \
    --allow-protected-append-writes true
  1. Add a legal hold to the container- DON'T DO THIS!!!

    Legal hold can be applied at a container level, or on individual blob versions. As this is a target for storing legal hold info then suggest container level.

    az storage container legal-hold set \
      --account-name $storage_account \
      --container-name legal-hold \
      --tags tag1 tag2 \
      --allow-protected-append-writes-all true

    Note that the tags are required.

Adding blobs

Working with blobs needs Storage Blob Data Owner, Storage Blob Data Contributor, or Storage Blob Data Reader. (Or a custom role.)

  1. Get the storage account name

    storage_account=$(az config get --local storage.account --query value -otsv)
  2. Add yourself as a Blob Contributor.

    az role assignment create \
        --role "Storage Blob Data Contributor" \
        --scope $(az storage account show --name $storage_account --query id -otsv) \
        --assignee $(az ad signed-in-user show --query id -otsv)
  3. Calculate the md5sum

  4. Single file example

    az storage blob upload \
      --file "./Partner Admin Link - Partner Ready FAQ_April28_2022.docx" \
      --account-name $storage_account --container-name time-based --auth-mode login
  5. Batch file example

    az storage blob upload-batch \
    --source ./my_folder --pattern *.txt \
    --destination legal-hold --destination-path my_folder \
    --account-name $storage_account --auth-mode login

    All *.vhd files go to page, otherwise block. (Can control with --type.) Can also choose --if-(un)modified-since with a UTC datetime. (E.g. YYYY-MM-DDThh:mmZ, or date -d "7 days ago" '+%Y-%m-%dT%H:%MZ')

Checksums

All files get uploaded with an automatic md5sum.

⚠️ Note that this needs checking for larger files.

  1. List blobs with checksum

    az storage blob list \
        --container-name legal-hold --account-name $storage_account \
        --query "[].{name:name, md5:properties.contentSettings.contentMd5}" \
        --auth-mode login

    Note that the contentMd5 value is a base64 encoded representation of the binary MD5 hash value.

  2. Example download

    az storage blob download --name "my_folder/my_blob_file.ext" \
        --container-name legal-hold --account-name $storage_account \
        --file "my_local_file.ext" --auth-mode login
  3. Bash checksum example

    contentMd5=$(md5sum --binary "my_local_file.ext" | awk '{print $1}' | xxd -p -r | base64)
  4. PowerShell checksum example

    $FilePath = ".\my_local_file.ext"
    $rawMD5 = (Get-FileHash -Path $FilePath -Algorithm MD5).Hash
    $hashBytes = [system.convert]::FromHexString($rawMD5)
    $contentMd5 = [system.convert]::ToBase64String($hashBytes)
  5. Filter container based on md5sum base64 value and count array length

    az storage blob list --container-name legal-hold --account-name $storage_account \
    --query "[?properties.contentSettings.contentMd5 == '$contentMd5'] | length(@)" \
    --auth-mode login

    Should return one if checksum matches, zero if not. If more than one then multiple matches.

Audit logging

https://learn.microsoft.com/azure/storage/blobs/immutable-legal-hold-overview#audit-logging

Each container with a legal hold in effect provides a policy audit log. The log contains the user ID, command type, time stamps, and legal hold tags. The audit log is retained for the lifetime of the policy, in accordance with the SEC 17a-4(f) regulatory guidelines.

The Azure Activity log provides a more comprehensive log of all management service activities. Azure resource logs retain information about data operations. It's the user's responsibility to store those logs persistently, as might be required for regulatory or other purposes.

TODO: Add this to a specific container or storage account.

Clear

az storage container legal-hold clear \
    --tags tag1 tag2 \
    --container-name <container> \
    --account-name <storage-account> \
    --resource-group <resource-group> \
    --auth-mode login

List blob versions

storageAccount="<storage-account>"
containerName="<container-name>"

az storage blob list \
    --container-name $containerName \
    --prefix "ab" \
    --query "[[].name, [].versionId]" \
    --account-name $storageAccount \
    --include v \
    --auth-mode login \
    --output tsv

Links

References

Additional references

Discarded

Legal hold

  1. Get the storage account name

    storage_account=$(az config get --local storage.legalhold --query value -otsv)
  2. Create a container

    az storage container create --name legal-hold \
      --account-name $storage_account --auth-mode login --public-access off
  3. Add a legal hold to the containe- DON'T DO THIS!!!

    Legal hold can be applied at a container level, or on individual blob versions. As this is a target for storing legal hold info then suggest container level.

    az storage container legal-hold set \
      --account-name $storage_account \
      --container-name legal-hold \
      --tags tag1 tag2 \
      --allow-protected-append-writes-all true

    Note that the tags are required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment