Skip to content

Instantly share code, notes, and snippets.

@andaryjo
Last active March 26, 2024 20:57
Show Gist options
  • Save andaryjo/9be7e89fc1cc424dc6c96e605c9aa1bc to your computer and use it in GitHub Desktop.
Save andaryjo/9be7e89fc1cc424dc6c96e605c9aa1bc to your computer and use it in GitHub Desktop.
Azure Tales: Resource locks don't do what you may think they do

Azure Tales: Resource locks don't do what you may think they do

Before we start, names can be confusing, so let's make sure that we are on the same page. By resource locks I mean the locks that lock Azure resources to prevent their modification or deletion. Depending on what Azure client you use, you might also know them as

  • "management_lock" (in the Azure Terraform provider)
  • "lock" (in the Azure Portal and the docs)
  • "ResourceLock" (in the Azure PowerShell client)
  • "Microsoft.Authorization/locks" (in the API data model)

There are already plenty of articles out there that make clear why Azure resource locks are a major footgun:

One big issue with resource locks is that they - similar to Azure policies - get evaluated purely based on the HTTP API request to the Azure Resource Manager even before that request hits the corresponding resource provider. For example, locking a Microsoft.Network/virtualNetworks resource with a ReadOnly lock would directly prevent all write API requests to that resource.

This allows the lock to work without any resource-specific implementation by the team that is responsible for VNets, but at the same time this breaks a lot of resource-specific behavior which probably comes from times before resource locks existed. Every feature that has been implemented using a POST request suddenly stops to work, regardless of whether that would be a logical write operation on that resource.

Azure lists a lof of these cases in their resource locks documentation. For example:

A read-only lock on an Application Gateway prevents you from getting the backend health of the application gateway. That operation uses a POST method, which a read-only lock blocks.

The funniest one to me is:

A can-not-delete lock on a resource group prevents Azure Resource Manager from automatically deleting deployments in the history. If you reach 800 deployments in the history, your deployments will fail.

But there is another big issue with resource locks: You may misunderstand what they actually do.

Resource locks only prevent accidental deletions or modifications

The docs actually tell you this:

protect them from accidental user deletions and modifications

"accidental" - this is something you might glance over easily. And the docs unfortunately never go into detail what it means. Why only "accidental"? Don't they protect resources from all deletions and modifications?

First of all, while can-not-delete locks prevent deletion of the resource group when you locked a resource within that resource group, they don't prevent deletion of the subscription. The docs make that pretty clear:

A resource lock doesn't block the subscription cancellation. Azure preserves your resources by deactivating them instead of immediately deleting them.

But that's it, right? Easy, you might think, simply don't allow your users to cancel subscriptions then.

Protecting routes in a route table from deletion

I'd like to take a look at a specific use case: Back in the days, the team and I were building a platform based on the hub & spoke networking topology. We wanted to protect specific routes in the spoke's route table from unauthorized deletion since without these routes spoke workloads would be able to send traffic directly to the Internet without going through our firewall. But since the spoke teams need to be able to add (and also delete) some routes on their own (for example for deploying an AKS cluster in their spokes), can-not-delete locks should only apply to some routes and not the whole route table.

Let's try this then. Following API request creates an empty route table.

$ % cat ./new-rt.json
{
  "location": "westeurope"
}

$ az rest --method PUT --url "https://management.azure.com/subscriptions/mysub/resourceGroups/myrg/providers/Microsoft.Network/routeTables/rt-hello?api-version=2023-09-01" --body @new-rt.json
{
  "etag": "W/\"xxx\"",
  "id": "/subscriptions/mysub/resourceGroups/myrg/providers/Microsoft.Network/routeTables/rt-hello",
  "location": "westeurope",
  "name": "rt-hello",
  "properties": {
    "disableBgpRoutePropagation": false,
    "provisioningState": "Succeeded",
    "resourceGuid": "xxx",
    "routes": []
  },
  "type": "Microsoft.Network/routeTables"
}

Now let's create a route in that route table and put a lock on it.

$ cat ./new-route.json
{
  "properties": {
    "addressPrefix": "0.0.0.0/0",
    "nextHopType": "VirtualAppliance",
    "nextHopIpAddress": "10.0.0.1"
  }
}

$ az rest --method PUT --url "https://management.azure.com/subscriptions/mysub/resourceGroups/myrg/providers/Microsoft.Network/routeTables/rt-hello/routes/catch-all?api-version=2023-09-01" --body @new-route.json
{
  "etag": "W/\"xxx\"",
  "id": "/subscriptions/mysub/resourceGroups/myrg/providers/Microsoft.Network/routeTables/rt-hello/routes/catch-all",
  "name": "catch-all",
  "properties": {
    "addressPrefix": "0.0.0.0/0",
    "hasBgpOverride": false,
    "nextHopIpAddress": "10.0.0.1",
    "nextHopType": "VirtualAppliance",
    "provisioningState": "Succeeded"
  },
  "type": "Microsoft.Network/routeTables/routes"
}

$ cat ./new-lock.json
{
  "properties": {
    "level": "CanNotDelete"
  }
}

$ az rest --method "PUT" --url "https://management.azure.com/subscriptions/mysub/resourcegroups/myrg/providers/Microsoft.Network/routeTables/rt-hello/routes/catch-all/providers/Microsoft.Authorization/locks/handsoff?api-version=2016-09-01" --body @new-lock.json
{
  "id": "/subscriptions/mysub/resourcegroups/myrg/providers/Microsoft.Network/routeTables/rt-hello/routes/catch-all/providers/Microsoft.Authorization/locks/handsoff",
  "name": "handsoff",
  "properties": {
    "level": "CanNotDelete"
  },
  "type": "Microsoft.Authorization/locks"
}

And now we try to delete that route.

$ az rest --method DELETE --url "https://management.azure.com/subscriptions/mysub/resourceGroups/myrg/providers/Microsoft.Network/routeTables/rt-hello/routes/catch-all?api-version=2023-09-01"
Conflict({"error":{"code":"ScopeLocked","message":"The scope '/subscriptions/mysub/resourceGroups/myrg/providers/Microsoft.Network/routeTables/rt-hello/routes/catch-all' cannot perform delete operation because following scope(s) are locked: '/subscriptions/mysub/resourcegroups/myrg/providers/Microsoft.Network/routeTables/rt-hello/routes/catch-all'. Please remove the lock and try again."}})

So far, so good. But this is where it gets complicated. In Azure, there is something we call "in-line resources", which are resources that are part of the defintion of other resources. For example, instead of reading that catch-all route I just created, I can read the whole route table and will retrieve a list of all routes:

$ az rest --method GET --url "https://management.azure.com/subscriptions/mysub/resourceGroups/myrg/providers/Microsoft.Network/routeTables/rt-hello?api-version=2023-09-01"
{
  "etag": "W/\"xxx\"",
  "id": "/subscriptions/mysub/resourceGroups/myrg/providers/Microsoft.Network/routeTables/rt-hello",
  "location": "westeurope",
  "name": "rt-hello",
  "properties": {
    "disableBgpRoutePropagation": false,
    "provisioningState": "Succeeded",
    "resourceGuid": "xxx",
    "routes": [
      {
        "etag": "W/\"xxx\"",
        "id": "/subscriptions/mysub/resourceGroups/myrg/providers/Microsoft.Network/routeTables/rt-hello/routes/catch-all",
        "name": "catch-all",
        "properties": {
          "addressPrefix": "0.0.0.0/0",
          "hasBgpOverride": false,
          "nextHopIpAddress": "10.0.0.1",
          "nextHopType": "VirtualAppliance",
          "provisioningState": "Succeeded"
        },
        "type": "Microsoft.Network/routeTables/routes"
      }
    ]
  },
  "type": "Microsoft.Network/routeTables"
}

But what happens when I perform a PUT request to the whole route table and simply remove the routes from the resource?

$ cat ./update-rt.json
{
  "location": "westeurope",
  "name": "rt-hello",
  "properties": {
    "disableBgpRoutePropagation": false,
    "provisioningState": "Succeeded",
    "routes": []
  },
  "type": "Microsoft.Network/routeTables"
}

$ az rest --method PUT --url "https://management.azure.com/subscriptions/mysub/resourceGroups/myrg/providers/Microsoft.Network/routeTables/rt-hello?api-version=2023-09-01" --body @update-rt.json
{
  "etag": "W/\"xxx\"",
  "id": "/subscriptions/mysub/resourceGroups/myrg/providers/Microsoft.Network/routeTables/rt-hello",
  "location": "westeurope",
  "name": "rt-hello",
  "properties": {
    "disableBgpRoutePropagation": false,
    "provisioningState": "Succeeded",
    "resourceGuid": "xxx",
    "routes": []
  },
  "type": "Microsoft.Network/routeTables"
}

Wait, did I just delete the route?

azure-tales-resource-locks-portal-route

The funny thing is, the lock is still there:

azure-tales-resource-locks-portal-lock

To explain what just happened here: By removing the route from the routes array in the route table, you implicitly deleted the route without actually needing to make an API request to the scope that has been locked by the can-not-delete lock. The resource provider deleted that route for you.

When you look into the activity logs for the route table, you won't find a log entry for that route deletion. It just shows up as "Create or Update Route Table".

The same thing would also be possible with resource modifications to bypass read-only locks.

Resource providers do what they want

Now this bypass only applies to these "in-line resources". But actually, these resources are only part of a bigger issue with Azure resource providers.

Over the years we have seen multiple times that resource providers take the original API request that passed the lock evaluation and then proceed to do some implicit stuff which the Azure Resource Manager has no knowledge of. Be it modifying or deleting child resources (azsh.it/22), updating networking configurations (azsh.it/107) or populating implict defaults (azsh.it/108).

Azure policy implementations actually suffer from the same behavior.

Conclusion

You can clearly see that Azure resource locks can in fact not protect resources from unauthorized or malicious deletions or modifications, which - to be fair - the documentation never claims.

But googling for tutorials on resource locks or even asking Microsoft's Bing Copilot might paint a totally different picture. And if you're not an experienced Azure user or just don't know about resource provider quirks, you probably don't see why you shouldn't be able to use locks for exactly that.

You might think, alright, simply pay attention to those resources that are vulnerable to these bypasses. The thing is, there is no complete list for them. So far we have encountered only a handful of these resources, but I would not be surprised if I were to find new ones tomorrow.

In the end it comes down to this: Are you absolutely certain that there is no other way to modify or delete your resource than through its corresponding API endpoint?

¯_(ツ)_/¯

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment