Azure Subnet Delegation: The Three Words That Break Deployments

I’ve been working with a customer who wants to migrate from Azure SQL Server to Azure SQL Managed Instance. It was the right choice for them – they want to manage multiple databases, so moving away from the DTU Model combined with the costs of running each database indepenently made this a simple choice.

So, lets go set up the deployment. We’ll just deploy it into the same subnet, as it will make life easier during the migration phase……

And it failed. So like most teams would, everyone went looking in the usual places:

  • Was it the NSG?
  • Was it the route table?
  • Was there an address space overlap somewhere?
  • Had DNS been configured incorrectly?
  • Was there some hidden policy assignment blocking the deployment?

The problem was three words in the Azure documentation that nobody on the team had flagged: requires subnet delegation.

And that was it. A deployment failure caused by something that takes about ninety seconds to fix when you know what you’re looking for.

The frustrating part is not that subnet delegation exists. In fairness, Azure has good reasons for it. The frustrating part is that it often surfaces as a deployment failure that sends you in entirely the wrong direction first.

Terminal Provisioning State. The Azure error for everything that tells you nothing…….

This post is about what subnet delegation actually is, why it breaks deployments in ways that are surprisingly difficult to diagnose, and — more importantly — how to make sure it never catches you out again.

What is Subnet Delegation?

At the simplest level, subnet delegation is Azure’s way of saying:

this subnet belongs to this service now.

Not in the sense that you lose visibility of it. Not in the sense that you cannot still apply controls around it. But in the sense that a particular Azure service needs permission to configure aspects of that subnet in order to function properly.

The reason for this is straightforward. Some services need to apply their own network policies, routing rules, and management plane configurations to the subnet. They can’t do that reliably if other resources are competing for the same address space. So Azure introduces the concept of delegation: the subnet is formally assigned to a specific service, and that service becomes the owner.

A delegated subnet belongs to one service. That’s it. No virtual machines, no load balancers, no other PaaS services sharing the space alongside it. The subnet is reserved for the delegated service, not just partially occupied by it.

What Happens When You Get It Wrong

The moment you try to place another resource into a delegated subnet — or deploy a service that requires delegation into a subnet that hasn’t been configured for it — the deployment fails.

And Azure’s error messaging in these situations is not always helpful.

What you typically get is a generic deployment failure (I mean, what the hell does “terminal provisioning state” mean anyway???). The portal or CLI may or may not surface an error that points you at the resource configuration, and the natural instinct is to start checking the things you know: NSG rules, route tables, address space availability. These are the usual suspects in VNet troubleshooting. You work through them methodically and find nothing wrong — because nothing is wrong with them.

What you don’t immediately think to check is the delegation tab on the subnet properties. Why would you? In most VNet troubleshooting scenarios, subnet delegation never comes up. For architects who spend most of their time working with IaaS workloads, it’s simply not part of the mental checklist.

Which Services Require It?

Subnet delegation isn’t a niche requirement for obscure services. It applies to some of the most commonly deployed PaaS workloads in enterprise Azure environments.

At this point, its important to distinguish between “Dedicated” and “Delegated” subnets. Some Azure services such as Bastion, Firewall and VNET Gateway have specific naming and sizing requirements for the subnets that they live in, which means they are dedicated.

I’ve tried to summarize in the table below the services that need both dedicated and delegated subnets. The list may or may not be exhaustive – the reason is that I can’t find a single source of reference on Microsoft Learn or GitHub that shows me what services require delegation. So buddy Copilot may have helped with compiling this list …..

Azure serviceRequirementNotes
Compute & containers
Azure Kubernetes Service (AKS) — kubenetDedicatedNode and pod CIDRs consume the entire subnet; mixing breaks routing
AKS — Azure CNIDedicatedEach pod gets a VNet IP; subnet exhaustion risk with shared use
Azure Container Instances (ACI)DelegationDelegate to Microsoft.ContainerInstance/containerGroups
Azure App Service / Function App (VNet Integration)DelegationDelegate to Microsoft.Web/serverFarms; /26 or larger recommended
Azure Batch (simplified node communication)DelegationDelegate to Microsoft.Batch/batchAccounts
Networking & gateways
Azure VPN GatewayDedicatedSubnet must be named GatewaySubnet
Azure ExpressRoute GatewayDedicatedAlso uses GatewaySubnet; can co-exist with VPN Gateway in same subnet
Azure Application Gateway v1/v2DedicatedSubnet must contain only Application Gateway instances
Azure FirewallDedicatedSubnet must be named AzureFirewallSubnet; /26 minimum
Azure Firewall ManagementDedicatedRequires separate AzureFirewallManagementSubnet; /26 minimum
Azure BastionDedicatedSubnet must be named AzureBastionSubnet; /26 minimum
Azure Route ServerDedicatedSubnet must be named RouteServerSubnet; /27 minimum
Azure NAT GatewayDelegationAssociated via subnet property, not a formal delegation; can share subnet
Azure API Management (internal/external VNet mode)DedicatedRecommended dedicated; NSG and UDR requirements make sharing impractical
Databases & analytics
Azure SQL Managed InstanceBothDedicated subnet + delegate to Microsoft.Sql/managedInstances; /27 minimum
Azure Database for MySQL Flexible ServerBothDedicated subnet + delegate to Microsoft.DBforMySQL/flexibleServers
Azure Database for PostgreSQL Flexible ServerBothDedicated subnet + delegate to Microsoft.DBforPostgreSQL/flexibleServers
Azure Cosmos DB (managed private endpoint)DelegationDelegate to Microsoft.AzureCosmosDB/clusters for dedicated gateway
Azure HDInsightDedicatedComplex NSG rules make sharing unsafe; dedicated strongly recommended
Azure Databricks (VNet injection)BothTwo dedicated subnets (public + private); delegate both to Microsoft.Databricks/workspaces
Azure Synapse Analytics (managed VNet)DelegationDelegate to Microsoft.Synapse/workspaces
Integration & security
Azure Logic Apps (Standard, VNet Integration)DelegationDelegate to Microsoft.Web/serverFarms; same as App Service
Azure API Management (Premium, VNet injected)DedicatedOne subnet per deployment region; /29 or larger
Azure NetApp FilesBothDedicated subnet + delegate to Microsoft.Netapp/volumes; /28 minimum
Azure Machine Learning compute clustersDedicatedDedicated subnet recommended to isolate training workloads
Azure Spring AppsBothTwo dedicated subnets (service runtime + apps); delegate to Microsoft.AppPlatform/Spring

If you’re building Landing Zones for enterprise workloads, you will encounter a significant number of these. Quite possibly in the same deployment cycle.

It’s also worth noting that Microsoft surfaces the delegation identifier strings (like Microsoft.Sql/managedInstances) in the portal when you configure a subnet — but only once you know to look there. These identifiers are also what you’ll specify in your IaC templates, so knowing the right string for each service before you deploy is part of the preparation work.

Why This Catches Architects Out

There’s a pattern worth naming here, because it’s the reason this catches people who really should know better — including architects who’ve been working in Azure for years.

When you build a Solution on Azure or any other cloud platform, you make a lot of network design decisions up front: address space, subnets, NSGs, route tables, peerings, DNS. These decisions form a mental model of the network, and that model tends to stay fairly stable once the design is locked.

Subnet delegation is easy to miss in that process because it isn’t a networking concept in the traditional sense. You’re not configuring routing, access control, or address space. You’re assigning ownership of a subnet to a service. That’s a different kind of decision, and it lives in a different part of the portal to everything else you’re configuring.

During a deployment, when the pressure is on and the clock is running, nobody goes back to check the delegation tab unless they already know delegation is the issue. And you only know delegation is the issue once you’ve already ruled out everything else.

What the Fix Actually Looks Like

Once you know what you’re looking for, the resolution is straightforward.

In the Azure Portal, navigate to the subnet, open the delegation settings, and assign the appropriate service. The delegation options available correspond to the services that support or require it — you select the right one, save, and retry the deployment.

That’s it. That’s the ninety-second fix.

In Terraform, it looks like this:

delegation {
name = "sql-managed-instance-delegation"
service_delegation {
name = "Microsoft.Sql/managedInstances"
actions = [
"Microsoft.Network/virtualNetworks/subnets/join/action",
"Microsoft.Network/virtualNetworks/subnets/prepareNetworkPolicies/action",
"Microsoft.Network/virtualNetworks/subnets/unprepareNetworkPolicies/action"
]
}
}

If you’re deploying via infrastructure-as-code — which you should be for any Landing Zone work — delegation needs to be defined in the subnet configuration from the start, not added reactively when a deployment fails.

Conclusion

Subnet delegation is a small concept with an outsized potential to cause problems during deployments and migrations. The key points:

  • Some PaaS services require exclusive control over a dedicated, delegated subnet
  • Deployment failures caused by missing delegation are poorly surfaced by Azure’s error messaging, which means diagnosis takes much longer than the fix
  • The services that require delegation include SQL Managed Instance, Container Instances, Databricks, Container Apps, and NetApp Files — these are common enterprise workloads, not edge cases
  • The fix, once identified, takes about ninety seconds
  • The right response is to make delegation a first-class design consideration in your subnet inventory and Landing Zone documentation

And if you haven’t hit this yet: now you’ll know what you’re looking at before the next deployment.

100 Days of Cloud – Day 63: Azure SQL Server

Its Day 63 of my 100 Days of Cloud journey, and today I’m looking at SQL services in Azure and the different options we have for hosting SQL in Azure.

As we discussed in the previous post, SQL is an example of a Relational Database Management System (RDBMS), which follows a traditional model of storing data using 2-dimensional tables where data is stored in columns and rows in a pre-defined schema.

On-premise installations of Microsoft SQL Server would follow the traditional IAAS model, where we would install a Windows Server operating which provides the platform for the SQL Server Database to run on.

In Azure, we have 3 options for migrating and hosting our SQL Databases.

SQL Server on Azure VM

SQL Server on Azure VM is an IaaS offering and allows you to run SQL Server inside a fully managed virtual machine (VM) in Azure.

SQL virtual machines are a good option for migrating on-premises SQL Server databases and applications without any database change.

This option is best suited where OS-level access is required. SQL virtual machines in Azure are lift-and-shift ready for existing applications that require fast migration to the cloud with minimal changes or no changes. SQL virtual machines offer full administrative control over the SQL Server instance and underlying OS for migration to Azure.

SQL Server on Azure Virtual Machines allows full control over the database engine. You can choose when to start maintenance/patching, change the recovery model to simple or bulk-logged, pause or start the service when needed, and you can fully customize the SQL Server database engine. With this additional control comes the added responsibility to manage the virtual machine.

Azure SQL Managed Instance

Azure SQL Managed Instance is a Platform-as-a-Service (PaaS) offering, and is best for most migrations to the cloud. SQL Managed Instance is a collection of system and user databases with a shared set of resources that is lift-and-shift ready.

This option is best suited to new applications or existing on-premises applications that want to use the latest stable SQL Server features and that are migrated to the cloud with minimal changes. An instance of SQL Managed Instance is similar to an instance of the Microsoft SQL Server database engine offering shared resources for databases and additional instance-scoped features.

SQL Managed Instance supports database migration from on-premises with minimal to no database change. This option provides all of the PaaS benefits of Azure SQL Database but adds capabilities that were previously only available in SQL Server VMs. This includes a native virtual network and near 100% compatibility with on-premises SQL Server. Instances of SQL Managed Instance provide full SQL Server access and feature compatibility for migrating SQL Servers to Azure.

Azure SQL Database

Azure SQL Database is a relational database-as-a-service (DBaaS) hosted in Azure that falls into the category of a PaaS offering.

This is best for modern cloud applications that want to use the latest stable SQL Server features and have time constraints in development and marketing.
A fully managed SQL Server database engine, based on the latest stable Enterprise Edition of SQL Server. SQL Database has two deployment options:

  • As a single database with its own set of resources managed via a logical SQL server. A single database is similar to a contained database in SQL Server. This option is optimized for modern application development of new cloud-born applications. Hyperscale and serverless options are available.
  • An elastic pool, which is a collection of databases with a shared set of resources managed via a logical SQL server. Single databases can be moved into and out of an elastic pool. This option is optimized for modern application development of new cloud-born applications using the multi-tenant SaaS application pattern. Elastic pools provide a cost-effective solution for managing the performance of multiple databases that have variable usage patterns.

Overall Comparisons

Both Azure SQL Database and Azure SQL Managed Instance are optimized to reduce overall management costs since you do not have to manage any virtual machines, operating system, or database software. You do not have to manage upgrades, high availability, or backups.

Both options can dramatically increase the number of databases managed by a single IT or development resource. Elastic pools also support SaaS multi-tenant application architectures with features including tenant isolation and the ability to scale to reduce costs by sharing resources across databases. SQL Managed Instance provides support for instance-scoped features enabling easy migration of existing applications, as well as sharing resources among databases.

Finally, the database software is automatically configured, patched, and upgraded by Azure, which reduces your administration

The alternative is SQL Server on Azure VMs which provides DBAs with an experience most similar to the on-premises environment they’re familiar with. You can use any of the platform-provided SQL Server images (which includes a license) or bring your SQL Server license. All the supported SQL Server versions (2008R2, 2012, 2014, 2016, 2017, 2019) and editions (Developer, Express, Web, Standard, Enterprise) are available. However, as this is a VM, it’s up to you to update/upgrade the operating system and database software and when to install any additional software such as anti-virus.

Management

All of the above options can be managed from the Azure SQL page in the Azure Portal.

Image Credit -Microsoft

Migration

In order to migrate from existing SQL Workloads, in all cases you would use an Azure Migrate Project with the Data Migration Assistant. You can find all of the scenarios relating to migrations options here.

Conclusion

And thats a look at the different options for hosting SQL on Azure. Hope you enjoyed this post, until next time – I feel like going bowling now!