“Why a Landing Zone?”: How to avoid Azure sprawl from day 1 (and still move fast)

A Landing Zone is never the first thought when a project starts. When the pressure is on to deliver something fast in Azure (or any other cloud environment, the simplest path looks like this:

  • Create a subscription
  • Throw resources into a few Resource Groups
  • Build a VNet (or two)
  • Add some NSGs
  • Ship it

Its a good approach ….. for a Proof Of Concept ….

Here’s the problem though: POC’s keep going and turn into Production environments. Because “we need to go fast….”.

What begins as speed often turns into sprawl, and this isn’t a problem until 30/60/180 days later, when you’ve got multiple teams, multiple environments, and everyone has been “needing to go fast”. And its all originated from that first POC …..

This post is about the pain points that appear when you skip foundations, and more importantly, how you can avoid them from day 1, using the Azure Landing Zone reference architectures as your guardrails and your blueprint.


This is always how it starts….

The business says:

“We need this workload live in Azure quickly.”

The delivery team says:

“No problem. We’ll deploy the services into a Resource Group, lock down the VNet with NSGs, and we’ll worry about the platform stuff later.”

Ops and Security quietly panic (or as per the above example, get thrown out the window….), but everyone’s under pressure, so you crack on.

At this point nobody is trying to build a mess. Everyone is “trying” to do the right thing. But the POC you build in those early days has a habit of becoming “the environment” — the one you’re still using a year later, except now it’s full of exceptions, one-off decisions, and “temporary” fixes that never got undone.


The myth: “Resource Groups + VNets + NSGs = foundation”

Resource Groups are useful. VNets are essential. NSGs absolutely have their place.

But if your “platform strategy” starts and ends there, you haven’t built a foundation — you’ve built a starting configuration.

Azure Landing Zones exist to give you that repeatable foundation: a scalable, modular architecture with consistent controls that can be applied across subscriptions as you grow.


The pain points that show up after the first few workloads

1) Governance drift (a.k.a. “every team invents their own standards”)

You start with one naming convention. Then a second team arrives and uses something else. Tags are optional, so they’re inconsistent. Ownership becomes unclear. Cost reporting turns into detective work.

Then you try to introduce standards later and discover:

  • Hundreds of resources without tags
  • Naming patterns that can’t be fixed without redeploying and breaking things
  • “Environment” means different things depending on who you ask

The best time to enforce consistency is before you have 500 things deployed. Landing Zones bring governance forward. Not as a blocker, but as a baseline: policies, conventions, and scopes that make growth predictable.


2) RBAC sprawl (“temporary Owner” becomes permanent risk)

If you’ve ever inherited an Azure estate, environments tend to have patterns like:

  • “Give them Owner, we’ll tighten it later.”
  • “Add this service principal as Contributor everywhere just to get the pipeline working.”
  • “We need to unblock the vendor… give them access for now.”

Fast-forward a few months and you have:

  • Too many people with too much privilege
  • No clean separation between platform access and workload access
  • Audits and access reviews that are painful and slow

This is where Landing Zones help in a very simple way. The platform team owns the platform. Workload teams own their workloads. And the boundaries are designed into the management group and subscription model, not “managed” by tribal knowledge.


3) Network entropy (“just one more VNet”)

Networking is where improvisation becomes expensive. It starts with:

  • a VNet for the first app
  • a second VNet for the next one
  • a peering here
  • another peering there
  • and then one day someone asks: “What can talk to what?”

And nobody can answer confidently without opening a diagram that looks like spaghetti.

The Azure guidance here is very clear: adopt a deliberate topology (commonly hub-and-spoke) so you centralise shared services, inspection, and connectivity patterns.


4) Subscription blast radius (“one subscription becomes the junk drawer”)

This is one of the biggest “resource group isn’t enough” realities. Resource Groups are not strong boundaries for:

  • quotas and limits
  • policy scope management at scale
  • RBAC complexity
  • cost separation across teams/products
  • incident and breach containment

When everything lives in one subscription, one bad decision has a very wide blast radius. Landing Zones push you toward using subscriptions as a unit of scale, and setting up management groups so you can apply guardrails consistently across them.


So what is a Landing Zone, practically?

In a nutshell, a Landing Zone is the foundation to everything you will do in future in your cloud estate.

The platform team builds a standard, secure, repeatable environment. Application teams ship fast on top of it, without having to re-invent governance, networking, and security every time.

The Azure Landing Zone reference architecture is opinionated for a reason — it gives you a proven starting point that you tailor to your needs.

And it’s typically structured into two layers:

Image Credit: Microsoft

Platform landing zone

Shared services and controls, such as:

  • identity and access foundations
  • connectivity patterns
  • management and monitoring
  • security baselines

Application landing zones

Workload subscriptions where teams deploy their apps and services — with autonomy inside guardrails.

This separation is the secret sauce. The platform stays boring and consistent. The workloads move fast.


Avoiding sprawl from day 1: a simple blueprint

If you want the practical “do this first” guidance, here it is.

1) Don’t freestyle: use the design areas as your checklist

Microsoft’s Cloud Adoption Framework breaks landing zone design into clear design areas. Treat these as your “day-1 decisions” checklist.

Even if you don’t implement everything on day 1, you should decide:

  • Identity and access: who owns what, where privilege lives
  • Resource organisation: management group hierarchy and subscription model
  • Network topology: hub-and-spoke / vWAN direction, IP plan, connectivity strategy
  • Governance: policies, standards, and scope
  • Management: logging, monitoring, operational ownership

The common failure mode is building workloads first, then trying to reverse-engineer these decisions later.


2) Make subscriptions your unit of scale (and stop treating “one sub” as a platform)

If you want to avoid a single subscription becoming a dumping ground, you need a repeatable way to create new workload subscriptions with the right baseline baked in.

This is where subscription vending comes in.

Subscription vending is basically: “new workload subscriptions are created in a consistent, governed way” — with baseline policies, RBAC, logging hooks, and network integration applied as part of the process.

If you can’t create a new compliant subscription easily, you will end up reusing the first one forever… and that’s how sprawl wins.


3) Choose a network pattern early (then standardise it)

Most of the time, the early win is adopting hub-and-spoke:

  • spokes for workloads
  • a hub for shared services and central control
  • consistent ingress/egress and inspection patterns

The point isn’t that hub-and-spoke is “cool” – it gives you a consistent story for connectivity and control.


4) Guardrails that don’t kill speed

This is where people get nervous. They hear “Landing Zone” and think bureaucracy. But guardrails are only slow when they’re manual. Good guardrails are automated and predictable, like:

  • policy baselines for common requirements
  • naming/tagging standards that are enforced early
  • RBAC patterns that avoid “Owner everywhere”
  • logging and diagnostics expectations so ops isn’t blind

This is how you enable teams to move quickly without turning your subscription into a free-for-all.


How can you actually implement this?

Don’t build it from scratch. Use the Azure Landing Zone reference architecture as your baseline, then implement via an established approach (and put it in version control from the start). The landing zone architecture is designed to be modular for exactly this reason: you can start small and evolve without redesigning everything.

Treat it like a product:

  • define what a “new workload environment” looks like
  • automate the deployment of that baseline
  • iterate over time

The goal is not to build the perfect enterprise platform on day 1; its to build something that won’t collapse under its own weight when you scale.


A “tomorrow morning” checklist

If you’re reading this and thinking “right, what do I actually do next?”, here are four actions that deliver disproportionate value:

  1. Decide your management group + subscription strategy
  2. Pick your network topology (and standardise it)
  3. Define day-1 guardrails (policy baseline, RBAC patterns, naming/tags, logging hooks)
  4. Set up subscription vending so new workloads start compliant by default

Do those four things, and you’ll avoid the worst kind of Azure sprawl before it starts.


Conclusion

Skipping a Landing Zone might feel like a quick win today.

But if you know the workload is going to grow — more teams, more environments, more services, more scrutiny — then the question isn’t “do we need a landing zone?”

The question is: do we want to pay for foundations now… or pay a lot more later when we (inevitably) lose control?

Hope you enjoyed this post – this is my contribution to this years Azure Spring Clean event organised by Joe Carlyle and Thomas Thornton. Check out the full schedule on the website!

The A-Z of Azure Policy

I’m delighted to be contributing to Azure Spring Clean for the first time. The annual event is organised by Azure MVP’s Joe Carlyle and Thomas Thornton and encourages you to look at your Azure subscriptions and see how you could manage it better from a Cost Management, Governance, Monitoring and Security perspective. You can check out all of the posts in this years Azure Spring Clean here. For this year, my contribution is the A-Z of Azure Policy!

Azure Policy is one of the key pillars of a Well Architected Framework for Cloud Adoption. It enables you to enforce standards across either single or multiple subscriptions at different scope levels and allows you to bring both existing and new resources into compliance using bulk and automated remediation.

These policies enforce different rules and effects over your resources so that those resources stay compliant with your corporate standards and service level agreements. Azure Policy meets this need by evaluating your resources for noncompliance with assigned policies.

Image Credit - Microsoft

Image Credit: Microsoft

Policies define what you can and cannot do with your environment. They can be used individually or in conjunction with Locks to ensure granular control. Let’s look at some simple examples where Policies can be applied:

  • If you want to ensure resources are deployed only in a specific region.
  • If you want to use only specific Virtual Machine or Storage SKUs.
  • If you want to block any SQL installations.
  • If you want to enforce Tags consistently across your resources.

So that’s it – you can just apply a policy and it will do what you need it to do? The answer is both Yes and No:

  • Yes, in the sense that you can apply a policy to define a particular set of business rules to audit and remediate the compliance of existing resources against those rules.
  • No in the sense that there is so much more to it than that.

There is much to understand about how Azure Policy can be used as part of your Cloud Adoption Framework toolbox. And because there is so much to learn, I’ve decided to do an “A-Z” of Azure Policy and show the different options and scenarios that are available.

Before we start on the A-Z, a quick disclaimer …. There’s going to be an entry for every letter of the alphabet, but you may have to forgive me if I use artistic license to squeeze a few in (Letters like Q, X and Z spring to mind!).

So, grab a coffee (or whatever drink takes your fancy) and let’s start on the Azure Policy alphabet!

A

Append is the first of our Policy Effects and is used to add extra fields to resources during update or creation, however this is only available with Azure Resource Manager (ARM). The example below sets IP rules on a Storage Account:

"then": {
    "effect": "append",
    "details": [{
        "field": "Microsoft.Storage/storageAccounts/networkAcls.ipRules",
        "value": [{
            "action": "Allow",
            "value": "134.5.0.0/21"
        }]
    }]
}

Assignment is the definition of what resources or scope your Policy is being applied to.

Audit is the Policy Effect that evaluates the resources and report a non-compliance in the logs. It does not take any actions; this is report-only.

"then": {
    "effect": "audit"
}

AuditIfNotExists is the Policy Effect that evaluates whether a property is missing. So for example, we can say if the type of Resource is a Virtual Machine and we want to know if that Virtual Machine has a particular tag or extension present. If yes, the resource will be returned as Compliant, if not, it will return a non-compliance. The example below evaluates Virtual Machines to determine whether the Antimalware extension exists then audits when missing:

{
    "if": {
        "field": "type",
        "equals": "Microsoft.Compute/virtualMachines"
    },
    "then": {
        "effect": "auditIfNotExists",
        "details": {
            "type": "Microsoft.Compute/virtualMachines/extensions",
            "existenceCondition": {
                "allOf": [{
                        "field": "Microsoft.Compute/virtualMachines/extensions/publisher",
                        "equals": "Microsoft.Azure.Security"
                    },
                    {
                        "field": "Microsoft.Compute/virtualMachines/extensions/type",
                        "equals": "IaaSAntimalware"
                    }
                ]
            }
        }
    }
}

B

Blueprints – Instead of having to configure features like Azure Policy for each new subscription, with Azure Blueprints you can define a repeatable set of governance tools and standard Azure resources that your organization requires. This allows you to scale the configuration and organizational compliance across new and existing subscriptions with a set of built-in components that speed the development and deployment phases.

Built-In –Azure provides hundreds of built-in Policy and Initiative definitions for multiple resources to get you started. You can find then both on the Microsoft Learn site or on GitHub.

C

Compliance State shows the state of the resource when compared to the policy that has been applied. Unsurprisingly this has 2 states, Compliant and Non-Compliant

Costs – if you are running Azure Policy on Azure resources, then its free. However, you can use Azure Policy to cover Azure Arc resources and there are specific scenarios where you will be charged:

  • Azure Policy guest configuration (includes Azure Automation change tracking, inventory, state configuration): $6/Server/Month
  • Kubernetes Configuration: First 6 vCPUs are free, $2/vCPU/month

Custom Policy definitions are ones that you create yourself when a Built-In Policy doesn’t meet the requirements of what you are trying to achieve.

D

Dashboards in the Azure Portal give you a graphical overview of the compliance state of your Azure environments:

Definition Location is the scope to where the Policy or Initiative is assigned. This can be Management Group, Subscription, Resource Group or Resource.

Deny is the Policy Effect used to prevent a resource request or action that doesn’t match the defined standards.

"then": {
    "effect": "deny"
}

DeployIfNotExists is the Policy Effect used to apply the action defined in the Policy Template when a resource is found to be non-compliant. This is used as part of a remediation of non-compliant resources. Important point to note – policy assignments that use a DeployIfNotExists effect require a managed identity to perform remediation.

Docker Security Baseline is a set of default configuration settings which ensure that Docker Containers in Azure are running based on a recommended set of regulatory and security baselines.

E

Enforcement Mode is a property that allows you to enable/disable enforcement of policy effects while still evaluating compliance.

Evaluation is the process of scanning your environment to determine the applicability and compliance of assigned policies.

F

Fields are used in policy definitions to specify a property or alias. In the example below, the field property contains “location” and “type” at different stages of the evaluation:

"if": {
        "allOf": [{
                "field": "location",
                "notIn": "[parameters('listOfAllowedLocations')]"
            },
            {
                "field": "location",
                "notEquals": "global"
            },
            {
                "field": "type",
                "notEquals": "Microsoft.AzureActiveDirectory/b2cDirectories"
            }
        ]
    },
    "then": {
        "effect": "Deny"
    }
}

G

GitHub – you can use GitHub to build an “Azure Policy as Code” workflow to manage your policies as code, control the lifecycle of updating definitions, and automate the process of validating compliance results.

Governance Visualizer – I have to include this because I think its an awesome tool – Julian Hayward’s AzGovViz tool is a PowerShell script which captures Azure governance capabilities such as Azure Policy, RBAC and Blueprints and a lot more. If you’re not using it, now is the time to start.

Group – within an Initiative, you can group policy definitions for categorization. The Regulatory Compliance feature uses this to group definitions into controls and compliance domains.

H

Hierarchy – this sounds simple but is important. The location that you assign the policy should contain all resources that you want to target under that resource hierarchy. If the definition location is a:

  • Subscription – Only resources within that subscription can be assigned the policy definition.
  • Management group – Only resources within child management groups and child subscriptions can be assigned the policy definition. If you plan to apply the policy definition to several subscriptions, the location must be a management group that contains each subscription.

I

Initiative (or Policy Set) is a set of Policies that have been grouped together with the aim of either targeting a specific set of resources, or to evaluate and remediate a specific set of definitions or parameters. For example, you could group several tagging policies into a single initiative that is targeted at a specific scope instead of applying multiple policies individually.

J

JSON – Policy definitions are written in JSON format. The policy definition contains elements for:

  • mode
  • parameters
  • display name
  • description
  • policy rule
    • logical evaluation
    • effect

An example of the “Allowed Locations” built-in policy is shown below

{
  "properties": {
    "displayName": "Allowed locations",
    "policyType": "BuiltIn",
    "description": "This policy enables you to restrict the locations...",
    "mode": "Indexed",
    "parameters": {
      "listOfAllowedLocations": {
        "type": "Array",
        "metadata": {
          "description": "Locations that can be specified....",
          "strongType": "location",
          "displayName": "Allowed locations"
        }
      }
    },
    "policyRule": {
      "if": {
        "allOf": [
          {
            "field": "location",
            "notIn": "[parameters('listOfAllowedLocations')]"
          },
          {
            "field": "location",
            "notEquals": "global"
          },
          {
            "field": "type",
            "notEquals": "Microsoft.AzureActiveDirectory/b2cDirectories"
          }
        ]
      },
      "then": {
        "effect": "Deny"
      }
    }
  },
  "id": "/providers/Microsoft.Authorization/policyDefinitions/e56962a6-4747-49cd-b67b-bf8b01975c4c",
  "type": "Microsoft.Authorization/policyDefinitions",
  "name": "e56962a6-4747-49cd-b67b-bf8b01975c4c"
}

K

Key Vault – you can integrate Key Vault with Azure Policy to audit the key vault and its objects before enforcing a deny operation to prevent outages. Current built-ins for Azure Key Vault are categorized in four major groups: key vault, certificates, keys, and secrets management.

Kubernetes – Azure Policy uses Gatekeeper to apply enforcements and safeguards on your clusters (both Azure Kubernetes Service (AKS) and Azure Arc enabled Kubernetes). This then reports back into your centralized Azure Policy Dashboard on the following:

  • Checks with Azure Policy service for policy assignments to the cluster.
  • Deploys policy definitions into the cluster as constraint template and constraint custom resources.
  • Reports auditing and compliance details back to Azure Policy service.

After installing the Azure Policy Add-on for AKS, you can apply individual policy definitions or initiatives to your cluster.

L

Lighthouse – for Service Providers, you can use Azure Lighthouse to deploy and manage policies across multiple customer tenants.

Linux Security Baseline is a set of default configuration settings which ensure that Linux VMs in Azure are running based on a recommended set of regulatory and security baselines.

Logical Operators are optional condition statements that can be used to see if resources have certain configurations applied. There are 3 logical operators – not, allOf and anyOf.

  • Not means that the opposite of the condition should be true for the policy to be applied.
  • AllOf requires all the conditions defined to be true at the same time.
  • AnyOf requires any one of the conditions to be true for the policy to be applied.
"policyRule": {
  "if": {
    "allOf": [{
        "field": "type",
        "equals": "Microsoft.DocumentDB/databaseAccounts"
      },
      {
        "field": "Microsoft.DocumentDB/databaseAccounts/enableAutomaticFailover",
        "equals": "false"
      },
      {
        "field": "Microsoft.DocumentDB/databaseAccounts/enableMultipleWriteLocations",
        "equals": "false"
      }
    ]
  },
  "then": {

M

Mode tells you the type of resources for which the policy will be applied. Allowed values are “All” (where all Resource Groups and Resources are evaluated) and “indexed” (where policy is evaluated only for resources which support tags and location)

Modify is a Policy Effect that is used to add, update, or remove properties or tags on a subscription or resource during creation or update. Important point to note – policy assignments that use a Modify effect require a managed identity to perform remediation. If you don’t have a managed identity, use Append instead. The example below is replacing all tags with a value of environment with a value of test:

"then": {
    "effect": "modify",
    "details": {
        "roleDefinitionIds": [
            "/providers/Microsoft.Authorization/roleDefinitions/b24988ac-6180-42a0-ab88-20f7382dd24c"
        ],
        "operations": [
            {
                "operation": "addOrReplace",
                "field": "tags['environment']",
                "value": "Test"
            }
        ]
    }
}

N

Non-Compliant is the state which indicates that a resource did not conform to the policy rule in the policy definition.

O

OK, so this is my first failure. Surprising, but lets keep going!

P

Parameters are used for providing inputs to the policy. They can be reused at multiple locations within the policy.

{
    "properties": {
        "displayName": "Require tag and its value",
        "policyType": "BuiltIn",
        "mode": "Indexed",
        "description": "Enforces a required tag and its value. Does not apply to resource groups.",
        "parameters": {
            "tagName": {
                "type": "String",
                "metadata": {
                    "description": "Name of the tag, such as costCenter"
                }
            },
            "tagValue": {
                "type": "String",
                "metadata": {
                    "description": "Value of the tag, such as headquarter"
                }
            }
        },
        "policyRule": {
            "if": {
                "not": {
                    "field": "[concat('tags[', parameters('tagName'), ']')]",
                    "equals": "[parameters('tagValue')]"
                }
            },
            "then": {
                "effect": "deny"
            }
        }
    }
}

Policy Rule is the part of a policy definition that describes the compliance requirements.

Policy State describes the compliance state of a policy assignment.

Q

Query Compliance – While the Dashboards in the Azure Portal (see above) provide you with a visual method of checking your overall compliance, there are a number of command line and automation tools you can use to access the compliance information gnerated by your policy and initiative assignments:

az policy state trigger-scan --resource-group "MyRG"

  • Azure PowerShell using the following command:

Start-AzPolicyComplianceScan -ResourceGroupName 'MyRG'

R

Regulatory Compliance describes a specific type of initiative that allows grouping of policies into controls and categorization of policies into compliance domains based on responsibility (Customer, Microsoft, Shared). These are available as built-in initiatives (there are built-in initiatives from CIS, ISO, PCI DSS, NIST, and multiple Government standards), and you have the ability to create your own based on specific requirements.

Remediation is a way to handle non-compliant resources. You can create remediation tasks for resources to bring these to a desired state and into compliance. You use DeployIfNotExists or Modify effects to correct violating policies.

S

Security Baseline for Azure Security Benchmark – this is a set of policies that comes from guidance from the Microsoft cloud security benchmark version 1.0. The full Azure Policy security baseline mapping file can be found here.

Scope is the location where the policy definition is being assigned to. This can be Management Group, Subscription, Resource Group or Resource.

T

Tag Governance is a crucial part of organizing your Azure resources into a taxonomy. Tags can be the basis for applying your business policies with Azure Policy or tracking costs with Cost Management. The template shown below shows how to enforce Tag values across your resources:

{
   "properties": {
      "displayName": "Require tag and its value",
      "policyType": "BuiltIn",
      "mode": "Indexed",
      "description": "Enforces a required tag and its value. Does not apply to resource groups.",
      "parameters": {
         "tagName": {
            "type": "String",
            "metadata": {
               "description": "Name of the tag, such as costCenter"
            }
         },
         "tagValue": {
            "type": "String",
            "metadata": {
               "description": "Value of the tag, such as headquarter"
            }
         }
      },
      "policyRule": {
         "if": {
            "not": {
               "field": "[concat('tags[', parameters('tagName'), ']')]",
               "equals": "[parameters('tagValue')]"
            }
         },
         "then": {
            "effect": "deny"
         }
      }
   },
   "id": "/providers/Microsoft.Authorization/policyDefinitions/1e30110a-5ceb-460c-a204-c1c3969c6d62",
   "type": "Microsoft.Authorization/policyDefinitions",
   "name": "1e30110a-5ceb-460c-a204-c1c3969c6d62"
}

U

Understanding how Effects work is key to understanding Azure Policy. By now, we’ve listed all the effects out above. The key thing to remember is that each policy definition has a single effect, which determines what happens when an evaluation finds a match. There is an order in how the effects are evaluated:

  • Disabled is checked first to determine whether the policy rule should be evaluated.
  • Append and Modify are then evaluated. Since either could alter the request, a change made may prevent an audit or deny effect from triggering. These effects are only available with a Resource Manager mode.
  • Deny is then evaluated. By evaluating deny before audit, double logging of an undesired resource is prevented.
  • Audit is evaluated.
  • Manual is evaluated.
  • AuditIfNotExists is evaluated.
  • denyAction is evaluated last.

Once these effects return a result, the following 2 effects are run to determine if additional logging or actions are required:

  • AuditIfNotExists
  • DeployIfNotExists

V

Visual Studio Code contains an Azure Policy code extension which allows you to create and modify policy definitions, run resource compliance and evaluate your policies against a resource.

W

Web Application Firewall – Azure Web Application Firewall (WAF) combined with Azure Policy can help enforce organizational standards and assess compliance at-scale for WAF resources.

Windows Security Baseline is a set of default configuration settings which ensure that Windows VMs in Azure are running based on a recommended set of regulatory and security baselines.

X

X is for ….. ah come on, you’re having a laugh ….. fine, here you go (artistic license taken!):

Xclusion – this of course should read Exclusion ….. when assigned, the scope includes all child resource containers and child resources. If a child resource container or child resource shouldn’t have the definition applied, each can be excluded from evaluation by setting notScopes.

Xemption – this of course should read Exemption …. this is a feature used to exempt a resource hierarchy or individual resource from evaluation. These resources are therefore not evaluated and can have a temporary waiver (expiration) period where they are exempt from evaluation and remediation.

Y

YAML – You can use Azure DevOps to check Azure Policy Compliance using using YAML Pipelines. However, you need to use the AzurePolicyCheckGate@0 task. The syntax is shown below:

# Check Azure Policy compliance v0
# Security and compliance assessment for Azure Policy.
- task: AzurePolicyCheckGate@0
  inputs:
    azureSubscription: # string. Alias: ConnectedServiceName. Required. Azure subscription. 
    #ResourceGroupName: # string. Resource group. 
    #Resources: # string. Resource name.

Z

Zero Non-Compliant – which is exactly the position you want to get to!

Z is also for Zzzzzzzz, which may be the state you’re in if you’ve managed to get this far!

Summary

So thats a lot to take in, but it gives you an insight into the different options that are available in Azure Policy to ensure that your Azure environments can meet both governance and cost management objectives for your organization.

In this post, I’ve stayed with the features of Azure Policy and apart from a few examples didn’t touch on the many different methods you can use to assign and manage policies which are:

  • Azure Portal
  • Azure CLI
  • Azure PowerShell
  • .NET
  • JavaScript
  • Python
  • REST
  • ARM Template
  • Bicep
  • Terraform

As always, check out the official Microsoft Learn documentation for a more in-depth deep dive on Azure Policy.

Hope you enjoyed this post! Be sure to check out the rest of the articles in this years Azure Spring Clean.

100 Days of Cloud – Day 46: Azure Well Architected Framework

Its Day 46 of my 100 Days of Cloud Journey, and today I’m looking at Azure Well Architected Framework.

Over the course of my 100 Days journey so far, we’ve talked about and deployed multiple different types of Azure resources such as Virtual Machines, Network Security groups, VPNs, Firewalls etc.

We’ve seen how easy this is to do on a Dev-based PAYG subscription like I’m using, however for companies who wish to migrate to Azure, Microsoft provides a ‘Well Architected Framework’ which offers guidance in ensuring that any resource or solution that is deployed or architected in Azure conforms to best practices around planning, design, implementation and on-going maintenance and improvement of the solution.

The Well Architected Framework is based on 5 key pillars:

  • Reliability – this is the ability of a system to recover from failures and continue to function, which in itself is built around 2 key values:
    • Resiliency, which returns the application to a fully functional state after a failure.
    • Availability, which defines whether users can access the workload if they need to.
  • Security – protects applications and data from threats. The first thing people would think of here is “firewalls”, which would protects against threats and DDoS attacks but its not that simple. We need to build security into the application from the ground up. To do this, we can use the following areas:
    • Identity Management, such as RBAC roles and System Managed Identities.
    • Application Security, such as storing application secrets in Azure Key Vault.
    • Data sovereignty and encryption, which ensures the resource or workload and its underlying data is stored in the correct region and is encrypted using industry standards.
    • Security Resources, using tools such as Microsoft Dender for Cloud or Azure Firewall.
  • Cost Optimization – managing costs to maximize the value delivered. This can be achieved in the form of using tools such as:
    • Azure Cost Management to create budgets and cost alerts
    • Azure Migrate to assess the system load generated by your on-premise workloads to ensure thay are correctly sized in the cloud.
  • Operational Excellence – processes that keep a system running in production. In most cases, automated deployments leave little room for human error, and can not only be deployed quickly but can also be rolled back in the event of errors or failures.
  • Performance Efficiency – this is the ability of a system to adapt to changes in load. For this, we can think of tools and methodologioes such as auto-scaling, caching, data partitioning, network and storage optimization, and CDN resources in order to make sure your workloads run efficiently.

On top of all this, the Well Architected Framework has six supporting elements wrapped around it:

Diagram of the Well-Architected Framework and supporting elements.
Image Credit: Microsoft
  • Azure Well-Architected Review
  • Azure Advisor
  • Documentation
  • Partners, Support, and Services Offers
  • Reference Architectures
  • Design Principles

Azure Advisor in particular helps you follow best practises by analyzing your deployments and configuration and provides recommends solutions that can help you improve the reliability, security, cost effectiveness, performance, and operational excellence of your Azure resources. You can learn more about Azure Advisor here.

I recommend anyone who is either in the process of migration or planning to start on their Cloud Migration journey to review the Azure Well Architected Framework material to understand options and best practices when designing and developing an Azure solution. You can find the landing page for Well Architected Framework here, and the Assessments page to help on your journey is here!

Hope you all enjoyed this post, until next time!

100 Days of Cloud – Day 32: AWS Cloud Practitioner Essentials Day 5

Its Day 32 of my 100 Days of Cloud journey, and its my final day of the learning on the AWS Skillbuilder course on AWS Cloud Practitioner Essentials.

This is the official pre-requisite course on the AWS Skillbuilder platform (which for comparison is the AWS equivalent of Microsoft Learn) to prepare candidates for the AWS Certified Cloud Practitioner certification exam.

Let’s have a quick overview of what the final modules covered, the technologies discussed and key takeaways.

Module 9 – Migration and Innovation

Module 9 covers Migration strategies and advice you can use when moving to AWS.

We dived straight into the AWS Cloud Adoption Framework (AWS CAF) and looked at the 6 Perspectives, each of which have distinct responsibilities and helps prepare the right people across your organization prepare for the challenges ahead.

The 6 Perspectives of AWS CAF are:

  • Business – ensure that your business strategies and goals align with your IT strategies and goals.
  • People – evaluate organizational structures and roles, new skill and process requirements, and identify gaps.
  • Governance – how to update the staff skills and processes necessary to ensure business governance in the cloud.
  • Platform – uses a variety of architectural models to understand and communicate the structure of IT systems and their relationships.
  • Security – ensures that the organization meets security objectives for visibility, auditability, control, and agility.
  • Operations – defines current operating procedures and identify the process changes and training needed to implement successful cloud adoption.

We then moved on to the 6 R’s of Migration which are:

  • Rehosting – “lift and shift” move of applications with no changes.
  • Replatforming – “lift, tinker and shift”, move of applications while making changes to optimize performance in the cloud.
  • Refactoring – adding features to the app in the cloud environment that are not possible in the existing environment.
  • Repurchasing – this is redesigning the application from scratch, or replacing it with a cloud-based version.
  • Retaining – keeping some applications that are not suitable for migration in your existing environment.
  • Retiring – removing applications that are no longer needed

We then looked at the AWS Snow solutions (which is similar to Azure Data Box), which is where you use AWS-provided physical devices to transfer large amounts of data directly to AWS Data Centers as opposed to over the internet. These devices range in size from 8TB of storage up to 100PB, and can come in both storage and compute optimized versions.

Finally, the module looked at some of the cool innovation features available in AWS, such as:

  • Amazon Lex – based on Alexa, enables you to build conversational interfaces using voice and text.
  • Amazon Textract – machine learning that extracts data from scanned documents.
  • Amazon SageMaker – enables you to build train and deploy machine learning models.
  • AWS Deep Racer – my favourite one! This is an autonomous 1/18 scale race car that you can use to test reinforcement learning models.

Module 10 – The Cloud Journey

Module 10 is a short one but starts by looking at the AWS Well-Architected Framework which helps you understand how to design and operate reliable, secure, efficient, and cost-effective systems in the AWS Cloud.

The Well-Architected Framework is based on five pillars: 

  • Operational excellence – the ability to run and monitor systems to deliver business value.
  • Security – the ability to protect information, systems and assets while delivering business value.
  • Reliability – the ability to automatically recover from disruptions or outages using scaling.
  • Performance efficiency – the ability to use computing resources efficiently to meet demand.
  • Cost optimization – the ability to run systems to deliver business value at the lowest cost.

Finally, we looked at the six advantages of cloud computing:

  • Trade upfront expense for variable expense – pay for only the resources you use using an OpEx model.
  • Benefit from massive economies of scale – achieve a lower variable cost by availing of aggregated costs.
  • Stop guessing capacity – no more predicting how much resources you need.
  • Increase speed and agility – flexibility to deploy applications and infrastructure in minutes, while also providing more time to experiment and innovate.
  • Stop spending money running and maintaining data centers – focus more on your applications and customers instead of overheads.
  • Go global in minutes – deploy to customers around the world

Module 11 – Exam Overview

The final module gives an overview of the AWS Certified Cloud Practitioner exam, giving a breakdown of the domains as shown below.

Image Credit – AWS Skillbuilder

The exam consists of 65 questions to be completed in 90 minutes, and the passing score is 70%. Like most exams, there are 2 types of questions:

  • A multiple-choice question has one correct response and three incorrect responses, or distractors.
  • A multiple-response question has two or more correct responses out of five or more options.

As always in any exam, the advice is:

  • Read the question in full.
  • Predict the answer before looking at the answer options.
  • Eliminate incorrect answers first.

And that’s all for today! Hope you enjoyed this mini-series of posts on AWS Core Concepts! Now I need to schedule the exam and take that first step on the AWS ladder. You should too, but more importantly, go and enroll for the course using the links at the top of the post – this is my brief summary and understanding of the Modules, but the course if well worth taking and I found it a great starting point in my AWS journey. Until next time!