The A-Z of Azure Policy

I’m delighted to be contributing to Azure Spring Clean for the first time. The annual event is organised by Azure MVP’s Joe Carlyle and Thomas Thornton and encourages you to look at your Azure subscriptions and see how you could manage it better from a Cost Management, Governance, Monitoring and Security perspective. You can check out all of the posts in this years Azure Spring Clean here. For this year, my contribution is the A-Z of Azure Policy!

Azure Policy is one of the key pillars of a Well Architected Framework for Cloud Adoption. It enables you to enforce standards across either single or multiple subscriptions at different scope levels and allows you to bring both existing and new resources into compliance using bulk and automated remediation.

These policies enforce different rules and effects over your resources so that those resources stay compliant with your corporate standards and service level agreements. Azure Policy meets this need by evaluating your resources for noncompliance with assigned policies.

Image Credit - Microsoft

Image Credit: Microsoft

Policies define what you can and cannot do with your environment. They can be used individually or in conjunction with Locks to ensure granular control. Let’s look at some simple examples where Policies can be applied:

  • If you want to ensure resources are deployed only in a specific region.
  • If you want to use only specific Virtual Machine or Storage SKUs.
  • If you want to block any SQL installations.
  • If you want to enforce Tags consistently across your resources.

So that’s it – you can just apply a policy and it will do what you need it to do? The answer is both Yes and No:

  • Yes, in the sense that you can apply a policy to define a particular set of business rules to audit and remediate the compliance of existing resources against those rules.
  • No in the sense that there is so much more to it than that.

There is much to understand about how Azure Policy can be used as part of your Cloud Adoption Framework toolbox. And because there is so much to learn, I’ve decided to do an “A-Z” of Azure Policy and show the different options and scenarios that are available.

Before we start on the A-Z, a quick disclaimer …. There’s going to be an entry for every letter of the alphabet, but you may have to forgive me if I use artistic license to squeeze a few in (Letters like Q, X and Z spring to mind!).

So, grab a coffee (or whatever drink takes your fancy) and let’s start on the Azure Policy alphabet!


Append is the first of our Policy Effects and is used to add extra fields to resources during update or creation, however this is only available with Azure Resource Manager (ARM). The example below sets IP rules on a Storage Account:

"then": {
    "effect": "append",
    "details": [{
        "field": "Microsoft.Storage/storageAccounts/networkAcls.ipRules",
        "value": [{
            "action": "Allow",
            "value": ""

Assignment is the definition of what resources or scope your Policy is being applied to.

Audit is the Policy Effect that evaluates the resources and report a non-compliance in the logs. It does not take any actions; this is report-only.

"then": {
    "effect": "audit"

AuditIfNotExists is the Policy Effect that evaluates whether a property is missing. So for example, we can say if the type of Resource is a Virtual Machine and we want to know if that Virtual Machine has a particular tag or extension present. If yes, the resource will be returned as Compliant, if not, it will return a non-compliance. The example below evaluates Virtual Machines to determine whether the Antimalware extension exists then audits when missing:

    "if": {
        "field": "type",
        "equals": "Microsoft.Compute/virtualMachines"
    "then": {
        "effect": "auditIfNotExists",
        "details": {
            "type": "Microsoft.Compute/virtualMachines/extensions",
            "existenceCondition": {
                "allOf": [{
                        "field": "Microsoft.Compute/virtualMachines/extensions/publisher",
                        "equals": "Microsoft.Azure.Security"
                        "field": "Microsoft.Compute/virtualMachines/extensions/type",
                        "equals": "IaaSAntimalware"


Blueprints – Instead of having to configure features like Azure Policy for each new subscription, with Azure Blueprints you can define a repeatable set of governance tools and standard Azure resources that your organization requires. This allows you to scale the configuration and organizational compliance across new and existing subscriptions with a set of built-in components that speed the development and deployment phases.

Built-In –Azure provides hundreds of built-in Policy and Initiative definitions for multiple resources to get you started. You can find then both on the Microsoft Learn site or on GitHub.


Compliance State shows the state of the resource when compared to the policy that has been applied. Unsurprisingly this has 2 states, Compliant and Non-Compliant

Costs – if you are running Azure Policy on Azure resources, then its free. However, you can use Azure Policy to cover Azure Arc resources and there are specific scenarios where you will be charged:

  • Azure Policy guest configuration (includes Azure Automation change tracking, inventory, state configuration): $6/Server/Month
  • Kubernetes Configuration: First 6 vCPUs are free, $2/vCPU/month

Custom Policy definitions are ones that you create yourself when a Built-In Policy doesn’t meet the requirements of what you are trying to achieve.


Dashboards in the Azure Portal give you a graphical overview of the compliance state of your Azure environments:

Definition Location is the scope to where the Policy or Initiative is assigned. This can be Management Group, Subscription, Resource Group or Resource.

Deny is the Policy Effect used to prevent a resource request or action that doesn’t match the defined standards.

"then": {
    "effect": "deny"

DeployIfNotExists is the Policy Effect used to apply the action defined in the Policy Template when a resource is found to be non-compliant. This is used as part of a remediation of non-compliant resources. Important point to note – policy assignments that use a DeployIfNotExists effect require a managed identity to perform remediation.

Docker Security Baseline is a set of default configuration settings which ensure that Docker Containers in Azure are running based on a recommended set of regulatory and security baselines.


Enforcement Mode is a property that allows you to enable/disable enforcement of policy effects while still evaluating compliance.

Evaluation is the process of scanning your environment to determine the applicability and compliance of assigned policies.


Fields are used in policy definitions to specify a property or alias. In the example below, the field property contains “location” and “type” at different stages of the evaluation:

"if": {
        "allOf": [{
                "field": "location",
                "notIn": "[parameters('listOfAllowedLocations')]"
                "field": "location",
                "notEquals": "global"
                "field": "type",
                "notEquals": "Microsoft.AzureActiveDirectory/b2cDirectories"
    "then": {
        "effect": "Deny"


GitHub – you can use GitHub to build an “Azure Policy as Code” workflow to manage your policies as code, control the lifecycle of updating definitions, and automate the process of validating compliance results.

Governance Visualizer – I have to include this because I think its an awesome tool – Julian Hayward’s AzGovViz tool is a PowerShell script which captures Azure governance capabilities such as Azure Policy, RBAC and Blueprints and a lot more. If you’re not using it, now is the time to start.

Group – within an Initiative, you can group policy definitions for categorization. The Regulatory Compliance feature uses this to group definitions into controls and compliance domains.


Hierarchy – this sounds simple but is important. The location that you assign the policy should contain all resources that you want to target under that resource hierarchy. If the definition location is a:

  • Subscription – Only resources within that subscription can be assigned the policy definition.
  • Management group – Only resources within child management groups and child subscriptions can be assigned the policy definition. If you plan to apply the policy definition to several subscriptions, the location must be a management group that contains each subscription.


Initiative (or Policy Set) is a set of Policies that have been grouped together with the aim of either targeting a specific set of resources, or to evaluate and remediate a specific set of definitions or parameters. For example, you could group several tagging policies into a single initiative that is targeted at a specific scope instead of applying multiple policies individually.


JSON – Policy definitions are written in JSON format. The policy definition contains elements for:

  • mode
  • parameters
  • display name
  • description
  • policy rule
    • logical evaluation
    • effect

An example of the “Allowed Locations” built-in policy is shown below

  "properties": {
    "displayName": "Allowed locations",
    "policyType": "BuiltIn",
    "description": "This policy enables you to restrict the locations...",
    "mode": "Indexed",
    "parameters": {
      "listOfAllowedLocations": {
        "type": "Array",
        "metadata": {
          "description": "Locations that can be specified....",
          "strongType": "location",
          "displayName": "Allowed locations"
    "policyRule": {
      "if": {
        "allOf": [
            "field": "location",
            "notIn": "[parameters('listOfAllowedLocations')]"
            "field": "location",
            "notEquals": "global"
            "field": "type",
            "notEquals": "Microsoft.AzureActiveDirectory/b2cDirectories"
      "then": {
        "effect": "Deny"
  "id": "/providers/Microsoft.Authorization/policyDefinitions/e56962a6-4747-49cd-b67b-bf8b01975c4c",
  "type": "Microsoft.Authorization/policyDefinitions",
  "name": "e56962a6-4747-49cd-b67b-bf8b01975c4c"


Key Vault – you can integrate Key Vault with Azure Policy to audit the key vault and its objects before enforcing a deny operation to prevent outages. Current built-ins for Azure Key Vault are categorized in four major groups: key vault, certificates, keys, and secrets management.

Kubernetes – Azure Policy uses Gatekeeper to apply enforcements and safeguards on your clusters (both Azure Kubernetes Service (AKS) and Azure Arc enabled Kubernetes). This then reports back into your centralized Azure Policy Dashboard on the following:

  • Checks with Azure Policy service for policy assignments to the cluster.
  • Deploys policy definitions into the cluster as constraint template and constraint custom resources.
  • Reports auditing and compliance details back to Azure Policy service.

After installing the Azure Policy Add-on for AKS, you can apply individual policy definitions or initiatives to your cluster.


Lighthouse – for Service Providers, you can use Azure Lighthouse to deploy and manage policies across multiple customer tenants.

Linux Security Baseline is a set of default configuration settings which ensure that Linux VMs in Azure are running based on a recommended set of regulatory and security baselines.

Logical Operators are optional condition statements that can be used to see if resources have certain configurations applied. There are 3 logical operators – not, allOf and anyOf.

  • Not means that the opposite of the condition should be true for the policy to be applied.
  • AllOf requires all the conditions defined to be true at the same time.
  • AnyOf requires any one of the conditions to be true for the policy to be applied.
"policyRule": {
  "if": {
    "allOf": [{
        "field": "type",
        "equals": "Microsoft.DocumentDB/databaseAccounts"
        "field": "Microsoft.DocumentDB/databaseAccounts/enableAutomaticFailover",
        "equals": "false"
        "field": "Microsoft.DocumentDB/databaseAccounts/enableMultipleWriteLocations",
        "equals": "false"
  "then": {


Mode tells you the type of resources for which the policy will be applied. Allowed values are “All” (where all Resource Groups and Resources are evaluated) and “indexed” (where policy is evaluated only for resources which support tags and location)

Modify is a Policy Effect that is used to add, update, or remove properties or tags on a subscription or resource during creation or update. Important point to note – policy assignments that use a Modify effect require a managed identity to perform remediation. If you don’t have a managed identity, use Append instead. The example below is replacing all tags with a value of environment with a value of test:

"then": {
    "effect": "modify",
    "details": {
        "roleDefinitionIds": [
        "operations": [
                "operation": "addOrReplace",
                "field": "tags['environment']",
                "value": "Test"


Non-Compliant is the state which indicates that a resource did not conform to the policy rule in the policy definition.


OK, so this is my first failure. Surprising, but lets keep going!


Parameters are used for providing inputs to the policy. They can be reused at multiple locations within the policy.

    "properties": {
        "displayName": "Require tag and its value",
        "policyType": "BuiltIn",
        "mode": "Indexed",
        "description": "Enforces a required tag and its value. Does not apply to resource groups.",
        "parameters": {
            "tagName": {
                "type": "String",
                "metadata": {
                    "description": "Name of the tag, such as costCenter"
            "tagValue": {
                "type": "String",
                "metadata": {
                    "description": "Value of the tag, such as headquarter"
        "policyRule": {
            "if": {
                "not": {
                    "field": "[concat('tags[', parameters('tagName'), ']')]",
                    "equals": "[parameters('tagValue')]"
            "then": {
                "effect": "deny"

Policy Rule is the part of a policy definition that describes the compliance requirements.

Policy State describes the compliance state of a policy assignment.


Query Compliance – While the Dashboards in the Azure Portal (see above) provide you with a visual method of checking your overall compliance, there are a number of command line and automation tools you can use to access the compliance information gnerated by your policy and initiative assignments:

az policy state trigger-scan --resource-group "MyRG"

  • Azure PowerShell using the following command:

Start-AzPolicyComplianceScan -ResourceGroupName 'MyRG'


Regulatory Compliance describes a specific type of initiative that allows grouping of policies into controls and categorization of policies into compliance domains based on responsibility (Customer, Microsoft, Shared). These are available as built-in initiatives (there are built-in initiatives from CIS, ISO, PCI DSS, NIST, and multiple Government standards), and you have the ability to create your own based on specific requirements.

Remediation is a way to handle non-compliant resources. You can create remediation tasks for resources to bring these to a desired state and into compliance. You use DeployIfNotExists or Modify effects to correct violating policies.


Security Baseline for Azure Security Benchmark – this is a set of policies that comes from guidance from the Microsoft cloud security benchmark version 1.0. The full Azure Policy security baseline mapping file can be found here.

Scope is the location where the policy definition is being assigned to. This can be Management Group, Subscription, Resource Group or Resource.


Tag Governance is a crucial part of organizing your Azure resources into a taxonomy. Tags can be the basis for applying your business policies with Azure Policy or tracking costs with Cost Management. The template shown below shows how to enforce Tag values across your resources:

   "properties": {
      "displayName": "Require tag and its value",
      "policyType": "BuiltIn",
      "mode": "Indexed",
      "description": "Enforces a required tag and its value. Does not apply to resource groups.",
      "parameters": {
         "tagName": {
            "type": "String",
            "metadata": {
               "description": "Name of the tag, such as costCenter"
         "tagValue": {
            "type": "String",
            "metadata": {
               "description": "Value of the tag, such as headquarter"
      "policyRule": {
         "if": {
            "not": {
               "field": "[concat('tags[', parameters('tagName'), ']')]",
               "equals": "[parameters('tagValue')]"
         "then": {
            "effect": "deny"
   "id": "/providers/Microsoft.Authorization/policyDefinitions/1e30110a-5ceb-460c-a204-c1c3969c6d62",
   "type": "Microsoft.Authorization/policyDefinitions",
   "name": "1e30110a-5ceb-460c-a204-c1c3969c6d62"


Understanding how Effects work is key to understanding Azure Policy. By now, we’ve listed all the effects out above. The key thing to remember is that each policy definition has a single effect, which determines what happens when an evaluation finds a match. There is an order in how the effects are evaluated:

  • Disabled is checked first to determine whether the policy rule should be evaluated.
  • Append and Modify are then evaluated. Since either could alter the request, a change made may prevent an audit or deny effect from triggering. These effects are only available with a Resource Manager mode.
  • Deny is then evaluated. By evaluating deny before audit, double logging of an undesired resource is prevented.
  • Audit is evaluated.
  • Manual is evaluated.
  • AuditIfNotExists is evaluated.
  • denyAction is evaluated last.

Once these effects return a result, the following 2 effects are run to determine if additional logging or actions are required:

  • AuditIfNotExists
  • DeployIfNotExists


Visual Studio Code contains an Azure Policy code extension which allows you to create and modify policy definitions, run resource compliance and evaluate your policies against a resource.


Web Application Firewall – Azure Web Application Firewall (WAF) combined with Azure Policy can help enforce organizational standards and assess compliance at-scale for WAF resources.

Windows Security Baseline is a set of default configuration settings which ensure that Windows VMs in Azure are running based on a recommended set of regulatory and security baselines.


X is for ….. ah come on, you’re having a laugh ….. fine, here you go (artistic license taken!):

Xclusion – this of course should read Exclusion ….. when assigned, the scope includes all child resource containers and child resources. If a child resource container or child resource shouldn’t have the definition applied, each can be excluded from evaluation by setting notScopes.

Xemption – this of course should read Exemption …. this is a feature used to exempt a resource hierarchy or individual resource from evaluation. These resources are therefore not evaluated and can have a temporary waiver (expiration) period where they are exempt from evaluation and remediation.


YAML – You can use Azure DevOps to check Azure Policy Compliance using using YAML Pipelines. However, you need to use the AzurePolicyCheckGate@0 task. The syntax is shown below:

# Check Azure Policy compliance v0
# Security and compliance assessment for Azure Policy.
- task: AzurePolicyCheckGate@0
    azureSubscription: # string. Alias: ConnectedServiceName. Required. Azure subscription. 
    #ResourceGroupName: # string. Resource group. 
    #Resources: # string. Resource name.


Zero Non-Compliant – which is exactly the position you want to get to!

Z is also for Zzzzzzzz, which may be the state you’re in if you’ve managed to get this far!


So thats a lot to take in, but it gives you an insight into the different options that are available in Azure Policy to ensure that your Azure environments can meet both governance and cost management objectives for your organization.

In this post, I’ve stayed with the features of Azure Policy and apart from a few examples didn’t touch on the many different methods you can use to assign and manage policies which are:

  • Azure Portal
  • Azure CLI
  • Azure PowerShell
  • .NET
  • JavaScript
  • Python
  • REST
  • ARM Template
  • Bicep
  • Terraform

As always, check out the official Microsoft Learn documentation for a more in-depth deep dive on Azure Policy.

Hope you enjoyed this post! Be sure to check out the rest of the articles in this years Azure Spring Clean.


Can we prevent Cloud Repatriation in Azure?

I’ve seen a lot of articles in the last few months talking about Cloud Repatriation, so I’ve decided to look into this more and find out more about:

  • What is Cloud Repatriation?
  • Why is it suddenly a topic?
  • Why its not as easy as it sounds?
  • How did this happen in the first place?
  • Why it should never become an issue?

What is Cloud Repatriation?

Lets start with the easy question and look for the definition of what it is. Repatriation is a term that has been around for a while and is defined in its simplest form as:

“the process of returning a thing or a person to its place of origin”

So if we take that definition and apply it to technology, Cloud Repatriation is the process of companies moving their services out of Microsoft Azure (or other Public Cloud providers such as AWS or GCP) and relocating those services back to the On-Premises or Private Cloud environments that they originated from.

Why is it suddenly a topic?

One word – cost. The cost of running a Cloud Computing environment isn’t the same as running an On-Premises environment.

In an On-Premises environment, we work with predictable cost models when it comes to Equipment, Licensing and Staffing costs. The only variable is Power which is in a constant state of flux and change. This leads us down the CapEx route which forces companies into predicting the costs involved over a 3-5 year period. Finance people love this as it means they can safely predict future costs and budgets, and not have to worry about unexpected charges affecting their balance sheets.

The first part of that previous paragraph is ambiguous. Unless your company is static with zero growth projections (and lets be honest, no company is), its going to be difficult to predict costs or a period of years:

  • How many servers will you need to run your estate? If you order too little, you’ll need to buy more and your CFO won’t like that after you told them that these were the only costs needed for the next 3 years.
  • If you order too much, its overspend and equipment/license wastage and you may not be approved for additional equipment in your next Budget cycle (which leads you to use unsupported and out of warranty equipment that may lead to more costs to keep that operational).
  • You may have also hired either too few staff (leading to overwork and burnout) or too many staff (which leads to idleness and ultimately reducing the workforce).

Cloud Computing environments use the OpEx which works differently in that it uses a Pay-As-You-Use model. You use a Cloud Service and are billed monthly for the cost of using it. You have options to scale the service up or down as required, and you can also purchase Reserved Instances or Savings Plans over a 3/5 year period in order to reduce the costs and have that “CapEx-feel” to Cloud Computing.

The problem is that there is no clearly defined way of keeping those costs consistent, and Microsoft’s recent announcement on price increases for European Customers (and depending on your currency, this was as much as 15%) has meant that CFOs and CTOs are scrambling to look at alternative solutions to the Cloud.

And in some cases, the word “Repatriation” has been thrown about and the question being asked is “were we wrong to move to Azure/AWS/GCP, and should we look to move our servers and data back?”

Why its not as easy as it sounds?

So you want to move back? It sounds easy, and if your Cloud Migration involved only a “Lift And Shift” or Rehost (where you migrated your VMs as-is and made no modifications to them), then fire away! Buy your equipment, install your favourite hypervisor and off you go! There are 3rd party products (such as Carbon) on the market that will bring your VMs back to either VMware or Hyper-V.

You can also migrate Office365 mailboxes back to On-Premises Exchange Servers by setting up a migration batch in EAC, so that process is simple.

But what if you did more than just Rehost? Lets remind ourselves of the 5 R’s of Cloud Rationalization:

  • Rehost – also known and Lift and Shift.
  • Refactor – customizing your apps and infrastructure to align with the Cloud.
  • Rearchitect – divides your app into different parts or MicroServices.
  • Rebuild – completely rebuild and redevelop your app.
  • Replace – completely replace the app with a cloud-native SaaS application.

If you’ve done anything more than Rehost during your migration to Azure, then you have a bit of work on your hands getting it back. It’s not impossible by any means but as with all Cloud Services, it’s a lot easier to get them into the Cloud than it is to get them out. If you’ve redesigned your app to make it Cloud-Native using any of the other 4 “R’s”, then you need to realise that you need to recreate that environment on your On-Premises, and that may not be easy and cost a lot more than it is running the service in Azure in the first place!

How did this happen in the first place?

To work out why this should never have become an issue, we need to go back through the mists of time and work out why the migrations happened in the first place. It was most likely down to either:

  • Running old and unsupported hardware.
  • Complex systems that were difficult to manage and maintain.
  • Enhanced Security.
  • Easier Scalability of services.

And if you moved to Azure, its likely that you used either :

  • Azure Site Recovery (and were using Azure as a DR platform to initially test how your VMs would work).
  • Azure Migrate (where you ran a discovery assessment on the load of your VMs over a period of time up to 30 days, and used that assessment as a means of sizing your target Azure VMs).

The original version of Azure Migrate only supported migration of VMware VM workloads to Azure. The new version (released in November 2019) included Database and Web Server migration features, and Application Discovery.

In all likelihood, some companies went down the same route as the initial Office365 migrations (where they only migrated Email and never used any of the other underlying services included in their licenses), and in doing their Cloud Migrations to Azure decided to effectively “Rehost-only” and not use the additional benefits that were available. So instead of running Web Servers or Applications as part of an Azure App Service, they may have been left running on VMs with underlying Web or App Services.

Another good example here is the Finance or Warehouse Management Application that ran on a VM and also required a dedicated SQL backend (that also ran on a VM). Instead of refactoring that into an App Service or a Serverless SQL Database, it was left running on VMs in Azure. We all know that these VMs have spikes at certain times every month, so in that case the scalability that could have offered cost savings wasn’t implemented.

Why it should never have become an issue?

There are a number of contributing factors why Cloud Computing costs can spiral out of control. I’ve made the case for these below, and in some cases what can be done to address them:

  • Azure Reserved Instances – this is what Finance people love as they immediate savings and some semblance of how they can “CapEx their OpEx” costs over a longer period of time.
  • Azure Cost Management – Setting a budget or at least budget alerts on monthly spend can at least give you an indication of where you are each month. If you’re getting budget alerts emails on the 10th of each month, then you haven’t got either your budget or your Service SKU’s and Sizing right.
  • Azure Policy – have you set policies to say that you can only have certain VM SKUs, running on certain disk types, in certain regions?
  • RBAC Roles – this is the most important one and the biggest factor in “spend-creep”. Who can do what in your Azure Subscription? For example, have you granted developers Owner access in their own Resource Group so they can spin up what they want? Changing a SKU on a VM is single click operation, as is changing Disk type from HDD to SSD, redundancy from LRS to GRS etc. And do the policies you have set above apply across the subscription or have you exclusions set somewhere? Having control of your environemnt and assigning the correct roles.
  • Assessments – OK, this is a “after the horse has bolted” scenario, but its never too late to do it. Asking questions like why did you move in the first place, does it align with business goals, strategy and governance objectives.
  • Azure Advisor – its there, on every resource you are running in Azure and also as its own page in the portal, giving you recommendations based on over/under consumption and how you can address this.
  • Backup/DR- this has long been a bone of contention for some companies and I’ve experienced some who see Cloud-based backup solutions as either unnecessary or too expensive (because being in the cloud means we don’t need Backup or DR, right?).


I’ve based this article purely on costs and how you can utilize the various Tools, Policies and Governance tools available in Azure that can help make final decisions on whether Cloud Repatriation is the right choice for your business.

Hope you enjoyed this post, until next time!


Control your Azure Virtual Desktop costs with Scaling Plans

Cloud Computing has changed the way we approach our enterprise infrastructure.

The amount of options available to us now means that we can finally ditch that dusty old server sitting the the bottom of the server rack (or in some cases at the back of a cupboard) for a modern secure solution that we don’t need to sit and pray in front of every time we need to restart it.

The Problem with the Cloud

But …. some people would prefer to keep old “Dusty Springfield” alive because the effort to migrate and in some cases re-architect the service is too much and too costly. And thats the thing we hear the most when a suggestion to migrate to a cloud service is raised – “the cloud is very expensive…”.

And lets be honest, it is …..

Money money money ……

There, I said it. Out Loud. In Print. Cloud Computing is expensive. There’s a helicopter hovering over my house at the minute but I’m sure its nothing to worry about ……

In all seriousness though, when scoping out a Cloud solution the first thing that is looked at is cost. You can argue as much as you want about the redundancy, the lower power and cooling costs, lack of hardware costs etc. The bean counters will look at the bottom line and say “we’re not paying that much now….”. And “Dusty Springfield” limps on defiantly in corner.

Of course, your cloud computing costs are defined by the options you select and what level of redundancy you need. Scale Sets, Storage redundancy across zones and regions. Or just keep it as locally redundant storage? Then you get into the sizing of your solutions.

How the Costs add up

Azure Virtual Desktop is one of those cool technologies that can help you provide a secure environment for your users to access Cloud or Hybrid environments in a consistent and unified experience. But because its built on underlying VMs which you need to size based on your requirements, the costs can mount up.

Lets take a look at an example of a standard Azure Virtual Desktop host pool that contains 10 Session Hosts which are delivering Remote Apps to 100 users. The Session Hosts are generally sized from the General Purpose VM type and the most common one used is the “Standard_D4s_v3”, which has 4 vCPU’s and 16GB memory.

The base cost for this VM if you create a standard Azure Virtual machine comes in at approx $160 per month.

Standard Virtual Machine Type

However, if we use this VM type for our Azure Virtual Desktop Session Hosts with Windows 10 Enterprise Multi-Session version 21H2 with Microsoft 365 Apps installed, the cost then jumps to $290 per month.

Azure Virtual Desktop Virtual Machine Type

So, lets go back to our 10 Session Hosts – at that price we’re talking $2900 per month, or just under $35000 per year. And thats for just 10 VMs in the environment. And thats why Cloud Computing is expensive! Of course, this doesn’t take into account reserved instances or spot instances, but you get the idea.

The $290 per month cost for a VM isn’t based on a cost per month – its based on 730 hours of usage or 24 hours multiplied by just over 30. This where you can start cutting into that $35000 per year cost, and where Scaling Plans applied to your Azure Virtual Desktop Host Pools can help.

Scaling Plans

Scaling Plans lets you scale your session host virtual machines (VMs) in a host pool up or down to optimize deployment costs. You can create a scaling plan based on:

  • Time of day
  • Specific days of the week
  • Session limits per session host

You follow the guidelines below when creating your scaling plan:

  • At the time of writing, you can only configure autoscale with existing Pooled host pools. This won’t work with Personal host pools
  • You must create the scaling plan in the same Azure region as the host pool you assign it to.
  • All host pools you use with autoscale must have a configured MaxSessionLimit parameter. Don’t use the default value.
  • You must grant Azure Virtual Desktop access to manage the power state of your session host VMs.

Create a custom RBAC role

Now that we know the benefits and rules, the first thing we need to do is create a custom RBAC role. This custom role and assignment will allow Azure Virtual Desktop to manage the power state of any VMs in those subscriptions. It will also let the service apply actions on both host pools and VMs when there are no active user sessions.

The steps for creating the Custom RBAC Role are as follows (this is the same for creating any Custom RBAC Role):

  • First, create a json file using whatever your favourite editor is (I’m using Sublime in this example). Save the file as avdscale.json and add the following information into it:
  • Open the Azure portal and go to Subscriptions and select a subscription that contains a host pool and session host VMs you want to use with autoscale. Select Access control (IAM). Select the + Add button, then select Add custom role from the drop-down menu.

  • On the “Basics” screen, go to Baseline permissions and browse to the avdscale.json file that you just created.
  • This will import all of your settings, so on the next screen you will see the permissions that you had specified in your json file.
  • Next, we have “Assignable Scopes”. You want to assign this at subscription level as assigning this custom role at any level lower than your subscription, such as the resource group, host pool, or VM, will prevent autoscale from working properly.
  • We can now skip to the “Review and Create” screen, as this will validate and list out our permissions for the RBAC role. Review these and then click “Create”:
  • And once thats created, we can see its been created as a Custom Role:

  • Now we need to add a Role Assignment for our RBAC Role. So we click on “Add role assignment”

  • We select our Custom RBAC role and in the members screen, we choose to assign access to a User, group or service principal. From the select members screen, search for “Windows Virtual Desktop”
  • Go to “Review and Assign” and click create:
  • And we can see that at subscription level the role has been assigned:

Create our Scaling Plan

Now that our RBAC role is done, we can create our scaling plan.

  • Open the Azure portal. In the search bar, type Azure Virtual Desktop and select the matching service entry. Select Scaling Plans, then select Create.
  • On the Basics screen, provide the following:
    • Subscription and Resource Group where the Scaling Plan will be created
    • Name
    • Location (remember this needs to be in the same region as your Host Pool)
    • Time Zone

The other entries are optional, however an important one to note is Exclusion Tags – you can use this in conjunction with Tags to excluse certain VMs from autoscaling operations

  • Click next and this will bring you to the Schedules screen. Click on Add Schedule
  • In the General screen, we enter a Schedule Name and also select the days we want the schedule to apply to.
  • In the Ramp-up screen, we specify a default starting point.
    • So in this instance, we want to have 20% (or 2 out of our 10 Session Hosts) powered on and ready to accept connections at 08:00.
    • We’ve selected “Breadth First” for Load balancing – this means users will be spread evenly across available hosts and is recommended for consistent performance.
    • Finally, we have set a Capacity threshold of 80%. If you recall, we set our hosts to accept a maximum of 10 connections. We have 2 hosts powered on, so once we reach 16 users across those 2 hosts, the next host will automatically power on.
  • Next up is Peak hours. For this we specify a starting time (which is normally when the majority of your users will be logging on) and we’ve also flipped the Load Balancing to “Depth-first”, which will load up all available hosts with user sessions (up to our 80% threshold) before bringing another one online. This is really up to you as to how you want to load balance, but as a reminder:
    • Breadth-first load balancing distributes new user sessions across all available session hosts in the host pool.
    • Depth-first load balancing distributes new sessions to any available session host with the highest number of connections that hasn’t reached its session limit yet.
  • Next up is Ramp-down, this is where we start deallocating hosts at the end of the working day and as you can see, the target is to get back down to 20% of the hosts. The important point to make here is the “Force logoff users” option. If this is enabled then the following applies:
    • This will choose the session host with the lowest number of user sessions to shut down. Autoscale will put the session host in drain mode, send all active user sessions a notification telling them they’ll be signed out, and then sign out all users after the specified wait time is over. After autoscale signs out all user sessions, it then deallocates the VM.
    • During ramp-down, autoscale will only shut down VMs if all existing user sessions in the host pool can be consolidated to fewer VMs without exceeding the capacity threshold.
  • Finally, we get to “Off-peak hours” which is the end of the “Ramp-down” period.
  • And thats our weekday schedule created. You can also go back in and create a weekend schedule where you can bring the number of hosts down to 10% and have a higher capacity threshold at weekends:
  • Once the schedules are created, we assign the Scaling Plan to our Host pool and click on “Enable autoscale”:
  • And now we can validate our options and click on “Review and create”:

Give all of this about an hour to kick in and you will see your Azure Virtual Desktop session hosts automatically deallocated as per your schedules if not in use!

Money money money ….

Earlier in this post, I gave a yearly figure of approx $35000 to run our 10 Session Host VMs. However, that figure is based on full consumption. So lets do some very quick calculations to see how our scaling plan affects that figure:

  • As we said, a single VM running at full consumption (or the full 730 hours) will cost us $290 per month.
  • Based on our schedules created above, we’re going to have 1 VM running full time for both weekdays and weekends. So thats $290 per month, or $3,480 per year.
  • We’re then guaranteed to have 1 VM running from Monday until Friday for 24 hours, and also on weekends for 12 hours each day (depending on how schedule is created). Thats effectively 6 days a week instead of 7. So we need to calculate that over a year which is a case of getting 6/7ths of our full price figure. Thats coming in at $2,983 per year for that VM.
  • Now, its back to the other 8 VMs and the 100 users who are using this. “If” those 100 users are logged on, the other 8 VMs will be up for 12 hours a day from Monday to Friday only as per our schedule. So for that, we need to get 5/7ths of our full price figure (which is $2,486) and then half it because we’re only using for 12 hours a day (and thats coming in at $1,243 per VM).

In summary, what we’ve got is:

  • $3,480 – 1 VM at full consumption
  • $2,983 – 1 VM at slightly reduced consumption for weekdays and weekends
  • $9,944 – 8 VMs running for 12 hours a day from Monday to Friday

Add those figures up and you get a total of $16,407. And we need to remember, that figure doesn’t available cost reductions like Reserved Instances or Hybrid Benefit.


So by implementing a Scaling Plan for the Host pool above, we’ve saved ourselves nearly $20,000. Again I’m going to stress the figures I’m quoting here are approximate, may not represent what you see in your own personal or enterprise subscriptions, and should not be taken as exact savings. Make sure to speak to your Microsoft TAM or Cloud Service Provider for more details. You can find out more about scaling plans here.

Hope you enjoyed this post, until next time!

Is it the (long overdue) end of the road for on-premises Exchange Servers?

A few weeks ago, I posted a Wired.com article on my LinkedIn feed entitled “Your Microsoft Exchange Server Is a Security Liability” by Andy Greenburg.

Image Credit – Priasoft

It was a great article that was released on the back of the most recent Exchange security vulnerability: this time the ProxyNotShell Zero-Day which oddly enough took almost 2 months to patch correctly. This has been released as part of the November Patch Tuesday release, and there are a few pre-requisites required (basically, be at the latest CU version for your Exchange environments and then apply the patch).

Image Credit – Microsoft

Its the latest in a long line of Exchange Server vulnerabilities. And its interesting to note this line in the Microsoft Tech Community Article that states:

These vulnerabilities affect Exchange Server. Exchange Online customers are already protected from the vulnerabilities addressed in these SUs and do not need to take any action other than updating any Exchange servers in their environment.

Well, of course Exchange Online isn’t affected. And in his Wired article, Andy Greenburg makes the point that Microsoft are happy to put all of their security efforts into protecting their Exchange Online services and customers as that makes up the majority of their customer base.

A brief history of Exchange Online

If we look back on the history of Exchange Online, it all started with BPOS way back in 2008. At the time of release, Microsoft had been privately offering customers a hosted email service since early 2007. That was around the time that Exchange Server 2007 was released, and it was also the time when Exchange started to get really complicated as regards the amount of different server roles involved and the overhead involved in maintaining them.

Now lets just put one thing on record. I would never dream of believing that Microsoft would conspire to over-complicate an on-premises solution with the intention of pushing more customers towards a cloud offering. I mean, they wouldn’t, would they?

There was always an option for having a Front-End sever separate, and the solution could sometimes be integrated with the long gone but not forgotten ISA Server.

A look at the diagram below shows us the evolution of how Exchange roles have changed since 2000/2003 versions, and have pretty much rolled back into less complicated instances with the release of 2016/2019 versions:

Image Credit – devco.re

Whether Microsoft intended to make Exchange Server more complicated or not, segregation of those roles was was needed due to the evolution of security threats and the rate of attacks that were happening on Exchange Server installations. What it did though was make Exchange a monster to manage from an adminstration perspective. Almost to the point that it made the decision to migrate to Exchange Online easier, as it offset the cost for some organisations of hiring a full time Exchange Administrator to manage that environment.

So I should Migrate?

The easy answer to that is yes, you should migrate. There’s a number of factors to take into consideration in answering that question:

  • As we saw in the recent ProxyNotShell Zero-Day and the length of time it took to remediate, Microsoft really doesn’t care about on-premises Exchange anymore. From Andy’s Wired article, the quote from Microsoft states that: "We strongly recommend customers migrate to the cloud to take advantage of real-time security and instant updates to help keep their systems protected from the latest threats".
  • The recent announcement that the next CU release will only be for Exchange Server 2019 (CU13). Because 2013 (which goes EOL in April 2023) and 2016 are now in Extened support, there will only be Security Updates released as required (such as the patch for the Zero-Day). But in order to install that and to get support from Microsoft, you must be in the most recent (and last) CU version.
  • There hasn’t been an Exchange Server 2022 release yet. This was touted as being released in late 2021, and early indication were that this would be a subscription based service. The latest update on this was released in this post in June 2022, where the updated roadmap is to release the next Exchange Server version in 2025. Are we really prepared to wait that long if the vulnerabilities continue at this rate? Again, the interesting quote to take ouit of this release is: The next version will require Server and CAL licenses and will be accessible only to customers with Software Assurance, similar to the SharePoint Server and Project Server Subscription Editions.
  • If you decide to migrate to Exchange Online, what does your business want to get out of the migration? Its the question thats rarely asked but its the most important one for any migration scenario. Because unlike 15 years ago when it was hosted Email and SharePoint with Live Meetings thrown in, Microsoft 365 is an extensive offering of Apps, Services and Licencing options and can open a gateway to a full cloud migration if planned correctly.
  • You can go for the Basic plans such as Business Basic or Office 365 E1 and “just” have Email, Sharepoint and Teams if you want. But go a little further, you take Office licensing into the equation, and maybe Defender, and then maybe Azure Virtual Desktop rights. The opportunities are there, it’s not just about lifting and shifting the tech anymore. You can check out my previous post on the different licensing options here.

Why can’t everyone just migrate to Exchange Online?

The majority of companies have already migrated to Exchange – nearly 350 million Office365 users running over 7 billion (yes, billion) mailboxes running on 300,000 Exchange Online instances on servers running in Microsoft Datacenters across the world.

There are those special cases who still need Exchange Servers On-Premises, and those servers need to be hardened or have specialist teams supporting them.

Then there are those companies that have specific Data Residency requirements. And thats really all they say ….. "We're not moving our data into the Cloud". It shows a lack of understanding of how Data Residency in Exchange Online works. Depending on where you are in the world, you can find out on this site the different options for where your Microsoft 365 data would be stored post migration, depending on the options you select at tenant creation and also in what datacenters the services are available around the world (for example, Forms is not available in all datacenters, only some US ones).


Having your data secured by Microsoft is better than having your data potentially exposed because of a mistrust or misunderstanding of what the cloud can offer as regards data residency. You also have the admin overhead of managing and securing your Exchange environment.

I think its the end of the road for Exchange Server – while a migration amy sound painful to some, a compromised server is much worse.

Hope you enjoyed this post, until next time!

Microsoft Ignite 2022 – Highlights of the Announcements (with a few personal opinions thrown in)!

For this year’s Microsoft Ignite, in-person conferences were held in cities around the world after two years of being online and I was fortunate enough to attend the Manchester Spotlight event last week.

At the conference Microsoft had their usual presentations, ‘Ask the Expert’ sessions, exhibition areas and a Cloud Skills Challenge. But of course it’s the announcements that everyone looks forward to the most, where improvements, changes and updates to the various technologies in the Microsoft product portfolio are revealed.

I’ve picked out my top highlights below!

  • Azure Stack HCI

I’m on both sides of the fence about the Azure Stack HCI announcements.

I love the Azure Stack HCI product and have been using it since the days when it was called Storage Spaces Direct and ran on Hyper-Converged Infrastructure in on-premises datacenters. As it has evolved, Microsoft has invested heavily in the Azure Stack HCI product, which allows you to run Azure Managed Infrastructure in your own datacentres and combine on-premises infrastructure with Azure Cloud Services.

One of the big announcements was around licensing, and gives Enterprise Agreement customers with Software Assurance the ability to exchange their existing licensed cores of Windows Server Datacentre to get Azure Stack HCI at no additional cost. This includes the right to run unlimited Azure Kubernetes Service and unlimited Windows Server guest workloads on the Azure Stack HCI cluster.

Speaking of Kubernetes, support for Azure Kubernetes Service on Azure Stack HCI is now available, meaning you can deploy and manage containerised apps side-by-side with your VMs on the same physical server or cluster. You can also now make provisioning for hybrid AKS clusters directly from Azure onto your Azure Stack HCI using Azure Arc

On the hardware side, you could previously purchase validated hardware for multiple vendors but in early 2023, Microsoft will begin offering an Azure Stack HCI integrated system based on hardware that’s designed, shipped, and supported by Microsoft (in partnership with Dell). 

This will be available in several configurations:

I mentioned both sides of the fence above, and the licensing announcement is one of the worrying ones, because like the recent announcements that Defender for Servers requires an Azure Subscription (Microsoft Defender for Endpoint (Server Version) is no longer available on the EA price list), we’re now potentially going down the route of Microsoft only allowing Windows Server Datacenter to run on Azure Stack HCI accredited hardware. Or potentially getting rid of the Windows Server Datacenter SKU entirely and having it as a “cloud-connected only” product. Only time will tell.

  • Azure Savings Plan for Compute

Azure Savings Plan for Compute is based on consumption, and allows you to by a one- or three-year savings plan and commit to a spend of $5 per hour per virtual machine (VM). This is based on Azure Advisor Recommendations in the Cost Management and Billing section of the Azure Portal.

Once purchased, this is applied on a hourly basis based on consumption and even if you go above the $5 spend, the initial commitment is still billed at the lower rate and any additional consumption is billed at a Pay-As-You-Go rate.

The main difference between this and Reserved Instances is that Reserved Instances is an up-front commitment whether the VM is powered on or not. Azure Savings Plan for Compute unlocks those lower savings based on consumption.

You can find more details in this article on the Microsoft Community Hub.

  • Azure Virtual Machine Scale Sets – Mixing Standard and Spot instances

Staying on the Cost Savings topic, you can now specify a % of Spot Instance VMs that you wish to run in a VM Scale Set.

This feature (which is in Preview) allows you to reduce compute infrastructure costs by leveraging the deep discounts that Spot VMs can provide while maintaining the compute capacity your workload needs. 

More information can be found here.

  • Microsoft 365 updates

A huge number of announcements were made about Microsoft 365 at this year’s Ignite, most notably:

  • The release of the Microsoft 365 app, which will replace the Office Mobile and Office for Windows App for all Microsoft 365 customers who use this as part of their subscriptions.
  • Teams Premium, which will be available to E5 subscriptions and will bring enhanced meeting features such as insights and live translation in more than 40 languages so that participants can read captions in their own language.
  • Microsoft Places, which will assist with the hybrid working model and let everyone know who will be in the office at what times, where colleagues are sitting, what meetings to attend in person; and how to book space on the days your team is planning to go into the office.

The Teams announcements are great, in particular the live translation option. For us as a multi-national and multi-language organisation, this is a massive step in fostering the inclusion of all users. There is an assumption in the world that spoken English is the native language of Tech, but it’s not everyone’s first language.

  • Microsoft Intune

Microsoft Endpoint Manager is being renamed to Microsoft Intune, which is what it was called before it was renamed to Endpoint Manager. This effectively bundles all Endpoint Management tools under a single brand, including Microsoft Configuration Manager. Some of the main features announced were:

  • ServiceNow Integration
  • Cloud LAPS for Azure Virtual Machines
  • Update Policies or MacOS and Linux Support
  • Endpoint Privileged Management – no more permanent admin permissions on devices!

For me, Endpoint Privileged Management is huge addition which removes the need for any permanent administrative permissions on devices. Cloud LAPS is also a huge security step.

  • Security

Finally on to Security, which was a big focus this year. This year’s updates to the Microsoft Security portfolio coincided with the announcement that Microsoft is now recognised as a leader in the Gartner Magic Quadrant for Security Information and Event Management.

First and foremost is Microsoft’s announcement of a limited-time sale of 50% off Defender for Endpoint Plan 1 and Plan 2 licenses, allowing organisations to do more and spend less by modernising their security with a leading endpoint protection platform. The offer runs until June 2023.

Microsoft 365 Defender now automatically disrupts ransomware attacks. This is possible because Microsoft 365 Defender collects and correlates signals across endpoints, identities, emails, documents and cloud apps into unified incidents and uses the breadth of signal to identify attacks early with a high level of confidence. Microsoft 365 Defender can automatically contain affected assets, such as endpoints or user identities. This helps stop ransomware from spreading laterally.

A number of new capabilities have been announced for Defender for Cloud:

  • Microsoft Defender for DevOps: A new solution that will provide visibility across multiple DevOps environments to centrally manage DevOps security, strengthen cloud resource configurations in code and help prioritise remediation of critical issues in code across multi-pipeline and multicloud environments. With this preview, leading platforms like GitHub and Azure DevOps are supported and other major DevOps platforms will be supported shortly.
  • Microsoft Defender Cloud Security Posture Management (CSPM): This solution, available in preview, will build on existing capabilities to deliver integrated insights across cloud resources, including DevOps, runtime infrastructure and external attack surfaces, and will provide contextual risk-based information to security teams. Defender CSPM provides proactive attack path analysis, built on the new cloud security graph, to help identify the most exploitable resources across connected workloads to help reduce recommendation noise by 99%.
  • Microsoft cloud security benchmark: A comprehensive multicloud security framework is now generally available with Microsoft Defender for Cloud as part of the free Cloud Security Posture Management experience. This built-in benchmark maps best practices across clouds and industry frameworks, enabling security teams to drive multicloud security compliance.
  • Expanded workload protection capabilities: Microsoft Defender for Servers will support agentless scanning, in addition to an agent-based approach to VMs in Azure and AWS. Defender for Servers P2 will provide Microsoft Defender Vulnerability Management premium capabilities.

If you’d like to read more about Microsoft’s Ignite announcements from the conference, then go to Microsoft’s Book of News here.

Hope you enjoyed this post, until next time!

MFA and Conditional Access alone won’t save us from Threat Actors

In the end of a week where we have had 2 very different incidents at high profile organisations across the globe, its interesting to look at these and compare them from the perspective of incident response and the “What we could have done to prevent this from happening” question.

Image Credit – PinClipart

Lets analyze that very question – in the aftermath of the majority of cases, the “What could we have done to prevent this from happening” question invariably leads in to the next question of “What measures can we put in place to prevent this from happening in the future”.

The problem with the 2 questions is that they are reactive and come about only because the incident has happened. And it seems that in both incidents, the required security systems were in place.

Or were they?

A brief analysis of the attacks

  • Holiday Inn

If we take the Holiday Inn attack, the hackers (TeaPea) have said in a statement that:

"Our attack was originally planned to be a ransomware but the company's IT team kept isolating servers before we had a chance to deploy it, so we thought to have some funny [sic]. We did a wiper attack instead," one of the hackers said.

This is interesting because it suggests that the Holiday Inn IT team had a mechanism to isolate the servers in an attempt to contain the attack. The problem was that once the attackers were inside their systems and they realized that the initial scope that their attack was based on wasn’t going to work, their focus changed from Cybercriminals who were trying to make a profit to Terrorism, where they decided to just destroy as much data as they could.

Image Credit – Northern Ireland Cyber Security Centre

Essentially, the problem here is two-fold – firstly, you can have a Data Loss Prevention system in place but its not going to report on or block “Delete” actions until its too late or in some cases not at all.

Second, they managed to access the systems using a weak password. So (am I’m making assumptions here), while the necessary defences and intrusion-detection technologies may have been in place, that single crack in the foundations was all it took.

So the how did they get in? The 2 part of their statement shown below explains it all:

TeaPea say they gained access to IHG's internal IT network by tricking an employee into downloading a malicious piece of software through a booby-trapped email attachment.

The criminals then say they accessed the most sensitive parts of IHG's computer system after finding login details for the company's internal password vault. The password was Qwerty1234.

Ouch ….. so the attack originated as a Social Engineering attack.

  • Uber

We know a lot more about the Uber hack and again this is a case of an attack that originated with Social Engineering. Here’s what we know at this point:

  1. The attack started with a social engineering campaign on Uber employees, which yielded access to a VPN, in turn granting access to Uber’s internal network *.corp.uber.com.
  2. Once on the network, the attacker found some PowerShell scripts, one of which contained hardcoded credentials for a domain admin account for Uber’s Privileged Access Management (PAM) solution.
  3. Using admin access, the attacker was able to log in and take over multiple services and internal tools used at Uber: AWS, GCP, Google Drive, Slack workspace, SentinelOne, HackerOne admin console, Uber’s internal employee dashboards, and a few code repositories.

Again, we’re going to work off the assumption (and we need to make this assumption as Uber had been targeted in both 2014 and 2016) that the necessary defences and intrusion detection was in place.

Once the attackers gained access, the big problem here is the one thats highlighted above – hardcoded domain admin credentials. Once they had those, they could then move across the network doing whatever they pleased. And undetected as well, as its not unusual for a domain admin account to have multiple access across the network. And it looks like Uber haven’t learned from their previous mistakes, because as Mackenzie Jackson of GitGuardian reported:

“There have been three reported breaches involving Uber in 2014, 2016, and now 2022. It appears that all three incidents critically involve hardcoded credentials (secrets) inside code and scripts”

So what can we learn?

What these attacks teach us is that we can put as much technology, intrusion and anomaly detection into our ecosystem as we like, but the human element is always going to be the one that fails us. Because as humans, we are fallible. Its not a stick to beat us with (and like most, I do have a lot of sympathy for those users in Uber, Holiday Inn and all of the other companies who have been victim to attakcs that began with Social Engineering).

Do we need constant training and CyberSecurity programmes in our organisations to ensure that our users are aware of these sorts of attacks? Well, they do now at Uber and Holiday Inn but as I said at the start of the article, this will be a reactive measure for these companies.

The thing is though, most of these programmes are put in as “one-offs” in response to an audit where a checkbox is required to say that such user training has been put in place. And once the box has been checked, they’re forgotten about until the next audit is needed.

We can also say that the priveleged account management processes failed in both companies (weak passwords in one, hardcoded credentials in another).


Multi-Factor Authentication. Conditional Access. Microsoft Defender. Anomaly Detection. EDR and XDR. Information Protection. SOC. SIEM. Priveleged Identity Management. Strong Password Policies.

We can tech the absolute sh*t out of our systems and processes, but don’t forget to train and protect the humans in the chain. Because ultimately when they break, the whole system breaks down.

And the Threat Actors out there know this all too well. They know the systems are there, but they need a human to get them past those walls. MFA and Conditional Access can only save us for so long.

100 Days of Cloud – Day 100: The End of the Beginning

Its Day 100 of my 100 Days of Cloud Journey.

Day 100….. I’ve made it! So its time to reflect on the journey, look back at what prompted me to do it, the original goals, how that changed over time and remind myself that this is definitely not the end!

Back before the Start…..

Before we start into the why’s of how “100 Days” came about, we need to go back to a different time for us all – March 2020, when none of us knew what was around the corner despite the reports that something wasn’t well in a part of China that none of us had ever heard of.

The story starts with me on my way home from my brothers’ wedding in Melbourne, and while waiting for the connection during the stopover in Dubai I came across an article informing me that Microsoft were retiring the MCSE certification for good in July 2020, and that there would be no 2019 version of the cert as it was all moving to Azure. I made note of the article and like most articles that interested me, I bookmarked it for future reading and probably emailed myself a copy to remind me to revisit it.

Damn you MCSE retirement!

And therein lies the problem – like most IT people, I have lots of great ideas and intentions and save them away for future reference. Its getting back to them and actually doing them thats the problem.

So anyway – 5 days after arriving home, the country went into lockdown and I was consigned to the makeshift desk in spare room. A few weeks went by, and having become Netflix-man and gotten bored with it, I was doom-scrolling through my emails late one night and came across the MCSE email I’d sent to myself.

It bothered me because I was using the technology on a daily basis,a nd also because I hadn’t pushed myself into doing an exam since I took one of the MCSE 2012 exams 3 years previously. So I have 3 months – I’ll at least get the MCSA portion done, right?

Not quite – I managed to clear the first 2 exams, but then Microsoft threw me a lifeline by extending the deadline to January 2021. Suddenly it was achievable again but I didn’t want to rest on my laurels and become complacent, so I pushed on. MCSA was achieved in July, and the MCSA duly completed by August. So goal achieved!

2 things happened in between MCSA and MCSE.

Firstly, I signed up to Cloudskills.io after seeing a Google Ad offering their Azure Admin Associate Course content for $7. I then signed up for the full platform after subscribing to their Podcast and realising that I needed to know the Fundamentals of Azure before diving deeper.

Secondly, I came across the 100 Days of Cloud website and Github hosted by people like Andrew Brown and Gwyneth Pena Siguenza. Wasn’t really ready for that jump yet, but did my usual bookmark and email to myself for future reference…..

2020 quickly became 2021, and by mid-2021 a number of things had happened:

  • I’d signed up for my free Azure Account and was experimenting with the services on offer.
  • Based on this I’d passed the Az-900 (Azure Fundamentals), AZ-104 (Azure Admin Associate), AZ-140 (Azure Virtual Desktop) and SC-300 (Identity and Access).
  • I’d gone deeper into CloudSkills.io, joined their Community, and started attending User Groups remotely.
  • I’d also changed job and started working for Ekco, a growing MSP based in Dublin.

The Driver behind 100 Days

So over the course of Summer 2021 I was attending user groups and getting involved in Cloudskills.io, and got the opportunity to meet Mike Pfeiffer on a call. I’ve always been a Mike Pfeiffer fan-boy, right back to the old days when he was blogging about Active Directory and Exchange right up to his Pluralsight content that I had used during my MCSE studies. During our conversation, Mike asked me 2 questions:

  • Are you producing any content?
  • Why not (response to answer provided)?

This introduced me to the concept of SODOTO, or:

  • See One – observe someone else teaching you about something.
  • Do One – can you do it yourself based on the teachings above.
  • Teach One – can you get a deep understanding of what you’ve learned and teach that back to an audience in either video or blog format so that they understand it.

This led me to start a blog and the original series on Monitoring Docker Containers with InfluxDB and Grafana. And that got me into the blogging bug for a few months – release a blog every week was the goal.

When that series finished, I wrote a few other smaller blogs, but eventually needed another goal – a longer term one that I could commit to. I’d always wanted to go deeper into Azure and learn more about all of the services that were offered. I played about in my own tenant and through Bootcamps had dived a bit deeper. But it was a big monster, how was I going to do it?

And during another one of my late night doom-scrolling sessions, I came across the 100 Days bookmark that I’d saved the previous year. And the lightbulb in my head turned on …

100 Days

And so I started. I knew I could transfer some of the skills I’d already learned across, so I started small and went for the basic IaaS stuff that I knew well.

At the start, the idea was that I would do 100 Days straight. It became very clear to me around Day 12 that is wasn’t going to be possible because I was doing this as a SODOTO model, and if I had tried to do it I was going to crash and burn quickly and regularly.

Thats the key, and a great piece of advice – learn at a pace that you are comfortable with and can sustain. Don’t rush it, it just won’t go in.

At the start, I was doing this for me – as a challenge that I wanted to finish, and doing it in the open held me accountable. It also meant that family, friends, work colleagues and social connections could enquire and joke about when the next blog was coming out. It wasn’t about likes or followers. It was about me learning and tying together components of the Azure and other cloud ecosystems and how they connect.

And at that point it evolves into being not just about me, but about the followers and giving something back to them and to the wider community.

Lets take Azure Virtual Desktop as an example – I have experience of working with Citrix, so the concepts are pretty transferrable. But think about all of the underlying concepts you need to know:

  • Virtual Machines
  • Storage
  • Authentication
  • MFA
  • Identity
  • Desktop and App Management

You very quicky realize that although all of those are standalone service offerings in Azure, they are not just intertwined in Azure Virtual Desktop but in hundreds of other Azure services. And knowing them as a baseline will give you a better understanding when you go to learn the rest of the services!

Time to give Thanks!

There were times when I never thought I’d reach this goal and doubted myself, but I had some unbelievable support and encouragement along the way.

Firstly and most importantly my wife and family, who put up with me disappearing back to the laptop most evenings and tolerated the late nights where I screamed curses at deployments that had gone wrong. Also for giving me “the eye” every time I flopped down on the couch in the evenings and encouraged me to embrace the challenge and keep going.

To my friends and work colleagues who kept me going with their encouragement, banter and interest in the blog. I’m not going to name you all becasue I’m sure to forget someone, but you’ve all been brilliant.

To my mentors. The opportunity to get to know people like Mike Pfeiffer, Robin Smorenburg, Derek Smith and Kevin Evans, and to be able to pick their brains and get tips and encouragement from them has been mind-blowing. There are many more who I haven’t mentioned, particularly the gang over at Cloudskills.io. You guys are all awesome and you know it – any success I have is down to you.

To the community who have chipped in with words of encouragement and support along the way. To people like Michal Marchlewski, Karl Cooke, Gregor Suttie, Daniel McLoughlin,John Lunn and many more – thanks for reaching out and for the support guys, it really meant a lot.

Finally to everyone who has read the blog and gotten in contact with messages of support and telling me that the blog has helped them and been useful to them. Thank you from the bottom of my heart, even helping one person would have made it all worthwhile, but the response has been genuinely amazing.

Conclusion and What happens next!

What happens next is I’m going to take a break from blogging for a few weeks! I’m going to Scottish Summit on June 10th, so if you see me there please do come over and stay hello! Or please feel free to reach out to me on my social channels. Once I get back from Scotland, I’ll come up with the next challenge, whatever that may be!

I hope you’ve enjoyed the 100 Days as much as I have and have found it useful. As the title says, this is not the end, its just the end of the beginning of the journey.

Until next time!

100 Days of Cloud – Day 99: Microsoft Build 2022

Its Day 99 of my 100 Days of Cloud journey and in todays post we’ll take a quick look at some of the announcements coming out of Microsoft Build.

Microsoft Build is an annual event that is primarily focused on the development side of the Microsoft ecosystem, however like all Microsoft events there are normally some really cool announcements around new technologies and updates to existing technologies.

I’m going to focus particularly on updates to the technologies that I’ve blogged about over the last 99 days! In effect, I’m providing some updates to the blog posts so that if you’ve followed me on the journey this far, you’ll get to here and have the latest news and features!

Azure Container Apps

Azure Container Apps is now Generally Available. This enables you to run microservices and containerized apps on a serverless platform.

Common uses of Azure Container Apps include:

  • Deploying API endpoints
  • Hosting background processing applications
  • Handling event-driven processing
  • Running microservices

Applications built on Azure Container Apps can dynamically scale based on the following characteristics:

  • HTTP traffic
  • Event-driven processing
  • CPU or memory load

We looked at Azure Container instances on Day 82. The key differences between the 2 are:

  • If you need to spin up multiple container (e.g. front end / backend / database), Azure Container Apps is a better choice as it comes with Dapr (Distributed Application Runtime) and it will auto retry the requests and add some telemetry data.
  • If you just need long running jobs or you don’t need multiple containers to communicate with each other, you can go with Azure Container Instances.

You can check out the blog post announcement here, and the offical Microsoft Docs page here for more information.

Azure Cosmos DB

We looked at Azure Cosmos DB back on Day 64 and learned that it is a fully managed NoSQL database provides high availability, globally-distributed access to data with very low latency. There are a number of APIs to choose from that best meets the needs of your database requirements.

Some of the new featres announced for CosmosDB are:

  • Increased serverless capacity to 1 TB.
  • Shared throughput across database partitions.
  • Support for hierarchical partition keys.
  • An improved 30-day free trial experience, now generally available, and support for MongoDB data in the Azure Cosmos DB Linux desktop emulator.
  • A new, free, continuous backup and point-in-time restore capability enables seven-day data recovery and restoration from accidental deletes
  • Role-based access control support for Azure Cosmos DB API for MongoDB offers enhanced security.

You can find out more about the Cosmos DB enhancements here.

Azure Stack HCI

Its timely that we only looked at Azure Stack HCI on Day 95 and commented that your Azure Stack HCI Cluster can contain between 2 and 16 physical servers.

The new single node Azure Stack HCI, now generally available, fulfills the growing needs of customers in remote locations while maintaining the innovation of native integration with Azure Arc. It offers customers the flexibility to deploy the stack in smaller spaces and with less processing needs, optimizing resources while still delivering quality and consistency.

Additional benefits include:

  • Smaller Azure Stack HCI solutions for environments with physical space constraints or that do not require built-in resiliency, like retail stores and branch offices.
  • A smaller footprint to reduce hardware and operational costs.
  • The same scale applies, so you can start at 1 and scale up to 16 nodes if required.

You can find out more about the AZure Stack HCI announcement here.

Azure Migrate

On Day 18 we looked at Azure Migrate, which is an Azure technology which automates planning and migration of your on-premise servers from Hyper-V, VMware or Physical Server environments.

Enhancements to the service now streamline and simlify cloud migration and modernization:

  • Agentless discovery and grouping of dependent Hyper-V virtual machines (VMs) and physical servers to ensure all required components are identified and included during a move to Azure. This feature is generally available.
  • Azure SQL assessment improvements for better customer experience. Assessments now include recommendations for SQL Server on Azure VMs and support for Hyper-V VMs and physical stacks, along with already existing assessments for Azure SQL Managed Instance and Azure SQL Database. This feature is in preview.
  • Pause and resume of migration function has been included to provide control over the migration window. This mechanism can be used to schedule migrations during off-peak periods. This feature is in preview.
  • Discovery, assessment and modernization of ASP.NET web apps to native Azure Application Service. Customers can discover and modernize an ASP.NET web app to Azure Kubernetes Service (AKS) Application Service Container and discover Java apps running on Apache Tomcat.


So thats a quick rundown of the main updates from Microsoft Build. You can find information on all of the updates that were released here in the Microsoft Build Book of News, and its also not too late to register and watch some of the recorded and on-demand sessions from Microsoft Build by signing up here.

As with all Microsoft Conferences, there’s a CloudSkills Challenge and you have until June 21st to sign up and complete the modules from one of the 8 challenges are available. As always, you can earn a free certification exam pass if you complete the challenge! You can sign up here and the list of rules and exams eligible is here!

Hope you enjoyed this post, until next time!

100 Days of Cloud – Day 98: Azure Bicep

Its Day 98 of my 100 Days of Cloud journey and in todays post we’ll take a quick look at Azure Bicep.

Azure Bicep is a domain-specific language (DSL) that uses a declarative syntax to deploy Azure resources. In a Bicep file, you define the infrastructure you want to deploy to Azure, and then use that file throughout the development lifecycle to repeatedly deploy your infrastructure. Your resources are deployed in a consistent manner.

Bicep v JSON

We’ve seen Azure Resource Manager Templates and how they can be used to define your infrastructure based on JSON Templates. Bicep is part of the Azure Resource Managet Template family – the difference is that Bicep is a launguage that uses .bicep files instead of .json files.

If we take a look at the differences between the 2 – below is a JSON template where we want to deploy a Storage Account:

And here we have a Bicep file deploying the same storage account:

You can see the difference in file size and the simpler syntax in use with Bicep over JSON. However, when you build Bicep templates and perform a deployment operation, it will transpile into an ARM template, and then Resource Manager will go and deploy your resources to Azure. So effectively the runtime is unchanged; Bicep only provides an abstract layer and reduces the pain of working with JSON.

You can also use the Bicep Playground to view Bicep and equivalent JSON side by side. This will allow you can compare the implementations of the same infrastructure. You can also decompile an existing ARM template to Bicep, see Decompiling ARM template JSON to Bicep.

Benefits of Azure Bicep

  • Authoring experience: When you use the Bicep Extension for VS Code to create your Bicep files, you get a first-class authoring experience. The editor provides rich type-safety, intellisense, and syntax validation.
  • Repeatable results: Repeatedly deploy your infrastructure throughout the development lifecycle and have confidence your resources are deployed in a consistent manner. Bicep files are idempotent, which means you can deploy the same file many times and get the same resource types in the same state. You can develop one file that represents the desired state, rather than developing lots of separate files to represent updates.
  • Orchestration: You don’t have to worry about the complexities of ordering operations. Resource Manager orchestrates the deployment of interdependent resources so they’re created in the correct order. When possible, Resource Manager deploys resources in parallel so your deployments finish faster than serial deployments. You deploy the file through one command, rather than through multiple imperative commands.
  • Modularity: You can break your Bicep code into manageable parts by using modules. The module deploys a set of related resources. Modules enable you to reuse code and simplify development. Add the module to a Bicep file anytime you need to deploy those resources.
  • Integration with Azure services: Bicep is integrated with Azure services such as Azure Policy, template specs, and Blueprints.
  • Preview changes: You can use the what-if operation to get a preview of changes before deploying the Bicep file. With what-if, you see which resources will be created, updated, or deleted, and any resource properties that will be changed. The what-if operation checks the current state of your environment and eliminates the need to manage state.
  • No state or state files to manage: All state is stored in Azure. Users can collaborate and have confidence their updates are handled as expected.
  • No cost and open source: Bicep is completely free. You don’t have to pay for premium capabilities. It’s also supported by Microsoft support.


  • Limited to Azure — Bicep isn’t going to fly if someone is using multi-cloud and wants to use the same language across multiple cloud providers. This where Terraform has the advantage in this space.
  • Learning Curve — Bicep is basically a new language that expects some learning and understanding in spite of being very simple. Most of the users can prefer to use JSON instead, and if you are familiar with traditional JSON ARM Templates you may decide to stick with that.


Azure Bicep is an exciting technology that promises to make deployments easier if you are using only Azure. There are some great resources out there to start your learning journey:

Hope you all enjoyed this post, until next time!

100 Days of Cloud – Day 97: Azure Terrafy

Its Day 97 of my 100 Days of Cloud journey and in todays post we’ll take a quick look at Azure Terrafy.

Azure Terrafy is a tool which you can use to bring your existing Azure resources into Terraform HCL and import it into Terraform state.

As we saw back on Day 36, Terraform state acts as a database to store information about what has been deployed in Azure. When running terraform plan, the tfstate file is refreshed to match what it can see in the environment. And if you use Terraform to manage resources, you should only use Terraform and not any other deployment tools as this can cause issues with your terraform configuration files and in turn issues with your Azure resources.

But if we’ve deployed infrastructure using means other than Terraform (such as ARM Templates, Bicep, PowerShell or manually using the Azure Portal), its difficult to keep those resources in a consistent state of configuration.

And thats where Azure Terrafy comes to the rescue!

So lets do a quick demo – on my local device I have the latest Terraform version installed and have also downloaded the Azure Terrafy binaries from GitHub. I also have Azure CLI installed and have authenticated to my subscription:

I’ve deployed a Resource Group in Azure and created a VM:

Let run aztfy.exe and see what options its giving us:

The main thing we see here is that we need to specify the resource group. So, we’ll run c:\aztfy\aztfy.exe md-aztfy-rg

Note – C:\aztfy is where I’ve downloaded the Azure Terrafy binary file to, the location C:\md-aztfy-rg that I’m running the command from is an empty directory where I want to store my terraform files once they get created.

Thats a good sign ….. so its initializing and is now interrogating the resource group to see what resources exist and if it can import them.

Once thats done, we get presented with this screen:

As we can see, the first line is the resource that has been identified, the second is the terraform provider that has been identified as being a match for the resource. As we can see, the majority have been identified except for one which is the virtual machine. If we scroll down using the controls listed at the bottom of the screen, there is an option to “show recommendation”.

For this one, its telling me no resource recommendation is available. Thats OK though, because we can hit enter and type in the correct resource:

Once thats done, we click enter to save that line, and then hit choose the option to import:

And as we can see thats started to import our configuration. And eventually we’ll get this screen:

And once thats finished we’ll see this:

So now lets open that directory from Visual Studio Code, and we’ll open the terraform.tfstate file:

Ok, so that looks great and everything looks to be good. But we need to test, so we’ll run terraform plan to see if its worked:

And its telling me my infrastructure matches the configuration! So we can now manage the resources using Terraform!


Azure Terrafy is in the early stages of its development, but we can see that its a massive step forward for those who want to manage their existing resources using Terraform.

There are some great resources out there on Azure Terrafy:

Hope you enjoyed this post, until next time!