Azure governance: The first steps to creating an organized cloud infrastructure

Microsoft's Joseph Chan and Liz Kim discuss Azure governance and Azure Policy Microsoft's Joseph Chan and Liz Kim discuss Azure governance and Azure Policy

Moving your workspace to the cloud can be overwhelming, but Joseph Chan (@jochan_msft) and Liz Kim (@lizkim101) of Microsoft explain the best way to organize an agile and secure Azure environment from the start.  

Azure governance is a large, complex topic, and there are many areas to be covered within it. During Deploy, ShareGate's expert-led event focused on Azure governance, Microsoft’s Joseph Chan and Liz Kim asked us to take a step back and consider what we should be doing on day 0 in the cloud. 

To be as efficient as possible while maintaining agility, control, and security, you need to understand how to conceptualize your cloud environment from the start. This includes learning why and how Azure governance tools work and what types of organizational questions and services you should consider before deploying your first resource. 

Moving your workspace into the cloud can be overwhelming, but this blog offers some concrete first steps you can take to set yourself up for success.   

ShareGate has been helping IT professionals succeed in the Microsoft cloud for over a decade. We hope this blog helps you better understand Azure governance—but if you’re looking for a tool to make managing Azure even easier, check out how ShareGate Overcast can give you better visibility and control in Azure.

You can watch Joseph and Liz’s 1-hour presentation from Deploy, or keep reading for a recap of their key points. 

Cloud governance tools: How do they work in Azure?

If you’re new to Azure, seeing all of the services offered can be overwhelming, especially since they’re being updated and added to all the time. But there’s a very good reason why these tools exist and why you should be implementing them. 

Get control and agility

The vast majority of organizations that use the Microsoft cloud have a designated person or team that manages how things operate in their cloud environment. That team is responsible for creating the subscription and making sure its organization’s security and governance standards are being complied with.

In the past, particularly with on-prem environments, these IT custodian teams would ensure compliance by having all new deployments or modifications go through them first. But one of the benefits of the cloud is the speed and agility it offers developers, and verifying every new action with a custodian team causes significant delays.

You shouldn’t have to choose between agility and keeping your data organized and secure. Which is why Azure provides a set of governance capabilities; the cloud custodian team can enforce controls through tools such as Azure Policy, Azure Blueprints, and Azure Resource Manager (ARM) without hindering the development team.

Using governance tools allows the cloud custodian team to maintain governance guidelines while giving developers the speed they need.

These Azure governance capabilities act more like guardrails than roadblocks. If someone tries to push a resource that doesn’t comply with the definitions set by the custodian team, it will be stopped before it can be deployed. But the person trying to take that action can then audit and analyze why it was stopped. That way, they can make any necessary changes themselves without waiting on a ticket from the custodian team.

This also means that if you’re creating your cloud environment from scratch, you can put these policies and guidelines in place before anyone starts spinning up resources. So when other members of your team start working in a new subscription or on new resources, they aren’t given a blank canvas.

How these tools are built

Most Azure services require you to go to the marketplace and select what resource or what service type you want to create. For example, if you wanted to create log analytics, you would go to the marketplace to create a workspace where you gather your logs.

"With governance tools, it really needs to be there natively the day you start working in Azure," said Chan.

The cloud is an evolving platform, and Microsoft wants to make sure that any time new governance services and capabilities are released, they’ll fall under the governance controls that you’ve already assigned in your environment on the day it’s released.

"With governance tools, it really needs to be there natively the day you start working in Azure."

Joseph Chan (@jochan_msft)

When most environments operated on-prem, oftentimes when a new service was introduced, you’d have to wait 60 days before the system center came out with a management pack to help you use it efficiently. Cloud environments move far too quickly for that kind of delay. So, Azure has the governance capabilities built into the control plane platform, ARM.

ARM acts like the front door, ensuring that all the control plane requests—such as creating new resources or querying the properties and shapes of the resources—are vetted first. There’s a component in ARM that is literally called “front door” where all of your resource requests come through, and all of the governance capabilities are behind the front door.

Policy engine, for example, is behind the front door. So any request first has to go through ARM and then the policy engine and be determined to be compliant before deploying or changing a resource.

Azure Resource Graph is also tucked behind the front door. Any change signal will be updated in Azure Resource Graph, which is why it can offer a fast SLA regarding data freshness of under one minute. It’s these tools that ensure all new resources are under your control and following your guidelines before they’re ever even created.

How to organize your Azure infrastructure before you start working in it

If you’re starting at day 0 and want to get to the point where you have some guardrails in place in your infrastructure, you don't actually start with the guardrails. The first thing you have to consider is your organizational hierarchy. 

Defining your organizational hierarchy 

Organizational hierarchy refers to how your environment is arranged. In Azure you have different items or groupings to help you keep things organized. From largest to smallest you have:

  • Tenant
  • Management groups
  • Subscriptions
  • Resource groups
  • Resources

This diagram sums up the organizational structure of an Azure tenant.

Consider how your business is structured outside the cloud and if your Azure environment should reflect that. You might be very centrally managed by a few higher ups, or you could be a sprawling combination of different businesses and sub-organizations that will need a more decentralized style of management.

For example, you might have a traditional business structure with a CEO, a C-suite of executives, and then lower levels of management and employees all working together in one region on a single product. If that's the case, you can probably roll all of your services into one management group.

You can then have a set of management groups nested inside the first group that represent different departments or levels of staging, such as production, development, and a sandbox environment where developers can explore and experiment.

Alternately, you might have an industrial conglomerate consisting of more than 50 businesses across several countries. You still might only have one CEO of the conglomerate, but each business also has its own CEO as well as its own products, processes, regions, and teams.

If you tried to roll all of these businesses into a single management group and a single system of guardrails, it would be extremely difficult to maintain. Instead, in this kind of organization, it makes more sense to have 50+ top-level management groups that each have their own departments and production, development, and sandbox environments underneath them.

Best practices for Azure management groups

The best service Azure offers to help you with organizational hierarchy is management groups. When you start arranging your management groups, you always start with what’s called a “root” management group. It’s the first one that exists in your environment and basically every subscription that you create will automatically be made a member of that management group. It’s equivalent to the AD tenant that your subscription belongs to.

Joseph explained that this is something Microsoft is currently working on changing. They’re hoping to be able to offer new capabilities that will allow you to assign the management group you want new subscriptions to fall under as you create them.

If you’re using the new Microsoft customer agreement, you can use a subscription creation API, which supplies a parameter that specifies which management group it should belong to.

If you don’t have that API, or until new capabilities are offered by Microsoft, you'll need to manually move your subscriptions into the right management group from the root management group. And that's definitely something you should take the time to do.

As much as possible, try to quickly place new subscriptions into the appropriate child management group. Typically, the longer you wait, the more time your subscription will have incorrect permissions assigned to them.

You can move your management groups by searching for them in the Azure portal.

All of your newly created subscriptions will have the root management group’s rights and permissions applied to them until you move them to a different management group. So, you have to be very careful about which rights and permissions you put in place in the root management group.

Commonly used cloud infrastructure 

Once you have your hierarchy defined and organized through the use of management groups, you should start trying to implement the most common cloud infrastructure. These are the policies, templates, etc. that you want to set up as guardrails across all of your environments.

"There is zero reason why you wouldn’t add anti-malware extension to all of your machines, so we apply that as a baseline policy at the very top level."

Joseph Chan (@jochan_msft)

These types of policies are usually related to security and compliance that most organizations would want applied to all of their systems. Joseph gave the example of anti-malware extensions on machines: "There is zero reason why you wouldn’t add anti-malware extension to all of your machines, so we apply that as a baseline policy at the very top level."

Microsoft supplies many of these types of baseline policies built into Azure that you can use out of the box. And the higher you apply these policies in your hierarchy, the better.

That’s because these types of policies and permissions typically refer to the subscriptions in the same management group. If you push them at the management group level, you won’t have to manually repeat yourself in each individual subscription in that management group.

Some of the Azure tools that can help you implement these baseline policies include:

  • Role-based access control (RBAC), which allows you to assign access permissions on a granular level according to their roles in your organization.  
  • Just-in-Time (JIT) Virtual Machine (VM) access, which allows you to lock down inbound traffic to your Azure VMs. This reduces exposure to attacks while providing easy access to connect to VMs when needed. 
  • Private Identity Management (PIM), which allows you to give conditional access to perform certain actions. This could mean that a particular person has to enter a ticket number or justification to be permitted entry, as opposed to RBAC, which stipulates that someone either does or doesn't have access 24/7.

Setting PIM can be particularly helpful in production environments where you may want to make sure that certain developers have access during the launch of a new product but not indefinitely.

Other types of common cloud infrastructure involve setting up Azure Key Vault to maintain secrets, creating storage accounts for collecting logs, and setting up monitoring tools to keep track of your key metrics.

Important reminder! If you move a subscription from one management group to another, the subscription will lose the policies and permissions of the old management group and gain those of the new management group. So before you move subscriptions around within your tenant, double check how it will affect the resources and developers working within that subscription.

Engineering decisions

An important decision to make regarding your cloud environment structure is how your team will be able to operate at different levels of development. Most organizations split up their production and development environments, and some have a third environment that’s exclusively for exploring and experimenting with new tools and ideas.

The rights and permissions set for each of these environments should probably be quite different. For example, an engineering leader may say that their development environments shouldn’t have internet endpoint exposure. You wouldn’t want to have the same policy for your production environment if you need to do business with your customer through the internet.

In that case, despite the fact that it may seem logical to group the production and development environments for the same product under the same subscription and management group, it may force you to have to do some redundant manual labor.

Instead, you could put all of your development environments in one subscription under one management group, and all of your production environments in another, regardless of what product they're for.

The most important control element—Azure Policy 

You can’t talk about Azure governance without discussing Azure Policy. It should be at the foundation of your Azure governance strategy.  

Azure policies are extremely flexible in what kinds of guidelines they define and to what resources and services they can be deployed. "Whether it's through the Azure Portal, CLI, Powershell, or Azure DevOps, it doesn’t matter how you’re creating Azure resources, we’ll consistently have blocks or remediations for them," explained Kim.

The three main components of Azure Policy are: 

  1. Enforcement and compliance 
  2. Applying policies at scale 
  3. Remediation and automation 

Enforcement and compliance refers to Azure Policy’s ability to create hard and fast rules for your environment. For example, you could create a policy that says storage accounts must go through only https traffic.

If you put that policy in for real-time enforcement, it will evaluate all storage accounts as they’re created. If one of your developers tries to create a storage account that goes through any other type of traffic, the policy will either deny the creation of that resource or auto-remediate it.

"Whether it's through the Azure Portal, CLI, Powershell, or Azure DevOps, it doesn’t matter how you’re creating Azure resources, we’ll consistently have blocks or remediations for them."

Liz Kim (@lizkim101)

Being able to apply policies at scale is incredibly important if you have a large, decentralized cloud environment. Azure facilitates applying policies at scale by integrating policies at the management group level. As discussed above, this allows you to set high-level security and compliance policies for your whole environment, or large sectors of it, at once rather than having to manually apply them to each subscription or resource group.

Azure Policy also has the capability to exempt particular resources or resource groups from policies. So if you have one or two exceptional resources, you can exclude them from the policy while keeping that particular environment intact.

Remediation and automation ensure that Azure Policy doesn’t become a burden on the cloud custodian team or developers. Sometimes, resources will be created that don’t follow your organizations guidelines.

Maybe a developer is creating several resources at once and creates a sequel but simply forgets to turn on the transparent data encryption for that sequel. For these types of oversights, remediation capabilities come in handy. You can set an Azure policy to automatically remediate this issue by turning on the transparent data encryption before deploying the sequel.

You can also use auto-remediation to ensure existing resources comply with new policies by creating a remediation task.

Azure Policy is a powerful tool and should play an essential role in your Azure governance strategy.

When you start working in the cloud, it’s tempting to dive right in and start creating workspaces and spinning up resources. In their Deploy presentation, Joseph and Liz made it clear why taking the time to properly organize your cloud environment will help you work more efficiently and securely.

So, make use of all the governance tools that Azure offers. And remember, once you’ve leveraged the built-in guardrails and infrastructure that Azure provides, there are always customizations that you can put in place.

The cloud offers seemingly endless possibilities. Take advantage of that to create the best fitting environment for your organization.

Ready to start proactively managing and optimizing Azure costs?  

Recommended by our team

What did you think of this article?

Order the ShareGate Takeaway, your to-go recap of Ignite 2020 Be the first to catch what we've cooked up!