Every cloud engineer knows that security is essential, but knowing how to secure your Azure environment isn’t always clear. Azure MVP and security consultant Joosua Santasalo (@SantasaloJoosua) walks us through a few cloud security mistakes most companies don’t realize they’re making.
During various assessment workshops I’ve led over the years, I’ve noticed a set of typical Azure governance and security oversights that are usually simple to fix and that provide high security benefits with low implementation costs.
Addressing these oversights is even more critical now as our ways of working together has changed in a COVID-19 world. Your emergency account procedure for example, can’t be necessarily tied into a single physical location.
So, in this blog I’m going to discuss the importance of securing roles, locks, and cross-resource communication in Azure.
Are Microsoft 365 roles a risk for your Azure AD security?
Despite the fact that we all use Azure Active Directory (Azure AD), we don’t always understand how its access and permission models interact in Azure services.
One of the most common and persistent security oversights I’ve seen with regards to Azure roles stems from the confusion regarding the Global Admin’s role in relation to Azure resources. Hint, the alternative name for that role, ‘Company admin,’ is actually more descriptive of its function.
Global Admin is hands down the most powerful role available, so you want to avoid assigning it to people who don’t need it. These days, there are lots of other roles that can allow people to perform their daily tasks without assigning them this role.
Companies that consume both Azure and Microsoft 365 in the same tenant
Most Azure and Microsoft 365 assessments I’ve participated in share the same Azure AD tenant, even when the scope of the assessment is exclusive to the other. Having the same Azure AD tenant for both services is the right way to do it 99% of the time since the most powerful Azure AD features are useful in both Azure and Microsoft 365.
In case you we’re wondering what features I’m talking about, here are just few:
- Azure AD Conditional Access
- Azure AD Privileged Identity Management
- Azure AD Identity Protection
While Microsoft has gone to great lengths—through communication and security controls —to ensure that there should be few ‘Company Admins,’ most organizations I’ve dealt with didn’t have previous knowledge about what the ‘Company Admin’ actually does in context.
Let me give you an example of how the conversation normally goes when I present my findings to an Azure team about the subscription IAM assessment.
Me: We have 10 accounts in total, which, through privilege elevation, can manage all Azure subscriptions in the tenant, covertly or non-covertly, besides the accounts listed in subscription identity and access management (IAM).
Azure Team: No, we only have two owners in the R&D subscription, and even those use Privileged Identity Management and one SPN for Azure DevOps to push deployments into the subscription. This is our most protected subscription, and there is no way we’d let accounts outside Azure’s RBAC manage those resources.
Me: Pulls a list of 10 Company Admins
Azure Team: But those are our accounts from Microsoft 365 team and from our Microsoft 365 managed service provider; you’ve got the wrong list. The operators of these accounts don’t even know we run Azure Services.
Me: These accounts can wreak havoc by taking ownership of any or all Azure subscriptions with two by-design methods:
- They can elevate their user access administrator by a single click
- They can add new client credentials to Azure DevOps SPN
Me: MIC DROP
Azure Team: Whoa! We were totally unaware of this. Since this is our R&D subscription, we need to do some internal governance drafting. As a first measure, we need to ensure that our Microsoft 365 team also understands how Company Admins can get access to protected areas.
Me: Delivers mitigation proposals
A few key takeaways
- Company Admins can manage Azure subscriptions
- Company Admins can create new credentials for Azure AD SPNs connected to subscription management and impersonate the DevOps SPN
- While Company Admins could also reset the password of owners (for cloud accounts), this is usually detected pretty quickly as the admin won’t be able to login with their credentials. The first two examples are much harder to detect for abuse.
- Being able to elevate your user access isn’t a secret strategy, it’s been in Microsoft’s documentation for at least two years now. Nothing is left for speculation as the documentation explains how this type of elevation is also intended for scenarios where the subscription access is lost accidentally.
Mitigation and ramifications
So, how do you avoid these security threats?
First, it’s possible to detect and set up automatic alerts for when Global Admins elevate their user access. This can help you stay on top of anyone giving themselves permissions that they shouldn’t have.
Microsoft has a lot of information in its documentation about avoiding the abuse and attack of global admin accounts.
Azure AD’s identity secure score recommends limiting the number of global admins for security reasons:
“Reducing the number of global admins limits the number of accounts with high privilege that need to be closely monitored. If any of those accounts are compromised, critical devices and data are open to attacks. Designating fewer than 5 global admins reduces the attack surface area.”Azure Active Directory identity secure score
Lastly, Microsoft recently released a paper on securing Azure environments with Azure AD. It details a way to separate Azure AD tenants and then grant less privileged access by resource delegation.
In the scenario pictured below, Global Admins of source tenant “contoso” can’t directly elevate to Azure Subscriptions, as the delegated access is reduced for the resource tenant “contoso sandbox.”
How and when to use Azure locks
How high would you rate the possibility of unplanned mass deletion of resources in your Azure environment? Or even disruptive modifications?
The emphasis here is on the word “unplanned.”
My personal opinion, based on lessons learned in the field, is that at some point, some kind of extinction event will take place. Whether you’re a seasoned Azure architect or a developer deploying new infrastructure-as-code, this kind of human error unfortunately take place.
I myself once deleted a public DNS zone by defining a wrong DNS zone name when deleting the zone from CLI. Imagine the situation where your company’s DNS records stop existing on the highway of internet—it was not fun!
Azure locks can prevent mistakes, as well as “gone-berserk” malicious activities
I was recently doing an assessment and noticed that the customer had enabled locks on multiple production resources. Since it’s rare to be this proactive, I applauded the customer. But I was curious to see if the locks had already been put in place.
As it turns out, the locks were in place because a mass extinction event had taken place, resulting in 48 hours of production outage. During this time the disaster recovery process of the company was tested in trial by fire.
Given that I was talking with someone who was leading that project, I was able to conclude that the disaster recovery had been successful. After that, a new policy was implemented to use locks on production resources.
Implementing Azure locks
Before recommending the use of locks in Azure, it’s good to understand what kind of lock types there are.
There are two kind of locks in Azure: Read-only and dont-delete locks. Read-only locks prevent users from deleting as well as making any other changes to the management plane.
For example, read-only locks can prevent an operator from accidentally opening the resource firewall to public access. Dont-delete locks just prevent deletion of a resource or resource group.
Lock type and capabilities
|Lock||Prevents deletion||Prevents changes|
You can only implement read-only and don’t-delete locks in certain scopes of Azure.
|Resource data plane||(Some exceptions)|
Quick overview of Azure management plane and Azure data plane
In order to understand lock scopes, it’s critical to understand how permissions work in Azure because Azure Locks mostly apply to the management plane.
The management plane interface is accessed for configuration and settings of security services. This includes:
- IAM settings where “who/what” can access the service is defined in terms of authorization of managing the resource.
- Authentication and authorization settings concerning the data plane access.
- If the service that is presented in the data plane requires authentication for someone to access it, an option in the management plane is offered to change these settings.
- Services can have their own identities, which are used to authorize cross-service communication, and these settings are controlled on the management plane.
- General options, modes, and settings of services.
- Security options enabled on services.
The data plane interface is accessed for runtime and in the rest of data of services. This includes two types of data, runtime data (often in-transit) and persistent data (often at rest):
|Runtime data||Persistent data|
|The data the service generates when it’s running||The data that remains even if the service was stopped|
|Typically, this data is lost when the service restarts unless stored to persistent storage||Even if you delete the service, often the data that was generated from the service still exists in some form of storage media|
The Management plane typically manages Azure Key Vault itself where the data plane manages the data stored in the Key Vault. The main point to remember about these planes with regards to locks is that locks work mainly in the management plane.
There are always exceptions to rules though, right? Read-only Locks can in some cases prevent access to the data plane. For instance, you can use owner permissions in the Azure Portal to access read-only locks for containers on Azure Storage.
In order for the owner of the subscription to read the data on a storage container without blob permissions, background operation is made to fetch storage account keys to access the container without user-level permissions. This fails to the following error, as the storage account keys are protected by a read-only lock in the data plane.
This pattern is typically seen in services that could have read-only access and expose an access key, or some other form of elevation mechanism that can be used to evade the read-only permissions or read-only lock.
Implementing Azure locks for Network Gateways
Now that we’ve covered how locks work, we can show a few examples of how to use them.
The typical example of when and how to use locks is to protect network gateways, which can cause a lot of trouble if unplanned deletion happens. Especially if disaster recovery or gateway redundancy isn’t planned and tested.
Typically, we want to avoid redeploying VPN Gateways because many of the values change upon redeployement, such as VPN keys, if not planned, and IPs.
If a network gateway is redeployed without planning, a situation may arise where the public IP and VPN keys are changed. This typically requires that all VPN peers need to update settings on their side too. This is something you want to avoid.
In my use case, I’ve placed a dont-delete lock on the whole resource group (called Infra), and then a read-only lock to prevent changes to the VPN gateway. With this configuration I can still modify resources of the resource group and ensure that no changes are made to it.
My resource group Infra is a hub, where I control general connectivity to spokes, which are in different VNET. This is a place where I want to keep things mostly “immutable.”
|Resource group||Dont-delete lock for all resources in the infra resource group|
|Resource||Read-only lock for all VPN gateway components|
|Resource data plane|
Azure Managed Identities and VNet integration to secure internal traffic
Azure offers an excellent way to strengthen internal resource communication by authorization and network integration through Managed Identities and VNet integration. This helps to ensure that your internal communications and data are secure.
When I’m consulting on a new tenant and want to enhance security, I usually begin by doing a discovery of the services used and then I ask two questions.
N.B. While I’m going to cover PaaS services, this does not exclude IaaS services. I’m focusing on PaaS services because they usually support fine-grained isolation, which isn’t typically understood very well.
1. Can the resource firewall or access filtering be enabled to only accept internal traffic?
If your answer is ‘yes,’ then you’re greatly reducing the attack surface.
“Internal traffic” can be all traffic inside Azure, traffic coming from certain VNets via VNet service endpoints, or traffic coming from on-premises via private endpoints. Typically, a form of network integration is enabled or an exception is enabled for the resource.
An example of such an exception is allowing traffic from Azure Logic Apps to Azure SQL via the setting Allow Azure services and resources to access this server.
The Allow Azure services and resources to access this server setting can be found under Firewalls and virtual networks of the Azure SQL Database.
If your answer is ‘no,’ then then the resource’s firewall settings remain as they are, and all traffic, including that from unrestricted internet traffic is allowed, which leaves your organization open to a greater attack surface.
2. Can the service (consumer) access other Azure services (resources) using Managed Identity?
In order to answer this question two things need to be researched:
- Does the resource being consumed support authorization by using Managed Identity?
- Does the consumer of the resource support enabling of identity, which is used in the resource to authorize the consumer?
In the above example, both Azure Key Vault and Azure Storage support authorization by managed identities.
In the following example, our App Service uses both network restriction and managed identity-based authorization to consume other Azure resources.
Here, Azure Function is the consumer, and it’s enabled for managed identity and needs to access Azure Key Vault.
For its network, Azure Function is attached to a subnet that’s authorized in Azure Key Vault’s firewall. It’s also enabled for Managed Identity in Azure Key Vault’s Access Policy.
Azure App Service as consumer is authorized by network and identity settings in Azure Key Vault as resource. And the subnet that’s attached to the function is authorized in the Key Vault’s firewall:
The identity that’s attached to the function is granted permissions to list and get secrets in the Key Vault:
In this blog we covered three areas that in my opinion play a substantial role in your total level of Azure security. Ensuring that all of your roles, locks, and cross-resource communications are secure is vital to the health and stability of your cloud infrastructure.