Apr 12, 2023
Azure SLAs – Azure Service-Level Agreements

Azure SLAs

An SLA sets out a customer’s expected level of service from their service provider; it can include responsibilities, vocabulary and terminology, claims and credit processes, service quality, and availability metrics.

Microsoft defines an SLA as Microsoft’s commitments to uptime and connectivity, meaning the amount of time the services are online, available, and operational.

Microsoft provides each service with an individual SLA that will detail what is covered by the agreement and any exceptions; a percentage of the monthly fees are credited for any service that does not meet the guarantees. Previews and free services are not provided with an SLA. Information about each service’s SLA can be found at the following URLs:

Service availability is expressed as the uptime percentage over time; Microsoft SLAs are expressed monthly.

Availability is typically referred to as 9s (nines); for example, this can be expressed as four nines of availability, meaning the service will be available and fully operational for 99.99% of the defined period. In contrast to availability and uptime, it is also important to consider downtime, which means the amount of time the service will not be available for.

While we see lots of references to availability and uptime when looking at an SLA that will be provided for a service, the customer and consumer of the services will want to know what that means in the real world and what impact any breach may mean to them. Therefore, it is often the case that the real metric that matters is downtime, which means for a given SLA, how long is that service permitted to be down (that is, not available from the service provider)? You should scrutinize any SLA to determine whether that level of downtime is acceptable.

The following table illustrates examples of SLA commitments and downtime permitted per month as part of an SLA:

For reference, 99.9% is the minimum SLA that Microsoft provides; 99.999 % is the maximum. It should be noted that 100% can’t be provided by Microsoft.

You should also be aware of the concept of a composite SLA; this means that when you combine services (such as virtual machines and the underlying services such as storage, networking components, and so on), the overall SLA is lower than the individual highest SLA on one of the services. This is because each service that you add increases the probability of failure and increases complexity. An example exercise will be provided later in this chapter to illustrate this important concept.

The following actions will positively impact and increase your SLA:

  • Using services that provide an SLA (or improve the service SLA), such as Azure AD Basic and Premium editions and Premium SSD managed disks
  • Adding redundant resources, such as resources to additional/multiple regions
  • Adding availability solutions, such as using Availability Sets and Availability Zones

The following actions will negatively impact and decrease your SLA:

  • Adding multiple services due to the nature of composite SLAs
  • Choosing non-SLA-backed services or free services

The following actions will have no impact on your SLA:

  • Adding multiple tenancies
  • Adding multiple subscriptions
  • Adding multiple admin accounts

The Azure status page (https://status.azure.com) provides a global overview of the service health across all regions; this should be the first place you visit, should you suspect there is a wider issue affecting the availability of services globally. From the status page, you can click through to Azure Service Health in the Azure portal, which provides a personalized view of the availability of the services that are being used within your Azure subscriptions.

Service credits are paid through a claims process by a service provider when they do meet the guarantees of the agreed service level. As we mentioned previously, previews and free services are not provided with a financially backed SLA and are not entitled to service credits for any service downtime. You should evaluate all your services to ensure that, where required, you always have an SLA-backed service; as they say, there is often an operational impact that’s felt from free services.

If you suspect that your services have been affected and that Microsoft has not been able to meet their SLA, then it is your responsibility to take action and pursue credit; you must submit a claim to receive service credit. For most services, you must submit the claim the month after the month the service was impacted. If your services are provided through the Microsoft Cloud Solution Provider (CSP) channel, they will pursue this claim on your behalf and provide the service refunds accordingly.

In this section, we looked at Azure SLAs. In the next section, we will look at the Azure service life cycle.

More Details