SOS - Don’t Let a Microsoft Outage Drag You Down
You are at your desk when all of a sudden there seems to be a hum that is growing louder, your heart starts to pound and you quickly realize that you might be in the midst of another Microsoft outage.
You know that right now, Microsoft is your organization‘s core workforce engine and the backbone to ensure productivity - any outage or decrease in service quality can cause widespread productivity declines. You send out the ‘SOS’ call and the clock starts ticking as you and your IT team quickly see the service tickets start to come in and scramble to figure out the source of the issue.
The first question you have to answer is: ‘Is it a Microsoft outage?’ or ‘Is it an IT issue?’ In any case, it is your problem.
It typically takes between 30 minutes to an hour for Microsoft to send a notification to its users indicating that there is an outage occurring. So, what do you do during this time? Will your entire company wait for a confirmation tweet from Microsoft before taking any action?
When Every Minute Counts: the 4Ws (What, Where, Who, and Why?)
When tickets start to come in, you need to quickly qualify the issue to determine the best course of action.
What? Which workloads are impacted? Does it affect all the features of a workload or only partially?
Where? Which locations are experiencing issues? Is it worldwide or localized?
Who? Who are the users that are experiencing the issues?
Why? Is it a Microsoft outage or is the issue located within your own infrastructure?
Every minute counts as every minute that passes can amplify the impact of the issue and its potential damage to your business and overall user productivity.
Martello’s DEM Solutions for Microsoft 365 Can Help
Martello’s Digital Experience Monitoring (DEM) solution for Microsoft 365 helps organizations to qualify any service issues they have in minutes, allowing them to identify where the problem comes from, what is affected, where, and who is really experiencing its effects.
As you can see below, Martello tests the service on a 24/7 basis, alerting you as soon as performance degradation is experienced with its robots which are running in each of your critical locations. Outages are detected by Martello prior to the Microsoft notification being issued. At the same time, Martello tells you which workloads and which features of the service are unavailable or degraded and in which location.
A minute after the issue is detected, when none of your users have realized that something is going to happen, you already know that the impact is coming.
Now you want to know why it is occurring, and specifically if it is an IT issue or a Microsoft issue.
This is where Martello goes far beyond any other Microsoft 365 monitoring tool.
Martello gathers data from your existing monitoring tool and correlates them into a Microsoft service delivery map to quickly pinpoint the root cause of the issue.
You can see clearly that the Microsoft 365 service delivery in Canada was affected and that the end-user was starting to have issues. The application itself was tested using our synthetic transactions, which confirmed that it was clearly down, even if everything in your infrastructure seemed to be up and running perfectly for the service.
You are now 2 minutes into the issue being detected, and you are already almost 90% confident that it is a Microsoft issue.
You could go further and check your infrastructure:
It is now clear that your local network components (Cisco) are fine, your VPN is fine, even your local ISPs are fine too.
And you can drill down to any other countries you operate in, for the same kind of data.
The last question you need an answer to is: who is starting to experience the issue?
In the image above, you can see that with Martello DEM, you can quickly identify which users are being affected by the outage in each of the locations you have selected.
A message is sent to the local IT personnel that will be affected by the outage, explaining the issue, the severity, which workloads and features are impacted and which users are already experiencing it, even if they have not created any ticket yet. Similarly, a message can be sent to alert the business, providing an explanation and potential workaround while waiting for Microsoft to announce and then resolve the issue.
You now know everything you need to take action and limit the impact of the Microsoft outage. And we are still in minute 5 after early detection, while the rest of the world waits for the notification from Microsoft!
Lay the Foundation
Martello DEM has just helped you to considerably reduce the impact of the Microsoft outage by allowing you to identify, qualify, understand and take decisive actions before any official announcement of the problem.
As the problem is solved, Martello shows you in real-time that the performance is improving, and you know again beforehand that the service is about to be restored.
Don’t wait for the next outage to start laying the foundation for an exceptional Microsoft user experience. The time to act is now.