Skip to main content
Back to Blog

The IT Scramble is On with a Microsoft Outage: Incident MO821132 – July 18, 2024

The IT Scramble is On with a Microsoft Outage: Incident M08821132 on July 18, 2024
July 19, 2024

On July 18, 2024 at 6:38 pm ET, Vantage DX, Martello’s Microsoft 365 and Teams performance management solution, started to see indicators of a likely Microsoft outage impacting users’ ability to access various Microsoft 365 apps and services. Almost an hour later at 7:41 pm ET Microsoft issued a statement on X:

Microsoft Outage Teams Incident NumberM08821132

That hour of uncertainty is a scramble for IT, trying to figure out why they are seeing service degradation.  Is it a problem with Microsoft’s services or a problem with your own network or your ISPs network? Vantage DX customers can get ahead of an outage because they have almost an hour advance notice that there is a problem happening at the Microsoft data center – they can spend that time implementing back-up plans and notifying users, instead of troubleshooting a problem that they can’t resolve.

Early Microsoft Outage Alerts – The Vantage DX Difference

Vantage DX customers across the globe, starting in Europe, began seeing the alerts below at 6:38 PM ET, almost an hour before Microsoft’s confirmation of an issue.

Microsoft Team Outage Incident M08821132 Vantage DX

Quickly Europe was followed by the first alert in North America at 6:47pm.

Microsoft Outage Incident M08821132 Vantage DX

Drilling into the Vantage DX dashboard at each impacted site, let IT admins see the scope of the outage including size, location, applications impacted, and symptoms of the outage experienced by users.  Vantage DX continuously does synthetic testing using a robot that performs user actions like connecting to and creating new channels in Teams, conducting searches in SharePoint and many more.  This lets you see what the user is actually experiencing by measuring response time on various tasks.  This is a preferred approach to common synthetic testing that simply simulates network traffic, instead of measuring how users are impacted.

Microsoft Outage Incident M08821132 Vantage DX

Finding the Root Cause with Network Path Tracing

Vantage DX provides hop-by-hop network path tracing, and shows where the problem(s) are occurring so action can be taken by the right party.  This evening’s network path trace clearly shows that the problem is originating at the Microsoft data center.

Microsoft Incident M08821132 Network Path Tracing

Beyond the network path tracing and proactive synthetic testing Vantage DX provides, it also pulls in all relevant performance and user experience data available from various native management tools such as call quality, endpoint, Teams Meeting Room and Teams Phone data, and this is correlated to give you visual analytics with drill-down dashboards to get to the root-cause of an issue quickly.

All of this happens automatically, without digging through multiple screens in multiple native Microsoft management tools, so IT Teams have a head start in notifying users that Microsoft is experiencing issues, and instituting back-up plans, instead of wasting their time trying to manually troubleshoot an issue that only Microsoft can resolve

Stay Ahead of Microsoft Outages with Vantage DX

There are many places where things can go wrong with Microsoft 365 and Teams service degradation or outages anywhere from the endpoint to the local network, service provider network or as in today’s case, Microsoft’s global services infrastructure.

Vantage DX keeps you ahead of Microsoft outages with:

  • Hop-by-hop network path tracing from the Teams collaboration user, Teams Phone or Teams Meeting Room all the way to the Microsoft data center, giving you end-to-end visibility.
  • Correlation of network path data with call quality, endpoint and service availability data from native Microsoft tools, so you can pinpoint the root-cause of a problem with speed and confidence, all within a single console.
  • Continuous proactive monitoring using synthetic testing, simulates user behaviors so you can find problems before they impact users.

 As of this writing, the last update from Microsoft at 6:25 am ET on July 19 indicates that the underlying cause of the issue has been fixed and several Microsoft 365 apps and services have been restored to full functionality.  Residual impact is still affecting some Microsoft 365 apps and services.

They have determined the preliminary root cause as: A configuration change in a portion of our Azure backend workloads, caused interruption between storage and compute resources which resulted in connectivity failures that affected downstream Microsoft 365 services dependent on these connections.

Are you ready for the next outage?

Find out with our Free Microsoft Outage Readiness assessment today! If you’d like to see Vantage DX in action, check out this demo.

Share

Recent Posts

Return to top