Top Office 365 Issues: Network Latency Issues
Whenever a user complains about slowness in accessing a mailbox, IT administrators often suspect it’s due to an issue with the network. Still, they don’t always have enough information to confirm this.
To confirm the issue, admins will check with other users to see if they’re facing the same issue. Then they’ll try to access Google to confirm the speed of the Internet, and ping outlook.office365.com to check if there is any packet loss. If this doesn’t identify the issue, they’ll try to get the hop details and raise a ticket with Microsoft to find out if there are any issues with Office 365. This takes up a lot of time since there’s no proactive approach to checking the latency or DNS issues before a user has complained.
Martello can help administrators fix Office 365 issues before users even notice, by pinging the Microsoft end-point every five minutes. If any packet loss is found, it will immediately notify the admins. Martello also automatically gets the average round-trip time, notifying the admins when the threshold limit is exceeded. Martello checks the DNS query resolution, hop details, and latency, and notifies administrators of all of these things when it detects a problem. By proactively monitoring the Microsoft Exchange environment, administrators have the power to identify and resolve issues before any end-user complaints arise.
Top Office 365 Issues: Mail Delay Complaints
A top complaint with Microsoft Exchange Online is delays in sending and receiving emails. This leads to a loss of productivity among employees, and also takes up a lot of time for IT administrators.
When a user complains of this issue, IT administrators will start by requesting the user provide their email header, then checking to see whether the issue is widespread amongst the organization or isolated. If they find that the delay is happening on the Office 365 end, they will raise a case with Microsoft to identify the problem. Unfortunately, the Office 365 administrator can’t see the mail queues in the portal, so they aren’t able to proactively identify and fix email delay issues.
Martello helps save time and reduce complaints by providing a mail routing end-to-end check associated to an echo mailbox in the cloud that allows administrators to send a test email every five minutes to validate message reception and time. By doing this, Martello helps admins continually understand round-trip time, and be alerted in case of any failures in sending or receiving emails. This lets administrators be proactively notified about delays, and to fix any issues before the end-user even becomes aware of it.
[Podcast] Got complaints with Office 365 performance?
Watch this technical deep dive into the performance metrics that are behind the tickets and the complaints. We dissect user actions, dig into the response times and show you what is really happening when tickets come in saying “Outlook is slow.” Watch the podcast!
Top Office 365 Issues: High Network Latency
Any IT administrator knows that when users complain about slowness accessing a mailbox, it’s usually tied to network latency. Still, there’s rarely enough information to confirm this is the case. And there’s no native proactive approach to checking and managing latency issues ahead of time.
That’s why IT administrators go through the process of checking with other users in the same area to see if they’re experiencing the same issue. They’ll also request that the user try to access google.com or outlook.office365.com to see if there’s any issues. They’ll test a few other things before submitting a ticket to Microsoft to find out if there are larger issues impacting Office 365.
There are a few performance counters and aggregates that you have to constantly measure to understand how the network really impacts the end-user experience:
- Packet Loss: Which measures how many packets are lost during transmission. The packet loss rate is measured as a percent.
- MOS: Which measures the network’s impact on the listening quality of the VoIP (Voice over Internet Protocol) conversation. The network MOS rating ranges from 1 to 5, with 1 being the poorest quality and 5 being the highest quality.
- Jitter: Which measures the variation in arrival times of packets being received. Inter arrival jitter is measured in milliseconds (ms).
When organizations deploy Martello for Office 365, the Robot User can start pinging the Microsoft end-point every two minutes to see if any pocket loss is found. If there is, it can notify the administrator immediately. Martello can also get the average round trip time to notify administrators if the threshold limit has been exceeded. It can also send details on latency issues so that administrators can discuss with the network team as needed. Also, Martello checks the DNS query resolution to notify administrators in case of any errors found.
It is important to constantly measure the network latency from where the user are in order to detect, troubleshoot and fix issues before you get overwhelmed by users complaints.
Altogether, Martello helps administrators save time and deliver a better end-user experience by proactively monitoring and fixing network latency issues.
Top Ways to Check your Office 365 Health
The IBM Domino environment was mostly replaced by Microsoft on-premise technology and later replaced by the Microsoft Cloud Services. One thing that hasn’t been replaced is the need for administrators to have control and visibility of service delivery to end-users.
Whether you’re using IBM applications, Microsoft on-premise servers, or Microsoft Office 365, as the messaging administrator you need to provide availability and performance metrics of the service delivered to your management. You also need to understand the issues when they arise.
In a word, you’re still responsible for the service that you deliver, even if the servers are physically in the Microsoft datacenter. So, how do you measure the health of your Office 365 service delivery and how do you anticipate issues before they impact your end-users?
Microsoft works to maintain the 99.99 percent of availability they promise out of their data centre as well as to improve the features they provide and the performance they deliver. The issues that arise generally impact only some features and usually only affect a subset of users with a specific configuration or in one location. It’s difficult to know exactly how these issues impact your global network, and it’s impossible to predict, but it helps to know that Microsoft offers a health dashboard. The dashboard provides an understandable view of the availability and performance of each Office 365 service out of the datacenter. It also lets you configure alerts. The disadvantage is that the dashboard only shows service out of the data centre, and doesn’t provide any idea of the service that is experienced by your users or relative to your own tenant. It doesn’t provide information into how issues are impacting the availability and performance in your company. Lastly, it doesn’t take into account user complaints that can stem from many issues within the Office 365 environment.
Microsoft provides two tools that can be useful for measuring the health of the Office 365 services delivered to your users. First, the Remote Connectivity Analyzer allows you to run a series of checks from the cloud to the cloud that will test different part of the Office 365 services. This allows you to check if your tenant has problems with the basic functions of Office 365 by running Outlook connectivity tests, opening a mailbox, performing a mail routing, or connecting to Skype to Business Online. The disadvantage is that it is performed from Microsoft Azure to Microsoft Office 365, so basically from one cloud to another. It doesn’t provide you with any information that could help you troubleshoot what is happening at the user location.
Microsoft’s other tool that can be useful when it comes to checking the connection is called the Support and Recovery Assistant. This tool lets you perform tests to help you understand what’s happening when a location gets poor performance from Office 365. The problem is that these tests aren’t automatic or continuous, and they don’t alert you when an issue begins to arise. You still have to rely on user complaints for that.
That leaves manual testing as the last option for administrators. Manual tests are available in PowerShell. When conducting tests, you should start by testing pings to Office 365, DNS resolution time, latency, and end-user egress points to understand the impact of the network on your Office 365 service delivery.
Even if these tools are clearly not enough to measure, alert, report or troubleshoot the service to the end-users, they still provide something to work on to check the health of the Office 365 service delivered to the end-user. They show that Microsoft understands that the service out of the data centre is not what matters most. That is why when people think about the health of Office 365, they should always consider the entire route of the service, from the Microsoft datacenter to the user sitting at his desk.
When we measure health, we understand the availability and performance of the service. The border between poor performance and unavailability is small, especially for end-users.
In order to understand the health of your services, you need metrics, facts and alerts.
What matters is not really a poor performance but a degradation of the performance, because that is what users perceive. To measure the degradation of performance, you need to know what’s normal. It is called baselining. This requires something that continuously measures the performance delivered to your end-users and that can alert you when the performance is declining.
Finally, most of the Office 365 service is used in correlation with internal infrastructure applications and appliance that also have an impact on the service delivered. If you don’t monitor these applications at the same time and on the same dashboard, you will have difficulties troubleshooting the issue. Some companies have hybrid Exchange, some have also remaining Domino application servers, some are using gateways, proxy servers, and so on.
It all comes down to service delivered to your end-users. Whether issues arise from the datacenter or your network, it’s important to monitor and understand what users are experiencing so that you can maintain the highest level of service. Understanding the relationship between availability and performance from the user’s perspective helps you achieve better insights and oversight of your cloud or hybrid cloud health.
ITIL Lesson Learned
I completed my final ITIL exam today, and one of the questions was about verifying the cost of specific IT services to the business, and it reminded me of these statements:
- You cannot Manage what you cannot control
- You cannot control what you cannot measure
- You cannot measure what you cannot define
It makes perfect sense, ITIL is about services, understanding the services that your IT infrastructure delivers to the business and managing accordingly. These statements, for me, form the basis for many many different aspects of ITIL. But most especially cost.
If you cannot measure, you cannot control cost.
In your collaboration environment, have you defined the specific services that you deliver to the business? Are you capable of measuring the quality of the services that you deliver to the business? (A running joke in Martello is that I manage to sneak in the statement, quantifiable metrics on the quality of service delivery in every meeting), have you measurement metrics in place to justify and control the cost of delivering collaborative services to your organization?
We in Martello are constantly surprised by the volume of customers that do not harness the power that Martello delivers to get even further value from Martello Gizmo. Martello Gizmo focuses on the availability and performance of your collaboration environment, in doing so it collects hundreds of statistics critical to measuring the services that you deliver to the business. Martello makes it incredibly simple to mine this information and easily deliver reports (with 100’s of pre-defined templates). Reports that can assist you in identifying and understanding operational costs, Reports that can validate the quality of service that you deliver to the business, reports to manage, control and measure.
Quantifiable metrics of quality of Service delivery ;-))