Route to the Cloud 365 Best Practices
Microsoft Office 365 PerformanceOriginally posted on gsx.com
In this series of articles on Office 365 connectivity, we are explaining in detail each principle as recommended by Microsoft.
Introducing The Office 365 Connectivity Principles Series
Office 365 Connectivity: Differentiate Your Traffic
Route to Office 365 Best Practices: Local Egress
Route to Office 365 Best Practices: Enable Direct Connectivity
Route to Office 365 Best Practices: Avoid Security Duplication
Introducing The Office 365 Connectivity Principles Series
SaaS delivery has changed the world. Now it’s time to change your network strategy
Use of the cloud and access to SaaS applications from anywhere in the world are forcing enterprises to profoundly change the ways they organize their networks and security.
Remote workers used to connect to the enterprise data centre via VPN to access applications; this is not a viable solution with the cloud in play.
An MPLS network connecting branch offices to the headquarters to access SaaS applications is no longer practical because of performance issues.
Credit: https://docs.microsoft.com/en-us/office365/enterprise/office-365-network-connectivity-principles
Providing a quality end-user experience has become the key challenge for every company using SaaS applications such as Office 365.
That is why Microsoft and Gartner have completed numerous studies on this topic.
For example, Gartner has published:
- How to Manage SaaS Performance When SLAs Remain Immature
- Use Monitoring for SaaS Despite Its Limitations
- Implementing Microsoft Office 365: Gartner Survey Results and Analysis, 2019
- How to React to the Impact of the Cloud on IT Operations Monitoring
These describe the changes in IT caused by the cloud, where end-user experience is key. This is ensured through the fusion of applications, networks and devices.
The best practices described by Microsoft result from the analysis of thousands of deployments and the performance issues involved.
You can easily read about the details of end-user connectivity in the document.
You can also read an interesting series of article about modern service management here.
We at Martello have assisted hundreds of customers in resolving their performance issues.
The key point is that even if your company’s IT has its own special features, even if your network or your business is special, in the end nobody is special enough to violate the connectivity principles established by Microsoft. Many of your peers have tried… and failed. That is what we have seen every single time because it is just a matter of time.
Office 365 Connectivity Principles: Because trust matters
In order to introduce the topic of connectivity, here is the latest summary chart from the Microsoft documentation:
Credit: MS Ignite, Optimal network connectivity for Office 365 performance: What is it and how to get there
In this series of blogs, we will go back to each of these principles in order to provide more details and use cases. You will see from the statistics that these principles are critical, no matter the size or the complexity of your infrastructure.
I have to say that we have been surprised during our discussions by the number of companies and Office 365 project managers that didn’t really pay close enough attention to these four principles.
Let’s have a quick overview.
The first principle is traffic optimization.
As we will see in the next blog post, Microsoft has made a considerable effort to reduce the number of FQDNs that must be prioritized (from thousands to fewer than 10!).
So now you have a very limited number of ports and URLs to deal with in order to dramatically enhance the Office 365 experience.
And this first point is key because it allows you to abide by the other three principles.
Read the post about Principle #1 Differentiate your traffic
The second principle is to enable local egress.
As discussed, this is sometimes in contradiction with the old way of determining access to an enterprise’s network applications (VPN, MPLS, backhauling, etc.)
But again, you now just have to do this for a few URLs. Your differentiated traffic should be able to exit to the internet as soon as possible.
The third principle tells you that this traffic should connect directly to the nearest Office 365 front door. As we will see, there are now front doors everywhere.
It doesn’t matter if your user is in Singapore and your tenant in the USA. Your user should enter the Microsoft network through the Singapore front door and travel on the Microsoft network to the tenant because this network will always be faster and more secure than yours.
And finally, the fourth principle is to update your security for SaaS. You should review the way you secure the traffic to Office 365.
This is a major point that needs deep coordination among your Office 365 team, the network team and the security team in order to avoid duplicating security processes that already exist in Office 365. Built-in security tools and processes in the Microsoft network should allow you to trust the FQDNs you are permitting to connect directly.
As you know, trust, overhead and performance are related in all applications you provide to your business lines. The key is to find the best proportion among these elements. To match security and end-user experience at the best point while lowering your overhead, you need to differentiate connections as close to the user as possible.
Office 365 Connectivity: Differentiate Your Traffic
In this article, we will take a look at Office 365 traffic optimization.
The first principle is the mother of all best practices (MOAB-P ?) for two good reasons.
Firstly, just applying this principle should provide your users a significant improvement in performance and user experience. Secondly, because you can differentiate the traffic, you are able to apply the other three connectivity principles.
So why should you differentiate your traffic?
Reaching Office 365 is not the same as reaching YouTube or any other website on the Internet. When your packets enter Office 365, they are reaching one of the most secure cloud environments on the planet. Extra security layers are not relevant when you are dealing with a network more secure than your own…
Differentiating the traffic allows you to enable access to the shortest possible route to Office 365 because, once again, the security is handled by Microsoft. So you can shorten the route to the cloud and benefit from Office 365 at its maximum speed.
The real change that occurred last year helps to answer the question: Can you differentiate this traffic?
Well, a bit more than a year ago, most enterprise IT people would have said no, which was understandable. The number of FQDNs to be differentiated was very large, with several ports and about a thousand different URLs for each workload; this was just not manageable from a security standpoint.
But things have changed because Microsoft has learned the lesson and now provides a usable structure. There are now three types of FQDNs that work with Office 365.
1. The first type of FQDNs is called “Optimize”
The thousands of URLs last year have now been reduced to fewer than ten.
There are now two for Exchange (with about 12 IP subnets), two for SharePoint (with about five IP subnets) and two for Teams.
These URLs are critical because they handle large volumes of data and are very sensitive to network latency; the positive side is that they reach a highly trusted network.
For these, bypassing SSL breaks and inspection is required, and proxy bypass is strongly recommended. Working only on these URLs will bring a dramatic improvement in the end-user experience.
2. The second category of FQDNs is called “allow”
These URLs are less sensitive. They handle a medium to low volume of data and they can handle proxies, but it is still better if you bypass SSL break and inspect.
There are about a hundred of them, so that the situation is still manageable.
A lot of our customers only focus on the optimized FQDNs, which is a very good start.
But of course, if you can work as well on the “allows URLs”, you will get better results.
3. The last category of FQDNs is the default URL, to which you can apply your existing security policies.
The recommendation of Microsoft is really for you to focus on the “optimized FQDNs” that you can access through Office 365 REST API to automate the Office 365 network configuration.
Route to Office 365 Best Practices: Local Egress
Office 365 connectivity, we are explaining in detail each principle as recommended by Microsoft. In the previous article, we had a look at Office 365 traffic optimization. Now let’s have a closer look at the egress point.
In order to ensure maximum performance for Office 365 applications, the key is to reach the Microsoft Network as fast as you can.
The goal is to set Office 365 data connections as close to the user as you can with matching DNS in order to leverage the high performance of the Microsoft Global Network.
This network has about 200 million users; they meet Microsoft at 160 global edge sites that are connected over 130 thousand miles of dark fiber to Azure’s 54 global regions. You cannot beat that.
Let’s speak a bit about DNS lookup.
If the DNS lookups are not performed at the same point as the network egress, the user may be directed to a distant Office 365 front door. And that will cause an end-user experience issue.
Let’s check how this works in real life.
For that I need to introduce briefly how we test the end-user experience. As you may know, Martello provides the Office 365 end-to-end service monitoring solution. We use our Robot users that can be installed anywhere and that use Office 365 exactly the way a user does, measuring the user experience and service quality, alerting and reporting on it.
Below we see a PowerBI report from multiple Robots. For this DNS experiment we will look at the purple robot (Robot user with DNS lookup in Russia) and the yellow witness Robot (perfect configuration of the DNS).
On the User Experience Quality chart we can see the extreme difference between the yellow (witness on the left) and the purple (bad DNS on the right).
The Robot with inadequate DNS connects the user to a front door that is much further from Nice than the best possible one (in Marseille) so that we can see the effect on the overall quality of service. It is interesting to check deeper in Office 365, for example on Exchange, to see how this bad DNS affect the service.
Here we compare the perfect Robot (yellow) with the bad DNS one (purple) for three different actions performed on Exchange:
You can see that the free/busy check shows about a 25% gap in performance, while the search shows almost 33% degradation. The action to create a meeting is less affected. Non-local egress, for example because of a bad DNS, has various serious effects on your end-user satisfaction and productivity.
To troubleshoot this, in case you don’t know what is going with the DNS, where it looks up and what to do, you can use the Microsoft Connectivity Tool.
When running it from the Robot with latency, it confirms that the traffic is rerouted to Russia and hence has a higher latency.
When you don’t know what is going on at a location, the connectivity tool is handy to test the route to the cloud. We will come back to this later in other blog posts.
To sum up, we’ve seen how it is important to be able to detect a poor end-user experience and service quality through by using the Martello monitoring solution for Office 365. You can then troubleshoot what is going on with the Microsoft Connectivity tool.
Route to Office 365 Best Practices: Enable Direct Connectivity
In the previous article we had a look at the egress point. We have seen how important it is to use a local egress for your connection and how the DNS can be a point of failure even if it redirects your user to a point not that far away (Saint Petersburg instead of Marseille for a user in Nice).
Now let’s have a closer look at the next principle: Enabling Direct Connectivity. When the traffic is outgoing, especially on the optimized FQDNs, it should connect directly to the nearest Office 365 front door.
We at Martello have seen three main situations that cause hairpins and lengthen the network path between a user and the Microsoft network.
The first situation is a bad DNS lookup.
The second situation is due to a cloud-based network security device.
If you are choosing a cloud security provider, make sure that the network device is physically near the user. You need to discuss this with your cloud security provider. We have seen many situations when the cloud security service was actually sitting in a data center on another continent (for example in the USA for a user in Europe), causing the length of the route to the cloud and latency to increase.
The third situation is due to a connection through headquarters.
A lot of enterprises have their networks configured to backhaul the network traffic to the headquarters data center in order to inspect it before releasing it to the Internet.
This goes against everything you should do to ensure a better end-user experience. VPN and MPLS networks are far slower than the Microsoft Global network.
So let’s see how backhauling your network traffic affects your users in real life.
First I need to explain briefly how we test the end-user experience. As you may know, Martello provides the Office 365 end-to-end service monitoring solution. We use our Robot users that can be installed anywhere and that use Office 365 exactly the way a user does, measuring the user experience and service quality, alerting and reporting on it.
Below we can see a PowerBI report from multiple Robots. For this network backhauling experiment, we will look at the blue robot (Robot user in Nice using VPN to connect to the US before going out to the internet) and the witness Robot yellow (Robot user in Nice connecting directly to the nearest Office 365 front door in Marseille).
As you can see, the difference in Office 365 performance is really big. This is just because the Robot user in Nice needs to send its traffic to the headquarters in Boston for inspection before sending it to the internet through the Boston front door for Office 365.
If we check, for example, the service quality of Onedrive, once again the difference is striking:
The execution time of the connection, upload and download document in Onedrive (left chart) is about 25% higher with the Robot forced to connect to the headquarters. The difference in basic uptime is even worst. While the Robot connecting from Nice to the Marseille Office 365 front achieves about 100% uptime, the one first connecting to the USA reaches only 70%. Users are complaining and you know why!
Once again, we can confirm that with the Microsoft connectivity tool.
You can clearly see that the route to the Office 365 front door travels across the Atlantic Ocean. The results of the test (on the right) shows that this is really not the ideal situation. Under the map the Connectivity Tool chart shows that performance is only comparable to that of other users in Nice in the yellow part.
So once again, when you spot an end-user issue at a site (and for that Martello provides the best possible tool), the Microsoft Connectivity Tool provides a simple way to analyze what is going on from a pure connectivity standpoint.
As you can see, enabling direct connectivity is really important to ensure a good end-user experience. It of course goes hand in hand with the previous principle to egress locally.
To sum up, we’ve seen how it is important to be able to detect poor end-user experience and service quality by using the Martello monitoring solution for Office 365. Then you can troubleshoot what is going on with the Microsoft Connectivity tool.
And all of that can be done because the traffic is limited because it has been differentiated.
Route to Office 365 Best Practices: Avoid Security Duplication
In the previous article we had a look at direct connectivity. Now let’s have a closer look at the security aspect.
Security is a big challenge for enterprise IT. But don’t worry, it is also a great challenge for Microsoft because their Office 365 business is highly dependent on it. Microsoft invests hundreds of millions of dollars in security features every year.
And most enterprises invest, too, using proxies, SSL inspection, packet inspection and data loss prevention systems.
These technologies should be used for generic internet requests, but they dramatically reduce the performance and quality of the services of Office 365 when applied to optimized FQDNs.
As explained in the blog: Differentiate your traffic, Microsoft has considerably reduced the number of FQDNs (now less than 10) that need to be prioritized in order to dramatically improve the end-user experience of Office 365.
Maximum security is applied to the FQDNs by Microsoft to enable enterprise IT to bypass their own security processes. This highly secure network includes security features such as Data Loss Prevention, Anti-Virus, Multi-Factor Authentication, Customer Lock Box, Advanced Threat Protection, Office 365 Threat Intelligence, Office 365 Secure Score, Exchange Online Protection, and Network DDOS Security.
In order to ease the bypassing of enterprise security processes that duplicate those that already exist in the Microsoft Global Network, Microsoft allows Office 365 administrators to use Rest API to access the list of endpoints to update the configuration of firewall and other security devices.
Finally, Office 365 administrators can create Proxy Automatic Configuration scripts to bypass proxies for Office 365 requests from WAN or VPN users.
There is often a big discussion with our customers about proxies. So let’s show you in real life how they affect the end user experience.
Once again I would like to explain briefly how we test the end-user experience. As you may, know Martello provides the Office 365 end-to-end service monitoring solution. We use our Robot users that can be installed anywhere and that use Office 365 exactly the way a user does, measuring the user experience and service quality, alerting and reporting on it.
Below we can see a PowerBI report from multiple Robots. For this proxy experiment we will look at the turquoise robot (Robot user in Nice using proxy before going out to the internet) and the yellow witness Robot (Robot user in Nice connect directly to the nearest Office 365 front door in Marseille).
As you can see, the user experience quality is almost doubled when the Robot can egress locally and connect directly to the nearest Office 365 front door.
And this is true for every workload of Office 365. It is important to also note that MS Teams usually doesn’t respond well when a proxy is involved.
In order to script the automatic bypass of proxy for selected sites and FQDNs, you can use the Get PAC (Proxy Automatic Configuration) file displayed below:
This tool will really help your administrator to automate the optimization of all your security devices to enable the best possible performance for your Office 365 users.
To sum up, we’ve seen how it is important to be able to detect poor end-user experience and service quality through Martello monitoring solution for Office 365. You can then enhance your performance with the tools Microsoft provides you (like the PAC file).
The connectivity principles have been developed to help you improve the end-user experience. Microsoft has worked a lot on improving its network to allow you to change your route to the cloud in a secure and high-performance way.
But as you implement those changes, it is essential for you to be able to measure the results, assessing whether the return on investment has been good or poor.
For that you need to continuously measure the end-user experience on every site you want to improve.
This allows the C Level to determine the Office 365 project costs and measure the ROI of the network improvement. It prevents critical situations and management complaints to the operations team. It improves the global quality of the services delivered to your business lines, which ensures optimal productivity to your company.
Thanks to our Robots that measure the end-user experience in real time, alert and report on it, Martello is the perfect solution to partner with you on your service quality enhancement journey.