Living on the Edge: Why (and How) You need to Monitor Your Exchange Edge Transport Servers
Microsoft Office 365 PerformanceOriginally posted on gsx.com
The Edge Transport server represents the critical handoff point for all messages sent to and received from the Internet, so what needs to be monitored to be sure it’s running properly?
It’s a given that your organization communicates via email over the Internet. And for those companies using Exchange Server in either a completely on-premises or hybrid deployment model, you’ve got at least one server acting as your Edge Transport server. This server performs a few critical tasks revolving around the sending to and accepting of messages from across the Internet including Internet and internal mail flow, protection against spam, and address rewriting.
Life Without the Edge
Should this server fail completely (and I assume you have no redundancy), a few things would happen:
- Inbound messages eventually bounce back, as they can’t be accepted.
- Outbound messages will sit in a Mailbox Server’s delivery, submission, or unreachable queue (depending on where in the delivery process a message was when the Edge server went down).
- If used in a hybrid environment, messages sent between on-prem and Microsoft 365 won’t be sent
If there’s a performance issue, such as a full disk or not enough available RAM, the likely repercussion would be slowed down processing of submission and delivery queues with messages stuck there while the sender assumes everything’s ok and their message has actually been received.
Monitoring the Edge
Understanding whether your Edge Transport servers are functioning well or not is mostly a matter of watching it process messages through four queues:
- Submission – Messages that have been accepted by the server but not yet processed are stored here.
- Delivery – existing one per domain or smart host, these queues contain messages that are being delivered to internal or external destinations.
- Unreachable – When messages can’t be routed to their destination, they are place in this queue.
- Poison – Messages that Exchange determines are harmful based on content, or contain errors are kept here.
Monitoring the state of each of these queues helps to determine if there is a problem with the server itself, it’s connectivity to the Internet, to Microsoft 365, even potentially a problem with its antispam protection. In addition, monitoring the consumption of RAM and disk can help determine if system resources are the source of a problem.
Getting the Full Picture with Synthetic Transactions
But just watching queue lengths isn’t enough; understanding what the user experience is like in scenarios where an Edge Transport server is necessary can both help expose leading indicators of issues, as well as provide insight into exactly what the problem is. This is one of the reasons synthetic transactions can play an important role in understanding both the state of your Edge Transport server and the impact of any reduction in performance. With synthetic transactions, various use cases can be put into play, testing out the functionality of the Edge Transport server; for example, sending a message from an on-premises mailbox to one on the Microsoft 365 side of your hybrid Exchange environment, or testing message transport that uses smart host forwarding to an external domain.
Your Edge Transport servers are the gatekeepers for inbound and outbound messaging. So, it’s imperative that you know if you’re having issues as early on as is possible. In some instances, message queue length may be the perfect leading indicator, but in other cases, determining that messages aren’t being reached at their intended destination may equally provide value. Used for both proactive detection and investigation, the combination of queue and resource monitoring, along with the use of synthetic transactions together can provide organizations with a complete picture of what is and isn’t working within your Edge Transport servers.