ITConnections: Come Out On Top When a Challenging Environment Tests Your Service Level Agreements
The workforce has seen a sudden and dramatic shift to remote work. This has caused employees to move out of their typical area of where Service Level Agreements (SLAs) apply, which presents new challenges to ensure service quality. With the help of SLA reporting, IT administrators can set up custom alerts before an SLA has been breached, which could prevent an outage.
We sat down with Luke O’Keefe, Customer Success Manager at Martello, Michael Vestergaard, an IT System Administrator, and Jacob Knudsen, a System Administrator from University College Copenhagen.
University College Copenhagen is one of Denmark’s main providers of teacher’s education, social education, nursing and social work. With more than 20,000 students currently enrolled, it’s imperative that they can maintain a consistent, reliable digital experience.
What are some of the new challenges you are seeing them face since this sudden shift to remote work?
Luke: At the beginning of the Covid Crisis many of our customers saw their offices empty, and their workers moved to their own homes. For IT workers this was a big challenge, it required a lot of work in the first couple of weeks of the crisis. Many teams were relying more on cloud to augment their existing on-prem solutions with increased capacity and redundancy. Now a lot of that work has been done, so IT operations teams are turning their attention more to the monitoring part, it’s now very important that their infrastructure can handle the increased traffic and that there are no major service disruptions.
Now that the world is working from home do you feel service availability and SLA reporting has become more of a focus?
Michael: Definitely, service availability. Mostly because we’re working from home it means that we have less overview of what’s happening so therefore the service availability is really critical right now.
What new factors, if any, are impacting SLA’s from 6 months ago?
Jacob: We are working from home so everything has been changed into cloud-based working, more or less. We need to be able to make sure that our network is accessible more now rather than one or two servers in our server farm.
If an SLA is breached has been claimed by a customer, what tools would you use to verify that the SLA was or was not breached? In other words, how would you use SLA’s to verify it in the future?
Michael: What we had in the past was multiple systems and we didn’t have a good overview. Since we now have iQ, we have a lot of our monitoring systems reporting into one place and thereby we quickly have a view and we can see in the log file that everything is running fine which means it could be something on the end user’s side instead of being on the corporate network.
Could you identify which types of metrics should be monitored in an SLA?
Jacob: Mostly what we are looking at is the end user application when we are looking at our SLAs. Your service level monitoring and response time is what we really need to look at because people are impatient- if your Google browser isn’t responding right away people will skip it and use something else.
Could you share what does an acceptable level of uptime look like today?
Michael: That depends on what kind of view your having, from the IT department or you’re looking from the user end. From the user end we’re talking 100%, from the IT Department “four nines” 99.99%, if possible. We would like to be 100%, but you can never be 100%, it’s near impossible to be because sometimes you have external parties coming in and turn off the power because they need to do some maintenance and then it hits your SLAs.
In which ways will SLA reporting help you to scale and futureproof your company’s network?
Michael: Using SLA we can actually go to management and say, you want to have 99.99 however, we can’t deliver because we have people turning off the power without informing us. With this SLA, we suddenly have an opportunity to be able to actually put pressure on other management teams to say, “you need to inform IT when you do this work, otherwise we can’t keep up the SLA.” Suddenly you as an IT department actually have a number you can show management saying, you spend this amount of money, you get 4 or 5% more uptime.