Before we dive into incident management, let’s get on the same page about some common terminology.
ITSM (IT service management) is a common approach to creating, supporting, and managing IT services. The core concept of ITSM is the belief that IT should be delivered as a service. And one of the core practices of ITSM is incident management.
Incidents are unplanned events of any kind that disrupt or reduce the quality of service (or threaten to do so). A business application going down is an incident. A crawling-but-not-yet-dead web server can be an incident, too. It’s running slowly and interfering with productivity. Worse yet, it poses the even-greater risk of complete failure.
A problem is the not-yet-known root cause behind one or more incidents. In the incident above where the network is creeping and a business application is down, a reconfigured router could be the underlying problem behind both.
What is Incident Management?
Incident Management is a key area of focus of IT departments as how the process is created to restore “normal” service operations as quickly as possible. This will remove an adverse impact to the business operations or the end user. To be successful, IT teams must promptly and effectively deal with all Incidents reported by user.
One of the most common mistakes of busy, growing IT organizations is to try and reinvent the wheel and create processes from scratch, or build their own tools for fielding tickets.
Incident Management Workflow
Incident Management has a simple workflow that goes from an Incident occurring, to a ticket being raised and identifying the problem, restoring service, and finally reporting to look for possible ways to prevent problem in the future.
Best Practices to Improve Incident Management
Create Robust Workflows
Establish a workflow for a clear process that encourages rapid resolution time. This includes identifying major incidents, communicating to impacted stakeholders, assigning to the right individuals to fix the problem, following and documenting the lifecycle of the incident.
Provide Multi-Channel Support
Allow users to raise tickets easily through various channels including email, chat, portal, etc. With several employees working remotely, they may find some forms of communication to be easier than others. Offering several channels that they can access from both their desktop and mobile will help improve the overall satisfaction and user experience.
Automate Where Possible
Automation can save a lot of time in a busy IT department. Start by automating ticket assignments to the right members of the team for quick resolution. Consolidate the same alerts into one incident for less noise and to avoid redundancy.
Ensure business critical issues are addressed first with proper classification and assignment. Knowing how each incident effects the overall business is important to ensure the best use of resources within an IT department.
Share the status of tickets with members of the team and the user that logged the incident. Ensuring the right stakeholders are aware of a potential outage will help increase efficiencies in the workplace, reduce frustrations and provide a better overall user experience.
Review and Report on Significant Incidents
Analyze major incidents with the goal of finding areas of improvement. This will help you take a proactive approach to your IT system. Data you will want to collect includes:
-Number of major incidents raised and closed each month
-Average resolution time for major incidents
-How much downtime the major incidents caused
-The problems and changes linked to major incidents
Learning from these experiences will help you proactively address potential issues, resulting in less downtime for the business.
If you have integrated an ITSM system with Martello iQ, you can automate the creation of incidents. With this feature enabled, iQ creates an incident when an alert is raised for a board, a business service or saved search. Any subsequent alert will be consolidated into one incident on your ITSM.
Martello iQ also lets you send notifications and alerts of incidents to a specific team automatically, for a faster resolution time.