Digital Bank Reduced IT Troubleshooting Time by 40%
☹️️️ Weak automation of processes
When incidents occurred, the duty officer followed step-by-step instructions to diagnose problems and notify the responsible team.
☹️️️ Difficult to keep track of knowledge base updates
Nobody followed the knowledge base updates, so new employees often made mistakes when following instructions.
☹️️ High human dependency
With such dependence on a person, SLA suffered greatly, which is not just a pain point but a very critical issue.
☹️️ Disparate monitoring tools
The fragmented IT infrastructure with many technical departments each had at least one monitoring system. It was difficult to constantly keep the focus of control on each notification, their urgency and importance were leveled due to their large number.
💻 Deployment: Enterprise on-premise version with priority support
🕒 Period: 2 months
💪 First, we centralized data from different monitoring systems on a single screen so that engineers working with data could understand the big picture of what is happening.
💪 Based on this data, a single resource-service model was built, showing the health of all components of the entire complex in real time. Links between configuration items reduced the time to find the root cause of the problem and the type of failure.
💪 Then, event-based automation was configured that reduced incident response time and the number of errors when executing instructions. Problem reports were sent to the responsible specialists.
💪 Interactions with the knowledge base were also automated. The engineer got the necessary entry from the knowledge base in the automated mode. Action scripts for recurring incidents of the same types were also automated.
😊 Configured event automation
Reduced incident response time. Reduced the number of errors when executing instructions.
😊 Problem reports are sent to the responsible persons automatically
Freed up employees’ time to perform more useful actions on problems.
😊 Automation of work with the knowledge base
The required knowledge base article is automatically issued to the engineer.
😊 Reduced critical incident processing time
From 25 to 15 minutes.
😊 Reduced the number of alerts per IT Ops engineer
From 110 to 10 alerts.