Quote on this page: A proper incident- and problem-management is mandatory for stability, availability and reliability of your IT environment. Besites that: every incident is an interuption of your business, and a huge wast of time. That time is taking you away from creating more business value.
The way from incident to problem to solution
The below described process is mandatory for stability and reliability or your IT environment. I create below process and used this way of working twice. Great success.
But there is a second reason to do this: Every interruption of your business process takes a amount of time of several employees. The time they spend on solving incidents cannot be used to create new Business Value.
One of my happy moments when I read an incident with a remark from employee: Ok this incident is done, but I will make a problem record because this one can occur again, and should be solved for ever. That the mindset I like to see.
Task and responsibility of the Service desk
The Service desk will verify the incident en problem queue on daily basis.
Incident: An unplanned interruption to an IT Service or a reduction in the Quality of an IT Service.
First step: Is this a known error and/or can we solve this immediate? Yes, do it. Administration is the second priority, but must be done. If the Incident is solved then the incident should be closed and documented (test evidence, approval to make the chnage in production).
Remark: you must consider if yhis is just an incident or if there is a problem record needed.
Problem: A cause of one or more Incidents.
The cause is not usually known at the time a Problem Record is created, and the Problem Management Process is responsible for further investigation.
If there are more occurrences of this incident or there should be an investigation being done on this incident, then we will make a Problem record.
Remark: not every incident will be a problem record.
Other remark: Every Problem record will have 1 occurrence for 1 issue. For example: all deadlock and timeouts concerning 1 application will have 1 problem record. All incidents concerning this will issue on this specific application should be linked to this problem record.
Only the DevOps teams are allowed to couple incidents to problem records.
The incident and problem record should have extended information and evidence to create information to maintain a fast analyses, development and implementation to resolve the issue.
Problem Management Proces in combination with Scrum/DevOps
Problem Management Process (PMP)
This is the responsibility of one Ops-engineer and one Dev-engineer of your team.
These 2 employee’s monitor the problems and initiate to solving the problems.
PMP is responsible to assign impact and priority of the existing problems
First investigation is performed to determine the kind of problem, impact on the agreed Service. Based on this information’s the PMP decides on de problem priority.
PMP can also decide to close the problem.
Priority 1 en 2 incident’s and problems
On Priority 1 (high impact) incident’s and problems, the following convention applies: Every specialist who can work to resolve problem should work on this problem. In shifts we will work 7 x 24 hours to resolve the issue.
About Priority 2 incident’s and problems, the following convention applies: Every specialist who can work to resolve problem should work on this problem. The team will work on workings day and normal working hours to resolve the issue.
On priority 1 en 2 incident’s and problems always and as soon as possible get your manager involved into this issue.
Problem records on the Product Backlog (PB)
PMP will take ownership for documentation all problem records on the Product Backlog, JIRA story type will be Bug Report. PMP will get information to the Product Owner (PO) on the background of these problems and about the problem priority so the PO can make the decision on the priority of this item on de PB (on refinement, build and implementation of the solution)
Every 2 weeks there should be clear which problems will be assigned to witch DevOps team on which sprint.
Analyse rapports and incident’s
Daily and weekly rapports and status about the IT-landscape. PMP can decide to an extra investigation. The PMP will create a story.
Every 2 months the PMP will deliver a trend analyse report about incident’s and problems.
Every month there will be an analysis on incident’s without problem records. Scope: are there incident’s with the same root cause and should we build a structural solution?
Realization of the solution
It is the responsibility of every DevOps team to resolve and implement the solutions.
Realization of analysis, building the solution and implementing the solution.
Analysis and refinement
Prioritization on (Problem) Backlog Items is always with alignment and responsibility of the Product Owner. After the team agreed to take a problem, the team will start with the Analysis and refinement in the next sprint.
Preference: Small issues should be analysed and solved in 1 sprint.
Realization and implementation
The realization and implementation will be done within the normal DevOps/scrum process. Documentation and evidence of this realization will be put into Product Backlog.
During the realization there should be a test case implemented into the regression test case.
After implementation and good results the Bug report and the Problem record should be closed. That will be done by the team that resolved the problem. Send an email to the PMP to get him aligned.
Copyright @ All Rights Reserved