Problem Management
1. Process
1.1. Introduction
Problem management is a process for managing multiple incident tickets that are related to one another.
1.2. Scope
Collection of related incidents are identified, merged into a problem and problem is managed to resolution.
1.3. Input(s)
Collection of related incidents identified through the incident management process.
1.4. Output(s)
Resolution offered to client and problem solution documented.
1.5. Tasks
- Identify and validate there is a problem
- Communicate with client
- Work the problem
1.6. Role(s)
- Technician
- Client
- Others as needed
2. Standard Operating Procedures
2.1. Problem Validation
2.1.1. Decision Tree: Is this a problem?
2.1.1.1. Yes: More than 2 related incidents from different clients
2.1.1.2. No: Then go back to "incident management process"
2.1.2. If determined to be a problem then click problem ticket type and link related tickets (See attachment)
2.2. Communicate with clients that incident is now a problem and what the problem is
-
-
-
- No: Go to identify severity level procedure
-
- Identify severity level assigned
- Set expectations on next communication
2.2.1. Acknowledge understanding of request or question and indicate that this has now been identified as a problem
- Include the parent ticket ID
- Identify appropriate internal stakeholders and inform
2.2.2. Decision Tree: Is additional information needed?
2.2.2.1. Yes: Request additional information
2.2.2.1.1. Set ticket status to “pending customer response”
2.2.2.1.1.1. Decision Tree: Response from client within three communication attempts?
2.2.2.1.1.1.1. Yes: Go to "Is additional information needed?" procedure (2.2.2)
2.2.2.1.1.1.2. No: Set ticket status to abandoned
- Critical - within 24hrs
- Urgent - within 3 business days
- High Impact/Time Sensitive - 3 business days
- Important - 5 business days
- Low - 10 business days
2.3. Working the Problem
2.3.1. Root Cause Analysis
- Replicate Error (if appropriate)
- Run Trace on Process (if needed)
- Support Repositories (Reference Center, Oracle, Google, Knowledge Base)
- Reach out to Application Services for deeper dive into problem (SQL analysis, Database tables)
- Test alternative processes in appropriate environment
2.3.2. Mitigate Problem
- If appropriate, provide alternate process to problem for clients to use until problem resolved
- Communicate expected resolution time for problem if applicable
2.3.3. Resolve Problem
Implement resolution to problem in production stack which includes running through testing cycles (PDV, SIT, UAT, Sanity, Confidence)
2.3.4. Update Support Documentation
2.3.4.1. Review related support documentation and update as needed
- QRGs
- Functional Design Documents
- technical design documents
- configuration guides
- test scripts
- Knowledge documents
2.3.4.2. Look at and edit title and metadata of support documents to improve search and retrievability
2.3.5. Testing
Test solution in appropriate environment
2.4. Collaboration with Other Technicians (as needed)
2.4.1. If another technician need work on a portion of your ticket:
- Flag ticket
- Re-assign to technician
- Make note if ticket needs to be reassigned back to original technician