Join Omilia, a market leader in Conversational AI, where we are shaping the future of customer experience for global enterprises. You will be joining a high-impact, mission-critical team dedicated to maintaining the reliability of our cutting-edge cloud platform. If you thrive on discipline, clarity, and rapid resolution in a fast-paced environment, this is your opportunity to be a crucial operational leader.
Role Summary
You will be part of a dedicated team of Incident Managers, based around the world, responsible primarily for managing and coordinating high-severity incidents on the Omilia Cloud Platform.
Your team provides 24/7 "follow-the-sun" coverage:
- You will primarily cover your regional daytime hours.
- You will participate in a shared, compensated on-call rotation to provide essential backup coverage for weekends, holidays, sick leave, or periods of high incident load.
You will work directly with highly skilled SRE, DevOps, and Engineering teams who are dedicated to root cause resolution and continuous improvement.
This role offers unparalleled exposure to the inner workings of a rapidly scaling global SaaS platform, providing a unique opportunity to shape and deliver operational excellence.
A background in a client-facing role, such as a Technical Account Manager, would make an individual particularly well-suited for the Incident Manager position.
Key Responsibilities
1. Incident Leadership & Coordination
- Lead Major Incidents from initial detection to mitigation and closure.
- Maintain situational awareness, ensuring all participants are aligned on scope, impact, and next steps.
- Facilitate effective collaboration between Technology Operations Centre, SRE, Engineering, Product, Customer Success, and leadership.
- Capture a clear, timestamped record of the incident timeline, key decisions, actions, and ownership.
- Drive a "mitigation-first" approach and execute timely, objective escalation when progress stalls or risks to service/customer relations increase.
2. Communication & Stakeholder Management
- Provide clear, timely, and jargon-free updates internally and externally throughout the
incident lifecycle. - Coordinate customer communications throughout major incidents.
- Manage effective regional handovers with other IMs and on-call SREs to ensure
continuity of service during shift changes. - Ensure executive leadership is consistently and accurately briefed when required by the
major incident framework.
3. Process & Governance
- Ensure strict adherence to the Omilia Incident Framework, including proper activation, escalation, and closure protocols.
- Identify and escalate deviations, process gaps, or risks.
- Coordinate and facilitate post-incident reviews (PIRs) ensuring action items are recorded, assigned, and tracked to completion.
- Proactively contribute to the continuous improvement of incident processes, templates, playbooks, and tooling.
4. Operational Readiness & On-Call
- Participate in a follow-the-sun shift pattern, covering regional day hours.
- Provide on-call backup for colleagues during weekends, holidays, sick leave, or peak incident load.
- Maintain working familiarity with monitoring tools, escalation paths, severity definitions, and communication templates.
- Support the TOC and on-call SREs by ensuring smooth handovers between regions.
Requirements
Essential
- The ability to command a bridge, synthesize complex information quickly, and tailor communications for engineers, business leaders, and external customers.
- Proven ability to remain calm, structured, and assertive during high-pressure, severe outage events.
- Prior experience in Service Management, Operations Leadership, Support Management, Delivery/Project Management, or a similar high-visibility, coordination-heavy role.
- Familiarity with incident response frameworks (such as ITIL Major Incident Management).
- Exposure to Cloud/SaaS environments and, ideally, contact centre services.
Desirable
- Experience working in 24/7 follow-the-sun operational environments.
- Familiarity with monitoring/alerting, ticketing systems, or post-incident analysis workflows.
- Understanding of SRE or DevOps ways of working (without requiring technical depth).
Key Behaviours
- Calm under pressure - maintains clarity and composure during severe outages.
- Highly collaborative - brings teams together and reduces friction.
- Strong ownership - sees incidents through to mitigation and closure.
- Clear communicator - able to speak confidently with both engineers and business stakeholders.
- Process-driven - follows and enforces incident protocols consistently.
- Bias for action - unblocks teams and maintains momentum when incidents stall.
Benefits
- Fixed compensation;
- Long-term employment with the working days vacation;
- Development in professional growth (courses, training, etc);
- Being part of successful cutting-edge technology products that are making a global impact in the service industry;
- Proficient and fun-to-work-with colleagues;
- Apple gear.
Omilia is proud to be an equal opportunity employer and is dedicated to fostering a diverse and inclusive workplace. We believe that embracing diversity in all its forms enriches our workplace and drives our collective success. We are committed to creating an environment where everyone feels welcomed, valued, and empowered to contribute their unique perspectives without regard to factors such as race, color, religion, gender, gender identity or expression, sexual orientation, national origin, heredity, disability, age, or veteran status, all eligible candidates will be given consideration for employment.