Live Jobs

Discover and Apply for Jobs

Lead - Reliability Engineer (m/f/d)

Permanent
Dubai, United Arab Emirates
23.04.2025

Lead Reliability Engineer – Enterprise SaaS Platform (Dubai)

Core Mission
This role drives reliability and scalability for a global SaaS platform serving enterprise clients. The ideal candidate will balance hands-on technical leadership with team enablement, ensuring high availability, security, and performance across a cloud-native stack while fostering a culture of ownership and operational excellence.

Key Responsibilities
Team Leadership

  • Manage and mentor a growing team of reliability engineers and DevOps specialists, emphasizing psychological safety, professional growth, and collaborative problem-solving.

  • Define processes for planning, prioritization, and delivery in a fast-paced environment, balancing velocity with long-term system health.

  • Champion reliability principles across engineering teams, advocating for resilience strategies, incident preparedness, and blameless postmortems.

Technical Execution

  • Architect and optimize an AWS-native infrastructure (EKS, Aurora, Terraform) to support scalability, automation, and observability.

  • Lead CI/CD enhancements, release automation, and developer tooling to accelerate deployment cycles without compromising stability.

  • Advance monitoring maturity through improved dashboards, alerts (e.g., CloudWatch, Prometheus), and SLO-driven instrumentation to preemptively address risks.

Operational Resilience

  • Translate incidents into coaching opportunities, strengthening cross-team operational readiness and response protocols.

  • Partner with security teams to conduct audits, vulnerability assessments, and ensure compliance across cloud environments.

  • Mitigate technical debt by prioritizing high-impact infrastructure investments and automating repetitive tasks.

Strategic Impact

  • Align reliability initiatives with organizational goals, translating long-term vision into actionable engineering roadmaps.

  • Optimize vendor relationships (e.g., AWS, New Relic) to balance cost, capability, and innovation.

  • Promote a culture of urgency and ownership, encouraging proactive problem-solving and accountability during high-stakes scenarios.

Critical Qualifications

  • Proven experience in AWS-based SRE/DevOps roles at scale, ideally supporting B2B SaaS platforms with 10,000+ daily active users.

  • Dual expertise as a hands-on engineer and people leader, capable of coding complex solutions while mentoring junior team members.

  • Technical fluency in infrastructure-as-code (Terraform), Kubernetes (EKS), observability tooling, and incident management frameworks.

  • Mindset fit: Thrives in ambiguous environments, prioritizes impact over perfection, and fosters collaboration across security, QA, and product teams.

Location: Dubai, UAE

 

#LI-JM8

Ready for Tomorrow?

Sign up now.