Live Jobs

Discover and Apply for Jobs

Infrastructure Architect

Contract
Abu Dhabi, US
12.11.2024

Job Title:Infrastructure Architect

Location: Abu Dhabi, US

Employment Type: 

Contract

As the Principal Infrastructure Architect, you will be responsible for designing and implementing the infrastructure that supports our cutting-edge AI platforms. With a minimum of 15 years of experience, you will lead the technical strategy for infrastructure architecture, ensuring scalability, reliability, and security of the AI platforms. You will work closely with engineering and operations teams to build and maintain high-performance systems that support AI workloads, focusing on optimizing cloud infrastructure, networking, and storage solutions.

Key Responsibilities:

  • Lead the design and implementation of scalable, secure, and highly available infrastructure to support our AI platforms and applications.

  • Architect cloud and on-premise infrastructure solutions that meet performance, scalability, and security requirements for AI/ML workloads.

  • Collaborate with engineering, DevOps, and product teams to ensure that the infrastructure is optimized for AI model training, inference, and data processing.

  • Ensure high availability and disaster recovery capabilities are built into the infrastructure to protect critical AI platforms and services.

  • Design and implement networking solutions that provide secure, fast, and reliable connectivity for distributed AI systems and platforms.

  • Drive the adoption of automation and infrastructure as code (IaC) practices, using tools like Terraform, Ansible, and Kubernetes to streamline deployment and management.

  • Implement best practices in cloud architecture, ensuring efficient use of cloud resources while maintaining cost-effectiveness and security.

  • Evaluate and integrate new technologies and tools to enhance infrastructure capabilities, such as advanced storage solutions, networking optimizations, and cloud-native services.

  • Ensure that the infrastructure meets compliance, security, and regulatory requirements, with a focus on data privacy, encryption, and access controls.

  • Provide technical leadership and mentorship to the infrastructure engineering team, ensuring they adopt best practices and continue to improve infrastructure quality and efficiency.

  • Stay ahead of industry trends and emerging technologies, continuously driving innovation in infrastructure to support evolving AI technologies.

Qualifications & Requirements:

  • Minimum of 15 years of experience in infrastructure architecture, cloud platforms, and systems engineering, with a strong focus on building large-scale, mission-critical systems.

  • Proven track record in architecting and managing cloud-based infrastructures (AWS, Azure, GCP) for AI/ML workloads, with a focus on scalability and performance.

  • Deep expertise in cloud computing, networking, and storage, with hands-on experience in infrastructure as code (IaC) tools such as Terraform, Ansible, or CloudFormation.

  • Strong understanding of containerization (Docker) and orchestration (Kubernetes) technologies, with experience in deploying and managing AI/ML workloads.

  • Extensive knowledge of networking protocols, VPNs, firewalls, and security best practices, particularly in cloud and hybrid environments.

  • Expertise in storage systems and data management for AI/ML, including distributed storage and high-performance computing (HPC) architectures.

  • Experience designing disaster recovery, business continuity, and high-availability strategies for critical systems.

  • Ability to work with cross-functional teams, including DevOps, software engineering, and security, to ensure alignment on infrastructure goals.

Preferred Qualifications:

  • Advanced degree in Computer Science, Engineering, or a related field.

  • Certifications in cloud architecture (AWS Certified Solutions Architect, Google Cloud Professional Architect, etc.).

  • Experience with AI/ML infrastructure challenges, including high-performance computing (HPC), large-scale data pipelines, and AI model deployment at scale.

  • Expertise in cost optimization strategies for cloud infrastructures, ensuring efficiency without sacrificing performance or security.

Skills and Attributes for Success:

  • Strategic mindset with the ability to design and implement infrastructure solutions that align with business objectives and technical goals.

  • Excellent problem-solving skills, with the ability to navigate complex technical challenges related to infrastructure scalability, performance, and security.

  • Strong communication and leadership skills, with the ability to work effectively across teams and mentor junior infrastructure engineers.

  • Passion for innovation and staying at the forefront of infrastructure technologies, particularly in support of AI and ML applications.

#LI-KM1