Emplois en direct

Découvrez et Postulez pour des emplois

Data Engineer - Gen AI (m/f/d)

Contract

Amman, Egypt

24.01.2025

We are seeking an inventive and forward-thinking Data Engineer to join our innovative team. In this role, you will not just follow the traditional paths of data engineering; instead, you'll break new ground by bringing a fresh, creative perspective to every project. Your self-motivation and ability to think differently will be key as you design and implement smart data solutions that go beyond the ordinary

Our Tech Stack:

Languages: SQL & Python
Pipeline orchestration tool: Dagster (Legacy: Airflo
Data stores: Snowflake, Clickhouse
Platforms & Services: Docker, Kubernetes
PaaS: AWS (ECS/EKS, DMS, Kinesis, Glue, Bedrock, Athena, S3 and others.)
ETL: FiveTran & DBT for transformation
IaC: Terraform (with Terragrunt)

Key Responsibilities:

Design and Implement Innovative Data Solutions: Develop and maintain advanced ETL pipelines using SQL, Python, and Generative AI, transforming traditional data processes into highly efficient and automated solutions.
Orchestrate Complex Data Workflows: Utilize tools such as Dagster and Airflow for sophisticated pipeline orchestration, ensuring seamless integration and automation of data processes.
Leverage Generative AI for Data Solutions: Create and implement smart data solutions using Generative AI techniques like Retrieval-Augmented Generation (RAG). This includes building solutions that retrieve and integrate external data sources with LLMs to provide accurate and contextually enriched responses.
Employ Prompt Engineering: Develop and refine prompt engineering techniques to effectively communicate with large language models (LLMs), enhancing the accuracy and relevance of generated responses in various applications.
Utilise Embeddings and Vector Databases: Apply embedding language models to convert data into numerical representations, storing them in vector databases. Perform relevancy searches using these embeddings to match user queries with the most relevant data.
Incorporate Semantic Search Techniques: Implement semantic search to enhance the accuracy and relevance of search results, ensuring that data retrieval processes are highly optimised and contextually aware.
Collaborate Across Teams: Work closely with cross-functional teams, including data science, business analytics to understand and deliver on unique and evolving data requirements.
Ensure High-Quality Data Flow: Leverage stream, batch, and Change Data Capture (CDC) processes to ensure a consistent and reliable flow of high-quality data across all systems.
Enable Business User Empowerment: Use data transformation tools like DBT to prepare and curate datasets, empowering business users to perform self-service analytics.
Maintain Data Quality and Consistency: Implement rigorous standards to ensure data quality and consistency across all data stores, continuously innovating to improve data reliability.
Monitor and Enhance Pipeline Performance: Regularly monitor data pipelines to identify and resolve performance and reliability issues, using innovative approaches to keep systems running optimally.

Essential Experience:

3+ years of experience as a data engineer.
Proficiency in SQL and Python.
Experience with modern cloud data warehousing and data lake solutions such as Snowflake, BigQuery, Redshift, and Azure Synapse.
Expertise in ETL/ELT processes, and experience building and managing batch and streaming data processing pipelines.
Strong ability to investigate and troubleshoot data issues, providing both short-term fixes and long-term solutions.
Experience with Generative AI, including Retrieval-Augmented Generation (RAG), prompt engineering, and embedding techniques for creating and managing vector databases.
Knowledge of AWS services, including DMS, Glue, Bedrock, SageMaker, and Athena
Familiarity with dbt or other data transformation tools

Other Desired Experience:

Familiarity with AWS Bedrock Agents and experience in fine-tuning models for specific use cases, enhancing the performance of AI-driven applications.
Proficiency in implementing semantic search to enhance the accuracy and relevance of data retrieval.

#LI-KM1

Emplois en direct

Découvrez et Postulez pour des emplois

Data Engineer - Gen AI (m/f/d)

Êtes-vous prêt pour demain?

Inscrivez-vous en ligne - cela ne prend que 10 minutes.