You will design and develop data processes and agentic systems to solve real-world problems.
Responsibilities
Design and develop data pipelines and robust data flows to handle complex interactions between AI agents and data sources.
Train and fine-tune large language models (LLMs) using structured and unstructured datasets.
Build data architecture, including databases, data lakes, and vector databases, to optimize storage and retrieval of embeddings.
Manage ELT processes and implement pipelines that facilitate feedback loops for human-in-the-loop systems.
Collaborate with data scientists to preprocess data, train models, and integrate AI into applications.
Required Skills
5+ years of experience in data engineering or related fields.
Proficiency in Python and AI/ML frameworks.
Experience with Apache Spark, including implementing partition schemas.
Hands-on experience with Azure Databricks and Azure AI services (Azure OpenAI, Azure AI Search, Azure Machine Learning, Azure Computer Vision, Azure Video Indexer, Azure Media Services, and Blob Storage).
Expertise in integrating with AI agent frameworks and working with vector databases.
Strong understanding of ELT, big data frameworks, and cloud computing practices.
Knowledge of graph databases and core machine learning algorithms.
Proficiency with Git version control.
Bachelor’s or Master’s degree in Computer Science, Artificial Intelligence, Data Science, or a related field.