You will design, deploy, and scale end-to-end machine learning systems and generative AI applications.
Responsibilities
- Engineer end-to-end ML pipelines covering data ingestion, feature engineering, training, and automated promotion using MLOps stacks.
- Convert research code into production-grade microservices using Docker and Kubernetes with REST, gRPC, or event-driven APIs.
- Build full-stack AI applications by integrating model services with UI components and workflow engines to ensure low-latency delivery.
- Optimize model performance and cost through quantization, pruning, and tuning GPU/CPU auto-scaling policies.
- Implement comprehensive observability including real-time metrics, distributed tracing, and drift/bias detection.
- Partner with data scientists to prototype algorithms and provide guidance on scalability and production-readiness.
Required Skills
- 8+ years of experience in machine learning and model development.
- Strong expertise in Machine Learning algorithms and Large Language Models (LLMs).
- Hands-on experience with Databricks or similar distributed data platforms.
- Proficiency with MLOps frameworks such as Kubeflow or SageMaker.
- Experience using OpenAI SDK and modern generative AI frameworks.
- Practical knowledge of containerization using Docker and orchestration with Kubernetes.
- Ability to build scalable APIs using REST or gRPC.
- Deep understanding of data engineering, feature engineering, and model evaluation.
- Experience implementing monitoring and model performance tracking mechanisms.
Education
- Any Graduate degree in Computer Science, Data Science, or a related field.