Description
Key Skills: LLM, Python, LLM Ops, Vector Database Integration, Retrieval-Augmented Generation, MLOps, Fast Api, Flask, Rest API
Roles and Responsibilities:
1. LLM Ops & Orchestration
- Operationalize LLM workloads including prompt/version management, model routing, A/B testing, guardrails, and safety enforcement.
- Design and implement agent-based workflows using frameworks such as LangChain, LangGraph, and Google Agent Development Kit.
- Establish evaluation frameworks for hallucination detection, factuality scoring, toxicity filtering, and relevance validation.
- Implement observability including prompt/response tracing, latency monitoring, cost tracking, and drift detection.
2. RAG Pipeline Engineering (Design to Production)
- Build end-to-end RAG systems: ingestion - chunking - embeddings - retrieval - reranking - generation.
- Optimize context windows, grounding strategies, and hybrid retrieval (BM25 + dense search).
- Enable multi-document and multi-tenant retrieval using metadata filtering and security-aware access control.
3. Vector Database Engineering
- Design and operate vector databases such as Pinecone, FAISS, Milvus, or Weaviate.
- Configure indexing strategies (HNSW, IVF), sharding, scaling, and lifecycle management.
- Tune recall vs latency trade-offs for optimal production performance.
4. Python & Platform Engineering
- Develop scalable microservices and APIs using Python with FastAPI or Flask.
- Design and consume REST APIs with OpenAPI/Swagger standards, pagination, rate limiting, and structured error models.
- Containerize applications using Docker and deploy on Kubernetes clusters.
- Implement CI/CD pipelines using Jenkins, GitHub Actions, or Azure DevOps.
- Integrate with AWS/Azure/GCP services for storage, queues, monitoring, and model endpoints.
5. Security, Compliance & Governance
- Implement data privacy controls including PII masking, RBAC, and audit logging.
- Apply enterprise-grade guardrails and jailbreak detection mechanisms.
- Ensure compliance with secure platform and regulatory requirements.
6. Collaboration & Documentation
- Partner with Data/ML, Product, and Security teams to deliver scalable AI solutions.
- Produce architecture diagrams, API contracts, and operational runbooks.
- Drive best practices for maintainable, observable, and secure AI platform development.
Skills Required:
- Strong expertise in LLM Ops including prompt engineering, orchestration, evaluation, and monitoring is essential.
- Practical experience building and optimizing production-grade RAG pipelines is required.
- Hands-on experience with at least one vector database platform is mandatory.
- Advanced Python engineering skills including async programming, packaging, and SOLID principles are expected.
- Experience with Docker, Kubernetes, CI/CD pipelines, and observability tooling is required.
- Familiarity with hybrid search (BM25 + dense retrieval), reranking models, and ANN configurations is advantageous.
- Exposure to model ecosystems including OpenAI, Azure OpenAI, Anthropic, Cohere, Llama, or Mistral is preferred.
- Understanding of cost optimization strategies including caching (Redis) and token usage monitoring is beneficial.
Education: Bachelor's or Master's degree in Computer Science, Artificial Intelligence, Machine Learning, or related field