← Back to jobs

Hyderabad, Telangana, India

Posted On: 30+ days ago

Experience: 7+ years

Availability: Onsite

Openings: 1

Category: Generative AI Engineer

Tenure: No Preference/Any

No related jobs found

Key Skills: LLM, Python, LLM Ops, Vector Database Integration, Retrieval-Augmented Generation, MLOps, Fast Api, Flask, Rest API

Roles and Responsibilities:

1. LLM Ops & Orchestration

Operationalize LLM workloads including prompt/version management, model routing, A/B testing, guardrails, and safety enforcement.
Design and implement agent-based workflows using frameworks such as LangChain, LangGraph, and Google Agent Development Kit.
Establish evaluation frameworks for hallucination detection, factuality scoring, toxicity filtering, and relevance validation.
Implement observability including prompt/response tracing, latency monitoring, cost tracking, and drift detection.

2. RAG Pipeline Engineering (Design to Production)

Build end-to-end RAG systems: ingestion - chunking - embeddings - retrieval - reranking - generation.
Optimize context windows, grounding strategies, and hybrid retrieval (BM25 + dense search).
Enable multi-document and multi-tenant retrieval using metadata filtering and security-aware access control.

3. Vector Database Engineering

Design and operate vector databases such as Pinecone, FAISS, Milvus, or Weaviate.
Configure indexing strategies (HNSW, IVF), sharding, scaling, and lifecycle management.
Tune recall vs latency trade-offs for optimal production performance.

4. Python & Platform Engineering

Develop scalable microservices and APIs using Python with FastAPI or Flask.
Design and consume REST APIs with OpenAPI/Swagger standards, pagination, rate limiting, and structured error models.
Containerize applications using Docker and deploy on Kubernetes clusters.
Implement CI/CD pipelines using Jenkins, GitHub Actions, or Azure DevOps.
Integrate with AWS/Azure/GCP services for storage, queues, monitoring, and model endpoints.

5. Security, Compliance & Governance

6. Collaboration & Documentation

Partner with Data/ML, Product, and Security teams to deliver scalable AI solutions.
Produce architecture diagrams, API contracts, and operational runbooks.
Drive best practices for maintainable, observable, and secure AI platform development.

Skills Required:

Strong expertise in LLM Ops including prompt engineering, orchestration, evaluation, and monitoring is essential.
Practical experience building and optimizing production-grade RAG pipelines is required.
Hands-on experience with at least one vector database platform is mandatory.
Advanced Python engineering skills including async programming, packaging, and SOLID principles are expected.
Experience with Docker, Kubernetes, CI/CD pipelines, and observability tooling is required.
Familiarity with hybrid search (BM25 + dense retrieval), reranking models, and ANN configurations is advantageous.
Exposure to model ecosystems including OpenAI, Azure OpenAI, Anthropic, Cohere, Llama, or Mistral is preferred.
Understanding of cost optimization strategies including caching (Redis) and token usage monitoring is beneficial.

Education: Bachelor's or Master's degree in Computer Science, Artificial Intelligence, Machine Learning, or related field