Description
You will design, build, and fine-tune NLP and LLM solutions for business use cases including classification, summarization, and Q&A. You will develop efficient, production-quality Python code for training, inference, and evaluation pipelines. You will build RAG applications using embeddings, vector databases, and prompt engineering. You will integrate LLM applications into services and APIs, ensuring performance, reliability, and scalability. You will establish model evaluation, monitoring, and governance practices covering quality, safety, bias, and drift. You will collaborate with data engineering and platform teams on data pipelines, deployments, and CI/CD.
Responsibilities
- Design and implement NLP/LLM solutions for classification, summarization, and Q&A tasks.
- Build RAG applications using embeddings, vector databases, and prompt engineering techniques.
- Develop production-grade Python code for training, inference, and evaluation pipelines.
- Integrate LLM applications into services/APIs with a focus on performance and scalability.
- Establish model evaluation, monitoring, and governance practices for quality and safety.
Required Skills
- 6+ years of overall experience in software development, data analytics, data science, or ML engineering.
- 2+ years of hands-on experience with deep learning for NLP/GenAI.
- Strong Python proficiency, including writing production-quality, testable, and maintainable code.
- Experience with deep learning frameworks: PyTorch or TensorFlow; Hugging Face Transformers.
- Solid understanding of deep learning architectures, tokenization, attention/transformers, and fine-tuning.
- Experience building rapid prototypes and APIs using FastAPI/Flask and/or Streamlit.
- Experience with MLOps: model packaging, CI/CD, Docker, Kubernetes, MLflow, and monitoring.
- Software engineering skills: Git, code reviews, unit/integration testing (pytest), REST APIs.
Preferred Skills
- Experience with LLM orchestration frameworks (LangChain, LlamaIndex, Semantic Kernel).
- Experience with vector databases and embedding workflows (FAISS, Pinecone, Weaviate, Chroma, Azure AI Search).
- Experience deploying and scaling ML/LLM workloads on cloud platforms (Azure preferred; GCP/AWS acceptable).