← Back to jobs
Rosemead, CA, USA
No related jobs found
Design, build, and deploy production-ready Generative AI applications and APIs on GCP.
Implement Retrieval-Augmented Generation (RAG) pipelines using vector databases and unstructured data sources.
Develop and manage end-to-end MLOps pipelines (training, evaluation, deployment, monitoring) using Vertex AI, Kubeflow, Cloud Build, and Terraform.
Optimize model performance through fine-tuning, prompt engineering, and model compression techniques (e.g., GPTQ, AWQ).
Architect scalable and cost-efficient GCP infrastructure with a strong focus on security, IAM, and VPC design.
Collaborate cross-functionally to define AI/ML technical roadmaps and deliver business-impacting solutions.
Ensure adherence to data privacy, security, and ethical AI standards (HIPAA, GDPR, where applicable).
Provide technical leadership, guidance, and mentorship to team members.
5–8+ years of industry experience in Machine Learning.
3+ years of hands-on experience building and deploying Generative AI / LLM-based solutions in production.
Strong experience with Google Cloud Platform (GCP), including Vertex AI, BigQuery, Dataflow, Cloud Run.
Advanced proficiency in Python and ML frameworks (TensorFlow, PyTorch, scikit-learn).
Hands-on experience with LLM frameworks/tools such as LangChain, LlamaIndex, or Hugging Face.
Solid understanding of SQL and unstructured data processing.
Experience with Docker, Kubernetes (GKE), and CI/CD pipelines.
Experience with multi-agent systems and orchestration (LangGraph, AutoGen, etc.).
Deep understanding of vector databases (Vertex AI Vector Search, Pinecone, Chroma).
Google Cloud Professional Machine Learning Engineer certification.
Proven ability to lead technical initiatives and mentor junior engineers
Bachelor's degree
No related jobs found
← Back to jobs