Distributed systems: Working knowledge required for debugging and end-to-end testing (not deep expertise)
Machine Learning frameworks: TensorFlow, PyTorch, JAX or similar
Must-Have:
Strong foundation in ML inference, deployment, and quality testing
Demonstrated ability to ramp up quickly on new and unfamiliar tech stacks — this is the single most important trait
End-to-end problem-solving mindset — can own a problem from model handoff to user-facing behavior
Core ML knowledge sufficient to benchmark models and collaborate with researchers
Experience deploying models in cloud environments, ideally GCP.
Good to Have:
Exposure to Java or JVM-based systems (model integration happens in Java; deep expertise not required)
Familiarity with streaming data architectures
Experience in hybrid cloud/on-prem environments.
What You Will Do:
Inference & Deployment
Evaluate and benchmark new ML inference frameworks to guide production decisions
Deploy models to GCP and integrate them into production applications and Java-based streaming pipelines
Own deployment automation end-to-end — from model handoff through live serving
Monitor how models behave in production for real end-users.
Performance & Quality
Design and execute benchmarking, performance testing, and quality testing on ML models
Perform model sampling to support quality evaluation and researcher feedback loops
Debug issues across the full stack — from inference layer down to streaming pipelines.
Cross-functional Collaboration
Partner with ML researchers to provide benchmarking feedback and guide inference decisions — requires enough core ML knowledge to have a meaningful technical handshake
Adapt rapidly to non-standard and evolving tech stacks across hybrid (on-prem + GCP) infrastructure.
Education:
Bachelor's or Master’s degree in Computer Science, Computer or Electrical Engineering, Mathematics, or a related field