Description
You will lead technical implementation for AI and GPU workloads.
Responsibilities
- Set up and manage Linux-based environments, including shell scripting and package management.
- Deploy and configure LLMs, specifically using PyTorch and Hugging Face Transformers.
- Run GPU-based inference workloads utilizing NVIDIA CUDA.
- Manage GPU instances across cloud providers like AWS EC2 or GCP.
- Collaborate with the team to integrate models into production pipelines.
Required Skills
- 5+ years of professional experience.
- Mastery of Linux, specifically Ubuntu environments.
- Proficiency in shell scripting and package management.
- Expertise in Python, PyTorch, and Hugging Face Transformers.
- Hands-on experience deploying LLMs (Llama preferred).
- Experience running GPU-based inference using CUDA.
- Familiarity with cloud GPU platforms including AWS and Google Cloud Platform (GCP).
- Ability to create and maintain isolated Python environments using Conda and pip.