Senior MLOps Engineer Job - USA (The X4 Group)

College Alumni New

New Feature Alert!

College Groups is here! Connecting with the right opportunities just got easier!

Join your school's network – Find peers, alumni, and recruiters from your college.

Access exclusive job postings – Get opportunities tailored for your school.

Collaborate & share insights – Discuss career tips, interview experiences, and industry trends.

Explore College Alumni

Login
Sign Up

Remote Jobs Companies About Us Create Resume AI Resume Check

College Alumni New

New Feature Alert!

College Groups is here! Connecting with the right opportunities just got easier!

Join your school's network – Find peers, alumni, and recruiters from your college.

Access exclusive job postings – Get opportunities tailored for your school.

Collaborate & share insights – Discuss career tips, interview experiences, and industry trends.

Explore College Alumni

Login
- Jobseeker
- Employer
Sign Up
- Looking for job
- Looking for candidates

Loading...

1340 S De Anza Blvd Ste # 208

San Jose, CA 95129.

support@iitjobs.com

Jobs Companies Create Resume

About Us

Our Services Login

FAQ Blogs Privacy Policy Terms of service Sitemap

Follow us on

Available on Mobile

Google Play

App Store

© Copyright 2006 - 2025 All rights reserved.

Senior MLOps Engineer Job - USA (The X4 Group)

← Back to jobs

Senior MLOps Engineer

USA

Posted On: 30+ days ago

Experience: 7+ years

Availability: Onsite

Openings: 1

Category: MLOps Engineer

Tenure: Full-time Only

Related Jobs

No related jobs found

Description

You will design and implement ML training, evaluation, and deployment pipelines for LLM applications.

Responsibilities

Operate and manage SUSE Linux Enterprise (SLES) GPU clusters featuring NVIDIA H100 hardware, handling driver installation and CUDA/NCCL tuning.
Integrate MLflow tracking, Azure ML model registry, and Model Catalog to unify model versioning and promotion.
Deploy GPU-based inference endpoints using Managed Online Endpoints, AKS GPU node pools, or Arc-enabled Kubernetes, managing traffic splits and rollbacks.
Automate CI/CD in Azure DevOps from data preparation through model deployment using Infrastructure as Code (Terraform / Bicep).
Monitor model performance, data drift, and GPU metrics via Azure Monitor, Log Analytics, and NVIDIA DCGM Exporter integration.

Required Skills

7+ years in ML/AI engineering or MLOps roles with significant GPU workload experience.
Hands-on experience with SUSE Linux (SLES) in production AI environments.
In-depth knowledge of NVIDIA H100 architecture (HBM3, NVLink, MIG, multi-GPU orchestration).
Proficiency in Azure ML, Azure AI Foundry, and Prompt Flow for LLM workflows.
Expertise in deploying on Kubernetes with GPU node support (AKS, Arc-enabled K8s).
Experience with Infrastructure as Code, specifically Terraform and Bicep.
Familiarity with CI/CD workflows using Azure DevOps Pipelines.
Knowledge of distributed training frameworks (DeepSpeed, Horovod, PyTorch DDP).
Experience implementing governance and security for ML platforms.

Preferred Skills

Proven track record deploying ML models in hybrid (cloud + on-prem) environments.
Experience applying Azure’s Well-Architected ML guidance.

Key Skills

Azure Ml Azure Ai Foundry Linux Azure Devops Pipelines Terraform Infrastructure As Code Kubernetes

Education

Any Graduate

Related Jobs

No related jobs found

Explore More Jobs

Jobs in USA Azure Ml Jobs Azure Ai Foundry Jobs Linux Jobs Azure Devops Pipelines Jobs Terraform Jobs Infrastructure As Code Jobs Kubernetes Jobs Jobs at iitjobs Jobs at The X4 Group MLOps Engineer Jobs Agile Jobs IDS Jobs Red Team Jobs Object Detection Jobs Cloud Computing Jobs Image Classification Jobs Few-Shot Learning Jobs Elasticsearch Jobs SIEM Jobs LLM Jobs

← Back to jobs