← Back to jobs
Charlotte, NC, USA
No related jobs found
Key Skills:
Must-Have Skills (Mandatory):
GCP, Azure (multi-cloud preferred)
Terraform (strong hands-on IaC)
Cloud Networking & Hybrid Connectivity (VPN, VPC/VNet peering, private endpoints)
Landing Zones & Cloud Governance (Org Policies, guardrails)
Kubernetes (GKE), OpenShift (OCP)
Platform Engineering / Internal Developer Platforms
Observability (monitoring, logging, tracing)
SRE concepts (SLOs, SLIs, reliability engineering)
Python (automation)
HashiCorp Vault (secrets management)
GenAI / Advanced Skills (Strong Preferred):
GenAI Platforms / LLMs
RAG (Retrieval Augmented Generation)
MLOps / LLMOps pipelines
Key Responsibilities (Keywords for Search):
Build enterprise cloud platforms (GCP + Azure)
Implement Terraform-based reusable modules
Design landing zones & governance frameworks
Enable hybrid/multi-cloud connectivity
Manage Kubernetes platforms (GKE/OCP)
Build Internal Developer Portals (self-service infra)
Define SLOs, reliability patterns, observability
Support GenAI/LLM workloads and platform enablement
GCP · Azure · Terraform · Cloud Networking · Landing Zones · Org Policy / Governance · HashiCorp Vault · Hybrid Connectivity · Kubernetes · GKE · OpenShift (OCP) · Platform Engineering · Observability · SRE / SLOs · Python · Internal Developer Portals · GenAI Platforms · LLMs · RAG · MLOps/LLMOps
Responsibilities:
Design, build, and operate secure, scalable GCP and OpenShift (OCP/GKE) platforms to support deployment of GenAI models, LLMs, and RAG workloads.
Provision and manage cloud infrastructure using Terraform, including landing zones, networking, org policies, and hybrid connectivity across GCP and Azure.
Enable MLOps/LLMOps pipelines for model deployment, monitoring, and lifecycle management, integrating Arize AI and GenAI platforms.
Implement platform engineering best practices, including Kubernetes-based abstractions, internal developer portals, and self-service environments.
Ensure platform security, governance, and secrets management using HashiCorp Vault, IAM, and policy-as-code.
Establish observability, SLOs, and SRE practices to ensure reliability and performance of GenAI and platform services.
Collaborate with data scientists, ML engineers, and application teams to onboard new LLMs, APIs, and inference services efficiently
Any Graduate
No related jobs found
← Back to jobs