You are a strong full-stack engineer who ships end-to-end — design, tests, CI/CD pipeline, deploy, and production monitoring — and treats every stage as first-class craft rather than overhead.
You have clear technical taste, articulate trade-offs well, and know when to reach for an AI agent versus a simpler tool.
You use modern AI development tools fluently in your daily workflow and have a grounded point of view on where they help and where they don't.
You iterate based on real usage data and telemetry, not intuition alone.
Key Responsibilities:
Platforms & Tooling
Builds and operates CI/CD pipelines and integrations
Build "Golden Path" scaffolding with standards, security, and quality gates built ins
Build & Maintain a governed catalog (Agents, MCPs, Skills) with behavior, permission, and access control
Build AI agents, LLM-powered tooling, and the frameworks and SDKs that accelerate them
Build integrations that give humans and AI agents deep context on our systems
Quality, Testing & Reliability
Own testing strategy across the platform: unit, integration, contract, and end-to-end
Apply mutation, property-based, or fuzz testing where it pays off
Define SLOs for platform services and AI toolin
Observability & Performance Analytics
Design metrics, logs, traces, and dashboards for productivity, adoption, and service health
Build alerting and anomaly detection to catch regressions early
Analyze telemetry to guide investment decision
Collaboration & Impact
Partner across teams to drive adoption of platform tooling and AI-augmented workflows
Stay current with LLM and platform-engineering trends and coach colleagues on where they apply
Rapidly prototype solutions to validate use cases
Communicate insights to stakeholders
Required Qualifications:
5–7+ years building and operating production software systems
Full-stack proficiency across Java, Python, and TypeScript/JavaScript, with frameworks like Spring Boot and Angular
Deep CI/CD experience — Jenkins, GitLab, or equivalent; comfortable with IaC
Testing discipline — TDD, test pyramid design, integration and contract testing
AWS fundamentals — ECS, EC2, Fargate, Lambda, API Gateway
Observability — metrics, logs, traces; CloudWatch or Splunk
Daily use of AI coding tools — Claude Code, Kiro, Codex, or equivalent, with a clear point of view on their limits
Nice to Have:
Building internal developer platforms, SDKs, or CLIs at scale
LLM orchestration in production — MCP servers, agents, tool calling
Advanced testing (mutation, property-based) or driving platform adoption across teams