Description

You will own the reliability and operational excellence of cloud-native systems.

Responsibilities

  • Deploy and manage containerized applications on Kubernetes.
  • Optimize cloud infrastructure using Terraform or Helm.
  • Support and troubleshoot Java microservices and APIs.
  • Handle incident and change management via ServiceNow.
  • Collaborate with DevOps, QA, and development teams on system stability.

Required Skills

  • 5-8 years of experience in Site Reliability Engineering.
  • Expertise with Kubernetes and container orchestration.
  • Strong proficiency in Java, including Spring Boot and REST APIs.
  • Hands-on experience with AWS, Azure, or GCP.
  • Familiarity with ITSM tools, specifically ServiceNow.
  • Experience implementing CI/CD pipelines using Jenkins or GitLab CI.
  • Proficiency with monitoring stacks like Prometheus, Grafana, or ELK.

Preferred Skills

  • Experience with Terraform or Helm for infrastructure management.

Education

Bachelor’s/Master’s in Computer Science