← Back to jobs

People Prime Worldwide Logo
Site Reliability Engineer

People Prime Worldwide

 

Pune, Maharashtra, India

Posted On: 30+ days ago
Experience: 14+ years
Availability: Onsite
Openings: 1
Category: Site Reliability Engineer
Tenure: Full-time Only
Related Jobs

No related jobs found

Description

You will ensure the reliability, performance, and scalability of production systems.

Responsibilities

  • Own the reliability of central AI models, agent registry, deployment pipelines, AI SecOps products, and portfolio products.
  • Manage incidents, perform root cause analysis, and implement preventative measures.
  • Support capacity planning, disaster recovery planning, and cost management.
  • Collect and analyze operational data to define Service Level Objectives (SLOs) from key metrics and Service Level Indicators (SLIs).
  • Automate processes using predictive monitoring, auto-scaling, or self-healing mechanisms.

Required Skills

  • 14+ years of experience in Production Support and SRE roles.
  • Deep experience ensuring quality, security, reliability, and compliance using SRE best practices.
  • Proficiency in performance analysis, log analytics, and automated testing.
  • Experience collaborating with data scientists and stakeholders to incorporate feedback.
  • Ability to drive automation leveraging predictive monitoring and auto-scaling.
  • Familiarity with Agile methodologies and fostering collaboration between development and operations.

Education

Any Gradute

Related Jobs

No related jobs found

← Back to jobs