← Back to jobs

Bangalore South, Bengaluru, Karnataka

Posted On: 30+ days ago

Experience: 5+ years

Availability: Onsite

Openings: 1

Category: Azure DevOps

Tenure: Full-time Only

No related jobs found

Key Responsibilities:

Manage and support Azure-based production environments, ensuring stability, performance, and availability.
Implement SRE practices including SLIs, SLOs, error budgets, alert tuning, and toil reduction.
Develop automation scripts and internal tooling using Python for reporting, integrations, and operational workflows.
Work with Azure Monitor, Log Analytics, Application Insights, and create dashboards, alerts, and KQL queries.
Deploy, manage, and troubleshoot Azure services including App Services, Functions, AKS, Storage, Key Vault, Azure SQL, Service Bus, Event Hub, etc.
Implement and maintain CI/CD pipelines using Azure DevOps, GitHub Actions, Jenkins, or similar tools.
Use Terraform/Bicep/ARM templates for infrastructure provisioning and automation.
Troubleshoot incidents across application, infrastructure, and network layers, performing root cause analysis.
Improve platform reliability through automation, observability enhancements, and performance optimization.
Collaborate with development, product, and platform engineering teams on deployments, releases, and reliability improvements.

Required Skills and Qualification

Strong hands-on experience with Microsoft Azure cloud services.
Proven experience in SRE, DevOps, platform engineering, or production support roles.
Strong Python scripting experience for automation, integrations, reporting, and tooling.
Good understanding of Azure SDK, REST APIs, Azure CLI, Bash, or PowerShell.
Experience with Azure Monitor, Application Insights, Log Analytics, KQL, dashboards & alerts.
Strong knowledge of Azure networking: VNets, subnets, NSGs, Private Endpoints, Load Balancers, App Gateway, Azure Front Door.
Experience with Terraform, Bicep, ARM templates, or similar IaC tools.
Hands-on experience building and maintaining CI/CD pipelines.
Good understanding of Linux and Windows environments.
Strong expertise in incident management, troubleshooting, RCA, and operational excellence.
Knowledge of SRE fundamentals (SLA, SLO, SLI, MTTR, MTTD, reliability engineering).

Good to have Skills

Experience with Kafka, Service Bus, Event Hub, or high‑volume messaging systems.
Exposure to FinOps, cost‑optimization, or cloud governance.
Experience working in financial services or other highly regulated environments.
Familiarity with container platforms (AKS, Docker) and distributed systems.
Knowledge of advanced observability tools (Prometheus, Grafana, Datadog).
Experience collaborating with globally distributed teams

Any Gradute

No related jobs found

← Back to jobs