Description
You will lead technical efforts to maintain and enhance system reliability.
Responsibilities
- Identify and address issues to meet service level agreements, ensuring system performance.
- Troubleshoot and resolve complex issues in collaboration with the engineering team.
- Develop runbooks to document and standardize operational procedures.
- Optimize cloud platform infrastructure for availability and scalability.
- Create automation tools using shell scripting or Python to improve team efficiency.
Required Skills
- 8+ years of relevant experience.
- Bachelor's degree in Computer Science, Engineering, or equivalent experience.
- Strong Linux System Admin experience with advanced troubleshooting skills.
- Demonstrable scripting and automation skills (Bash, Python).
- Experience with Service Defined Storage (SDS).
- Familiarity with cloud services (compute, storage, network, automation).
- Experience with virtualization technologies.
- Familiarity with Istio and Envoy service mesh technologies.
- Self-motivated and capable of delivering results with minimal supervision.