← Back to jobs

NVIDIA Logo
Site Reliability Engineer - Networking Support

NVIDIA

 

Azusa, CA, U.S.

Posted On: 6 days ago
Experience: 4+ years
Availability: Remote
Openings: 1
Category: Sr. Site Reliability Engineer
Tenure: Full-time Only
Related Jobs

No related jobs found

Description

You will maintain the reliability and uptime of customer production environments through proactive monitoring and troubleshooting of data center networking equipment.

Responsibilities

  • Supervise equipment, applications, and processes using various tools and consoles.
  • Debug and triage incidents and user-reported issues rapidly.
  • Collaborate with Tier 2 and Tier 3 support to resolve complex networking issues.
  • Develop documentation for operations processes.
  • Perform hardware tasks including racking, stacking, and replacing modules or cabling.

Required Skills

  • 4+ years of Site Reliability Engineering experience in a production environment with large-scale distributed microservices.
  • Hands-on experience operating network devices, including cabling, transceivers, and component replacement.
  • Proficiency with TCP/IP networks and standard troubleshooting tools.
  • Strong knowledge of the Linux operating system and associated tools.
  • Experience with incident management, organizational change, and problem management processes.
  • Ability to perform server and network switch reboots, system reboots, and file restores.
  • Experience providing Level 1 network and server support, including system backups and batch processing.
  • Bachelor’s Degree in Information Technology or equivalent experience.

Preferred Skills

  • Ability to work a rotating shift schedule including days, nights, weekends, and holidays.

Education

Bachelor’s Degree

Related Jobs

No related jobs found

← Back to jobs