Own the mainframe automation strategy and roadmap to improve reliability, detection, and recovery.
Scale automated solutions across infrastructure domains (batch, storage, networking, middleware) using APIs, orchestration, and infrastructure-as-code.
Architect and govern multi-site failover automation; maintain and test DR playbooks and runbooks.
Define and operationalize SLOs/SLIs, error budgets, and alerting standards; reduce MTTA/MTTR through event correlation and automated remediation.
Instill disciplined engineering: peer reviews, version control, change management gates, and automation standards aligned to risk/compliance.
Build and lead a high-performing team; develop talent in REXX, z/OS automation, DevOps, and integration.
Partner across platforms, applications, cyber, risk, and compliance to prioritize automation investments that reduce toil and operational risk.
Own KPIs and continuous improvement cycles; communicate performance, risks, and outcomes to senior stakeholders.
Responsibilities
Technical leadership and hands-on guidance in:
Mainframe automation frameworks and system state management
Expert-level REXX scripting; deep z/OS operating environment expertise
Designing/supporting automated failover in multi-site environments
RESTful APIs and systems integration for orchestration
DevOps tooling (e.g., Jenkins including Zowe plug-in, Ansible) and CI/CD for mainframe workloads
ServiceNow ITSM workflows and automation
AIOps/event-management platforms (e.g., Moogsoft or equivalent)
SQL, JCL, ISPF/TSO and related toolchains
Governance, risk, and controls:
Strong change control discipline (Git, peer reviews, deployment safeguards)