You will lead and support the migration of ETL jobs and databases from on-premises environments to the AWS Cloud.
Responsibilities
Design, develop, and implement ETL Glue PySpark processes to extract, transform, and load data into AWS data warehouses and data lakes (Redshift and Aurora PostgreSQL).
Orchestrate and validate ETL processes, applications, and APIs across on-premises and cloud environments.
Conduct business analysis and perform data validation to ensure data quality and minimize inconsistencies.
Provide comprehensive production support, monitoring, troubleshooting, and resolving issues for ETL processes.
Develop documentation covering data mappings, data flows, APIs, and support procedures.
Required Skills
8+ years of experience in ETL development, data warehousing, AWS cloud services, and API development.
Expertise in scripting languages, including DataStage, Informatica, Korn Shell, Perl, and Python on Unix/Linux platforms.
Comprehensive proficiency in AWS services, including AWS Glue, Step Functions, Airflow DAGS, Lambda, S3, and EC2.
Proficiency in SQL with on-premises databases (e.g., DB2 on AIX) and AWS databases (e.g., Amazon RDS, Aurora PostgreSQL, Redshift).
Experience developing scalable full-stack applications, including back-end APIs (REST/SOAP).
Demonstrated experience in data analysis/validation using Python and Excel.
Experience with version control tools like Git or Bitbucket.
Familiarity with CI/CD processes, such as Jenkins pipelines.