You will lead and support the migration of ETL jobs and databases from on-premises environments to the AWS Cloud.
Responsibilities
Design, develop, and implement ETL Glue PySpark processes to extract, transform, and load data into AWS data warehouses and data lakes (Redshift and Aurora PostgreSQL).
Automate and orchestrate ETL processes, applications, and APIs across on-premises and cloud environments.
Conduct business analysis and perform data validation to minimize errors and support business decision-making.
Provide comprehensive production support, monitoring, and troubleshooting for ETL processes and data integrity.
Develop documentation covering data flows, mappings, APIs, and support procedures.
Required Skills
10+ years of experience in ETL development, data warehousing, AWS cloud services, and API development.
Expertise in scripting languages: DataStage or Informatica, Shell Scripting, Perl, and Python.
Comprehensive proficiency in AWS services, including Glue, Step Functions, Airflow DAGs, Lambda, S3, and EC2.
Proficiency in SQL across on-premises (DB2 on AIX) and AWS databases (Amazon RDS, Aurora PostgreSQL, Redshift).
Experience developing full-stack applications, including REST and SOAP API development.
Strong understanding of data integration, transformation, and loading processes.
Experience with version control tools like Git.
Demonstrated experience in data analysis/validation using Python and Excel.