You will lead and support the migration of ETL jobs and databases from on-premises environments to the AWS Cloud.
Responsibilities
Design, develop, and implement ETL Glue PySpark processes to extract, transform, and load data into AWS data warehouses and data lakes (Redshift, Aurora PostgreSQL).
Automate and orchestrate ETL processes, applications, and APIs across on-premises and cloud environments.
Conduct data analysis and validation to minimize errors, extract insights, and support business decisions.
Provide production support, monitoring, and troubleshooting for ETL processes and data integrity.
Develop comprehensive documentation covering processes, data mappings, and application code.
Required Skills
8+ years of experience in ETL development, data warehousing, AWS cloud services, and API development.
Expertise in scripting languages, including DataStage or Informatica, Korn Shell, Perl, and Python.
Comprehensive proficiency in AWS services, including AWS Glue, Step Functions, Airflow DAGs, Lambda, S3, and EC2.
Proficiency in SQL with on-premises (DB2 on AIX) and AWS databases (Amazon RDS, Aurora PostgreSQL, Redshift).
Experience developing full-stack applications, including REST and SOAP APIs.
Experience with version control tools like Git or Bitbucket.
Familiarity with implementing Jenkins pipelines and CI/CD Processes.
Demonstrated data analysis/validation skills using Excel, Python, and Tableau.