You will lead and support the migration of ETL jobs and databases from on-premises environments to the AWS Cloud.
Responsibilities
Design, develop, and implement ETL Glue PySpark processes to extract, transform, and load data into AWS data warehouses and data lakes (Redshift, Aurora PostgreSQL).
Automate and orchestrate ETL processes, applications, and APIs to optimize performance across on-premises and cloud environments.
Conduct business analysis and data validation to minimize errors, extract insights, and support decision-making.
Provide comprehensive production support, monitoring, troubleshooting, and resolving issues for ETL processes and data integrity.
Develop detailed documentation covering processes, data mappings, data flows, and support procedures.
Required Skills
8+ years of experience in ETL development, data warehousing, AWS cloud services, and API development.
Expertise in scripting languages, including Python, Perl, and Korn Shell, on Unix/Linux platforms.
Comprehensive proficiency in AWS services, including AWS Glue, Step Functions, Airflow DAGs, Lambda, S3, and EC2.
Proficiency in SQL with relational databases, including DB2 on AIX, Amazon RDS, Aurora PostgreSQL, and Redshift.
Experience developing full-stack applications, including REST and SOAP APIs.
Familiarity with CI/CD Processes and version control tools like Git or Bitbucket.
Demonstrated experience in data analysis and validation using Excel, Python, and Tableau.
Experience leveraging ETL tools such as DataStage or Informatica.