Description
You will design and implement ETL processes using AWS Glue, Python, and SQL to ingest, transform, and load data.
Responsibilities
- Develop and maintain data lakes and data warehouses using Amazon S3 and Amazon Redshift.
- Optimize data storage, performance, and cost efficiency leveraging Redshift Spectrum and query tuning.
- Build reusable data ingestion and transformation frameworks following best practices for scalability and security.
- Troubleshoot performance bottlenecks and ensure system reliability adhering to AWS Well-Architected practices.
- Define data models, governance, and solution design in collaboration with cross-functional teams.
Required Skills
- 6+ years of hands-on experience with the AWS analytics stack, including AWS Glue, Amazon Redshift, and Amazon S3.
- Proficiency in Python for data processing and automation.
- Strong SQL expertise for complex queries, transformations, and data validation.
- Experience with data modeling, schema design, and dimensional modeling (Star/Snowflake schemas).
- Good understanding of data architecture, integration patterns, and solution design principles.
- Exposure to data governance, cataloging, and security best practices within the AWS environment.
- Experience implementing data quality checks and monitoring across ETL pipelines.
- Bachelor's degree.