Description
You will design and manage data pipelines and warehousing solutions using AWS and various database technologies.
Responsibilities
- Build and maintain ETL processes using tools like Apache NiFi, Talend, or Informatica.
- Design and implement data models using normalization and database design techniques.
- Develop data integration workflows for streaming, batch, and error/replay scenarios.
- Optimize database performance through query tuning and optimization strategies.
- Manage data lifecycles using AWS services including S3, Redshift, and RDS.
Required Skills
- 5+ years of experience in data engineering or a related field.
- Proficiency in Python, PySpark, and Apache Spark.
- Strong expertise in SQL and relational databases such as Oracle, MySQL, PostgreSQL, or SQL Server.
- Hands-on experience with NoSQL databases including MongoDB and Cassandra.
- Experience with AWS data engineering services, specifically Redshift, S3, and RDS.
- Solid understanding of data warehousing and ETL methodologies.
- Proficiency with version control and CI/CD tools like GitHub and Jenkins.
- Bachelor's or Master's degree in Computer Science, IT, or a related field.