Description
You will design, build, and optimize data pipelines and ETL/ELT workflows to support analytics and reporting.
Responsibilities
- Define and evolve cloud-based data architecture, including data lakes, data warehouses, and streaming platforms.
- Partner with data scientists and analysts to deliver reliable, reusable data solutions.
- Develop and maintain scalable data storage solutions using AWS S3, Redshift, and Snowflake.
- Implement data quality checks, validation processes, and metadata documentation.
- Monitor, troubleshoot, and improve overall pipeline performance and workflow efficiency.
Required Skills
- Expert proficiency in Python and SQL.
- Strong understanding of data integration, data modeling, and SDLC.
- Hands-on experience with AWS data services (Glue, Lambda, Athena, Step Functions, Lake Formation).
- Proficiency in at least one major cloud platform (AWS preferred).
- Advanced SQL and knowledge of data warehousing (Kimball/star schema).
- Experience with relational and NoSQL databases (PostgreSQL, MySQL).
- Experience working in Agile environments.
- Experience with big data processing technologies like Spark, Hadoop, or Flink.