Description
You will design, develop, and maintain scalable data pipelines on Google Cloud Platform (GCP). You will translate business requirements into technical solutions, ensuring high data quality and security standards.
Responsibilities
- Build and optimize end-to-end ETL/ELT pipelines using Apache Spark, Hive, and Python.
- Manage data ingestion and transformation workflows within BigQuery and Google Cloud Storage (GCS).
- Troubleshoot production pipeline issues and resolve them within defined SLAs.
- Conduct code reviews and mentor junior engineers on best practices.
- Collaborate with product owners and analysts in an Agile/Scrum environment.
Required Skills
- 6–8 years of hands-on experience in Data Engineering or related fields.
- Strong proficiency with Google Cloud Platform services, specifically BigQuery and GCS.
- Expertise in Apache Spark and Hive for large-scale distributed data processing.
- Solid programming skills in Python for data engineering tasks.
- Proven experience designing and maintaining robust data pipelines and ETL frameworks.
- Familiarity with Agile/Scrum delivery models and tools like Jira.
- Bachelor's or Master's degree in Computer Science or a related field.