Description
You will build and maintain data engineering pipelines within the GCP ecosystem.
Responsibilities
- Develop ETL processes utilizing strong SQL and scripting languages.
- Build reusable frameworks using Python to enhance existing data infrastructure.
- Implement and manage data workflows using tools like Dataflow or Airflow.
- Set up and manage GCP IAM configurations.
- Own infrastructure definition and deployment using Terraform within CI/CD pipelines.
Required Skills
- 4+ years of Information Technology experience.
- Experience with GCP for data engineering tasks.
- Proficiency in Python, Scala, Java, or R for development.
- Experience with BigQuery, Hadoop, Hive, Spark, or Kafka.
- Strong SQL background for data transformation and querying.
- Experience with Git and GitHub for version control.
- Knowledge of core GCP Services including Dataproc and Composer.
- Familiarity with CI/CD pipeline concepts.
Preferred Skills
- Experience in Relational Modeling or Dimensional Modeling.
- Knowledge of Airflow DAG creation and monitoring.