You will build and maintain scalable data pipelines within the Google Cloud ecosystem.
Responsibilities
Design and develop data pipelines for ingestion, transformation, and loading into GCP services including BigQuery, Dataflow, DataProc, Pub/Sub, and Storage.
Capture and process change data from various sources using Debezium and Apache Flink.
Implement data quality checks and procedures to ensure accuracy and consistency across datasets.
Automate data processing tasks using Python, Bash, or PowerShell scripting.
Monitor and troubleshoot pipelines to resolve technical issues and ensure stability.
Collaborate with architects and analysts to translate requirements into technical specifications.
Required Skills
5+ years of experience building and operationalizing large-scale enterprise data solutions on GCP.
Mandatory proficiency in Python development.
3+ years of experience with ETL processes, data pipelines, and data warehousing concepts.
Expertise in Debezium and Apache Flink for change data capture (CDC).