EDW Modernization: Assist in the offloading and migration of workloads from legacy systems such as Oracle, Teradata, Client Redshift, or Snowflake into the Google Cloud ecosystem. Knowledge of one of the On premises EDW platforms is a must.
Data Migration Execution: Execute the migration of schemas and data from legacy on-premise or cloud databases to BigQuery using BigQuery Migration Service, BQ DTS, STS, DataStream or other GCP products
Build/migrate Data Flows: Build and operationalize data ingestion and processing pipelines using Cloud-native products (Dataflow, Dataproc - Spark, Cloud Composer).
Infrastructure Support: Provision and manage data-related cloud infrastructure, including BigQuery datasets, storage buckets, and IAM roles following least-privilege principles.
Quality & Governance: Develop and implement automated data quality checks and validation scripts to ensure the accuracy of migrated data.
Technical Documentation: Create and maintain technical documentation, including data mapping sheets, runbooks, and WBS task updates for internal and external stakeholders.
Troubleshooting: Identify, debug, and resolve issues within ETL/ELT processes and perform SQL performance tuning.
Qualifications:
Bachelor's degree in Computer Science, Engineering, Mathematics, or equivalent practical experience.
4+ years of experience in data engineering, developing and troubleshooting pipelines using Python, SQL, and Spark.
Hands-on experience with at least one major Public Cloud provider (GCP preferred/certified).
Proficiency in writing, translating and optimizing complex SQL queries.
Experience with relational and non-relational database technologies.
Good communication skills and the ability to work effectively in a collaborative team environment.
Preferred Qualifications
BigQuery Expertise [Must have]: Experience with BigQuery architecture, partitioned/clustered tables, and migrating from legacy EDWs (e.g., Oracle, Teradata, Netezza).
Migration Tooling [Must have]: Familiarity with migration tools and services such as BigQuery Migration Service (BQMS), Data Stream, Dataflow.
Orchestration [Must have]: Experience with Apache Airflow (Cloud Composer) is a must.
Automation [Preferred]: Familiarity with Infrastructure as Code (IaC) tools like Terraform for provisioning data resources.
CI/CD [Preferred]: Experience using Jenkins, GitLab, or Cloud Build for automating data pipeline deployments.
Certification [Must have]: Google Professional Data Engineer certification