Description
Key Skills: SQL, Python, Spark, Airflow, dbt, AWS, GCP, Azure, ETL, Data Pipelines
Good to Have Skills: Scala, Java, Dagster, Luigi, Flink, Hadoop, Redshift, BigQuery, Snowflake, Databricks, dimensional modeling, normalization, star/snowflake schemas, batch and streaming architectures, data warehousing, data lake, lakehouse concepts, file formats (Parquet, Avro, ORC), Git, CI/CD, testing, modularity, documentation, Terraform, Docker, Kubernetes, data quality, observability, monitoring, data governance, lineage, privacy/compliance.
Roles & Responsibilities:
- Design, build, and optimize scalable data pipelines and data platforms for enterprise-level operations.
- Collaborate with cross-functional teams to deliver reliable, high-quality data solutions while ensuring performance excellence.
- Implement ETL/ELT frameworks using distributed processing technologies to handle large-scale data operations.
- Develop and maintain data pipeline tools using Airflow, dbt, Dagster, or Luigi for automated workflows.
- Work with cloud-based data services on AWS, GCP, or Azure platforms for optimal data management.
- Ensure data governance, observability, and monitoring standards are maintained across all data processes.
- Translate ambiguous business requirements into reliable and efficient data pipeline solutions for stakeholders.
- Implement version control, CI/CD practices, and infrastructure-as-code for maintaining code quality and deployment processes.
Experience Required: 2 years