Description
You will design, implement, and migrate data for medium to large enterprise data platforms.
Responsibilities
- Design and implement ingestion processes for structured and unstructured data sets.
- Develop data cleansing routines and transformation logic using PySpark.
- Build and manage data pipelines using Azure Data Factory, Azure Functions, and Databricks.
- Perform performance tuning for Spark and SQL queries on multi-terabyte data sets.
- Translate product requirements into technical specifications for the engineering team.
Required Skills
- 7+ years of hands-on experience in data platform design, configuration, and migration.
- Expertise in SQL, T-SQL, stored procedures, and advanced SQL techniques.
- Proficiency in Python and PySpark for data ingestion and transformation.
- Experience with Databricks and Spark performance tuning.
- Practical knowledge of Azure data services including Azure Data Factory, Azure Functions, Azure Data Lake Storage, and Azure Synapse.
- Experience working with Avro, JSON, and CSV file formats.
- Hands-on experience with Terraform scripting and DevOps processes.
- Proficiency with Azure DevOps or GitHub Actions for CI/CD pipelines.
- Strong understanding of data modeling, including fact and dimension tables and logical/physical database design.
Preferred Skills
- Microsoft Azure certifications.