You will manage and secure data assets within the Databricks ecosystem while building scalable data processing pipelines.
Responsibilities
- Manage and secure data assets in Databricks using Unity Catalog.
- Develop Python and PySpark scripts for data processing and analysis.
- Configure CI/CD pipelines and automate workflows using GitHub Actions.
- Implement monitoring and observability using Datadog.
- Apply transactional and dimensional data models to complex datasets.
Required Skills
- 8+ years of experience in data engineering.
- Proficiency in Python and PySpark.
- Hands-on experience with Databricks architecture and design principles.
- Experience with Unity Catalog for data governance.
- Expertise in version control using GitHub.
- Experience configuring CI/CD pipelines and DevOps workflows.
- Proficiency with GitHub Actions for automation.
- Knowledge of Datadog for monitoring and observability.
- Familiarity with AI coding assistants like GitHub Copilot and Databricks Assistant.
Preferred Skills
- Deep understanding of specific datasets and dimensional modeling.