Description

You will manage and secure data assets within the Databricks ecosystem while building scalable data processing pipelines.

Responsibilities

  • Manage and secure data assets in Databricks using Unity Catalog.
  • Develop Python and PySpark scripts for data processing and analysis.
  • Configure CI/CD pipelines and automate workflows using GitHub Actions.
  • Implement monitoring and observability using Datadog.
  • Apply transactional and dimensional data models to complex datasets.

Required Skills

  • 8+ years of experience in data engineering.
  • Proficiency in Python and PySpark.
  • Hands-on experience with Databricks architecture and design principles.
  • Experience with Unity Catalog for data governance.
  • Expertise in version control using GitHub.
  • Experience configuring CI/CD pipelines and DevOps workflows.
  • Proficiency with GitHub Actions for automation.
  • Knowledge of Datadog for monitoring and observability.
  • Familiarity with AI coding assistants like GitHub Copilot and Databricks Assistant.

Preferred Skills

  • Deep understanding of specific datasets and dimensional modeling.

Education

Any Graduate