Description

You will design and maintain data infrastructure using Azure services and Databricks to support batch and real-time processing.

Responsibilities

  • Build and optimize ETL pipelines for ingesting, transforming, and loading data into cloud data warehouses.
  • Implement data quality checks and automation to reduce errors and ensure reliable reporting.
  • Develop Python scripts to interact with REST APIs and streamline data acquisition workflows.
  • Containerize applications using Docker to improve deployment efficiency and scalability.
  • Orchestrate end-to-end data workflows using Apache Airflow and PySpark.

Required Skills

  • 5+ years of experience in data engineering.
  • Expertise in Databricks, Azure Data Lake Storage, and Azure Data Factory.
  • Strong proficiency in Python for data processing and REST API integration.
  • Hands-on experience with Docker for containerization.
  • Experience with Apache Airflow for data orchestration.
  • Ability to tune performance for large datasets and distributed systems.
  • Any Graduate degree.

Education

Any Graduate