Description

You will engineer, develop, and maintain scalable data processing pipelines.

Responsibilities

  • Design and develop efficient data processing pipelines using PySpark and Python.
  • Build and implement ETL processes to ingest data from disparate sources.
  • Optimize PySpark applications and troubleshoot existing code for performance.
  • Ensure data integrity and quality across the entire data lifecycle.
  • Translate business requirements into technical solutions and participate in architecture discussions.

Required Skills

  • 5+ years of professional experience in data engineering.
  • Expert proficiency in PySpark.
  • Strong programming skills in Python.
  • Experience building and maintaining ETL workflows.
  • Ability to ingest and load data from various sources.
  • Experience optimizing big data processing jobs.

Education

Any Graduate