Senior Data Engineer

DATAECONOMY
Pune, Maharashtra, India

Description

You will design, implement, and optimize scalable data pipelines within distributed computing environments.

Responsibilities

Develop and maintain scalable data processing pipelines using Python and PySpark.
Optimize existing pipelines for performance, scalability, and efficient distributed computing.
Perform data wrangling, cleansing, and analysis on large datasets.
Conduct code reviews, mentor junior developers, and maintain technical documentation.
Troubleshoot, debug, and resolve data processing issues.

Required Skills

5+ years of experience in Python programming.
3+ years of hands-on experience with PySpark and distributed data processing.
Strong understanding of Hadoop, Spark, and Hive.
Proficiency with SQL and relational databases.
Experience with ETL processes and data pipelines.
Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field.

Preferred Skills

Experience with Docker, Kubernetes, or other containerization technologies.
Knowledge of DevOps tools including CI/CD pipelines, Jenkins, and Git.
Experience with Apache Kafka or real-time data streaming.

Key Skills

Python Development Pyspark Hadoop Spark Hive Docker Kubernetes Ci/cd Pipelines Jenkins Git

Education

Bachelor’s or Master’s degree

Apply Now

Back To Jobs

Posted On: 2 days Ago
Experience: 5+ years of experience
Openings: 1
Category: data engineer
Tenure: Full-Time Position