← Back to jobs

Lead Data Engineer

SpectraMedix

Gurgaon, Haryana, India

Posted On: 15+ days ago

Experience: 5+ years

Availability: Hybrid

Openings: 2

Category: Lead Data Engineer

Tenure: No Preference/Any

Related Jobs

No related jobs found

Description

We are seeking a Lead Data Engineer with strong expertise in Python and PySpark to design, build, and migrate scalable data pipelines on Databricks using Apache Spark. The role focuses on implementing and managing Medallion Architecture (Bronze, Silver, Gold) with Delta Lake to deliver reliable, high-performance data solutions. The successful candidate will optimize Databricks jobs and Spark workloads, enhance ETL/ELT pipelines, support batch and streaming data processing, and provide technical mentorship while leading Databricks-focused proof-of-concept initiatives

Roles & Responsibilities

Migration and Implementation : Lead the migration of legacy Java-based data pipelines to Databricks; design and maintain scalable data pipelines using Databricks and Spark.

Data Engineering and Management : Create, maintain, and update data models; build infrastructure for optimal data extraction, transformation, and loading (ETL).

Performance Analysis and Optimization : Analyse and optimize Databricks jobs and queries; monitor and tune Databricks environments for scalability and reliability. Address data-related technical issues; assist teams with data transformation workloads; perform root cause analysis to identify improvement opportunities.

Technical Improvement : Implement best practices for efficient data processing and storage; develop processes that support data transformation, workload management, data structures, and metadata.

Technical Mentorship : Mentor a team of data engineers; serve as the go-to person for Databricks-related queries; conduct Proof of Concepts (POCs) to demonstrate Databricks capabilities

Required Experience

Proficiency in Python / PySpark
Proven experience with Databricks and Spark, including building and optimizing data pipelines.
Strong data modeling experience and advanced SQL knowledge.
Experience with Delta Lake and performance analysis in Databricks.
Knowledge of message queuing, stream processing, and scalable big data stores.
Experience with Talend is preferable.

Non-Technical / Behavioral Competencies Required

Must have worked with US based clients in onsite/offshore delivery model
Strong verbal and written communication, technical articulation, listening, and presentation skills are essential
Should have proven analytical and problem-solving skills
Demonstrated expertise in prioritization, time management, and stakeholder management (both internal and external) is necessary
Should be a quick learner, self-starter, proactive, and an effective team player.
Must have experience working under tight deadlines within a matrix organizational structure

Key Skills

Python Pyspark Databricks Apache Spark Etl/elt Pipelines Delta Lake

Education

Any Graduate

Related Jobs

No related jobs found

← Back to jobs

Lead Data Engineer

Related Jobs

Description

Key Skills

Education

Related Jobs

Explore More Jobs