← Back to jobs

Hallmark Global Technologies Inc Logo
Big Data Developer
Posted On: Just posted
Experience: 5+ years
Availability: Onsite
Openings: 1
Category: Big Data Developer
Tenure: Contract - Corp-to-Corp
Related Jobs

No related jobs found

Description

Key Responsibilities

•       Design, develop, and maintain large scale Spark applications using Scala and PySpark

•       Build and operate streaming heavy data pipelines using Kafka and Spark Structured Streaming

•       Implement stateful streaming patterns including windowing, watermarking, late data handling, and checkpointing

•       Develop robust event replay and reprocessing workflows using Kafka offsets and partitions

•       Build ingestion and routing flows using Apache NiFi, including Kafka based ingestion patterns

•       Implement end to end ETL/ELT pipelines with strong emphasis on low latency, fault tolerance, and scalability

•       Optimize Spark jobs through partitioning strategies, memory tuning, shuffle optimization, and efficient data formats

•       Integrate Spark workloads with distributed object storage systems such as Apache Ozone and Ceph

•       Ensure data quality, consistency, and auditability through validation, reconciliation, and metadata capture

•       Collaborate with platform, infrastructure, and operations teams on production readiness and capacity planning

•       Support production systems, including monitoring, incident analysis, and root cause resolution

•       Contribute to reusable frameworks, coding standards, and engineering best practices

•       Participate in architecture reviews, code reviews, and technical documentation

Required Qualifications

•       Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience

•       Strong hands on experience with Apache Spark in production environments

•       Advanced proficiency in Scala and PySpark

•       Solid understanding of distributed systems and data processing at scale

•       Strong experience with Kafka based streaming architectures

•       Hands on experience with Spark Structured Streaming

•       Experience building batch and real time pipelines

•       Hands on experience with Apache NiFi for data ingestion and flow management

•       Strong SQL skills and experience working with structured and semi structured data

•       Experience working with object storage or distributed storage platforms

•       Proficiency with Linux, shell scripting, and Git based version control

Preferred Qualifications

•       Experience with Apache Ozone and/or Ceph as storage backends for analytics workloads

•       Experience implementing exactly once / at least once streaming semantics

•       Strong background in Spark performance tuning (CPU, memory, I/O, shuffle)

•       Experience supporting mission critical production systems with strict SLAs

•       Familiarity with CI/CD pipelines and automated testing for data applications

Experience designing observability for streaming systems (lag, throughput, backpressure)

Education

Bachelor's degree

Related Jobs

No related jobs found

← Back to jobs