← Back to jobs

Big Data Developer

Hallmark Global Technologies Inc

O'Fallon, MO, USA

Posted On: 30+ days ago

Experience: 5+ years

Availability: Onsite

Openings: 1

Category: Big Data Developer

Tenure: Contract - Corp-to-Corp

Related Jobs

No related jobs found

Description

Key Responsibilities

• Design, develop, and maintain large scale Spark applications using Scala and PySpark

• Build and operate streaming heavy data pipelines using Kafka and Spark Structured Streaming

• Implement stateful streaming patterns including windowing, watermarking, late data handling, and checkpointing

• Develop robust event replay and reprocessing workflows using Kafka offsets and partitions

• Build ingestion and routing flows using Apache NiFi, including Kafka based ingestion patterns

• Implement end to end ETL/ELT pipelines with strong emphasis on low latency, fault tolerance, and scalability

• Optimize Spark jobs through partitioning strategies, memory tuning, shuffle optimization, and efficient data formats

• Integrate Spark workloads with distributed object storage systems such as Apache Ozone and Ceph

• Ensure data quality, consistency, and auditability through validation, reconciliation, and metadata capture

• Collaborate with platform, infrastructure, and operations teams on production readiness and capacity planning

• Support production systems, including monitoring, incident analysis, and root cause resolution

• Contribute to reusable frameworks, coding standards, and engineering best practices

• Participate in architecture reviews, code reviews, and technical documentation

Required Qualifications

• Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience

• Strong hands on experience with Apache Spark in production environments

• Advanced proficiency in Scala and PySpark

• Solid understanding of distributed systems and data processing at scale

• Strong experience with Kafka based streaming architectures

• Hands on experience with Spark Structured Streaming

• Experience building batch and real time pipelines

• Hands on experience with Apache NiFi for data ingestion and flow management

• Strong SQL skills and experience working with structured and semi structured data

• Experience working with object storage or distributed storage platforms

• Proficiency with Linux, shell scripting, and Git based version control

Preferred Qualifications

• Experience with Apache Ozone and/or Ceph as storage backends for analytics workloads

• Experience implementing exactly once / at least once streaming semantics

• Strong background in Spark performance tuning (CPU, memory, I/O, shuffle)

• Experience supporting mission critical production systems with strict SLAs

• Familiarity with CI/CD pipelines and automated testing for data applications

Experience designing observability for streaming systems (lag, throughput, backpressure)

Key Skills

Spark Scala Python Pyspark Sql Kafka Linux Git Ci/cd

Education

Bachelor's degree

Related Jobs

No related jobs found

← Back to jobs

Big Data Developer

Related Jobs

Description

Key Skills

Education

Related Jobs

Explore More Jobs