Description

You will design, develop, and maintain data pipelines for ingesting, storing, and processing large datasets.

Responsibilities

  • Develop and maintain data pipelines for both batch and real-time processing.
  • Implement data analytics pipelines in collaboration with data science teams.
  • Process, cleanse, and validate data integrity to support analysis and machine learning algorithms.
  • Analyze large data stores to uncover patterns and propose technical solutions to business challenges.
  • Document technical and functional specifications and analyze system processing flows.

Required Skills

  • 5–7 years of experience in software development and data engineering.
  • Expertise in Hadoop and Spark architecture and working principles.
  • Hands-on experience with Big Data, Spark, and Hadoop technologies.
  • Proficiency in Python, Scala, or Core Java.
  • Strong SQL skills, including writing complex queries (Hive/PySpark data frames) and optimizing joins.
  • Experience with Informatica and Oracle.
  • Solid understanding of Data Warehousing concepts.
  • Proficiency in Unix shell scripting.
  • Experience in system application analysis, design, development, testing, and implementation.
  • Bachelor of Computer Science or equivalent degree.

Preferred Skills

  • Knowledge of the Financial reporting ecosystem.

Education

BACHELOR OF COMPUTER SCIENCE