Description

You will design and manage large-scale data systems to support complex data acquisition, modeling, and storage requirements.

Responsibilities

  • Define data retention policies and monitor system performance to advise on infrastructure changes.
  • Implement ETL/ELT processes and orchestration of data flows.
  • Mentor junior engineers and collaborate with other architects on solution delivery.
  • Recommend and drive the adoption of new tools and techniques from the big data ecosystem.
  • Scale data pipelines using open-source components and AWS services.

Required Skills

  • 10+ years of industry experience building and managing big data systems.
  • Expertise in Hadoop, Spark, Kafka, Pig, Hive, and Impala.
  • Proficiency with Core Java, Spring/IOC, and design patterns.
  • Strong experience with SQL and NoSQL databases, including Vertica and Redshift.
  • Hands-on experience building stream-processing systems using Storm or Spark-Streaming.
  • Experience with messaging systems such as JMS, Active MQ, Rabbit MQ, or Kafka.
  • Deep understanding of distributed computing principles and physical data modeling.
  • Ability to build, monitor, and optimize cost-efficient pipelines for SaaS.
  • Experience with AWS services, provisioning, capacity planning, and performance analysis.

Preferred Skills

  • Experience with Cloudera, Hortonworks, Spark HDFS, and NiFi.
  • Knowledge of web-based SOA architecture implementation.
  • Familiarity with reporting solutions like Pentaho, PowerBI, or Looker.

Education

Any Graduate