You will design and manage large-scale data systems to support complex data acquisition, modeling, and storage requirements.
Responsibilities
- Define data retention policies and monitor system performance to advise on infrastructure changes.
- Implement ETL/ELT processes and orchestration of data flows.
- Mentor junior engineers and collaborate with other architects on solution delivery.
- Recommend and drive the adoption of new tools and techniques from the big data ecosystem.
- Scale data pipelines using open-source components and AWS services.
Required Skills
- 10+ years of industry experience building and managing big data systems.
- Expertise in Hadoop, Spark, Kafka, Pig, Hive, and Impala.
- Proficiency with Core Java, Spring/IOC, and design patterns.
- Strong experience with SQL and NoSQL databases, including Vertica and Redshift.
- Hands-on experience building stream-processing systems using Storm or Spark-Streaming.
- Experience with messaging systems such as JMS, Active MQ, Rabbit MQ, or Kafka.
- Deep understanding of distributed computing principles and physical data modeling.
- Ability to build, monitor, and optimize cost-efficient pipelines for SaaS.
- Experience with AWS services, provisioning, capacity planning, and performance analysis.
Preferred Skills
- Experience with Cloudera, Hortonworks, Spark HDFS, and NiFi.
- Knowledge of web-based SOA architecture implementation.
- Familiarity with reporting solutions like Pentaho, PowerBI, or Looker.