Description

You will design and implement data processing workflows within an AWS environment.

Responsibilities

  • Build and maintain ETL pipelines using AWS Glue and PySpark.
  • Manage data warehousing tasks within Amazon Redshift.
  • Develop and orchestrate workflows using Oozie and Hive.
  • Provision and manage cloud infrastructure through CloudFormation.
  • Optimize data storage and retrieval using DynamoDB and EMR.

Required Skills

  • 5+ years of experience in data engineering.
  • Expert-level proficiency in Python and PySpark.
  • Hands-on experience with AWS Glue ETL.
  • Experience managing Amazon Redshift clusters.
  • Proficiency with Hive and Oozie orchestration.
  • Experience with AWS CloudFormation for infrastructure as code.
  • Working knowledge of Amazon EMR and DynamoDB.
  • Bachelor's degree or equivalent experience.

Education

Any Graduate