Description

JD:

Key Responsibilities:

  • Build and maintain ETL pipelines using Python and PySpark on AWS Glue and other compute platforms
  • Orchestrate workflows with AWS Step Functions and serverless components (Lambda)
  • Implement messaging and event-driven patterns using AWS SNS and SQS
  • Design and optimize data storage and querying in Amazon Redshift
  • Write performant SQL for data transformations, validation, and reporting
  • Ensure data quality, monitoring, error handling and operational support for pipelines
  • Collaborate with data consumers, engineers, and stakeholders to translate requirements into solutions
  • Contribute to CI/CD, infrastructure-as-code, and documentation for reproducible deployments

Required skills and experience:

  • Strong experience with Python and Pyspark for large-scale data processing
  • Proven hands-on experience with AWS services: Lambda, SNS, SQS, Glue, Redshift, Step Functions
  • Solid SQLSQL skills and familiarity with data modeling and query optimization
  • Experience with ETL best practices, data quality checks, and monitoring/alerting
  • Familiarity with version control (Git) and basic DevOps/CI-CD workflows

Education

Any Graduate