← Back to jobs
Malvern, PA, USA
No related jobs found
Key Responsibilities
Architect, build, and maintain event-driven data pipelines using AWS services such as Kinesis, MSK/Kafka, Lambda, Step Functions, SQS/SNS, and Glue/EMR.
Develop ETL/ELT workflows using Python and PySpark, ensuring performance, scalability, and cost efficiency.
Implement and optimize Spark-based data transformations, partitioning strategies, and data processing frameworks.
Design and manage data lake and warehouse structures using S3, Glue Catalog, Athena, and/or Redshift.
Build streaming solutions with checkpointing, stateful transformations, idempotency, and schema evolution.
Ensure high standards of data quality, observability, monitoring, and alerting (CloudWatch, Datadog, etc.).
Implement data security best practices including IAM, encryption (KMS), networking, and governance.
Create reusable frameworks, internal libraries, and CI/CD pipelines for automated deployments.
Collaborate with data scientists, analysts, and business teams to deliver well-modeled, reliable datasets.
Lead design reviews, mentor junior engineers, and contribute to engineering best practices.
Required Qualifications
Bachelor's degree
No related jobs found
← Back to jobs