Description
You will design and maintain scalable data pipelines and architectures for big data processing.
Responsibilities
- Design and maintain scalable data architectures and pipelines.
- Lead data projects and enforce engineering best practices.
- Optimize data systems for analytics and reporting workflows.
- Ensure data quality and system reliability within production environments.
Required Skills
- 8+ years of IT experience.
- 5+ years of experience with Python, PySpark, and SQL.
- Experience with data lakes using Iceberg format.
- Proficiency in ETL processes using Informatica.
- Hands-on experience with AWS services: S3, Glue, Redshift, Lambda, EMR, and Airflow.
- Experience with Postgres and BASH/Shell scripting.
- Background working with healthcare data.
- Experience leading data teams.
- Practical experience with Agile development methodologies.