You build and maintain data pipelines to move and transform data into AWS environments.
Responsibilities
- Develop and maintain ETL/ELT pipelines extracting data from Oracle and other systems into AWS (S3, Redshift, Glue).
- Collaborate with data scientists to prepare and optimize data for SageMaker workloads.
- Implement and manage data ingestion frameworks covering both batch and streaming requirements.
- Automate and schedule data workflows using AWS Glue, Step Functions, or Airflow.
- Design and maintain data models, schemas, and cataloging processes for consistency.
- Optimize data processes specifically for performance and cost efficiency.
Required Skills
- 5+ years of experience in a data engineering role.
- Strong proficiency with SQL and hands-on experience with Oracle databases.
- Proficiency in Python, including pandas, boto3, and pyodbc.
- Hands-on experience with AWS data services: S3, Glue, Redshift, Lambda, and IAM.
- Solid understanding of data modeling, relational databases, and schema design.
- Experience designing and implementing ETL/ELT pipelines and data workflows.
- Familiarity with version control, CI/CD, and automation practices.
- Bachelor's degree in Computer Science, MIS, or related field.