Description
You will design, build, and maintain cloud-based data infrastructure and pipelines to support analytics and business intelligence.
Responsibilities
- Design and implement scalable data architectures across AWS, Azure, or GCP environments.
- Build and manage cloud-based data warehouses including Snowflake, Redshift, or BigQuery.
- Develop and optimize ETL/ELT pipelines for data ingestion, transformation, and processing.
- Automate workflows using Apache Airflow, AWS Glue, or Azure Data Factory.
- Implement real-time data processing and manage big data technologies like Spark, Hadoop, or Kafka.
- Ensure data security and compliance through encryption and role-based access control.
- Monitor and optimize cloud data storage and computing costs.
Required Skills
- 5+ years of experience in data engineering.
- Proficiency in SQL and NoSQL databases such as PostgreSQL, MongoDB, or DynamoDB.
- Strong programming skills in Python, Java, or Scala.
- Experience with AWS services including S3, Redshift, Glue, Lambda, EMR, and Kinesis.
- Experience with Azure services including Data Factory, Synapse, Cosmos DB, and Blob Storage.
- Experience with GCP services including BigQuery, Dataflow, Pub/Sub, and Cloud Storage.
- Knowledge of Infrastructure as Code using Terraform or CloudFormation.
- Experience with CI/CD pipelines for data deployment.
- Bachelor’s or Master’s degree in Computer Science, Data Engineering, IT, or a related field.
Preferred Skills
- Certifications such as AWS Certified Data Analytics or Google Professional Data Engineer.
- Experience with Kubernetes and Docker for containerized data applications.
- Familiarity with MLOps and AI/ML model deployment in cloud environments.