You will design, develop, and maintain scalable data pipelines and back-end systems.
Responsibilities
- Design, develop, and maintain data pipelines using Spark, Python, Scala, and Java.
- Write efficient and optimized SQL queries for ETL processes.
- Manipulate and analyze large datasets using DataFrames.
- Implement data storage and processing solutions using AWS or GCP.
- Build and maintain real-time data streaming pipelines using MSK/Kafka.
Required Skills
- 5+ years of experience in back-end development focused on data engineering.
- Strong proficiency in Spark, Python, Scala, and Java.
- Expertise in SQL and working with relational databases.
- Experience with cloud technologies, specifically AWS or GCP.
- Experience with message streaming platforms like MSK/Kafka.
- Experience with S3 or similar object storage solutions.
- Hands-on experience with Data Lake technologies like Iceberg.
- Solid understanding of data warehousing concepts.