Description
You will design and implement end-to-end data pipelines from ingestion to consumption to support web applications and databases.
Responsibilities
- Design and implement data warehouses and data marts to serve data consumers.
- Execute design implementation, moving pipelines from design through operationalization and maintenance.
- Model data logically at macro and micro levels for consumers.
- Tune database performance and manage the data lifecycle.
- Assist in supporting and enhancing existing data pipelines and databases.
Required Skills
- 6-10 years of experience in data integration teams.
- 3+ years developing data pipelines in Apache Spark (DataBricks preferred).
- 2+ years of active work with DataBricks.
- 2+ years working with data warehouse modeling techniques.
- Strong knowledge of PySpark, Python, and SQL, including distributed computing principles.
- Experience designing and implementing ETL/ELT processes using SSIS or similar tools.
- Fluent in complex SQL and database performance tuning.
- Knowledge of cloud platforms (AWS or Azure) and big data technologies (Hadoop, Spark).
- Bachelor's degree required.