Lead the design and implementation of data engineering and analytics solutions using Python and PySpark.
Responsibilities
- Build and operate large-scale data warehouses and data lakes.
- Design, code, and optimize ETL processes for low-latency streaming and processing.
- Refactor Python code to PySpark to improve scalability and performance.
- Ensure data quality, track data lineage, and improve data discoverability.
- Coordinate with development teams to define requirements and prioritize feature requests.
Required Skills
- 5+ years of experience in data engineering or backend development.
- Expertise in Python programming and PySpark.
- Experience building scalable data pipelines and ETL optimizations.
- Strong background in integrating data storage solutions and reprogramming databases.
- Proven ability to test and debug complex applications.
- Experience working within Agile and DevOps environments.
- Ability to coordinate projects and communicate effectively with senior leadership.
- Experience developing back-end components and integrating server-side logic.