You will work as an individual contributor on a data engineering team to develop and deploy automated data products.
Responsibilities
Design, develop, and review real-time and bulk data pipelines from sources including APIs, streaming data, data warehouses, messages, images, and video.
Apply DataOps best practices including version control, PR-based development, CI/CD, deployment automation, and test automation.
Develop documentation for data lineage and data dictionaries to support the enterprise data model.
Collaborate within cross-functional, distributed agile teams to innovate analytic solutions.
Write production-quality Python and SQL code following established design patterns for data ingest, transformation, and egress.
Required Skills
Proven experience in production-level Python development and designing high-quality codebases.
Strong expertise in SQL development and designing production SQL codebases.
Experience with data modeling and software engineering best practices.
Knowledge of ML systems architecture and data science workflows, including data wrangling, model training, and deployment at scale.
Proficiency in applying CI/CD, schema change control, and monitoring within data engineering projects.
Experience working in Agile environments using Scrum or Kanban.
Bachelor’s degree in engineering, computer science, or an analytical field (Statistics, Mathematics) with 3 years of experience, or a Master’s/Ph.D. with 1 year of experience.
Familiarity with Azure DevOps and Azure environments.
Preferred Skills
Working knowledge of Azure Stream Architectures, DBT, and Azure Machine Learning Environment.
Experience with GIS data, schema change tools, and data dictionary tools.
Understanding of Object-Oriented Programming principles and distributed systems.