Description
Architect and implement enterprise-scale data solutions, focusing on big data application deployment and system integration in Santa Clara, CA.
Responsibilities
- Design and implement enterprise-wide data initiatives, including data migration, transformation, warehouse builds, and lake implementations.
- Develop and maintain streaming data pipelines using structured streaming and Delta Live Tables.
- Optimize data processing through effective indexing, partitioning, and schema design.
- Debug, troubleshoot, and design solutions for complex technical issues within big data environments.
- Brief stakeholders and senior management on the benefits and constraints of proposed technology solutions.
Required Skills
- 8+ years of IT experience focusing on enterprise data architecture and management.
- Hands-on experience with Databricks, Spark, Scala, and Java programming.
- Expertise in Conceptual, Logical, and Physical Data Modeling, including Relational and Dimensional modeling.
- Proficiency in SQL, including joins, aggregations, windowing functions, CTEs, and RDBMS schema optimization.
- Deep understanding of ETL/ELT processes and data migration services.
- Experience with Delta Lake concepts, including time travel, schema evolution, and optimization.
- Knowledge of streaming data concepts such as tumbling windows, sliding windows, high watermarks, and handling late data.
- Experience with S3, Glue, Redshift, and AWS Lambda for data processing and configuration.
- Competency in using GitLab for CI/CD pipelines and CloudWatch for monitoring.
Preferred Skills
- Experience with Great Expectations or similar data quality and validation frameworks.
- Architecture experience within an AWS environment.