Description
You will design and implement large-scale data automation frameworks and data platform architectures.
Responsibilities
- Define and design large-scale data automation frameworks.
- Implement data quality engineering principles including profiling, validation, cleansing, and anomaly detection.
- Lead and mentor technical teams to foster technical excellence.
- Integrate quality gates into CI/CD pipelines using DevOps practices.
- Articulate complex technical concepts to senior leadership and non-technical stakeholders.
Required Skills
- 15+ years of progressive experience in software engineering.
- 5+ years in a Technical Architect, Lead Data Architect, or Principal Data Engineer role.
- Expert-level proficiency in Apache Spark including PySpark, Scala, or Java.
- Deep understanding of Spark SQL, DataFrames, and Datasets.
- Hands-on experience with distributed storage including Hadoop, HDFS, S3, Delta Lake, or Apache Iceberg.
- Experience with streaming technologies such as Apache Kafka or AWS Kinesis.
- Extensive experience with cloud data platforms on AWS, Azure, or GCP.
- Expert-level Python for data engineering and automation.
- Expert-level Advanced SQL for complex data analysis and optimization.
- Bachelor’s or master’s degree in computer science, Engineering, or a related quantitative field.
Preferred Skills
- Experience with GCP tools including Dataproc, BigQuery, and Cloud Storage.