Lead data engineering efforts focused on Hadoop ecosystems and large-scale migrations to cloud environments.
Responsibilities
- Lead on-premises Hadoop to Azure Cloud migration projects.
- Design and implement Data Lake architectures, ETL processes, and data ingestion frameworks.
- Build and operate data pipelines for upstream and downstream systems.
- Manage end-to-end data engineering lifecycles including requirement analysis, design documentation, testing, and deployment.
- Integrate multiple databases such as Oracle and SQL Server for large-scale data extraction and processing.
Required Skills
- 10-12 years of experience in Data Engineering with extensive Hadoop expertise.
- Hands-on experience with Databricks, Spark, and PySpark.
- Proficiency in Python scripting for ETL and data pipeline implementation.
- Proven track record with on-premises to cloud migrations, specifically Hadoop to Azure.
- Experience with SQL-on-Hadoop technologies such as Hive, Pig, Impala, Spark SQL, or Presto.
- Strong SQL skills including complex queries and joins.
- Experience with Qlik Replicator, BODS, or WebMethods.
- Background in implementing Data Hubs, EDW, and Data Ingestion Frameworks.
Preferred Skills