Description
You will design and build scalable ETL solutions using big data technologies.
Responsibilities
- Design and implement scalable Hadoop architecture solutions.
- Develop Spark-based solutions for near real-time data ingestion, analytics, and reporting.
- Optimize Hadoop clusters for performance and resource utilization.
- Maintain and monitor Hadoop infrastructure, ensuring high availability.
- Troubleshoot and resolve issues within the Hadoop ecosystem.
Required Skills
- 5+ years of experience in big data environments.
- Expertise with Hadoop ecosystem components (HDFS, YARN, MapReduce, Hive, HBase, Spark).
- Strong hands-on knowledge of Python and PySpark.
- Proficiency in SQL and Unix environments.
- Experience with Data Warehousing and Data Modeling.
- Familiarity with cloud platforms (Google Cloud, Azure, AWS).
- Architectural knowledge of Hadoop implementations.