You will design and implement software engineering solutions across platform development, data processing, and report generation.
Responsibilities
- Design and develop data pipelines for ingestion and transformation using Spark.
- Implement data lake ETL processes using AWS Glue and Databricks.
- Participate in architectural refinement, software design, and concept definition.
- Troubleshoot performance bottlenecks and data skew issues in large-scale deployments.
- Provide technical recommendations for optimal configurations in enterprise environments.
Required Skills
- 5+ years of experience with Python, PySpark, and Apache Spark.
- 5+ years of experience with AWS capabilities including Glue, DynamoDB, Lambda, Redshift, and Elasticsearch.
- 5+ years of experience with API development and data integration (Streaming, Batch, Error, and Replay).
- Strong proficiency in SQL and both relational and non-relational database systems.
- Experience with data modeling and query optimization.
- Hands-on experience with GitHub, Jenkins, and Terraform.
- Knowledge of large-scale distributed database systems.
- Experience with ETL tools or RDBMS systems such as Teradata (Vantage).
- Background in Data Analytics and Healthcare data.