Big Data Engineer

You will build and orchestrate data pipelines and distributed computing applications within an AWS ecosystem.

Build and orchestrate data pipelines and ETL processes.
Develop distributed computing applications using PySpark.
Design and implement data models using normalization, denormalization, and schema design.
Write, maintain, and execute automated unit tests following Test-Driven Development (TDD) practices.
Build APIs and manage serverless architectures.

5+ years of experience in big data environments.
Proficiency in Python programming.
Strong expertise in SQL, Presto, Hive, and Spark.
Experience with PySpark and libraries including Pandas, Polars, and NumPy.
Extensive experience with AWS services: EMR, Lambda, Glue ETL, Step Functions, S3, ECS, Kinesis, IAM, RDS PostgreSQL, DynamoDB, CloudWatch Events/EventBridge, Athena, SNS, SQS, and VPC.
Experience with relational and NoSQL databases, including Amazon Redshift.
Knowledge of trading and investment data.
Experience with OneTick or KDB.
Understanding of CI/CD, source control, and data warehousing concepts.

Any Graduate

Back To Jobs