You will design and build the foundational data platform components.
Responsibilities
Design and build data platform components, including event handling, system management tools, and query-optimized storage for large-scale commodities datasets.
Implement and maintain batch and streaming pipelines using Python, SQL, Airflow, and Kafka to ingest and transform data.
Develop and manage cloud-native infrastructure on AWS (S3, SQS, RDS, Terraform), ensuring scalability and cost efficiency.
Build and maintain FastAPI-based services and APIs for data access and platform operations.
Optimize queries, workflows, and resource usage to deliver low-latency data access and high platform uptime.
Required Skills
4–8 years of software/data engineering experience, preferably building data platforms.
Strong proficiency in Python with solid software engineering practices.
Hands-on experience with SQL and relational databases (Snowflake, Postgres or similar).
Practical experience with Airflow and message/streaming systems such as Kafka.
Experience with AWS services (S3, SQS, RDS) and infrastructure-as-code tools like Terraform.
Experience building RESTful services, ideally with FastAPI.
Familiarity with Git-based workflows and CI/CD tooling (e.g., GitHub Actions) and automated testing (PyTest).
Understanding of data modelling and query optimization.
Preferred Skills
Experience with columnar/analytic data formats and engines (e.g., Iceberg, ClickHouse, Parquet).
Exposure to monitoring/observability stacks (Prometheus, Grafana, OpenTelemetry, etc.).