You will build and maintain large-scale distributed systems and data pipelines.
Responsibilities
- Build multi-language text processing, scraping, or data pipelines for information retrieval, machine learning, or data analytics.
- Design systems for high maintainability, testability, monitorability, and automation.
- Tune queries and analysis to achieve low latency and high relevance in search.
- Develop services and applications using Java and Spring, and utilize Python for tooling.
- Manage infrastructure and automation using CI/CD, Ansible, and Packer.
Required Skills
- 5+ years of software engineering experience.
- Expert-level programming experience in Java.
- Strong programming experience in Python.
- Experience with big data search technologies such as Elasticsearch, Lucene, Solr, or CloudSearch.
- Experience with messaging, queueing, or stream processing systems.
- Experience with AWS cloud infrastructure.
- Proficiency with DynamoDB and S3.
- Experience implementing CI/CD pipelines.
- Background in building real-time or batch data pipelines on large datasets.
Preferred Skills
- Experience with web-scale data and large-scale distributed systems on cloud infrastructure.