Description

You will design and implement data processing frameworks and pipelines within the Google Cloud Platform environment. You will build high-performing systems for batch and real-time streams, managing ingestion, transformation, and aggregation.

Responsibilities

  • Build and maintain ETL pipelines for data collection, storage, and curation.
  • Develop streaming and batch processing frameworks using GCP services.
  • Schedule complex workflows and orchestrate jobs using Airflow or Cloud Composer.
  • Automate deployments and testing through CI/CD pipelines.
  • Manage data ingestion and processing using industry-standard technology stacks.

Required Skills

  • 10+ years of application development experience.
  • Expertise in GCP services: BigQuery, Dataflow, Cloud Storage, DataProc, Composer, Pub/Sub, and Cloud Monitoring.
  • Proficiency in Java and Python.
  • Experience with streaming ETL using Apache Beam and Kafka.
  • Strong SQL background with Teradata, BigQuery, or BigTable.
  • Hands-on experience with Airflow or Cloud Composer.
  • Proficiency in Bash shell scripting, UNIX utilities, and commands.
  • Experience implementing CI/CD automation pipelines.
  • Experience using JIRA or similar project management tools.

Preferred Skills

  • Knowledge of Kubernetes, Docker, Spark, PySpark, or Kafka.
  • Experience with Scrum/Agile methodologies, data mapping, and JSON data manipulation.

Education

Any Graduate