Description
Key Skills: Python, PySpark, Pandas, NLP, BERT, Flair, Flask, MLflow, Redis, Linux
Good to Have Skills: Experience with Polars or Dask for high-performance data processing. Experience with PyTorch or TensorFlow for model training. Experience with Docker, Kubernetes, or containerized deployments. Experience with monitoring tools such as ITRS Geneos. Experience with FastAPI, Airflow, or Prefect.
Roles & Responsibilities:
- Develop and optimize ETL and data processing jobs using PySpark, Pandas, PyArrow, and related libraries for large-scale data operations.
- Work with Parquet files using FastParquet or pyarrow.parquet for efficient data processing and storage optimization.
- Implement data parsing and serialization using json, ujson, or orjson for high-performance JSON handling in data pipelines.
- Build and maintain NLP pipelines using Flair, BERT, and LLM-based models for advanced text processing and analysis.
- Develop scalable ingestion and data transformation pipelines for AI and analytics use cases across the organization.
- Build and maintain Flask-based APIs for model inference and service integrations with platform services.
- Use regular expressions for text cleaning, parsing, and NLP preprocessing to ensure data quality and consistency.
- Integrate caching and fast lookups using Redis to improve application performance and response times.
- Manage and deploy ML models using MLflow for tracking, versioning, and maintaining model lifecycle management.
- Support CI/CD workflows using GitHub, LightSpeed Enterprise, and deployment pipelines for automated software delivery.
- Create and maintain Autosys JILs for job scheduling and automation of batch processing workflows.
- Use basic Linux commands for troubleshooting, operations, and deployment tasks in production environments.
- Monitor application and system health using ITRS Geneos to ensure optimal performance and availability.
- Write unit tests and improve automation test coverage using PyTest or unittest frameworks.
- Work with REST APIs, microservices, and basic shell scripting for system integration and automation.
- Work with cloud services including ECS and utilize boto3 for AWS service integration and management.
Experience Required: 8+ years of hands-on Python programming experience with strong fundamentals in Python, OOP, and design patterns