← Back to jobs

Data Engineer

Merican Inc

Overland Park, KS, USA

Posted On: 30+ days ago

Experience: 5+ years

Availability: Hybrid

Openings: 1

Category: Data Engineer

Tenure: Contract - Corp-to-Corp

Related Jobs

No related jobs found

Description

Key Responsibilities

As a Data Engineer, you will: Data Ingestion & Pipeline Development Build and enhance ingestion pipelines for large batch and event-driven paths (streaming may evolve over time).
Integrate data from: Third party enrichment vendors (identity + attributes, very large volumes) Digital platforms via Conversion API (CAPI) integrations (through intermediary/middleware) Rewards/Promotions systems (e.g., TMT) for offer issuance/redemption/consumption data
Data Quality, Reliability & Operations Implement strong data validation, idempotency, replay/backfill strategies, and deduplication to prevent quality drift.
Own monitoring, alerting, dashboarding, and operational readiness ( wrappers around core pipelines).
Troubleshoot failures with root cause analysis not just reruns: Interpret Spark logs Diagnose performance issues (shuffle, skew, partitioning) Improve stability and SLA adherence Governance & Compliance (First-class NFR) Apply privacy, compliance, and governance requirements across pipelines and datasets.
Support governance standards such as: Unity Catalog, lineage, access controls Managing PII vs non PII access Documentation of tables, schemas, catalogs, and cluster usage
Cost Governance & Performance Optimization Design pipelines with cost awareness from day one: Cluster sizing, workload tuning, efficient compute/storage usage Trade-off decisions balancing cost vs quality vs SLA Collaboration & Ownership Work in a small, fast-moving team; be self-driven and ownership-oriented.
Raise and manage data quality escalations when issues are detected.
Contribute to evolving architecture (product is early-stage; first live month was recent).

Must-Have Skills (Screening Keywords)

Candidate with hands-on, recent experience in: Strong coding: PySpark + SQL (hands-on, not only orchestration)
Databricks: notebooks/jobs, performance tuning fundamentals, medallion patterns Spark fundamentals: partitioning, skew/shuffle optimization, understanding failures via logs
Snowflake: data modeling/usage for analytics/warehousing workloads Azure ecosystem: Azure Data Factory (ADF) (orchestration) Azure-native integrations and services exposure
Data engineering reliability patterns: validation, idempotency, replay/backfills, dedup, auditability Data governance: Unity Catalog (preferred), lineage, access control patterns, PII handling Ownership mindset: can execute independently without constant approvals/check-ins

Nice-to-Have Skills

Event-driven/streaming ingestion exposure (even if primary is batch today)
Delta/Databricks patterns such as Delta Live Tables (DLT) (some workflows exist)
Experience building config-driven export frameworks for multiple downstream consumers/vendors
Exposure/interest in identity resolution concepts (ML optional; ETL strength is priority)
Familiarity with CAPI integrations / marketing tech data signals
Experience implementing operational telemetry: dashboards, alerts, SLA monitoring
What Good Looks Like (Success Criteria) Ships reliable, well-governed datasets with strong data quality practices
Can scale pipelines for very large volumes (hundreds of millions of records per vendor)
Prevents silent failures where quality degrades without obvious job failures
Balances delivery speed with compliance, governance, and cost controls

Key Skills

Pyspark Azure Databricks Snowflake Adf Data Governance

Education

Bachelor's degree

Related Jobs

No related jobs found

← Back to jobs

Data Engineer

Related Jobs

Description

Key Skills

Education

Related Jobs

Explore More Jobs