About the Role
Design and build the data infrastructure, frameworks, and tooling that power the CDP platform. You build the systems that DataOps runs — connector frameworks, schema mapping engines, transformation pipelines, data quality tooling. You think in terms of “how do I solve this for all clients” not one client.
What You Will Do
- Design and build pluggable connector framework for client data sources (databases, APIs, file feeds, event streams)
- Build automated schema mapping and detection tools to accelerate client onboarding
- Architect the transformation layer — cleaning, deduplication, normalization, enrichment as composable modules
- Build data quality framework — profiling, validation gates, anomaly detection, lineage tracking
- Design efficient data models for Snowflake and BigQuery — partitioning, clustering, materialization, cost-aware query patterns
- Build schema evolution handling — graceful adaptation when client source schemas change
- Design metadata layer — schema definitions, mapping rules, transformation logic per client
- Optimize stored procedures and transformation jobs for analytical workloads
Must-Have
- 4+ years building data systems (not just running them)
- Strong SQL — complex queries, window functions, performance and cost optimization
- Experience building data pipelines with Python, Spark, or similar
- Snowflake or BigQuery at architecture level — data modeling, performance tuning, cost optimization
- Kafka or similar for real-time ingestion
- Experience building reusable frameworks/tooling, not one-off scripts
- Data modeling — star schema, SCD, event sourcing, EAV patterns
- Workflow orchestration — Airflow, dbt, or similar