Pull-quote: “Streaming pipelines that wake people at 3 AM are not real-time. They’re real-painful.”
Why this matters
Real-time pipelines are easy to demo and hard to operate. The pattern that fails: a clever Spark Structured Streaming job that works in dev, struggles in prod under skew, and breaks at the first schema evolution. The pattern that survives: Auto Loader for ingestion, DLT for transformations, expectations for quality, and SLOs that the team monitors like uptime.
The reference architecture
Sources Ingestion Transformation Consumption
─────── ───────── ────────────── ───────────
Cloud storage ──► Auto Loader (cloudFiles) ──► Bronze
Kafka / EH ──► Structured Streaming ──► Bronze
CDC (Debezium) ──► Auto Loader / SS ──► Bronze
│
DLT expectations
(drop / quarantine)
▼
Silver
│
joins / aggregations
▼
Gold ──► BI · ML · Apps
Auto Loader: incremental, schema-evolving, exactly-once
Auto Loader is the foundation. For file-based ingestion at scale, it handles:
- Incremental discovery of new files
- Schema inference with versioned schema files
- Schema evolution with rescued data column for unexpected fields
- Exactly-once semantics via durable file tracking
For event streams, Structured Streaming directly from Kafka, Event Hubs, or Kinesis covers the same role.
DLT: declarative streaming with managed dependencies
DLT lets you describe what the pipeline computes, not how. The runtime handles dependency ordering, retry semantics, schema validation, and metric capture. Expectations express data-quality contracts:
-- Pseudocode
CREATE STREAMING LIVE TABLE silver_orders
CONSTRAINT valid_id EXPECT (order_id IS NOT NULL) ON VIOLATION DROP ROW
CONSTRAINT valid_amt EXPECT (amount > 0) ON VIOLATION DROP ROW
CONSTRAINT plausible EXPECT (amount < 1e7) ON VIOLATION QUARANTINE
AS SELECT ... FROM STREAM(LIVE.bronze_orders);
The metrics on those expectations become part of the pipeline’s observability surface.
SLOs that survive production
| SLO | Target |
|---|---|
| End-to-end latency P95 | < 60 s for “near-real-time” use cases |
| Drop rate | < 0.5% of input records |
| Quarantine rate | < 2% of input records |
| Pipeline uptime | 99.9% monthly |
| Backfill capability | < 24 h for last-7-day reprocessing |
These are the right targets to commit to, not the latency benchmarks vendors quote in marketing.
Closing
Streaming on the Lakehouse is operationally feasible when you adopt Auto Loader, DLT, and expectations as the standard pattern. The team’s job becomes monitoring SLOs and reviewing quarantine, not babysitting jobs.


