Last modified: April 27, 2026
This article is written in: πΊπΈ
As data volumes grew and real-time analytics became a business requirement, engineers needed architectural patterns that could handle both historical re-computation and low-latency stream processing reliably. Lambda and Kappa architectures are the two dominant answers to that challenge.
Lambda architecture, popularised by Nathan Marz, splits processing into three independent layers that run simultaneously. The key insight is that immutable raw data stored in the batch layer is the single source of truth, while the speed layer only provides low-latency approximations that are eventually superseded by accurate batch results.
Lambda Architecture
Incoming Data Stream
|
v
+--------+--------+
| |
| Batch Layer | <-- immutable raw store (HDFS, S3, Delta Lake)
| | recomputes full views periodically
+--------+--------+
|
v (batch views, hours to days old)
+--------+--------+
| |
| Serving Layer | <-- merges batch view + speed view to answer queries
| |
+--------+--------+
^
| (real-time view, seconds to minutes old)
+--------+--------+
| |
| Speed Layer | <-- stream processor (Kafka + Flink / Spark Streaming)
| | low-latency, eventually replaced by batch view
+-----------------+
The batch layer stores all raw, immutable data and periodically runs large-scale compute jobs to produce batch views β pre-aggregated, correct, complete results.
The speed layer compensates for the high latency of batch jobs by processing new events in real time and producing speed views that cover the gap since the last batch run.
The serving layer merges batch views and speed views at query time, presenting a unified, consistent result to consumers.
Query Answering in Lambda
User Query
|
v
Serving Layer
|
+--------> Batch View (complete, accurate, covers T-0 to T-batch)
| \
+--------> Speed View (recent, approximate, covers T-batch to now)
\
Merge & Return Result
| Aspect | Advantage | Challenge |
| Accuracy | Batch layer guarantees correctness over full history | Speed layer may be approximate until batch catches up |
| Latency | Speed layer provides low-latency real-time views | Batch layer has high latency (hours to days) |
| Complexity | Clear separation of concerns | Two separate codebases for batch and streaming logic |
| Re-processing | Trivial β recompute batch views from raw data | Speed view state must be carefully managed |
| Fault tolerance | Immutable batch store simplifies recovery | Speed layer state recovery can be complex |
Kappa architecture, proposed by Jay Kreps (co-creator of Kafka), eliminates the batch layer entirely. The central thesis is that a well-designed streaming system can replace batch processing if the stream log is retained long enough to enable historical re-computation by replaying it from the beginning.
Kappa Architecture
Incoming Data Stream
|
v
+--------+--------+
| |
| Stream Log | <-- durable, replayable log (Apache Kafka, Kinesis)
| (retained) | acts as the single source of truth
+--------+--------+
|
v
+--------+--------+
| |
| Stream Processor| <-- single processing layer (Flink, Kafka Streams)
| (current code) | handles both real-time and historical replay
+--------+--------+
|
v
+--------+--------+
| |
| Serving Store | <-- queryable output (Cassandra, Elasticsearch, etc.)
| |
+-----------------+
When business logic changes or a bug must be corrected, Kappa handles re-processing by:
Kappa Re-processing Cutover
Log (all history retained)
|
+-----> Old Job (v1) -------> Output Table v1 (serving live traffic)
|
+-----> New Job (v2) -------> Output Table v2 (catching up from offset 0)
|
(when caught up)
|
Atomic swap: v2 becomes primary
|
Old job + v1 table decommissioned
| Aspect | Advantage | Challenge |
| Simplicity | Single codebase for all processing | Streaming code can be harder to write than batch SQL |
| Latency | Consistently low latency for all queries | Log retention for full history can be expensive |
| Re-processing | Replay from any offset in the log | Re-processing large history takes significant time |
| Fault tolerance | Checkpointing and offset management | Stateful stream operators require careful design |
| Tooling | Mature streaming frameworks available | Historical replay at batch scale requires careful tuning |
Decision Guide
Is low-latency (< seconds) required?
|
+-- No --> Classic Batch ETL is sufficient (simpler)
|
Yes
|
v
Is exactly-correct re-computation of history important?
|
+-- Yes + team has capacity for two codebases --> Lambda
|
+-- Yes + team prefers single codebase + log retention feasible --> Kappa
|
+-- Approximate / near-correct is acceptable --> Speed layer only (simplified Lambda)
| Factor | Favour Lambda | Favour Kappa |
| Team size / complexity | Larger teams with specialised roles | Smaller teams preferring unified code |
| Correctness requirements | Strict β batch layer guarantees accuracy | Achievable through idempotent streaming |
| Log retention cost | Not required to retain full history | Full history must be kept in stream log |
| Re-processing frequency | Infrequent, scheduled batch runs OK | Frequent logic changes need fast replay |
| Data volume | Petabyte-scale where streaming is hard | Manageable with stream parallelism |
| Existing infrastructure | Hadoop / Spark ecosystem in place | Kafka-centric organisation |
Newer frameworks blur the line between Lambda and Kappa: