baseball – Nodes and Edges

I’ve spent the last few months building event-sourced microservices on the Hazelcast platform — immutable event logs, materialized views, CQRS, the whole stack. If you’ve been following along, you’ve seen the why and the how. Event Sourcing is elegant. It’s powerful. And it was invented in the 1870s.

Not by a software engineer. By a baseball scorekeeper.

The Scorecard Is an Event Log

Here’s the thing about keeping score at a baseball game. You don’t write down “the score is 3-2.” You write down what happened: Acuna singled to left. Olson doubled to right-center, Acuna to third. Riley grounded to short, Acuna scored, Olson to third. Play by play, pitch by pitch, every event appended in order to the record.

Sound familiar? It should. That’s an event log. Immutable, append-only, temporally ordered. The scorekeeper’s pencil is the world’s oldest event producer.

And here’s the part that made me sit up straight the first time I really thought about it: the scorekeeper never records the score directly. The score — along with batting averages, on-base percentages, pitcher stat lines, and everything else you see on the broadcast — is derived by replaying those events. You don’t store state. You compute it.

If you read the Hazelcast framework piece, you’ll recognize this immediately. We spent considerable effort setting up event stores and materialized views as separate concerns — events flow in, projections flow out. The scorekeeper has been doing this with a pencil and a printed card since before the lightbulb was invented.

Materialized Views Are Everywhere at the Ballpark

Once you see this pattern, you can’t unsee it. Every summary artifact at a baseball game is a materialized view — a read-optimized projection of the same underlying event stream.

The box score is a materialized view. It takes the full play-by-play event log and projects it into a compact batting line per player: at-bats, runs, hits, RBI. A completely different shape than the source data, optimized for a specific read pattern.

The line score is a materialized view. Same events, different projection — runs per inning instead of runs per player.

The pitcher’s stat line is a materialized view. IP, hits, walks, strikeouts, earned runs — all derived from the same at-bat events, filtered by which pitcher was on the mound.

The standings page is a materialized view built from an even larger event stream — every game’s final result across the entire season.

This is CQRS in its purest form. The write side is the scorekeeper recording events. The read side is every statistical summary, broadcast graphic, and newspaper box score derived from those events. Multiple consumers, each projecting the same stream into the shape they need.

Where It Gets Really Interesting

The conceptual alignment goes deeper than “it’s a log and you read from it.” The hard problems are the same too.

Temporal queries. “What was the score after five innings?” In an event-sourced system, you replay events up to a point in time. In baseball, you read the line score up to the fifth column. Same operation. A scorekeeper can reconstruct the exact game state — who was on base, how many outs, what the count was — at any point in the game by replaying events to that moment.

Corrections and compensating events. The official scorer initially rules a play an error, then changes it to a hit. The original event isn’t erased — it’s annotated, and the downstream materialized views (batting averages, fielding percentages) recompute. If you’ve ever dealt with event versioning or compensating events in a production system, this should feel very familiar. The scorer is solving the same problem you are, just with a pencil eraser instead of an event migration framework.

MLB’s replay challenge system makes this even more explicit. The umpire calls ball three — that’s an event. The catcher challenges — that’s a compensating event. The replay review overturns the call to strike three — that’s the resolution event. You don’t go back and pretend the original call never happened. You record the whole sequence: the call, the challenge, the outcome. The materialized views — the count, the at-bat result, the pitcher’s stat line — all reflect the final state, but the event log preserves the full history of how you got there. That’s not an analogy for event sourcing. That is event sourcing.

Multiple independent consumers. The home team’s scorer, the visiting team’s scorer, the press box statistician, and the broadcast team are all consuming the same stream of game events and maintaining their own projections. They might format them differently, update at slightly different times, even disagree on a ruling until the official scorer weighs in. Independent materialized views with eventual consistency.

Eventual consistency itself. The press box stats might lag a play behind the action. The official scorer’s ruling on a borderline hit-vs-error might not come until minutes after the play — sometimes not until after the game. During that window, different consumers have different projections of the truth. The system is eventually consistent. Always has been. Nobody panics about it because the mental model is intuitive: the ruling will come, the stats will update, and the final record will be correct.

Where the Analogy Breaks Down

Now, I’m not going to pretend this maps perfectly. The places where it doesn’t map are actually instructive — they highlight what makes distributed event sourcing genuinely hard.

Ordering is free in baseball. One batter at a time. One pitch at a time. Events are naturally, globally ordered with no effort. In a distributed system, you fight for ordering. You need Lamport clocks, vector clocks, consensus protocols, or a centralized sequencer. Baseball gets this for free because the game is inherently single-threaded. Your microservices are not.

Single source of truth. Baseball has an official scorer — one authoritative human who makes the final call. Distributed systems don’t get a benevolent dictator. They have to negotiate consensus across nodes that might disagree, might be partitioned, might be lying. The official scorer never has a network partition. (Though I’ve seen some calls that made me wonder.)

Bounded, finite streams. A baseball game ends. The event stream is finite, relatively small, and complete. Production event stores grow without bound, need compaction strategies, deal with schema evolution over years. A scorekeeper’s biggest scaling challenge is a 19-inning game. Yours is a few billion events with a 5-year retention policy.

These gaps are worth noting because they’re exactly the problems that make event sourcing hard to implement in software. Baseball had the luxury of solving the easy version first — sequential, single-authority, bounded. We got the distributed, multi-authority, unbounded version. Lucky us.

So I Built an App

This pattern recognition wasn’t just an intellectual exercise. I’ve been building BaseballScorer, an iOS app for scoring baseball games by hand — the way scorekeepers have always done it, but on a tablet instead of paper.

The app’s data model is, naturally, event-sourced. Pitches, plays, and substitutions are the events. The score, the scorecard, the line score, the inning summaries — all materialized views, derived by replaying the event stream. It’s the same architecture I’ve been writing about in the Hazelcast series, just running on a single iPad instead of a distributed cluster.

It’s available on the App Store now. If you’re the kind of person who keeps score at games — or the kind of person who’s curious why anyone would — I think you’ll find it interesting.

If you think this sounds interesting but aren’t familiar with keeping score, we’ve got you covered – check out this detailed how-to score site that covers using the app or paper scorecards if that’s more your style.

And if you want to see what event sourcing looks like when you don’t have the luxury of a single-threaded, globally-ordered event stream, check out the Hazelcast microservices framework. Same patterns, much harder problem.

This is part of an ongoing series on event-driven microservices at Nodes and Edges. Previous posts: Why Event-Driven Microservices? and The Hazelcast Microservices Framework.

Code: github.com/myawnhc/hazelcast-microservices-framework — clone it, docker-compose up, and the framework boots locally with sample data.

Tag: baseball

How Baseball Invented Event Sourcing 150 Years Ago

The Scorecard Is an Event Log

Materialized Views Are Everywhere at the Ballpark

Where It Gets Really Interesting

Where the Analogy Breaks Down

So I Built an App

If at first you don’t succeed