Tag: baseball

  • Baseball Invented Event Sourcing 150 Years Ago

    I’ve spent the last few months building event-sourced microservices on the Hazelcast platform — immutable event logs, materialized views, CQRS, the whole stack. If you’ve been following along, you’ve seen the why and the how. Event Sourcing is elegant. It’s powerful. And it was invented in the 1870s.

    Not by a software engineer. By a baseball scorekeeper.

    The Scorecard Is an Event Log

    Here’s the thing about keeping score at a baseball game. You don’t write down “the score is 3-2.” You write down what happened: Acuna singled to left. Olson doubled to right-center, Acuna to third. Riley grounded to short, Acuna scored, Olson to third. Play by play, pitch by pitch, every event appended in order to the record.

    Sound familiar? It should. That’s an event log. Immutable, append-only, temporally ordered. The scorekeeper’s pencil is the world’s oldest event producer.

    And here’s the part that made me sit up straight the first time I really thought about it: the scorekeeper never records the score directly. The score — along with batting averages, on-base percentages, pitcher stat lines, and everything else you see on the broadcast — is derived by replaying those events. You don’t store state. You compute it.

    If you read the Hazelcast framework piece, you’ll recognize this immediately. We spent considerable effort setting up event stores and materialized views as separate concerns — events flow in, projections flow out. The scorekeeper has been doing this with a pencil and a printed card since before the lightbulb was invented.

    Materialized Views Are Everywhere at the Ballpark

    Once you see this pattern, you can’t unsee it. Every summary artifact at a baseball game is a materialized view — a read-optimized projection of the same underlying event stream.

    The box score is a materialized view. It takes the full play-by-play event log and projects it into a compact batting line per player: at-bats, runs, hits, RBI. A completely different shape than the source data, optimized for a specific read pattern.

    The line score is a materialized view. Same events, different projection — runs per inning instead of runs per player.

    The pitcher’s stat line is a materialized view. IP, hits, walks, strikeouts, earned runs — all derived from the same at-bat events, filtered by which pitcher was on the mound.

    The standings page is a materialized view built from an even larger event stream — every game’s final result across the entire season.

    This is CQRS in its purest form. The write side is the scorekeeper recording events. The read side is every statistical summary, broadcast graphic, and newspaper box score derived from those events. Multiple consumers, each projecting the same stream into the shape they need.

    Where It Gets Really Interesting

    The conceptual alignment goes deeper than “it’s a log and you read from it.” The hard problems are the same too.

    Temporal queries. “What was the score after five innings?” In an event-sourced system, you replay events up to a point in time. In baseball, you read the line score up to the fifth column. Same operation. A scorekeeper can reconstruct the exact game state — who was on base, how many outs, what the count was — at any point in the game by replaying events to that moment.

    Corrections and compensating events. The official scorer initially rules a play an error, then changes it to a hit. The original event isn’t erased — it’s annotated, and the downstream materialized views (batting averages, fielding percentages) recompute. If you’ve ever dealt with event versioning or compensating events in a production system, this should feel very familiar. The scorer is solving the same problem you are, just with a pencil eraser instead of an event migration framework.

    MLB’s replay challenge system makes this even more explicit. The umpire calls ball three — that’s an event. The catcher challenges — that’s a compensating event. The replay review overturns the call to strike three — that’s the resolution event. You don’t go back and pretend the original call never happened. You record the whole sequence: the call, the challenge, the outcome. The materialized views — the count, the at-bat result, the pitcher’s stat line — all reflect the final state, but the event log preserves the full history of how you got there. That’s not an analogy for event sourcing. That is event sourcing.

    Multiple independent consumers. The home team’s scorer, the visiting team’s scorer, the press box statistician, and the broadcast team are all consuming the same stream of game events and maintaining their own projections. They might format them differently, update at slightly different times, even disagree on a ruling until the official scorer weighs in. Independent materialized views with eventual consistency.

    Eventual consistency itself. The press box stats might lag a play behind the action. The official scorer’s ruling on a borderline hit-vs-error might not come until minutes after the play — sometimes not until after the game. During that window, different consumers have different projections of the truth. The system is eventually consistent. Always has been. Nobody panics about it because the mental model is intuitive: the ruling will come, the stats will update, and the final record will be correct.

    Where the Analogy Breaks Down

    Now, I’m not going to pretend this maps perfectly. The places where it doesn’t map are actually instructive — they highlight what makes distributed event sourcing genuinely hard.

    Ordering is free in baseball. One batter at a time. One pitch at a time. Events are naturally, globally ordered with no effort. In a distributed system, you fight for ordering. You need Lamport clocks, vector clocks, consensus protocols, or a centralized sequencer. Baseball gets this for free because the game is inherently single-threaded. Your microservices are not.

    Single source of truth. Baseball has an official scorer — one authoritative human who makes the final call. Distributed systems don’t get a benevolent dictator. They have to negotiate consensus across nodes that might disagree, might be partitioned, might be lying. The official scorer never has a network partition. (Though I’ve seen some calls that made me wonder.)

    Bounded, finite streams. A baseball game ends. The event stream is finite, relatively small, and complete. Production event stores grow without bound, need compaction strategies, deal with schema evolution over years. A scorekeeper’s biggest scaling challenge is a 19-inning game. Yours is a few billion events with a 5-year retention policy.

    These gaps are worth noting because they’re exactly the problems that make event sourcing hard to implement in software. Baseball had the luxury of solving the easy version first — sequential, single-authority, bounded. We got the distributed, multi-authority, unbounded version. Lucky us.

    So I Built an App

    This pattern recognition wasn’t just an intellectual exercise. I’ve been building BaseballScorer, an iOS app for scoring baseball games by hand — the way scorekeepers have always done it, but on a tablet instead of paper.

    The app’s data model is, naturally, event-sourced. Pitches, plays, and substitutions are the events. The score, the scorecard, the line score, the inning summaries — all materialized views, derived by replaying the event stream. It’s the same architecture I’ve been writing about in the Hazelcast series, just running on a single iPad instead of a distributed cluster.

    It’s available on the App Store now. If you’re the kind of person who keeps score at games — or the kind of person who’s curious why anyone would — I think you’ll find it interesting.

    And if you want to see what event sourcing looks like when you don’t have the luxury of a single-threaded, globally-ordered event stream, check out the Hazelcast microservices framework. Same patterns, much harder problem.


    This is part of an ongoing series on event-driven microservices at Nodes and Edges. Previous posts: Why Event-Driven Microservices? and The Hazelcast Microservices Framework.

  • If at first you don’t succeed

    In my previous post I mentioned that I wanted to give a bit of the story of developing The Sorcerer’s Apprentice iPhone application.    But The Sorcerer’s App was not my first crack at writing an iPhone app.    Before we get to the new app, let’s turn the wayback machine to 2009.   The App Store was only about a year old  (it’s easy to forget that at the initial release, third party developers could not write applications for the iPhone).   And I had an idea for what I felt would be a great iPhone application.

    The idea of the app was a baseball scoring application.   This wasn’t a new idea for me — I had originally thought of it as an application I thought would do well for the Apple Newton.   I had even drawn up some screen mock-ups of the Newton app (I still have them in a file around here somewhere).  But the Newton wasn’t a long-lived platform and was gone before I ever got a chance to make any serious attempt at developing an application for it.

    But the idea didn’t die, so when the iPhone opened up for third party developers, I started thinking about it again, and then working on it.   I bought a couple of developer’s guides, and even attended an iOS developers conference in San Jose.   Soon pieces of the app were beginning to take shape … a display across the top of the screen for the line score (inning-by-inning runs scored), a lineup on the left, an area for scoring the current play on the right.

    Background image for the play scoring area
    Background image for the play scoring area

    As it turns out, this was an incredibly complex application, and in hindsight was really too ambitious for a first project — especially for a single developer, working part time.    Things that were uninteresting, but vitally necessary — like handling the roster, lineup, substitutions, etc. — were very time consuming to get right.    The interesting part — scoring the plays — really required skills with graphics that I didn’t possess if I was to give the app the polished look I was looking for.

    I worked on the app pretty steadily for a number of months.    At some point while I was doing this, another baseball scoring app showed up in the app store — but I wasn’t too discouraged, because I looked at it and decided I could do better.    Not too long after that, a second scoring app showed up — much more complete, better thought out.    Well, I thought, I may have lost the first mover advantage, but  I could catch up.    Then the newer, better app was re-branded –it became the ESPN scorecard app.    At that point it really seemed like Game Over.    If I was confident that I was going to turn out an app that was everything I envisioned, perhaps I would have continued at that point — but I was daunted by how long I’d worked on this and how much was still left to do.   I knew it would be several more months before I could possibly have anything to market, and then it might very well be second-best.

    So, my first iPhone development project was shelved.   But I’d learned a lot, and I felt I would return to iPhone development when the right project came along.   I really thought that would be in a matter of months, rather than years — but in the intervening time, there has been nothing that struck me as something I wanted to do badly enough that I’d invest the hours required.    So time marched on, while millions of new apps were developed and shipped.   There had to be an idea that was still out there somewhere, waiting for me to find it.

    That’s where the story will pick up in the next post.