Tag: Hazelcast

  • Baseball Invented Event Sourcing 150 Years Ago

    I’ve spent the last few months building event-sourced microservices on the Hazelcast platform — immutable event logs, materialized views, CQRS, the whole stack. If you’ve been following along, you’ve seen the why and the how. Event Sourcing is elegant. It’s powerful. And it was invented in the 1870s.

    Not by a software engineer. By a baseball scorekeeper.

    The Scorecard Is an Event Log

    Here’s the thing about keeping score at a baseball game. You don’t write down “the score is 3-2.” You write down what happened: Acuna singled to left. Olson doubled to right-center, Acuna to third. Riley grounded to short, Acuna scored, Olson to third. Play by play, pitch by pitch, every event appended in order to the record.

    Sound familiar? It should. That’s an event log. Immutable, append-only, temporally ordered. The scorekeeper’s pencil is the world’s oldest event producer.

    And here’s the part that made me sit up straight the first time I really thought about it: the scorekeeper never records the score directly. The score — along with batting averages, on-base percentages, pitcher stat lines, and everything else you see on the broadcast — is derived by replaying those events. You don’t store state. You compute it.

    If you read the Hazelcast framework piece, you’ll recognize this immediately. We spent considerable effort setting up event stores and materialized views as separate concerns — events flow in, projections flow out. The scorekeeper has been doing this with a pencil and a printed card since before the lightbulb was invented.

    Materialized Views Are Everywhere at the Ballpark

    Once you see this pattern, you can’t unsee it. Every summary artifact at a baseball game is a materialized view — a read-optimized projection of the same underlying event stream.

    The box score is a materialized view. It takes the full play-by-play event log and projects it into a compact batting line per player: at-bats, runs, hits, RBI. A completely different shape than the source data, optimized for a specific read pattern.

    The line score is a materialized view. Same events, different projection — runs per inning instead of runs per player.

    The pitcher’s stat line is a materialized view. IP, hits, walks, strikeouts, earned runs — all derived from the same at-bat events, filtered by which pitcher was on the mound.

    The standings page is a materialized view built from an even larger event stream — every game’s final result across the entire season.

    This is CQRS in its purest form. The write side is the scorekeeper recording events. The read side is every statistical summary, broadcast graphic, and newspaper box score derived from those events. Multiple consumers, each projecting the same stream into the shape they need.

    Where It Gets Really Interesting

    The conceptual alignment goes deeper than “it’s a log and you read from it.” The hard problems are the same too.

    Temporal queries. “What was the score after five innings?” In an event-sourced system, you replay events up to a point in time. In baseball, you read the line score up to the fifth column. Same operation. A scorekeeper can reconstruct the exact game state — who was on base, how many outs, what the count was — at any point in the game by replaying events to that moment.

    Corrections and compensating events. The official scorer initially rules a play an error, then changes it to a hit. The original event isn’t erased — it’s annotated, and the downstream materialized views (batting averages, fielding percentages) recompute. If you’ve ever dealt with event versioning or compensating events in a production system, this should feel very familiar. The scorer is solving the same problem you are, just with a pencil eraser instead of an event migration framework.

    MLB’s replay challenge system makes this even more explicit. The umpire calls ball three — that’s an event. The catcher challenges — that’s a compensating event. The replay review overturns the call to strike three — that’s the resolution event. You don’t go back and pretend the original call never happened. You record the whole sequence: the call, the challenge, the outcome. The materialized views — the count, the at-bat result, the pitcher’s stat line — all reflect the final state, but the event log preserves the full history of how you got there. That’s not an analogy for event sourcing. That is event sourcing.

    Multiple independent consumers. The home team’s scorer, the visiting team’s scorer, the press box statistician, and the broadcast team are all consuming the same stream of game events and maintaining their own projections. They might format them differently, update at slightly different times, even disagree on a ruling until the official scorer weighs in. Independent materialized views with eventual consistency.

    Eventual consistency itself. The press box stats might lag a play behind the action. The official scorer’s ruling on a borderline hit-vs-error might not come until minutes after the play — sometimes not until after the game. During that window, different consumers have different projections of the truth. The system is eventually consistent. Always has been. Nobody panics about it because the mental model is intuitive: the ruling will come, the stats will update, and the final record will be correct.

    Where the Analogy Breaks Down

    Now, I’m not going to pretend this maps perfectly. The places where it doesn’t map are actually instructive — they highlight what makes distributed event sourcing genuinely hard.

    Ordering is free in baseball. One batter at a time. One pitch at a time. Events are naturally, globally ordered with no effort. In a distributed system, you fight for ordering. You need Lamport clocks, vector clocks, consensus protocols, or a centralized sequencer. Baseball gets this for free because the game is inherently single-threaded. Your microservices are not.

    Single source of truth. Baseball has an official scorer — one authoritative human who makes the final call. Distributed systems don’t get a benevolent dictator. They have to negotiate consensus across nodes that might disagree, might be partitioned, might be lying. The official scorer never has a network partition. (Though I’ve seen some calls that made me wonder.)

    Bounded, finite streams. A baseball game ends. The event stream is finite, relatively small, and complete. Production event stores grow without bound, need compaction strategies, deal with schema evolution over years. A scorekeeper’s biggest scaling challenge is a 19-inning game. Yours is a few billion events with a 5-year retention policy.

    These gaps are worth noting because they’re exactly the problems that make event sourcing hard to implement in software. Baseball had the luxury of solving the easy version first — sequential, single-authority, bounded. We got the distributed, multi-authority, unbounded version. Lucky us.

    So I Built an App

    This pattern recognition wasn’t just an intellectual exercise. I’ve been building BaseballScorer, an iOS app for scoring baseball games by hand — the way scorekeepers have always done it, but on a tablet instead of paper.

    The app’s data model is, naturally, event-sourced. Pitches, plays, and substitutions are the events. The score, the scorecard, the line score, the inning summaries — all materialized views, derived by replaying the event stream. It’s the same architecture I’ve been writing about in the Hazelcast series, just running on a single iPad instead of a distributed cluster.

    It’s available on the App Store now. If you’re the kind of person who keeps score at games — or the kind of person who’s curious why anyone would — I think you’ll find it interesting.

    And if you want to see what event sourcing looks like when you don’t have the luxury of a single-threaded, globally-ordered event stream, check out the Hazelcast microservices framework. Same patterns, much harder problem.


    This is part of an ongoing series on event-driven microservices at Nodes and Edges. Previous posts: Why Event-Driven Microservices? and The Hazelcast Microservices Framework.

  • The Hazelcast Microservices Framework

    How a side project connecting Event Sourcing to Hazelcast sat unfinished for years — and why I decided to bring it back with an AI collaborator.


    In my previous post, I shared some of my thinking about Event-Driven Microservices — the coupling problems, the mental shift toward thinking in events, and the patterns (Event Sourcing, CQRS, materialized views) that make it all work. That post was conceptual. This one is personal.

    I’ve been playing around with design concepts in this area for some time. While I was an employee of Hazelcast, I frequently worked with customers and prospects to show how Hazelcast Jet — an event stream processing engine built into the Hazelcast platform — could be used to build event processing solutions that would scale while continuing to provide low latency. These conversations were always framed around stream processing, though. Even when the intended use case was around microservices, we didn’t explicitly get into the Event Sourcing pattern. As someone coming from a background that was database-centric, the concept of events as the source of truth was a bit much for me.

    The Light Bulb Moment

    It was a light bulb moment when I realized that Hazelcast Jet could fit naturally into an Event Sourcing architecture — and that Hazelcast IMDG (the in-memory data grid, or caching layer) could concurrently maintain materialized views representing the current state of domain objects.

    Think about it: Event Sourcing needs an event log and a processing pipeline. Hazelcast Jet is a processing pipeline. CQRS needs a fast read-side store that’s kept in sync with the event stream. Hazelcast IMDG is a fast read-side store. Event Sourcing + CQRS maps beautifully onto Jet + IMDG (even though that acronym is officially retired — it’s all just “Hazelcast” now).

    And from there, I really wanted to demonstrate this. The original Microservices Framework project began.

    Version 1: The Proof of Concept

    The first version was focused on proving the core idea worked. Could I wire up a Hazelcast Jet pipeline to process domain events, persist them to an event store, and update materialized views — all in a way that was generic enough to work across different services?

    The answer was yes. The central pattern that emerged was straightforward: a service’s handleEvent() method writes incoming events to a PendingEvents map, which triggers a Jet pipeline that persists events to the EventStore, updates materialized views, and publishes to an event bus for other services to consume. It worked, and it was fast.

    Now, the central components of the architecture — the domain object, event class, controller, and pipeline — have survived relatively intact through multiple iterations of the implementation. The bones were good. But a lot of the specific implementation choices I made around those bones haven’t aged all that well.

    You know how it goes with side projects. Technical debt accumulates quietly, one “I’ll fix this later” at a time, until you’re looking at a codebase where you know you’d make different choices if you were starting over — but the sunk cost of time already invested keeps you from actually doing it. It’s the software equivalent of a kitchen renovation where you keep patching the old cabinets because ripping them out feels like too big a project for a weekend.

    That version of the framework is still hanging around on GitHub, although I decided not to link to it here as I may take it down at any time. (Upcoming posts will link to the improved version, so embedding links to the original will inevitably lead to someone grabbing the wrong one.)

    I got it to a working state, but there was a long list of things I wanted to add. Saga patterns for coordinating multi-service transactions. Observability dashboards. Comprehensive tests. Documentation that went beyond “read the code.” Each of these was a meaningful chunk of work, and progress slowed to a crawl.

    The Stall

    Let’s be honest about what happened: the project stalled. Not dramatically — it wasn’t ever really abandoned. It just… stopped moving. Every few months I’d open the codebase, when I had some extra time, and make a few minor, inconsequential changes while thinking of the more ambitious refactorings or added features that I’d get to when time permitted.

    If you’ve ever maintained a passion project alongside a day job, you know this feeling. The ideas don’t go away — they sit in the back of your mind, periodically surfacing with a pang of “I should really get back to that.” But the activation energy to restart is high, especially when the next step isn’t a fun new feature but the grind of scaffolding, configuration, and test coverage. So you close the laptop and tell yourself next month will be different. (It won’t be.)

    Enter AI-Assisted Development

    In early 2025, I started using Claude for various coding tasks and was genuinely surprised by the results. This wasn’t autocomplete on steroids — I could describe an architectural pattern and get back code that understood the why, not just the what. I could say “this needs to work like an event journal with replay capability” and get something that actually accounted for ordering guarantees and idempotency.

    That’s when the thought crystallized: what if I could use this to break through the stall?

    Here’s the thing — the stuff that had been blocking me wasn’t the hard design work. I knew what the architecture should look like. The bottleneck was the sheer volume of implementation grind: scaffolding new services, writing comprehensive tests, wiring up Docker configurations, producing documentation. Exactly the kind of work where you need focused hours, and a side project never has enough of those.

    Now, I want to be clear about what I mean here, because “AI wrote my code” carries a lot of baggage. This wasn’t about handing off the project and checking back in when it was done. It was about having a collaborator who could take high-level design direction and turn it into working code at a pace that made the project viable again. I’d provide the domain expertise, the architectural decisions, and the quality bar. The AI would provide the throughput.

    Making the Decision

    I decided to move forward with a clean reimplementation rather than trying to evolve the existing codebase. The core patterns from the original work — the Jet pipeline architecture, the event store design, the materialized view update strategy — were proven and would carry forward. But the project structure, package naming, dependency versions, and framework abstractions would start fresh. Sometimes the best way to fix a kitchen is to actually rip out the cabinets.

    The plan was to use Claude’s desktop interface for iterative design discussions (requirements, architecture, implementation planning) and then hand off to Claude Code for the actual coding. Design first, then build — with comprehensive documentation at every step so the AI would have rich context to work from.

    What happened next — the design phase, the handoff to Claude Code, and the surprises along the way — is the subject of the next post.