Event Sourcing turns events into the source of truth.

A bank account is the classic example. Instead of just storing the balance:

// Traditional approach (current state)
{
  "accountId": "123",
  "balance": 1000
}

// Event Sourcing (history of changes)
[
  {"type": "AccountOpened", "amount": 500, "timestamp": "2024-01-01"},
  {"type": "MoneyDeposited", "amount": 700, "timestamp": "2024-02-01"},
  {"type": "MoneyWithdrawn", "amount": 200, "timestamp": "2024-03-01"}
]

The current balance is just replaying these events. This seems like extra work, but it gives you more: perfect audit trails, time travel (see state at any point), and business insights.

Note: First, we try to store the change in the database. When that succeeds, we then create the event. An update event will not happen if the database update didn’t go through.

Events are facts. They're immutable. You can't change history. If you make a mistake, you add a correction event. Just like in accounting, you never erase. You add correction entries.

Snapshotting

Storage grows forever since you keep all events (we'll dig into partitioning later).

What if we want to restore state from events? Say you look at event with id 1000, how do you know what state to restore?

This is where snapshots come in. Think of snapshots like save points in a game. Without them, you'd need to replay from the start every time. With millions of events, that's slow and expensive. You literally need to loop through all the events to get the state for X date for example.

When to Create Snapshots:

// Common trigger points:
- Every N events (e.g., every 1000 events)
- Time-based (every hour)
- When state size changes a lot

// Example snapshot metadata
// For event 1000, we take a snapshot:
{
  "snapshotId": "snap_123",
  "eventCount": 1000,
  "timestamp": "2024-03-01T10:00:00Z",
  "lastEventId": "evt_1000",
  "state": {
    "balance": 1000,
    "lastTransaction": "2024-03-01",
    "status": "active"
  }
}

Snapshots are needed for:

Fast system recovery after crashes
Creating new read models quickly
Performance optimization
Reducing load during replay

Note: Snapshots aren't events. They're performance optimization. You could delete all snapshots and rebuild from events (slow but possible). Different services might need different snapshot frequencies based on their needs.

Remember: Keep multiple snapshots. If the latest is corrupted, you can fall back to an earlier one and replay fewer events. It's a trade-off between storage space and recovery speed.

End of introductory notes

When to use Event Sourcing:

Financial systems (need audit trails)
Legal/compliance heavy systems
Complex business processes
When history and decision tracking matter

When not to use Event Sourcing:

Simple CRUD applications
When only current state matters
High-volume data with no audit needs
When explaining the system to others is hard

Note: Event Sourcing isn't just storing events. It's modeling your domain as a series of meaningful business events. "UserAddressChanged" is an event. A database UPDATE is not.

Remember: Events are different from CDC events. CDC captures database changes. Event Sourcing captures business decisions. They solve different problems. Sometimes you use both. Event Sourcing for your domain, CDC for syncing data.

Digging deeper: Command vs Event

In Event Sourcing, not everything becomes an event. When a user does something, it starts as a command. Commands can fail. Events can't. Events are facts that have already happened.

Example of a registration flow:

// Command (can fail)
{
  "type": "RegisterUser",
  "username": "alice",
  "email": "alice@email.com"
}

// System checks: Is username taken? Is email valid?
// Only after DB write succeeds:

// Event (recorded fact)
{
  "type": "UserRegistered",
  "username": "alice",
  "email": "alice@email.com",
  "timestamp": "2024-03-01T10:00:00Z"
}

The flow is important:

Command arrives
Validate business rules
Write to database
Only then create event
Consumers can't reject events

Think of it like a bank transfer:

Command: "Transfer $100" (might fail - insufficient funds)
Event: "Transferred $100" (already happened, can't fail)

Remember: Once an event exists, it's true forever. That's why validation happens at the command level, before the database write. There's no "undo" in Event Sourcing. Only new compensating events.

This matches how businesses work in real life. You can't "undo" a bank transfer. You make a new transfer back. The history of both transactions remains.

Digging deeper: Storage growth and partitioning

We can't store all events in one place. It would be too slow and expensive. We need to partition the data.

Common Partitioning Strategies:

// By Aggregate ID (most common)
partition_key = accountId
account_123: [event1, event2, event3]
account_456: [event1, event2]
account_789: [event1]

// By Time (for analytics)
2024_03: [events...]
2024_02: [events...]
2024_01: [events...]

// By Event Type (less common)
deposits: [events...]
withdrawals: [events...]

We also have storage tiers. Hot for example is more expensive than cold:

Hot Storage (Recent/Active)
- Last 30 days
- Fast access
- In-memory/SSD

Warm Storage (Medium-term)
- Last 90 days
- Regular HDDs
- Slightly slower access

Cold Storage (Historical)
- Older events
- Cheap storage (S3)
- Slower access, but complete

Note: Unlike CDC, we can't use log compaction. Every event matters. But we can move older events to cheaper storage since they're accessed less frequently.

Remember: Partition key choice is critical. Related events must stay together to maintain consistency. For a bank account, all its transactions need to be in the same partition to calculate balance correctly.