What is a Circuit Breaker?

Just a guy who loves to write code and watch anime.
Introduction
Circuit Breakers protect services from each other. When one service calls another, things can go wrong. The other service might be slow. It might be down. It might be struggling. We need a way to handle this gracefully.
Think of calling another service. Maybe it's an API for weather data. Or a payment service. Or a database. We make a request and wait for an answer. But what if that answer never comes? What if the service is stuck? We can't wait forever.
That's where Circuit Breakers come in. They watch for problems. At first, they let all requests through. This is the "closed" state, like a closed circuit letting electricity flow. Everything works normally here.
States
But problems start happening. Maybe the service is responding with errors. Maybe it's not responding at all. The Circuit Breaker counts these failures. Three failures? Five failures? We decide what's too many. After too many failures, something's clearly wrong.
Now the Circuit Breaker "opens". Like flipping a switch to protect your house from an electrical surge. When open, it stops all requests immediately. It doesn't even try to call the failing service. This is called "fail fast". Better to fail quickly than leave our users waiting. We can fallback to a different strategy.
But we can't stay closed forever. Maybe the service fixed itself? The Circuit Breaker waits a bit (say 30 seconds), then tries again. This is the "half-open" state. It's like testing the water with your toe before jumping in.
Nuance to be aware of
Here's where it gets interesting. Sometimes services don't fail cleanly. They just... hang. Never responding. This is actually worse than a quick error! Our resources are stuck waiting. So we add timeouts. If a service takes too long (maybe 5 seconds), we count it as a failure. For really bad timeouts, we might even open the circuit immediately. If a service is so slow it's timing out, something is seriously wrong.
Different operations need different rules. Reading data might be quick, timeout after 2 seconds. Writing data might be slower, maybe wait 5 seconds. Bulk operations might need even longer. The Circuit Breaker adapts to each case.
This isn't just for web browsers calling APIs. APIs call other APIs. Services call databases. Everyone calls everyone. Circuit Breakers protect this whole chain. When one service starts failing, Circuit Breakers prevent a cascade of failures through the whole system.
This is why big systems stay reliable. One piece can fail without bringing everything down. Circuit Breakers give failing services time to recover. They give quick answers to users, even if those answers are "sorry, try later". And they protect resources from being wasted on calls that will probably fail anyway.
Remember: Systems will fail. Good systems fail gracefully. Circuit Breakers help make that happen.
No single rule
Depending on the system you're building, you might want different thresholds for different types of operations.
For reading account balances: Maybe allow 50% error rate over 100 requests
For processing trades: Maybe only 5% error rate over 20 requests
Stateful across system
Circuit breakers should be stateful across your entire system. If you have multiple servers calling the same downstream service, they should share circuit breaker state. Otherwise, each server learns about failures independently, which isn't efficient!






