Demystifying Latency and Throughput in System Design

Demystifying Latency and Throughput in System Design


In the world of system design, two critical performance metrics often come into play: latency and throughput.

Knowing about these metrics is important for making and improving efficient systems, especially when thinking about real-world uses.


Latency refers to the time it takes for data to travel from one point to another. It’s the delay between a request being initiated and the response being received.

For instance, in a video game, latency is the time from when you perform an action (like shooting a bullet) to when you see the result on your screen. Lower latency is important for real-time applications because they need a quick response.

Examples to understand latency in different contexts:

  • Reading 1MB of data sequentially from memory typically takes about 250 microseconds.

  • The same action from an SSD takes about 1 millisecond (1,000 microseconds), which is 4 times slower than from memory.

  • Transferring 1MB of data over a 1 Gbps network link takes around 10,000 microseconds.

  • A packet's round trip time from California to the Netherlands and back is roughly 150 milliseconds (150,000 microseconds).

These examples show that latency can change a lot depending on the medium and distance.


Throughput, on the other hand, measures how much data can be transferred from one point to another within a given time frame. It’s about volume and capacity, essentially answering the question, "How much data can we move at once?"

This is particularly relevant in scenarios where large volumes of data need to be processed or transferred.

For example:

  • A 1 Gbps (Gigabit per second) link can transfer 1 billion bits per second.

Throughput is an important measurement in applications and networks that deal with a lot of data. It shows how much data can be processed or moved in a short time. High throughput systems can manage more requests or transfer more data, which is important for services like streaming platforms, data centers, and big web applications.

Real-World Implications

Knowing the balance between latency and throughput is important in system design.

For example, an online video streaming service needs low latency for fast video start times and high throughput to show high-quality content without problems.