00:00

What is Latency in System Design?

In simple words, latency is the time it takes for a system to respond to a request. It measures the delay between when a user sends a request and when the user receives a response. Lower latency means a faster and smoother user experience, while higher latency means noticeable delays.

Latency is usually measured in milliseconds (ms). Even a small delay can make an application feel slow, especially in real-time systems like online payments, gaming, or video calls.

Real-Life Example of Latency

Imagine you are ordering food using a mobile app:

  • You tap the "Place Order" button.
  • The app sends your request to the server.
  • The server processes the order and sends a response back.

If the confirmation message appears instantly, the latency is low. If it takes 4–5 seconds to show the confirmation, the latency is high.

Even if the order is successful, high latency can make users feel that the app is slow or unreliable.

Technical Example in System Design

Consider an e-commerce website:

  • The user searches for a product.
  • The request travels from the browser to the backend server.
  • The server fetches data from the database.
  • The result is sent back to the user.

If this entire process takes 200 milliseconds, the latency is good. If it takes 2 seconds, users may feel frustrated and may leave the website.

This is why system designers focus heavily on reducing latency by using caching, faster databases, load balancers, and content delivery networks (CDNs).

Why Latency Matters

  • Improves user experience
  • Increases customer satisfaction
  • Reduces user drop-off
  • Is critical for real-time systems like trading, gaming, and video streaming

Latency vs Speed (Simple Understanding)

Speed refers to how fast data can be transferred. Latency refers to how long it takes before the transfer even starts.

A system can have high speed but still feel slow if latency is high.

Summary

Latency in system design means the delay between a request and its response. It directly affects how fast and responsive an application feels to users. Low latency systems provide smooth and quick interactions, while high latency systems cause delays and frustration.

For modern applications, reducing latency is just as important as adding new features. A fast, responsive system always creates a better user experience.