Sessions

Meet the talks!

Rate Limiting Done Right

Mario Slatinac - Infinum / Matej Nedić - Infinum

Nowadays most systems we build are distributed, and the core issues between teams remain the same. Scaling, dependencies, and coupling always mean that a problem in one service can cause downtime in another. This is exactly why rate limiting has become a must. It protects your service from sudden surcharges of calls, keeps critical paths stable, and gives you predictable behavior even when something on the outside goes wrong. If you’re running a multi tenant system, it also becomes the basic tool for fairness so one noisy client doesn’t starve everyone else.

In this talk we will look at how to implement API rate limiting in Spring in a way that actually works in real systems. When should you enable it and when does it only add overhead? Do you treat every endpoint the same or do you apply more fine grained rules? How do you decide when to drop, delay, or shape traffic? And how do you deal with distributed state when your limits have to work across multiple nodes?

We will go through advanced techniques, fairness distribution, and the challenges of consistent rate limiting in large architectures. The goal is to give you practical tools to keep your system healthy, stable, and predictable, even when everything around it isn’t.