top of page

Scaling Identity to Millions: Session, Token, and Cache Design in Keycloak

  • Mayank Soni
  • May 5
  • 4 min read

Primary Audience: Platform Engineers building identity infrastructure, Security architects designing authentication systems, Teams operating large-scale SaaS or financial systems.

CNCF Alignment: Kubernetes, HA, Multi-RegionAbstract

At scale, identity systems are defined by how sessions are managed, tokens are designed, and validation is performed. In large Keycloak deployments, these decisions directly impact performance, availability, and security. This article explores practical patterns for handling high concurrency, focusing on token lifecycle, validation strategies, and distributed cache behavior required to support millions of active sessions.


Introduction

As identity platforms grow, they evolve from simple authentication services into distributed systems. In Keycloak deployments operating at scale, session state, token refresh cycles, and validation models continuously generate load across the system.

These elements are tightly connected. Token lifetime affects cache usage, validation strategy defines system dependencies, and session control determines how effectively the system responds to security and compliance events. Designing these components in isolation leads to instability under load. Designing them together enables systems to scale predictably.


Token Design and System Load

Token configuration determines how frequently Keycloak is involved after initial authentication. A common approach uses short-lived access tokens with longer-lived refresh tokens. This limits the exposure of compromised tokens while increasing the number of refresh operations.

This trade-off directly affects system behavior. Shorter token lifetimes improve security but increase pressure on cache and refresh mechanisms. Longer lifetimes reduce operational load but make it harder to invalidate sessions during incidents. At scale, token lifetime becomes a performance decision as much as a security one, shaping how much traffic the identity system must handle continuously.


Validation Strategy and Dependency

Keycloak supports both introspection-based validation and local token validation. With introspection, every validation request depends on Keycloak, placing it in the critical path of all service interactions. This creates a strong dependency that can affect overall system availability.

Local validation removes this dependency by allowing services to validate tokens independently. This approach improves resilience, as services can continue operating even when the identity system is under stress or temporarily unavailable. The choice between these models determines whether Keycloak acts as a runtime dependency or as a distributed authority.


Session State and Cache Design

At high concurrency, session state must be handled through a distributed cache. Keycloak relies on this cache to maintain consistency across nodes while supporting large volumes of active sessions.

Cache sizing is a key factor in performance. A small cluster of well-provisioned nodes can support millions of sessions, depending on session timeout configuration. Session duration directly impacts memory usage, making timeout tuning an important control for managing resource consumption.

In multi-region deployments, session data is replicated asynchronously. This introduces replication lag, which must be monitored because it affects session validation. At scale, even small delays can lead to inconsistencies, making cache behavior a central part of system performance.


Session Revocation and Control

Session revocation is essential in regulated environments where access must be terminated immediately in response to events such as fraud or account closure. In Keycloak, this is typically handled using a revocation list stored in the cache, which is checked during session validation.

The reliability of this mechanism is critical. Failures in revocation, whether due to cache issues or misconfiguration, can allow invalid sessions to persist. This makes revocation not just a system feature but a compliance requirement, requiring continuous validation as part of operations.


Device and Session Context

Keycloak supports device-based patterns such as session binding and silent authentication. Binding sessions to specific devices or contexts improves security by reducing the risk of session misuse. At the same time, it introduces practical constraints.

Strict binding can conflict with real-world usage, such as shared networks or multi-device access. Device-based authentication also requires careful handling of registration, credential storage, and revocation conditions. These patterns improve control but must be balanced against usability.


Performance Under High Concurrency

At scale, identity systems must handle constant activity, including token refresh, session validation, and cache operations. Performance depends on how efficiently these processes are distributed and how much load is placed on Keycloak.

Short token lifetimes increase refresh frequency, while validation strategy determines how often services depend on the identity system. Cache replication adds further complexity in distributed environments. These factors interact, and small configuration changes can significantly affect system behavior under load.


Observability Signals

Performance issues often appear first as increases in latency. Monitoring token issuance, validation latency, and authentication outcomes provides early insight into system stress.

These signals help detect issues such as cache delays or increased load before they impact users. In distributed identity systems, timely visibility is essential for maintaining reliability at scale.


Conclusion

Scaling Keycloak to support millions of sessions requires careful coordination between token design, validation strategy, and cache architecture. These components are interdependent, and decisions in one area affect the entire system.

Reliable systems are built by aligning these elements with expected load and operational requirements. When designed together, they enable identity platforms to scale efficiently while maintaining performance, control, and resilience.


Writer's Overview

Mayank Soni – DevSecOps Specialist  

Mayank leads DevSecOps initiatives at Midships, driving platform-level automation, CI/CD pipeline optimization, and secure infrastructure delivery. With a dual role as contributor and team lead, he focuses on scalable deployment strategies, cost-efficient infrastructure, and faster feedback loops across cloud environments. His hands-on experience spans infrastructure, security, and application layers—including building and deploying full-stack services using modern cloud-native architectures.

Short bio: Mayank is a DevSecOps expert with 6+ years of experience across AWS, Azure, and GCP. He specializes in infrastructure automation, secure CI/CD pipelines, and observability systems. With a strong foundation in both cloud engineering and application development, he supports Midships’ cloud transformation across Southeast Asia.

Comments


Stronger Identity, Happier Customers.

Ready to modernize your identity infrastructure?

Let's secure your growth together.

Explore more resources
bottom of page