Home / Articles / Mastering Redis Caching Strategies for Scalable APIs

Mastering Redis Caching Strategies for Scalable APIs

Backend

Written by: Anant

04.03.2026Software Engineer | Systems & Web

Caching is one of the highest-leverage performance tools in backend engineering, but it is also one of the easiest ways to accidentally introduce stale data, hidden bugs, and confusing production behavior. Redis is fast, flexible, and battle-tested, which makes it a common choice for API caching. The hard part is not connecting Redis to your application. The hard part is deciding what to cache, how long to cache it, when to invalidate it, and how to observe whether the cache is actually helping.

This article is a practical guide to designing Redis caching for scalable APIs. We will walk through the mental model, common caching patterns, TTL strategy, invalidation, cache stampede prevention, key naming, code examples, and operational metrics that matter in production.

Why Redis for Caching?

Redis is an in-memory data store. Reads and writes are usually much faster than a relational database because Redis keeps data in memory and uses simple access patterns. For API workloads, that means Redis can sit between your application and slower systems like PostgreSQL, MongoDB, third-party APIs, or expensive computation.

Redis is useful because it gives you:

Sub-millisecond reads for hot data
Built-in TTL support for automatic expiration
Atomic operations for counters, locks, and rate limits
Flexible data structures like strings, hashes, sets, lists, and sorted sets
Simple deployment options for standalone, replicated, or clustered setups

But Redis is not magic. A cache is a copy of data, and copies can become wrong. Good Redis architecture starts with knowing which correctness tradeoff your endpoint can tolerate.

When Should You Cache?

Do not cache just because an endpoint exists. Cache when there is a measurable reason.

Good candidates:

Expensive database queries
Repeated reads of the same data
Public or mostly stable resources
Personalized dashboards with tolerable freshness windows
Third-party API responses with rate limits or high latency
Computed summaries, leaderboards, feeds, and recommendations

Poor candidates:

Highly volatile data that must always be exact
Security-sensitive authorization decisions without careful invalidation
Low-traffic endpoints where cache complexity gives little value
Data that is cheap to fetch and rarely requested

A simple rule: cache data when the cost of occasionally serving slightly old data is lower than the cost of repeatedly recomputing or refetching it.

Cache-Aside Pattern

Cache-aside is the most common Redis pattern for APIs. The application controls the cache directly.

The flow is:

Read from Redis first
If Redis has the value, return it
If Redis misses, read from the database
Store the database result in Redis with a TTL
Return response

async function getUserProfile(userId: string) {
  const key = `user:${userId}:profile`;
  const cached = await redis.get(key);
  if (cached) {
    return JSON.parse(cached);
  }
  const profile = await db.user.findUnique({
    where: { id: userId },
  });
  if (!profile) {
    return null;
  }
  await redis.set(key, JSON.stringify(profile), {
    EX: 300,
  });
  return profile;
}

Cache-aside is easy to reason about because Redis is only used on reads. The database remains the source of truth.

The weakness is that the first request after a miss still pays the full database cost. For most APIs, that is acceptable. For very hot keys, you may need pre-warming or request coalescing.

Write-Through and Write-Behind

Cache-aside is read-focused. Write-through and write-behind are write-focused.

In write-through caching, every write goes to the database and cache together:

async function updateUserProfile(userId: string, input: UpdateProfileInput) {
  const profile = await db.user.update({
    where: { id: userId },
    data: input,
  });
  await redis.set(`user:${userId}:profile`, JSON.stringify(profile), {
    EX: 300,
  });
  return profile;
}

This keeps the cache warm after writes, but it adds complexity. If the database write succeeds and the Redis write fails, your system needs a fallback plan. Usually that fallback is simple: delete the cache key and allow the next read to repopulate it.

Write-behind is different. The application writes to Redis first, then persists to the database later through a background process. This can improve write speed, but it is risky for normal API data because Redis becomes part of the durability path. Use it carefully, usually for analytics, counters, or buffered events rather than critical business records.

TTL Strategy: Freshness Is a Product Decision

Different endpoints need different freshness windows:

Static reference data: 30 minutes to 24 hours
User dashboards: 1 to 5 minutes
Feed pages: 30 seconds to 5 minutes
Public article data: 10 minutes to 1 hour
Critical metrics: 5 to 60 seconds
Authorization or billing state: avoid caching, or cache for very short windows

TTL is not only a technical setting. It defines how stale your product is allowed to be.

Use short TTLs when:

Data changes frequently
Users expect immediate updates
Incorrect data can cause support issues
The endpoint affects money, access, or permissions

Use longer TTLs when:

Data changes rarely
Users tolerate eventual consistency
The endpoint is expensive and high traffic
The data is public or shared across many users

Avoid setting the same TTL everywhere. A global 300 seconds cache policy is easy, but it is rarely correct.

Invalidation Rules

Invalidation is where caching becomes real engineering.

TTL-based expiration is passive. It waits for time to pass. Invalidation is active. It removes or updates cached data when the source data changes.

For write-heavy domains, do not rely on TTL alone. If a user updates their profile, they expect the new profile to appear immediately. Waiting five minutes because the cache has not expired is a bad experience.

Common invalidation rules:

On write/update: invalidate related keys
On delete: remove dependent cache entries
On permission change: invalidate authorization-sensitive data
For aggregate data: use namespace versioning

async function invalidateUser(userId: string) {
  await redis.del(`user:${userId}:profile`);
  await redis.del(`user:${userId}:stats`);
  await redis.del(`user:${userId}:settings`);
}

The tricky part is aggregate data. Imagine a user updates their display name. You may need to invalidate:

user:{id}:profile
team:{teamId}:members
feed:{viewerId}:page:1
search:users:{query}

If you cannot reliably list every dependent key, use versioned namespaces.

async function getTeamMembers(teamId: string) {
  const version = await redis.get(`team:${teamId}:members:version`) ?? "1";
  const key = `team:${teamId}:members:v${version}`;
  const cached = await redis.get(key);
  if (cached) return JSON.parse(cached);
  const members = await db.teamMember.findMany({
    where: { teamId },
  });
  await redis.set(key, JSON.stringify(members), { EX: 600 });
  return members;
}
async function invalidateTeamMembers(teamId: string) {
  await redis.incr(`team:${teamId}:members:version`);
}

Instead of deleting every old key, you bump a version. New reads use the new namespace. Old keys naturally expire through TTL.

Avoiding Cache Stampede

When a hot key expires, many requests may hit DB at once.

A cache stampede happens when many requests try to rebuild the same missing key at the same time. This can overwhelm your database exactly when the cache was supposed to protect it.

Techniques that help:

Add jitter to TTL values
Use request coalescing/single-flight
Pre-warm critical keys in background jobs
Use soft expiration for hot data
Lock cache regeneration with short-lived Redis locks

TTL jitter prevents many keys from expiring at the same instant.

function ttlWithJitter(baseSeconds: number, jitterSeconds = 60) {
  return baseSeconds + Math.floor(Math.random() * jitterSeconds);
}
await redis.set(key, JSON.stringify(data), {
  EX: ttlWithJitter(300, 45),
});

For hot keys, you can use a short lock so only one request rebuilds the cache while others wait briefly or serve stale data.

async function getCachedReport(reportId: string) {
  const key = `report:${reportId}`;
  const lockKey = `${key}:lock`;
  const cached = await redis.get(key);
  if (cached) return JSON.parse(cached);
  const lock = await redis.set(lockKey, "1", {
    NX: true,
    EX: 10,
  });
  if (!lock) {
    await new Promise((resolve) => setTimeout(resolve, 100));
    const retry = await redis.get(key);
    if (retry) return JSON.parse(retry);
  }
  try {
    const report = await buildExpensiveReport(reportId);
    await redis.set(key, JSON.stringify(report), { EX: 300 });
    return report;
  } finally {
    await redis.del(lockKey);
  }
}

This pattern should be used sparingly. Locks add moving parts. Start with TTL jitter and only add locks for truly hot paths.

Recommended Key Design

Keep keys predictable and namespaced:

user:{id}:profile
tenant:{tenantId}:plan
feed:{userId}:page:{n}
article:{slug}:summary
repo:{owner}:{name}:stats

Good naming avoids collisions and simplifies invalidation.

Use consistent conventions:

Put the entity first: user:123:profile
Include tenant IDs in multi-tenant systems: tenant:abc:user:123:settings
Include versions for aggregates: team:42:members:v3
Include pagination parameters: feed:123:page:1:limit:20
Avoid storing raw JSON query strings as keys without normalization

Bad keys create operational pain. If you cannot guess what a key stores by reading it, invalidation and debugging will be harder.

What Should You Store?

For basic API caching, serialized JSON is fine.

await redis.set(key, JSON.stringify(response), { EX: 300 });

But Redis supports richer data structures:

Strings: JSON responses, tokens, feature flags
Hashes: object-like data where fields update independently
Sets: unique membership lists
Sorted sets: leaderboards, ranked feeds, time-ordered data
Lists/streams: lightweight queues and event flows

Do not force everything into JSON if Redis has a structure that better matches your access pattern. For example, a leaderboard is usually better as a sorted set than as one huge JSON blob.

Memory and Eviction Policy

Redis is fast because it uses memory. Memory is finite.

You need to understand:

How large your values are
How many keys you create
How long keys live
What Redis should do when memory is full

Common eviction policies include:

noeviction: writes fail when memory is full
allkeys-lru: evict least recently used keys from all keys
volatile-lru: evict least recently used keys only among keys with TTL
allkeys-lfu: evict least frequently used keys from all keys

For a general-purpose API cache, allkeys-lru or allkeys-lfu is often reasonable, but the best choice depends on whether Redis is only a cache or also stores durable-ish operational data like queues, locks, and sessions.

If Redis stores both cache data and important coordination data, separate them by database, instance, or cluster. Do not let a response cache evict critical locks or session state.

Observability Checklist

Track these metrics:

Cache hit ratio
Evictions
Memory usage
P95/P99 latency
DB queries per request (before vs after caching)
Redis command errors
Hot keys
Key count by namespace
Average serialized value size

Also add application-level logging around cache misses for important paths. A sudden drop in hit ratio can point to a deployment bug, key naming change, broken serialization, or accidental invalidation loop.

Useful questions:

Did the cache reduce database load?
Did P95 latency improve?
Are we serving stale data beyond the expected window?
Are evictions happening under normal traffic?
Which keys are hottest?
Which namespaces consume the most memory?

If you cannot answer these questions, you are operating the cache blindly.

Production Checklist

Before shipping Redis caching broadly, check these items:

Every cached endpoint has a clear freshness expectation
Keys are namespaced and predictable
TTLs are endpoint-specific
Writes invalidate or update related cache entries
Hot keys have stampede protection if needed
Cached values have a bounded size
Redis memory usage and evictions are monitored
Cache failures degrade gracefully
The database remains the source of truth

Graceful degradation matters. If Redis goes down, your entire API should not go down automatically. For many endpoints, the fallback should be reading from the database directly.

async function safeGetJson<T>(key: string): Promise<T | null> {
  try {
    const value = await redis.get(key);
    return value ? JSON.parse(value) : null;
  } catch (error) {
    console.error("Redis read failed", { key, error });
    return null;
  }
}

This is not an excuse to ignore Redis failures. It is a way to keep your product alive while alerts and dashboards tell you what broke.

Final Takeaway

Redis caching works best when it is designed as part of the system, not sprinkled on top after an endpoint becomes slow. Start with cache-aside for expensive read paths. Pick TTLs based on product freshness requirements. Invalidate aggressively where users expect immediate updates. Add jitter and stampede protection for hot keys. Monitor hit ratio, memory, evictions, and database load so you can prove the cache is helping.

The best cache is boring: predictable keys, understandable rules, clear fallbacks, and metrics that tell the truth.