Home / Articles / Mastering Redis Caching Strategies for Scalable APIs
Mastering Redis Caching Strategies for Scalable APIs
Backend
Written by: Anant
Software Engineer | Systems & Web
Caching is one of the highest-leverage performance tools in backend engineering, but it is also one of the easiest ways to accidentally introduce stale data, hidden bugs, and confusing production behavior. Redis is fast, flexible, and battle-tested, which makes it a common choice for API caching. The hard part is not connecting Redis to your application. The hard part is deciding what to cache, how long to cache it, when to invalidate it, and how to observe whether the cache is actually helping.
This article is a practical guide to designing Redis caching for scalable APIs. We will walk through the mental model, common caching patterns, TTL strategy, invalidation, cache stampede prevention, key naming, code examples, and operational metrics that matter in production.
Why Redis for Caching?
Redis is an in-memory data store. Reads and writes are usually much faster than a relational database because Redis keeps data in memory and uses simple access patterns. For API workloads, that means Redis can sit between your application and slower systems like PostgreSQL, MongoDB, third-party APIs, or expensive computation.
Redis is useful because it gives you:
- Sub-millisecond reads for hot data
- Built-in TTL support for automatic expiration
- Atomic operations for counters, locks, and rate limits
- Flexible data structures like strings, hashes, sets, lists, and sorted sets
- Simple deployment options for standalone, replicated, or clustered setups
But Redis is not magic. A cache is a copy of data, and copies can become wrong. Good Redis architecture starts with knowing which correctness tradeoff your endpoint can tolerate.
When Should You Cache?
Do not cache just because an endpoint exists. Cache when there is a measurable reason.
Good candidates:
- Expensive database queries
- Repeated reads of the same data
- Public or mostly stable resources
- Personalized dashboards with tolerable freshness windows
- Third-party API responses with rate limits or high latency
- Computed summaries, leaderboards, feeds, and recommendations
Poor candidates:
- Highly volatile data that must always be exact
- Security-sensitive authorization decisions without careful invalidation
- Low-traffic endpoints where cache complexity gives little value
- Data that is cheap to fetch and rarely requested
A simple rule: cache data when the cost of occasionally serving slightly old data is lower than the cost of repeatedly recomputing or refetching it.
Cache-Aside Pattern
Cache-aside is the most common Redis pattern for APIs. The application controls the cache directly.
The flow is:
- Read from Redis first
- If Redis has the value, return it
- If Redis misses, read from the database
- Store the database result in Redis with a TTL
- Return response
async function getUserProfile(userId: string) { const key = `user:${userId}:profile`; const cached = await redis.get(key); if (cached) { return JSON.parse(cached); } const profile = await db.user.findUnique({ where: { id: userId }, }); if (!profile) { return null; } await redis.set(key, JSON.stringify(profile), { EX: 300, }); return profile;}Cache-aside is easy to reason about because Redis is only used on reads. The database remains the source of truth.
The weakness is that the first request after a miss still pays the full database cost. For most APIs, that is acceptable. For very hot keys, you may need pre-warming or request coalescing.
Write-Through and Write-Behind
Cache-aside is read-focused. Write-through and write-behind are write-focused.
In write-through caching, every write goes to the database and cache together:
async function updateUserProfile(userId: string, input: UpdateProfileInput) { const profile = await db.user.update({ where: { id: userId }, data: input, }); await redis.set(`user:${userId}:profile`, JSON.stringify(profile), { EX: 300, }); return profile;}This keeps the cache warm after writes, but it adds complexity. If the database write succeeds and the Redis write fails, your system needs a fallback plan. Usually that fallback is simple: delete the cache key and allow the next read to repopulate it.
Write-behind is different. The application writes to Redis first, then persists to the database later through a background process. This can improve write speed, but it is risky for normal API data because Redis becomes part of the durability path. Use it carefully, usually for analytics, counters, or buffered events rather than critical business records.
TTL Strategy: Freshness Is a Product Decision
Different endpoints need different freshness windows:
- Static reference data: 30 minutes to 24 hours
- User dashboards: 1 to 5 minutes
- Feed pages: 30 seconds to 5 minutes
- Public article data: 10 minutes to 1 hour
- Critical metrics: 5 to 60 seconds
- Authorization or billing state: avoid caching, or cache for very short windows
TTL is not only a technical setting. It defines how stale your product is allowed to be.
Use short TTLs when:
- Data changes frequently
- Users expect immediate updates
- Incorrect data can cause support issues
- The endpoint affects money, access, or permissions
Use longer TTLs when:
- Data changes rarely
- Users tolerate eventual consistency
- The endpoint is expensive and high traffic
- The data is public or shared across many users
Avoid setting the same TTL everywhere. A global 300 seconds cache policy is easy, but it is rarely correct.
Invalidation Rules
Invalidation is where caching becomes real engineering.
TTL-based expiration is passive. It waits for time to pass. Invalidation is active. It removes or updates cached data when the source data changes.
For write-heavy domains, do not rely on TTL alone. If a user updates their profile, they expect the new profile to appear immediately. Waiting five minutes because the cache has not expired is a bad experience.
Common invalidation rules:
- On write/update: invalidate related keys
- On delete: remove dependent cache entries
- On permission change: invalidate authorization-sensitive data
- For aggregate data: use namespace versioning
async function invalidateUser(userId: string) { await redis.del(`user:${userId}:profile`); await redis.del(`user:${userId}:stats`); await redis.del(`user:${userId}:settings`);}The tricky part is aggregate data. Imagine a user updates their display name. You may need to invalidate:
user:{id}:profileteam:{teamId}:membersfeed:{viewerId}:page:1search:users:{query}
If you cannot reliably list every dependent key, use versioned namespaces.
async function getTeamMembers(teamId: string) { const version = await redis.get(`team:${teamId}:members:version`) ?? "1"; const key = `team:${teamId}:members:v${version}`; const cached = await redis.get(key); if (cached) return JSON.parse(cached); const members = await db.teamMember.findMany({ where: { teamId }, }); await redis.set(key, JSON.stringify(members), { EX: 600 }); return members;}async function invalidateTeamMembers(teamId: string) { await redis.incr(`team:${teamId}:members:version`);}Instead of deleting every old key, you bump a version. New reads use the new namespace. Old keys naturally expire through TTL.
Avoiding Cache Stampede
When a hot key expires, many requests may hit DB at once.
A cache stampede happens when many requests try to rebuild the same missing key at the same time. This can overwhelm your database exactly when the cache was supposed to protect it.
Techniques that help:
- Add jitter to TTL values
- Use request coalescing/single-flight
- Pre-warm critical keys in background jobs
- Use soft expiration for hot data
- Lock cache regeneration with short-lived Redis locks
TTL jitter prevents many keys from expiring at the same instant.
function ttlWithJitter(baseSeconds: number, jitterSeconds = 60) { return baseSeconds + Math.floor(Math.random() * jitterSeconds);}await redis.set(key, JSON.stringify(data), { EX: ttlWithJitter(300, 45),});For hot keys, you can use a short lock so only one request rebuilds the cache while others wait briefly or serve stale data.
async function getCachedReport(reportId: string) { const key = `report:${reportId}`; const lockKey = `${key}:lock`; const cached = await redis.get(key); if (cached) return JSON.parse(cached); const lock = await redis.set(lockKey, "1", { NX: true, EX: 10, }); if (!lock) { await new Promise((resolve) => setTimeout(resolve, 100)); const retry = await redis.get(key); if (retry) return JSON.parse(retry); } try { const report = await buildExpensiveReport(reportId); await redis.set(key, JSON.stringify(report), { EX: 300 }); return report; } finally { await redis.del(lockKey); }}This pattern should be used sparingly. Locks add moving parts. Start with TTL jitter and only add locks for truly hot paths.
Recommended Key Design
Keep keys predictable and namespaced:
user:{id}:profiletenant:{tenantId}:planfeed:{userId}:page:{n}article:{slug}:summaryrepo:{owner}:{name}:stats
Good naming avoids collisions and simplifies invalidation.
Use consistent conventions:
- Put the entity first:
user:123:profile - Include tenant IDs in multi-tenant systems:
tenant:abc:user:123:settings - Include versions for aggregates:
team:42:members:v3 - Include pagination parameters:
feed:123:page:1:limit:20 - Avoid storing raw JSON query strings as keys without normalization
Bad keys create operational pain. If you cannot guess what a key stores by reading it, invalidation and debugging will be harder.
What Should You Store?
For basic API caching, serialized JSON is fine.
await redis.set(key, JSON.stringify(response), { EX: 300 });But Redis supports richer data structures:
- Strings: JSON responses, tokens, feature flags
- Hashes: object-like data where fields update independently
- Sets: unique membership lists
- Sorted sets: leaderboards, ranked feeds, time-ordered data
- Lists/streams: lightweight queues and event flows
Do not force everything into JSON if Redis has a structure that better matches your access pattern. For example, a leaderboard is usually better as a sorted set than as one huge JSON blob.
Memory and Eviction Policy
Redis is fast because it uses memory. Memory is finite.
You need to understand:
- How large your values are
- How many keys you create
- How long keys live
- What Redis should do when memory is full
Common eviction policies include:
noeviction: writes fail when memory is fullallkeys-lru: evict least recently used keys from all keysvolatile-lru: evict least recently used keys only among keys with TTLallkeys-lfu: evict least frequently used keys from all keys
For a general-purpose API cache, allkeys-lru or allkeys-lfu is often reasonable, but the best choice depends on whether Redis is only a cache or also stores durable-ish operational data like queues, locks, and sessions.
If Redis stores both cache data and important coordination data, separate them by database, instance, or cluster. Do not let a response cache evict critical locks or session state.
Observability Checklist
Track these metrics:
- Cache hit ratio
- Evictions
- Memory usage
- P95/P99 latency
- DB queries per request (before vs after caching)
- Redis command errors
- Hot keys
- Key count by namespace
- Average serialized value size
Also add application-level logging around cache misses for important paths. A sudden drop in hit ratio can point to a deployment bug, key naming change, broken serialization, or accidental invalidation loop.
Useful questions:
- Did the cache reduce database load?
- Did P95 latency improve?
- Are we serving stale data beyond the expected window?
- Are evictions happening under normal traffic?
- Which keys are hottest?
- Which namespaces consume the most memory?
If you cannot answer these questions, you are operating the cache blindly.
Production Checklist
Before shipping Redis caching broadly, check these items:
- Every cached endpoint has a clear freshness expectation
- Keys are namespaced and predictable
- TTLs are endpoint-specific
- Writes invalidate or update related cache entries
- Hot keys have stampede protection if needed
- Cached values have a bounded size
- Redis memory usage and evictions are monitored
- Cache failures degrade gracefully
- The database remains the source of truth
Graceful degradation matters. If Redis goes down, your entire API should not go down automatically. For many endpoints, the fallback should be reading from the database directly.
async function safeGetJson<T>(key: string): Promise<T | null> { try { const value = await redis.get(key); return value ? JSON.parse(value) : null; } catch (error) { console.error("Redis read failed", { key, error }); return null; }}This is not an excuse to ignore Redis failures. It is a way to keep your product alive while alerts and dashboards tell you what broke.
Final Takeaway
Redis caching works best when it is designed as part of the system, not sprinkled on top after an endpoint becomes slow. Start with cache-aside for expensive read paths. Pick TTLs based on product freshness requirements. Invalidate aggressively where users expect immediate updates. Add jitter and stampede protection for hot keys. Monitor hit ratio, memory, evictions, and database load so you can prove the cache is helping.
The best cache is boring: predictable keys, understandable rules, clear fallbacks, and metrics that tell the truth.