12 min readOct 30, 2025
–
Press enter or click to view image in full size
Introduction
As the internet grows faster and faster, demand for web applications and their usage has gone higher and higher. Web users don’t have any kind of patience, and they actually don’t have any time to waste on a loading web page, at least for a few seconds. Even milliseconds delay can cause an impact on user satisfaction, and that would be one of the leading factors in any business. To overcome these problems and meet these expectations, as web developers, we often try to minimize repeated database queries and slow I/O operations. But that is one of the biggest application performance bottlenecks. When it comes to large enterprise-level applications, not just .NET applications, all sorts of ap…
12 min readOct 30, 2025
–
Press enter or click to view image in full size
Introduction
As the internet grows faster and faster, demand for web applications and their usage has gone higher and higher. Web users don’t have any kind of patience, and they actually don’t have any time to waste on a loading web page, at least for a few seconds. Even milliseconds delay can cause an impact on user satisfaction, and that would be one of the leading factors in any business. To overcome these problems and meet these expectations, as web developers, we often try to minimize repeated database queries and slow I/O operations. But that is one of the biggest application performance bottlenecks. When it comes to large enterprise-level applications, not just .NET applications, all sorts of applications face this challenge. When thousands of concurrent users request the same data, the database quickly becomes a performance choke point. Then what will be the result? Slower response times, reduced throughput, and higher operational costs.
This is where caching comes into the picture to save the day. With caching, what happens is, it temporarily stores frequently accessed data inside the in-memory or fast cache components like Redis. That reduces repetitive data retrieval and dramatically boosts application responsiveness. Among the available caching solutions, Redis stands out as a powerful choice for its speed, flexibility, and scalability.
In this article, we’ll explore how developers can leverage in-memory and distributed caching to build scalable, high-performance systems. You’ll learn where caching fits in a system’s architecture, how to implement it in .NET, and which strategies to adopt for optimal performance in real-world scenarios.
Understand the caching in a software system
What is caching
Caching is the process of temporarily storing frequently accessed data in a faster storage medium than a database. Normally, it would be memory, so that future requests for the same data can be served more quickly from memory without calling the database.
Instead of repeatedly fetching information from a slower source such as a database or an external API, our application will fetch it from the cache, which is more faster. Even though this is a simple concept, it can have a huge impact on performance. Think of it this way. Like remembering an answer you’ve already learned, rather than re-reading the entire textbook every time, you recall it instantly from memory. Similarly, a cache helps your system “remember” data it has already retrieved.
In .NET applications, caching can occur in several forms, like,
- In-memory caching using
IMemoryCache - Distributed caching with technologies like Redis that store data outside the application process, then it accessible across multiple servers.
Why Caching Improves Scalability
Scalability isn’t just about adding more and more servers. It’s about handling increased load efficiently. That’s where caching directly contributes to scalability by reducing redundant work and optimizing data retrieval paths. Here’s how that’s happening:
- Reduces Database Load Cached data means fewer database queries, minimizing read-heavy traffic, and freeing up database resources for more critical operations.
- Improves Response Time Retrieving data from memory takes microseconds, compared to milliseconds or seconds when fetching from a remote database or API.
- Supports High Concurrency By offloading repeated reads to a cache, the system can serve thousands of users simultaneously without overwhelming the backend and database.
- Lowers Infrastructure Costs Less database activity and fewer compute-related operations mean you can scale horizontally without a proportional increase in costs.
- Enhances User Experience Faster response times create smoother, more responsive user interactions, which directly translates to higher satisfaction and retention.
Where caching can be applied in a system
Caching isn’t limited to a single layer of our application. A well-designed architecture applies caching at multiple layers that each address specific performance challenges. Here’s a breakdown of where caching can be applied in .NET applications or any other applications:
- Client-Side (Browser/App)
- Static assets (HTML, CSS, JS), API responses
- Browser cache, Service Workers, LocalStorage
- Reduce server calls and improve page load speed
2. CDN / Edge Layer
- Images, videos, static content, API responses
- Cloudflare, AWS CloudFront, Azure CDN
- Deliver content faster from servers near users
3. Application Layer
- Frequently accessed or computed data
IMemoryCache,IDistributedCache, Redis- Reduce database hits and speed up logic execution
4. Database Layer
- Query results, lookups, static tables
- Redis, Memcached, EF Core 2nd-Level Cache
- Minimize expensive queries
5. External Service Layer
- Third-party API responses, tokens
- Redis, API Gateway Cache
- Reduce dependency latency and API costs
However, we will mainly focus on the application layer caching in the article.
Caching in .NET systems
The .NET as framework provides powerful built-in features for caching through two primary mechanisms. That would be In-Memory Caching and Distributed Caching. The difference shows mainly in where and how the cached data is stored.
Let’s talk about each approach and see how they can be implemented in real-world .NET applications.
- In-memory caching
Here, it stores data directly in the memory (RAM) of the application’s server process. It’s the fastest form of caching since it avoids network calls and external dependencies. It just has to read from the same application server. In .NET, this can be done through the IMemoryCache interface provided by the Microsoft.Extensions.Caching.Memory namespace.
This method is ideal for small-scale or single-instance applications where high-speed data retrieval is needed, and data doesn’t have to be shared across multiple servers. Because we can’t do it that way. That’s a limitation we have to face. And another thing is that if the server goes down, the cache memory will also be lost, and it doesn’t work with cloud and a load-balanced environment.
Example: Using IMemoryCache in a .NET Service
Let’s assume you’re building a product API, and you want to cache product details for 5 minutes after the first retrieval.
using Microsoft.Extensions.Caching.Memory;public class ProductService{ private readonly IMemoryCache _cache; private readonly ProductRepository _repository; public ProductService(IMemoryCache cache, ProductRepository repository) { _cache = cache; _repository = repository; } public Product GetProductById(int id) { // Try to get from cache if (_cache.TryGetValue($"product_{id}", out Product cachedProduct)) { return cachedProduct; } // If not found, fetch from database var product = _repository.GetProductById(id); // Store in cache for 5 minutes _cache.Set($"product_{id}", product, TimeSpan.FromMinutes(5)); return product; }}
And then we need to register the cache service inside the program.cs like this:
builder.Services.AddMemoryCache();
2. Distributed caching
This stores data in an external cache that can be shared among multiple servers or instances. This approach is critical for cloud-native or load-balanced applications, where requests can be handled by any node in a cluster. Even though it’s an external cache, it’s gonna be faster than a database query.
In .NET, this is achieved through the IDistributedCache interface, which supports different providers such as:
- Redis (most common)
- SQL Server
- NCache
- Custom providers
Among these, Redis is the most popular due to its speed, scalability, and rich feature set.
Example: Using Redis as a Distributed Cache in .NET
First, you need to install the Redis package, and you can use this command.
dotnet add package Microsoft.Extensions.Caching.StackExchangeRedis
Then you need to configure the Redis instance inside the program.cs
builder.Services.AddStackExchangeRedisCache(options =>{ options.Configuration = "localhost:6379"; // Replace with your Redis connection string options.InstanceName = "MyApp_";});
Use IDistributedCache in Your Service
using Microsoft.Extensions.Caching.Distributed;using System.Text.Json;public class ProductService{ private readonly IDistributedCache _cache; private readonly ProductRepository _repository; public ProductService(IDistributedCache cache, ProductRepository repository) { _cache = cache; _repository = repository; } public async Task<Product> GetProductByIdAsync(int id) { string cacheKey = $"product_{id}"; // Try to get cached data var cachedData = await _cache.GetStringAsync(cacheKey); if (cachedData != null) { return JsonSerializer.Deserialize<Product>(cachedData)!; } // Fetch from DB if not in cache var product = await _repository.GetProductByIdAsync(id); // Cache the data var options = new DistributedCacheEntryOptions { AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(5) }; await _cache.SetStringAsync( cacheKey, JsonSerializer.Serialize(product), options); return product; }}
With distributed caching, we can share the data across multiple app instances, and data can persist even if one app node restarts or goes down. Therefore, this is ideal for distributed systems and cloud deployments. And it’s scalable as well since Redis supports clustering and replication. However, this will be slightly slower than in-memory caching (network call overhead), requires additional infrastructure (Redis or SQL Server), and must handle serialization and network failures.
Common Caching Patterns in .NET Applications
Caching isn’t just about where you store data. It’s about how and when you store and invalidate it. Over the past few years, several caching patterns have been introduced that help .NET developers design high performance, consistent, and scalable cache behavior.
Let’s explore the most widely used caching patterns in .NET one by one.
- Cache-Aside Pattern (Lazy Loading)
The Cache-Aside (or Lazy Loading) pattern is the most common approach we use in .NET applications. In this pattern, the application first checks the cache before hitting the database. If the requested data isn’t found, it’s fetched from the data source and stored that data in the cache, and then returned to the client.
Press enter or click to view image in full size
This is very simple to implement and easy to manage expiration. Here, we store data in cache that is actually needed. Not all data will be stored in cache. There will be some disadvantages as well, like the first request after cache expiration will still hit the database, and the risk of “cache stampede” if many users request the same missing data simultaneously
2. Write-Through / Write-Behind
Get Kushan Madhusanka’s stories in your inbox
Join Medium for free to get updates from this writer.
These patterns control how updates are happening between the cache and the original data store.
Write-Through
Every time data is created or updated, the application writes to both the cache and the database simultaneously. This ensures that the cache is always up to date. So, the cache is always consistent with the database, and read operations always return fresh data. But with this, write latency is slightly higher (two writes per operation), and unnecessary data will also be stored in cache.
Write-Behind (Write-Back)
In this approach, data is first updated in the cache and later asynchronously saved to the database through a background job or message queue. This way, we can have faster write operations since database writes are happening separately, and it reduces load on the database under heavy write traffic. But there will be a huge risk of data loss if the cache fails before syncing data into the database. And this is more complex to implement since we need a queue or a background processor.
3. Hybrid Caching
Hybrid caching means it combines in-memory caching (for extra-fast local reads) and distributed caching (for shared consistency across servers). For example, we can use IMemoryCache for per-instance short-term lookups and use Redis as a shared distributed cache across multiple app instances. So, this is best for high-traffic applications that require fast reads with consistent shared state (e.g., e-commerce sites).
Press enter or click to view image in full size
4. Output and Response Caching
Output Caching stores the entire response of a controller action or API endpoint. That way, the next request can be served instantly without re-executing any logic or database calls.
In .NET 8 Minimal APIs / MVC
app.MapGet("/products", async (IProductService service) =>{ return await service.GetAllProductsAsync();}).CacheOutput(policy => policy.Expire(TimeSpan.FromMinutes(5)));
This will massively improve the performance for read-heavy endpoints and reduce CPU and I/O load. But not suitable for highly dynamic or user-specific data.
Handling Cache–Database Inconsistency [Important]
Caching can improve performance, but it also introduces several challenges, like data inconsistency. This happens when the cache and the original database get out of sync. For example, the database gets updated, but the cache still holds the old data. If not handled properly, users might see stale or incorrect information, which can cause business or UX problems.
Let’s look at why this happens and how to prevent it.
The Problem: Stale Cache
Just think about this flow:
- A product’s price is stored in Redis:
Cache["Product_101"] = $49.99 - An admin updates the price in the database to
$59.99. - But the cache isn’t updated yet. So users continue to see
$49.99until the cache expires.
This stale data can persist for minutes or hours, depending on your expiration policy you have configured. And in high-traffic systems, that can be a serious problem.
Why Inconsistencies Happen
There are several ways cache–database mismatches occur, like,
- Cache not invalidated on DB write -: You update the DB but forget to remove/update the cache key.
- Race conditions -: Cache is refreshed while DB is being updated.
- Cache update failures -: Cache write fails after DB commit (e.g., Redis temporarily unavailable).
- Async write delays (Write-Behind) -: Updates are batched, and the database lags behind cache.
Common Solutions
There’s no one-size-fits-all fix for this. The right strategy depends on your use case and consistency requirements. Here are the main patterns used in production systems.
1. Cache Invalidation on Write
The simplest and most common solution is to remove the cache entry immediately after updating the database. That way, the next read will trigger a cache miss and repopulate it with fresh data. But there is a slight performance hit since a cache miss immediately after the update.
This approach is great for read-heavy systems with infrequent writes.
2. Cache Update After Write (Write-Through Pattern)
Instead of deleting the old cache, update both the database and cache together.
3. Event-Driven Invalidation (Using Pub/Sub)
In distributed systems with multiple app servers, simply removing the cache locally isn’t enough. You need all instances to remove the same key. That’s where Redis Pub/Sub or change notifications help. When one instance updates data, it publishes an event that others can listen to and invalidate their local caches.
Even though this approach keeps all distributed caches in sync and event-driven, and scalable, it’s slightly more complex to implement. And it requires a messaging infrastructure.
4. Time-to-Live (TTL) as a Safety Net
Even with manual invalidation, always set a TTL (time-to-live) for cache entries. That way, even if invalidation fails, stale data won’t live forever.
5. Versioned Cache Keys (a Clean Alternative)
Instead of manually deleting keys, use version tokens or namespaced keys that change whenever data changes.
For example:
string version = await _cache.GetStringAsync("ProductVersion_101") ?? "v1";string cacheKey = $"{version}:Product_101";
When a product updates:
await _cache.SetStringAsync("ProductVersion_101", $"v{Guid.NewGuid()}");
This automatically makes old cache entries obsolete, so, no need to delete them.
Best Practices
Caching is powerful, but it can easily introduce complexity or bugs if we misuse it. So, follow these best practices to build a reliable, maintainable caching layer.
1. Choose the Right Expiration Policy
- Absolute Expiration-: Fixed lifetime for all entries.
- Sliding Expiration-: Extends the lifetime when accessed frequently.
Use sliding expiration for frequently requested data and absolute for static reference data.
2. Handle Cache Invalidation Carefully
“There are only two hard things in Computer Science: cache invalidation and naming things.” — Phil Karlton
Make sure to invalidate or refresh the cache when the database data changes. Use event-based invalidation (e.g., publish-subscribe via Redis) in multi-server setups.
3. Avoid Over-Caching
We don’t need everything in our cache. Only extremely volatile data, sensitive user-specific data, and very large objects that can cause memory pressure should be stored in cache.
4. Use Namespacing or Versioned Cache Keys
Add a prefix or version tag to cache keys:
string cacheKey = $"v1:User_{userId}";
This helps you easily invalidate or migrate cache entries when schema changes.
5. Monitor Cache Performance
You need to track the cache hit/miss ratio, eviction rate, and memory usage. Use tools like RedisInsight, Azure Cache Metrics, or Application Insights to visualize performance.
6. Combine Caching with Async Patterns
Always use async cache APIs (GetStringAsync, SetStringAsync) to prevent thread blocking in high-load environments.
7. Use a Multi-Layered Strategy
Combine all browser caches for static content, CDN cache for global edge delivery, and App-layer cache for dynamic data to ensure both speed and resilience.
8. Implement Graceful Cache Fallbacks
Always have a fallback plan:
var data = await _cache.GetStringAsync(key);if (data == null){ data = await FetchFromDbAsync(); await _cache.SetStringAsync(key, data);}
This guarantees your app continues functioning even during cache downtime.
Conclusion
Caching isn’t just an optimization. It’s a scalability enabler. When implemented strategically using patterns like Cache-Aside or Write-Through, and backed by a robust system like Redis, .NET applications can handle massive loads with minimal latency.
By following these patterns and best practices, you’re not just speeding up your app, you’re future-proofing it for scale, stability, and cost efficiency.
See you in the next blog post. Bye Bye🍻🍸❤️❤️