锁定的成本(The cost of locking)

我有一个缓存（由Web应用程序使用），它在内部使用两个缓存 - 一个短期缓存，仅在请求中使用，以及长期缓存，“永久”使用（跨请求）。

我有以下代码，请注意所有底层数据结构都是线程安全的。

public TCache Get(CacheDependency cachdeDependancy, Func<CacheDependency, TCache> cacheItemCreatorFunc) { TCache cacheItem; if (shortTermCache.TryGetValue(cachdeDependancy.Id, out cacheItem)) { return cacheItem; } DateTime cacheDependancyLastModified; if (longTermCache.TryGetValue(cachdeDependancy.Id, out cacheItem) && IsValid(cachdeDependancy, cacheItem, out cacheDependancyLastModified)) { cacheItem.CacheTime = cacheDependancyLastModified; shortTermCache[cachdeDependancy.Id] = cacheItem; return cacheItem; } cacheItem = cacheItemCreatorFunc(cachdeDependancy); longTermCache.Add(cachdeDependancy.Id, cacheItem); shortTermCache[cachdeDependancy.Id] = cacheItem; return cacheItem; }

显然，在运行并发（即多个Web请求）时，上述代码仍然可能（可能甚至可能）不一致。但是我写了一些单元测试，我看到的是从来没有发生过“异常”。可能会发生的是，即使它已经存在，也会再次添加相同的项目 - >我认为您在查看代码时可以看到我的意思。

我仍然认为拥有一个始终正确且一致的解决方案会更好。

所以我使用一个简单的双重检查锁机制重写了这段代码（也许这可能更好，通过为另一个缓存添加另一个/第二个锁？）：

public TCache Get(CacheDependency cachdeDependancy, Func<CacheDependency, TCache> cacheItemCreatorFunc) { TCache cacheItem; if (shortTermCache.TryGetValue(cachdeDependancy.Id, out cacheItem)) { return cacheItem; } lock (_lockObj) { if (shortTermCache.TryGetValue(cachdeDependancy.Id, out cacheItem)) { return cacheItem; } DateTime cacheDependancyLastModified; if (longTermCache.TryGetValue(cachdeDependancy.Id, out cacheItem) && IsValid(cachdeDependancy, cacheItem, out cacheDependancyLastModified)) { cacheItem.CacheTime = cacheDependancyLastModified; shortTermCache[cachdeDependancy.Id] = cacheItem; return cacheItem; } cacheItem = cacheItemCreatorFunc(cachdeDependancy); longTermCache.Add(cachdeDependancy.Id, cacheItem); shortTermCache[cachdeDependancy.Id] = cacheItem; return cacheItem; } }

我认为此代码现在可以在多线程环境中正常运行。

然而，我不确定的是：这不会非常慢，因此也会破坏缓存的目的吗？是否可能更好地解决问题，缓存有时会出现“不一致”的行为？因为如果同时有1000个Web请求，则必须等到它们才能进入锁定区域。或者这根本不是一个问题，因为CPU只有一个特定数量的内核（因此是“真正的”并行线程），这种性能损失总是很小？

I have a cache (which is used by a web application) which internally uses two caches - one short term cache, which is just used within a request, and a long term cache which is used "permanently" (across requests).

I have the following code, note that all underlying data-structures are thread safe.

public TCache Get(CacheDependency cachdeDependancy, Func<CacheDependency, TCache> cacheItemCreatorFunc) { TCache cacheItem; if (shortTermCache.TryGetValue(cachdeDependancy.Id, out cacheItem)) { return cacheItem; } DateTime cacheDependancyLastModified; if (longTermCache.TryGetValue(cachdeDependancy.Id, out cacheItem) && IsValid(cachdeDependancy, cacheItem, out cacheDependancyLastModified)) { cacheItem.CacheTime = cacheDependancyLastModified; shortTermCache[cachdeDependancy.Id] = cacheItem; return cacheItem; } cacheItem = cacheItemCreatorFunc(cachdeDependancy); longTermCache.Add(cachdeDependancy.Id, cacheItem); shortTermCache[cachdeDependancy.Id] = cacheItem; return cacheItem; }

Obviously it's still possible (probably even likely) that the code above won't be consistent when running concurrent (i.e. multiple web requests). However I wrote some unit-tests and what I saw is that never an "exception" occurs. What can happen is that the same item is added again even though it's already there etc. --> I think you can see what I mean when you look at the code.

Still I thought it would be nice to have a solution which always works correct and is consistent.

So I rewrote this code using a simple double-checking lock mechanism (maybe this could even be better, by adding another/second lock for the other cache?):

public TCache Get(CacheDependency cachdeDependancy, Func<CacheDependency, TCache> cacheItemCreatorFunc) { TCache cacheItem; if (shortTermCache.TryGetValue(cachdeDependancy.Id, out cacheItem)) { return cacheItem; } lock (_lockObj) { if (shortTermCache.TryGetValue(cachdeDependancy.Id, out cacheItem)) { return cacheItem; } DateTime cacheDependancyLastModified; if (longTermCache.TryGetValue(cachdeDependancy.Id, out cacheItem) && IsValid(cachdeDependancy, cacheItem, out cacheDependancyLastModified)) { cacheItem.CacheTime = cacheDependancyLastModified; shortTermCache[cachdeDependancy.Id] = cacheItem; return cacheItem; } cacheItem = cacheItemCreatorFunc(cachdeDependancy); longTermCache.Add(cachdeDependancy.Id, cacheItem); shortTermCache[cachdeDependancy.Id] = cacheItem; return cacheItem; } }

I think this code now works correctly in multi-thread environments.

However what I'm not sure is: Will this not be terribly slow and therefore also kind of destroys the purpose of the cache? Would it maybe be better to live with the problem, that the cache can sometimes have an "inconsistent" behavior? Because if there are 1000 web requests at the same time, all have to wait until they can enter the lock zone. Or is this not really a problem at all, because a CPU only has a specific amount of cores (and therefore "real" parallel threads) at once and this performance penalty will always be minor?

最满意答案

如果你使用ConcurrentDictionary ，你已经有办法做你想要的 - 你可以简单地使用GetOrAdd方法：

shortTermCache[cacheDependency.Id] = longTermCache.GetOrAdd(cacheDependency.Id, _ => cacheItemCreatorFunc(cachdeDependancy));

快捷方便：）

您甚至可以将其展开以包含短期缓存检查：

return shortTermCache.GetOrAdd ( cacheDependency.Id, _ => { return longTermCache .GetOrAdd(cacheDependency.Id, __ => cacheItemCreatorFunc(cacheDependency)); } );

虽然对于每个请求缓存使用ConcurrentDictionary是没有必要的 - 但它实际上不必是线程安全的。

至于你的原始代码，是的，它已被打破。您在测试期间没有看到这一点的事实并不太令人惊讶 - 多线程问题通常难以重现。这就是为什么你想要编码是正确的 ，首先是 - 这意味着你必须了解究竟发生了什么，以及可能发生什么样的并发问题。在您的情况下，有两个共享引用： longTermCache和cacheItem本身。即使您正在使用的所有对象都是线程安全的，您也无法保证您的代码也是线程安全的 - 在您的情况下，可能存在对cacheItem的争用（线程安全性如何）？），或者有人可能在此期间添加了相同的缓存项目。

这究竟是多么严重依赖于实际的实现 - 例如，如果已经存在具有相同Id的项目，则Add可能会抛出异常，或者可能不存在。您的代码可能希望所有缓存项都是相同的引用，或者它可能不是。 cacheItemCreatorFunc可能会产生可怕的副作用或运行成本高昂，或者可能没有。

使用添加的lock更新可以解决这些问题。但是，它不会处理您在整个地方泄漏cacheItem的方式，例如。除非cacheItem完全是线程安全的，否则你可能会遇到一些难以跟踪的错误。而且我们已经知道它也不是一成不变的 - 至少，你正在改变缓存时间。

If you use a ConcurrentDictionary, you already have a way to do what you want - you can simply use the GetOrAdd method:

shortTermCache[cacheDependency.Id] = longTermCache.GetOrAdd(cacheDependency.Id, _ => cacheItemCreatorFunc(cachdeDependancy));

Quick and easy :)

You can even expand this to include the short term cache check:

return shortTermCache.GetOrAdd ( cacheDependency.Id, _ => { return longTermCache .GetOrAdd(cacheDependency.Id, __ => cacheItemCreatorFunc(cacheDependency)); } );

Though it's kind of unnecessary to use a ConcurrentDictionary for a per-request cache - it doesn't really have to be thread-safe.

As for your original code, yes, it is broken. The fact that you don't see that during your testing isn't too surprising - multi-threading issues are often hard to reproduce. That's why you want to code to be correct, first and foremost - and that means you have to understand what exactly is going on, and what kind of concurrency issues can possibly happen. In your case, there's two shared references: longTermCache and the cacheItem itself. Even if all the objects you're working with are thread-safe, you've got no guarantee that your code is thread-safe as well - in your case, there might possibly be a contention over cacheItem (how thread-safe is that?), or someone might have added the same cache item in the meantime.

How exactly this breaks depends heavily on the actual implementations - for example, Add might throw an exception if an item with the same Id is already present, or it might not. Your code might expect all of the cache items to be the same reference, or it might not. cacheItemCreatorFunc might have horrible side-effects or be expensive to run, or it might not.

Your update with the added lock does fix those issues. However, it doesn't handle the way you're leaking cacheItem all over the place, for example. Unless cacheItem is perfectly thread-safe as well, you could encounter some hard to track bugs. And we already know it isn't immutable either - at the very least, you're changing the cache time.

更多推荐

锁定的成本(The cost of locking)

最满意答案

发布评论取消回复

最近发表

热门文章

标签列表