The .net class ConcurrentDictionary<T1, T2> does provide thread-safe methods to store and read objects with a unique key. This ability makes it a good candidate for implementing memory based “level 1 caching” for fast access to cached information from long running database request.
When implementing such a caching using this class, you should be aware that methods like “GetOrAdd” really are thread safe, but this does not mean that they will behave as you might expect.
For demonstration purpose I will replace the long running database request simply with a “Thread.Sleep(500);” and returning a unique value.
I’ve this little unit test to demonstrate the first issue with the ConcurrentDictionary:
[TestMethod]
As you already can see by reading the comments and asserts, the call to GetOrAdd does execute the generation of the value for the key “Hello” two times. This is because the first result has not been added to the internal value collection when the 2nd call starts. Even more disturbing for me is: the 2nd call provides a delegate that returns “Hello2”, that is being executed but the return value is “Hello1” (even if “Hello2” is the “newer” value).
It depends on the timing:
| 0 ms | We’ve started a thread that takes 500ms to generate a value for “Hello” – so that value is added to the internal collection of the ConcurrentDictionary. |
| 100 ms | While this value-generation we requested the value for the key “Hello” again – this will lookup the internal collection for the value of key “Hello”, will not find anything, so it will start the value creation again. |
| 500 ms | The value from the first call inside the additionally started thread will be added to the internal collection. |
| 600 ms | The second call did finish generating its value and tries to insert it into the internal collection – but there is already the value from the first call. In this case it does NOT update the value inside the ConcurrentDictionary, but uses the value from the internal collection instead (“Hello1”). |
In this case the generation of the second value was a totally waste of time. In case of the second call ending before the first one, both calls will return the second value … which value will be “the one” depends on when the call starts (before adding a value for the key) and when it ends.
If you use the ConcurrentDictionary as a simple storage for a cache on a server side, you might get into trouble when you start up your application and get multiple requests that need potentially cached information. They will all start nearly the same time, so requesting a page 10 times a second that might need 2 seconds to get some “normally cached” data will case 20 concurrent queries for that data – what might not be what you really want to do. You should carefully design and test your caching – there might be an open source caching library that already implements what you need, so you don’t risk wasting that amount of CPU-time and IO-load.
I’ve talked about the “first issue” with using this class in a self implemented cache – so what’s the “second issue”? The second issue is security related and I will post that in a week or two (depending on my other workload).
Verfasst von svenerikmatzen 