You are doing it REALLY wrong, the shortest code review ever
We got a question from a customer that included the following code:
This was quite unrelated to the question being asked, but it was enough to trigger all sorts of alarms. Just based on this two lines, and the fact that we can assume that _session is a RavenDB session, we can tell quite a lot about the codebase.
To start with, it is broken, probably badly so.
Let us try to break down why. The fact that you are locking on the session means that you access it concurrently. And that is utterly forbidden. The session is not meant to be used concurrently, and Strange Stuff will happen if you try to.
Now, the session doesn’t have thread affinity, which means that if you are calling it serially, you can certainly move it between threads. In fact, with async sessions, that happens quite often. So one would expect that locking on the session would provide sufficient protection for the session from concurrent use.
However, nothing could be further from the truth. Look at the output of this method. It returns User instances, which are used outside the lock. However, if two threads will call this method concurrently, they will each run in turn, taking and releasing the lock as required. Which is great, but it is very likely that they will get the same User instance back (remember, the session implements Identity Map and ensure that within the scope of the session, the same document is always represented by the same object instance). That is when things start going boom, boom, BOOM!
Even if you are locking on the session, you aren’t locking on your entities, and that means that you have concurrent access to them. I haven’t actually reviewed the entity in question, but I’m 99.999999% certain that no attempt was made to make them safe for multi threading and 100% certain that it there was such an attempt, it would fail horribly in some manner quite easily.
Entities are not meant for concurrent usage, and even trying to reason about them effectively is going to be near impossible.
A session is meant to be a scope of work, in fact, look up the Unit of Work pattern, which is exactly how you are expected to use the session.
Comments
So what is the appropriate way to Query (read-only) your database (No UOW/Session required)?
John Doe, Why not use the session? That is what it is there for, and it makes things much easier. You can use it as a one off, for firing a single query. I have issues with that, but much less than the pattern above.
Appropriate way to query is using unit of work pattern to read or read and update items. Don't over do on repository pattern.
Some of bad way of use session I have seen was following.
The only reason that the session is locked is to prevent threading errors if you have code that can be unintentionally called from two threads. It isn't there in order to try and lock object instances either. If you have a large code base then it's possible you may have a few small scenarios where this could happen. The lock is there for robustness.
We don't intentionally share our sessions between threads. We don't have multiple instances of the same object in different threads either. By default then we create a new session in each thread and only process objects of a specific category in that thread. Object X is never processed simultaneously in two separate threads.
You may also have methods where there is code that needs to serialize access to other objects, not the session object. In this case then we lock an object that is common in all of the classes instead of creating a separate object and lock that. We decorate all methods in the class with the locking for robustness.
If you provide your code libraries to third parties then you have no real control over how they use it. Again the lock is there for robustness.
John Doe 2, Nope, the mere fact that this is possible or of concern to you indicates a huge architectural problem. The lock is an indication that someone thought that this is possible, which indicate a whole level of issues that needs to be addressed first. It is like saying, "watch out about the loose, hungry, tiger in the nursery" and not doing anything to fix it.
Session management is supposed to flow the unit of work pattern, with very clear boundaries on its usage. There should be no case in which this is even possible, certainly not of conern.
When providing code to others, you need to be clear about your expectations on usage.
Code isn't generally written to target a specific database product. Therefore your architecture shouldn't be designed with regards to the limitations of specific products. In theory then I should be able to have a repository instance and share it between multiple threads as long as it's agreed that repository instances are thread safe.
Your comments appear to assume that every unit of work can be accomplished using one thread. I may need to have multiple threads in order to complete a unit of work that has a 'big' workload (Not necessarily lots of DB work). Therefore the repository (and session) may need to be shared between threads.
Also, if your code is reusable by third parties then you have no real control how the code will be reused. You can provide them with guidelines but you can't always control whether they will be followed. For example, You have no control over whether they'll share a repository instance between threads.
John Doe 2, Pretty much all databases that I'm aware of have strict rules about the concurrency of session / connection / etc. A repository instance can be safely assumed to be non thread safe across a whole range of technologies.
In fact, given thread safeties issues in general, assuming that a repository of any kind is thread safe is a really bad default, as is any assumption that code needs to be multi threaded safe by default. In fact, this is the other way around, code is single threaded by default and you need to have explicit documentation that it is safe to use from multiple threads. A big workload that is used on multiple threads is a sign of bad architecture. If you need a unit of work that compose multiple threads, you need to have the architecture to support it. At a glance, I would do:
As for "no control over whether they'll share a repository". You also have no control over whatever they do error handling, and yet I assume you still use exceptions? If a user insist that they never call the
new
operator, will you design a system around that?Being thread unsafe by default is not something weird or extra ordinary. This is the natural order of things, and users who make baseless assumptions gets exactly what they are deemed worth.
I'm well aware of the rules about concurrency of session/connection etc. I've been a software developer for over 22 years (including being a Senior Developer & Technical Architect) and I've used database products heavily during this time.
We don't 'assume that a repository of any kind' is thread safe. We made an architectural decision in our particular product that we would make our repositories thread safe so that, in the event that there is rare code in our product that uses it from another thread, then it doesn't crash. In our system then robustness is important.
No, a big workload that is necessary on multiple threads is not a sign of bad architecture. As I stated previously, a big workload doesn't necessarily mean lots of DB work. (Even if did mean lots of DB work then it's perfectly legitimate when you have a data heavy system.) It could mean a process that takes a relatively long time to complete. For example, You have a method that accepts a file name and a single repository reference, uploads the file to 10 servers as quickly as possible and then updates a flag in Raven to indicate that the process is complete for the particular server. In this instance then it's perfectly legitimate to spawn a number of threads to do the work in order to speed up the process. Each thread may need to query the DB occasionally and so it will need to have access to the repository.
Your recommendation that each thread uses their own repository may not be practical. The method caller passed in a reference to a single repository. It shouldn't have to pass in a factory reference for creating repositories. The caller doesn't care that the method uses multi-threading. Also creating a new repository can be relatively slow in some database products. For example, In SQL database then creating a new database connection can be slow if you're not using connection pooling. Alternatively you may need to limit resources and not have one DB connection per thread.
Being thread-safe is a perfectly legitimate architectural decision in systems that require robustness or have methods that use multi-threading where the threads need to access a DB. (For the record, I quite agree that if each thread is doing lots of DB work then using separate repositories would ideally be best.)
That is a mistake. I know that this may sound harsh, but is is a big mistake. Literally no other product on the market has session / connections that are safe for multi threading. Your entities are not likely to be thread safe and the whole thing is going to require you to take upon yourself a lot of really complex stuff for what is likely to be marginal benefit. A proper architecture would give you the benefit of multi core scaling without the requirement to handle thread safety issues in this manner. Again, note that there is a big difference between thread safe and scaling up with parallelism via concurrent instances.
That is leaving aside the fact that you are trying to achieve higher performance but to ensure that the system operates you have to slap locks on multiple places that is going to be a perf killer.
That kind of system isn't robust. See: https://ayende.com/blog/180385/pr-review-encapsulation-stops-at-the-assembly-boundary
This is trying to hide (a very serious) issue by limping along. For that matter, this kind of thinking exposes you to Hyrum’s Law: “With a sufficient number of users of an interface, it doesn’t matter what you promised in the interface contracts, all observable behaviors of your class or function or whatnot will be depended upon by somebody.”
That leads you from an outright error to having to try to support it, because you have sufficent number of users who assume that this is safe, and you can't make it safe (at least not with good performance) in the general case.
You want it to crash. Crashing is a godsend in these cases. What you are describing is a race condition, and the last thing you want it to do is to continue running. Because that way you have madness. You have concurrently executing code that is doing strange stuff and violates a lot of assumptions in your domain in very fundemental manners.
Yes, absolutely agree with you up to the last two works. It isn't _the repository_. It is _the repositories_. A repository is stateful, and each thread should have one for the scope of the operation.
You want to debug this kind of code? Because I have done many mutli threaded debugging, and it is harsh. With RavenDB 4.0 we have made multiple passes to reduce this kind of sharing as an explicit goal to make debugging and operations easier.
Fix the method, then. You'll regret it otherwise.
Let us take your example and show the issue: "For example, You have a method that accepts a file name and a single repository reference, uploads the file to 10 servers as quickly as possible and then updates a flag in Raven to indicate that the process is complete for the particular server.".
We'll assume that you have code similar to this:
There are a few things wrong here. At first glance, this code may look proper, but it has multiple race conditions. You can get a record about the file upload before it is finished uploading or even if it failed to upload. The problem is apparent if you are using explicit locks, but what about this code? Now "properly" separated.
// in the repository
At a glance, this looks fine, proper locking, etc. But you are potentially mutating the same file record concurrently, that can lead to error, hangs, etc. If the
AddServer
method modify a dictionary, you can get into an infinite loop quite easily, for example, if there is a race.This is code that is easy to write, easy to read and really easy to miss the error if you don't consider it. It can also run, in production, for years on end, without any trouble. And then one day the entire world will come crashing down on your program because of something like this. The most recent example is here: https://ayende.com/blog/180769/production-postmortem-the-random-high-cpu
That particular codepath hasn't changed in literally years, it is deployed to tens of thousands or higher of nodes and gave us no trouble for years. To trigger it, an operation that should take 1 second should take more than 30 _minutes_. And somehow, it was triggered. For that matter, it was triggered several times in a particular environment.
Yes, it is. In the right place. A repository is not that place. You are taking an extremely high cost upon yourself with this kind of attempt, for very few benefits and opening a whole huge can of worms.
Comment preview