Concurrent Max
Can you think of a better way to implement this code?
private volatile Guid lastEtag; private readonly object lastEtagLocker = new object();internal void UpdateLastWrittenEtag(Guid? etag) { if (etag == null) return; var newEtag = etag.Value.ToByteArray(); // not the most recent etag if (Buffers.Compare(lastEtag.ToByteArray(), newEtag) <= 0) { return; } lock (lastEtagLocker) { // not the most recent etag if (Buffers.Compare(lastEtag.ToByteArray(), newEtag) <= 0) { return; } lastEtag = etag.Value; } }
We have multiple threads calling this function, and we need to ensure that lastEtag value is always the maximum value. This has the potential to be called often, so I want to make sure that I chose the best way to do this. Ideas?
Comments
You could store lastEtag as a Byte[] rather than a Guid which would save some conversions.
I could, yes. We are actually using the etag far more often as a Guid rather than a byte array elsewhere.
Depending on the read/write ratio. Off of your last paragraph I might guess that it is quite balanced, maybe even rending toward more writes than reads, then you might move the burden to the read side. i.e. - writes will add to a small (non-locking concurrent) collection of "potentially last etag" values, and reads will get the latest, and clear the collection.
You might even take advantage of a non-locking concurrent sorted-list (is there anything like that available?) , simply adding to the list on write, and getting the last on read. I'm not sure it would be very efficient though.
Change the type of lastEtag to Box<Guid> (reftype). Then use interlocked operations to atomically swap out the value in a retry loop:
while(true) { Box<Guid> lastEtagLocal = lastEtag; if(newer) { var newEtagBox = ...; if(Interlocked.CmpXchg(ref lastEtag, newEtagBox, lastEtagLocal) == lastEtagLocal) break; } }
pseudo-code.
PS: The preview has wrong code style. PSS: "An error occurred while___ posing___ your comment"
I concur with the notion of optimizing for dominant read or writes . Otherwise the SpinWait class introduced in .Net 4 is helpful for a lock free update of a field:
nternal void UpdateLastWrittenEtag (Guid? etag)
{
if(etag == null) return;
var spinWait = new SpinWait();
while (true)
{
var snapshot1 = lastEtag;
var snapshot2 = Interlocked.CompareExchange (ref lastEtag, etag, snapshot1);
if (snapshot1 == snapshot2) return;
spinWait.SpinOnce();
}
}
I meant Box[Guid]. Why are you stripping html tags? That is a bad practice. Encoding is simpler, more reliable and more user friendly.
I recommend you use this lib for BBCode in comments: http://bbcode.codeplex.com/ It is very robust, easy to use and fast.
You can use a CAS scheme like the following; however, it generates quiet a lot of garbage if writes are frequent, which may make the lock approach more adequate.
I believe I found a bug in Racoon Blogs so called "whitelisting" of html (it is really blacklisting because you are deleting disallowed stuff instead of adding allowed stuff): You can get around the img-regex by typing "[[img src="javascript:..."]] with [ replaced by left angle bracket. I did not test this.
God do I hate the practice of blacklisting. There is always something missing. It almost never works.
The only thing I see that may significantly improve performance is related to possible use of the "unsafe" keyword and to do comparison inline (instead of calling Buffers.Compare and ToByteArray methods that might be too expensive overhead) using pointers arithmetic directly in your method. Also if you know some additional info about your GUIDs, like for example you know that they are always generated on same PC you may optimize algorithm more, when do compare first for most significant GUID part that changed over time in your specific environment (i.e. question here where and how GUIDs are generated?) etc.
Ayende,
Why not use a jQuery markdown editor, then people might actually use markdown: http://markitup.jaysalvat.com/home/
Also, the current editor is a headache to use on some devices; I'm writing this comment on an iPhone and I think your preview script is making it extremely slow to type.
Sorry about the double post, maybe I should refrain from using my phone to post comments!
You will have to use a reference type field (either Holder<Guid> or just byte[]) - the current code is not thread-safe!
"volatile" just ensures the threads see the latest value; it doesn't ensure the access is atomic. Thus, is the current lastEtag is 00000000-0000-0000-FFFF-FFFFFFFFFFFF, and one thread is updating that to 00000000-0000-0001-0000-000000000000, then another thread might read 00000000-0000-0001-FFFF-FFFFFFFFFFFF, a value much larger than the maximum. Now if that second thread was trying to set the value to 00000000-0000-0001-0000-000000000001, it won't do that due to the incorrect read, and your maximum will be incorrect.
Tobi, You need to solve the "only replace if newer" problem, though.
Ayende, I did. You can exit the loop if your guid is not newer than the existing. In that case you just return.
Basically, the trick is that you can atomically swap any amount of data by stuffing it into a ref type. Then swap the reference instead all of the data.
Why do you need to compare? Can you ever get a less recent eTag? Just always overwrite.
Ayende, just checked and seems that Guid.ToByteArray is not thread safe so you could get some weird array of bytes
@petar - Can you describe why Guid.ToByteArray is not thread safe?
Guid appears to be an immutable struct, so unless you're doing some whacky reflection to change the private values, I don't see how this can produce invalid results.
Kevin, you can just overwrite a struct variable which will gradually overwrite the fields. The memory locations are not immutable.
@Kevin look at http://msdn.microsoft.com/en-us/library/system.guid.aspx it says that instance members are not thread safe
Thanks tobi & petar. I was able to repro the issue, confirming the comment from Daniel.
Zvolkov, What happen if two threads tries to update at the same time?
Petar, There is no way ToByteArray isn't safe. Put simply, Guid is an immutable data structure, since ToByeArray is working based on immutable data structure, it is safe to use in multiple threads.
@Ayende, jup - you are right. I should not rely on MS documentation so much, I guess.
Ayende, ToByteArray is not safe if invoked on a variable that is concurrently mutated. Struct variables can be overwritten which akes the memory locations of their readonly fields change.
Tobi, Good point, but not a scenario that is going to happen the way I am doing this.
Ayende, you are reading and writing directly from/to the field. I believe this is going to happen. The problem goes away when you implement the interlocked.cmpxchg loop that was proposed above because the ref type field will be accesses atomically.
Tobi, The code as posted cannot even compile (you can't make volatile Guid). I am using a reference to this via a pointer, that is atomic
How is your Buffers.Compare implemented? Perhaps you can calulate a value from it which fits into an integer which would make the comparison cheaper. Another option is to make lastEtag threadlocal and check against your threadlocal value. If the new value is bigger only then you need to take a lock and update the global value and your thread local value. This way your threads operate most of the time on their own data which is much more data cache friendly. If you try hard to avoid CPU cache misses (code and data) you can get speedups up to a factor 20 if your tight loop fits with its data entirely into the L1 (32 KB on modern CPUs) and L2 (4-8 MB) cache.
Yours, Alois Kraus
private volatile BigInteger lastEtag;
internal void UpdateLastWrittenEtag(Guid? etagGuid) { if (etagGuid == null) return; var etag = new BigInteger(etagGuid.ToByteArray()); if (etag < lastEtag) return;
}
How about storing additionally hash of a Guid. Getting Hash from newEtag should be faster than creating an array, and, if the Guid is well-hashable not creating so much collisions. If check would go to firstly check the hash, next guid bytes, next try replace if needed.
Comment preview