Ayende @ Rahien

Hi!
My name is Ayende Rahien
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:

ayende@ayende.com

+972 52-548-6969

, @ Q c

Posts: 5,949 | Comments: 44,548

filter by tags archive

Concurrent Max


Can you think of a better way to implement this code?

private volatile Guid lastEtag;
private readonly object lastEtagLocker = new object();
internal void UpdateLastWrittenEtag(Guid? etag)
{
    if (etag == null)
        return;

    var newEtag = etag.Value.ToByteArray();

    // not the most recent etag
    if (Buffers.Compare(lastEtag.ToByteArray(), newEtag) <= 0)
    {
        return;
    }

    lock (lastEtagLocker)
    {
        // not the most recent etag
        if (Buffers.Compare(lastEtag.ToByteArray(), newEtag) <= 0)
        {
            return;
        }

        lastEtag = etag.Value;
    }
}

We have multiple threads calling this function, and we need to ensure that lastEtag value is always the maximum value. This has the potential to be called often, so I want to make sure that I chose the best way to do this. Ideas?


Comments

Will

You could store lastEtag as a Byte[] rather than a Guid which would save some conversions.

Ayende Rahien

I could, yes. We are actually using the etag far more often as a Guid rather than a byte array elsewhere.

Ken Egozi

Depending on the read/write ratio. Off of your last paragraph I might guess that it is quite balanced, maybe even rending toward more writes than reads, then you might move the burden to the read side. i.e. - writes will add to a small (non-locking concurrent) collection of "potentially last etag" values, and reads will get the latest, and clear the collection.

You might even take advantage of a non-locking concurrent sorted-list (is there anything like that available?) , simply adding to the list on write, and getting the last on read. I'm not sure it would be very efficient though.

tobi

Change the type of lastEtag to Box (reftype). Then use interlocked operations to atomically swap out the value in a retry loop:

while(true) { Box lastEtagLocal = lastEtag; if(newer) { var newEtagBox = ...; if(Interlocked.CmpXchg(ref lastEtag, newEtagBox, lastEtagLocal) == lastEtagLocal) break; } }

pseudo-code.

PS: The preview has wrong code style. PSS: "An error occurred while___ posing___ your comment"

oskark

I concur with the notion of optimizing for dominant read or writes . Otherwise the SpinWait class introduced in .Net 4 is helpful for a lock free update of a field:

nternal void UpdateLastWrittenEtag (Guid? etag)

{

if(etag == null) return;

var spinWait = new SpinWait();

while (true)

{

var snapshot1 = lastEtag;

var snapshot2 = Interlocked.CompareExchange (ref lastEtag, etag, snapshot1);

if (snapshot1 == snapshot2) return;

spinWait.SpinOnce();

}

}

tobi

I meant Box[Guid]. Why are you stripping html tags? That is a bad practice. Encoding is simpler, more reliable and more user friendly.

I recommend you use this lib for BBCode in comments: http://bbcode.codeplex.com/ It is very robust, easy to use and fast.

Duarte Nunes

You can use a CAS scheme like the following; however, it generates quiet a lot of garbage if writes are frequent, which may make the lock approach more adequate.

internal class Holder<T> 
{  
    internal T Value;
}

private volatile Holder<Guid> lastEtag;

internal void UpdateLastWrittenEtag(Guid? etag)
{
    if (etag == null || etag.HasValue == false)
        return;

    var newEtag = etag.Value.ToByteArray();
    Holder<Guid> newHolder;
    SpinWait sw = new SpinWait();

    do 
    {
         var last = lastEtag;

         if (Buffers.Compare(last.Value.ToByteArray(), newEtag) <= 0)
         return;

         if (newHolder == null) 
         newHolder = new Holder<Guid> { Value = etag.Value };

         if (Interlocked.CompareExchange(ref lastEtag, newHolder, last) == last) 
         return;

         sw.SpinOnce();         
    } while (true);
}
tobi

I believe I found a bug in Racoon Blogs so called "whitelisting" of html (it is really blacklisting because you are deleting disallowed stuff instead of adding allowed stuff): You can get around the img-regex by typing "[[img src="javascript:..."]] with [ replaced by left angle bracket. I did not test this.

God do I hate the practice of blacklisting. There is always something missing. It almost never works.

EvereQ

The only thing I see that may significantly improve performance is related to possible use of the "unsafe" keyword and to do comparison inline (instead of calling Buffers.Compare and ToByteArray methods that might be too expensive overhead) using pointers arithmetic directly in your method. Also if you know some additional info about your GUIDs, like for example you know that they are always generated on same PC you may optimize algorithm more, when do compare first for most significant GUID part that changed over time in your specific environment (i.e. question here where and how GUIDs are generated?) etc.

Simon Bartlett

Ayende,

Why not use a jQuery markdown editor, then people might actually use markdown: http://markitup.jaysalvat.com/home/

Also, the current editor is a headache to use on some devices; I'm writing this comment on an iPhone and I think your preview script is making it extremely slow to type.

Simon Bartlett

Sorry about the double post, maybe I should refrain from using my phone to post comments!

Daniel Grunwald

You will have to use a reference type field (either Holder or just byte[]) - the current code is not thread-safe!

"volatile" just ensures the threads see the latest value; it doesn't ensure the access is atomic. Thus, is the current lastEtag is 00000000-0000-0000-FFFF-FFFFFFFFFFFF, and one thread is updating that to 00000000-0000-0001-0000-000000000000, then another thread might read 00000000-0000-0001-FFFF-FFFFFFFFFFFF, a value much larger than the maximum. Now if that second thread was trying to set the value to 00000000-0000-0001-0000-000000000001, it won't do that due to the incorrect read, and your maximum will be incorrect.

Ayende Rahien

Tobi, You need to solve the "only replace if newer" problem, though.

tobi

Ayende, I did. You can exit the loop if your guid is not newer than the existing. In that case you just return.

tobi

Basically, the trick is that you can atomically swap any amount of data by stuffing it into a ref type. Then swap the reference instead all of the data.

zvolkov

Why do you need to compare? Can you ever get a less recent eTag? Just always overwrite.

petar repac

Ayende, just checked and seems that Guid.ToByteArray is not thread safe so you could get some weird array of bytes

Kevin Pullin

@petar - Can you describe why Guid.ToByteArray is not thread safe?

Guid appears to be an immutable struct, so unless you're doing some whacky reflection to change the private values, I don't see how this can produce invalid results.

tobi

Kevin, you can just overwrite a struct variable which will gradually overwrite the fields. The memory locations are not immutable.

petar repac

@Kevin look at http://msdn.microsoft.com/en-us/library/system.guid.aspx it says that instance members are not thread safe

Kevin Pullin

Thanks tobi & petar. I was able to repro the issue, confirming the comment from Daniel.

Ayende Rahien

Zvolkov, What happen if two threads tries to update at the same time?

Ayende Rahien

Petar, There is no way ToByteArray isn't safe. Put simply, Guid is an immutable data structure, since ToByeArray is working based on immutable data structure, it is safe to use in multiple threads.

petar repac

@Ayende, jup - you are right. I should not rely on MS documentation so much, I guess.

tobi

Ayende, ToByteArray is not safe if invoked on a variable that is concurrently mutated. Struct variables can be overwritten which akes the memory locations of their readonly fields change.

Ayende Rahien

Tobi, Good point, but not a scenario that is going to happen the way I am doing this.

tobi

Ayende, you are reading and writing directly from/to the field. I believe this is going to happen. The problem goes away when you implement the interlocked.cmpxchg loop that was proposed above because the ref type field will be accesses atomically.

Ayende Rahien

Tobi, The code as posted cannot even compile (you can't make volatile Guid). I am using a reference to this via a pointer, that is atomic

Alois Kraus

How is your Buffers.Compare implemented? Perhaps you can calulate a value from it which fits into an integer which would make the comparison cheaper. Another option is to make lastEtag threadlocal and check against your threadlocal value. If the new value is bigger only then you need to take a lock and update the global value and your thread local value. This way your threads operate most of the time on their own data which is much more data cache friendly. If you try hard to avoid CPU cache misses (code and data) you can get speedups up to a factor 20 if your tight loop fits with its data entirely into the L1 (32 KB on modern CPUs) and L2 (4-8 MB) cache.

Yours, Alois Kraus

Mark Rendle

private volatile BigInteger lastEtag;

internal void UpdateLastWrittenEtag(Guid? etagGuid) { if (etagGuid == null) return; var etag = new BigInteger(etagGuid.ToByteArray()); if (etag < lastEtag) return;

lock (_sync)
{
    lastEtag = BigInteger.Max(etag, lastEtag);
}

}

Scooletz

How about storing additionally hash of a Guid. Getting Hash from newEtag should be faster than creating an array, and, if the Guid is well-hashable not creating so much collisions. If check would go to firstly check the hash, next guid bytes, next try replace if needed.

Comment preview

Comments have been closed on this topic.

FUTURE POSTS

No future posts left, oh my!

RECENT SERIES

  1. The RavenDB Comic Strip (3):
    28 May 2015 - Part III – High availability & sleeping soundly
  2. Special Offer (2):
    27 May 2015 - 29% discount for all our products
  3. RavenDB Sharding (3):
    22 May 2015 - Adding a new shard to an existing cluster, splitting the shard
  4. Challenge (45):
    28 Apr 2015 - What is the meaning of this change?
  5. Interview question (2):
    30 Mar 2015 - fix the index
View all series

RECENT COMMENTS

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats