Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,546
|
Comments: 51,161
Privacy Policy · Terms
filter by tags archive

Concurrent Max

time to read 2 min | 286 words

Can you think of a better way to implement this code?

private volatile Guid lastEtag;
private readonly object lastEtagLocker = new object();
internal void UpdateLastWrittenEtag(Guid? etag)
{
    if (etag == null)
        return;

    var newEtag = etag.Value.ToByteArray();

    // not the most recent etag
    if (Buffers.Compare(lastEtag.ToByteArray(), newEtag) <= 0)
    {
        return;
    }

    lock (lastEtagLocker)
    {
        // not the most recent etag
        if (Buffers.Compare(lastEtag.ToByteArray(), newEtag) <= 0)
        {
            return;
        }

        lastEtag = etag.Value;
    }
}

We have multiple threads calling this function, and we need to ensure that lastEtag value is always the maximum value. This has the potential to be called often, so I want to make sure that I chose the best way to do this. Ideas?

time to read 3 min | 442 words

Phil Haack has an interesting post about this topic, where he presents the following solution:

public static bool IsNullOrEmpty<T>(this IEnumerable<T> items) {
    return items == null || !items.Any();
}

This solution, unfortunately, suffers from a common problem related to handling IEnumerables. The assumption that you can iterate over enumerable more than once. This hold true for things like collections, but in many cases, this sort of code will silently hide data:

var files = Directory.EnumerateFiles(".","*.cs");
if(files.IsNullOrEmpty())
{
    Cosnole.WriteLine("No files");
}
else
{
   foreach(var file in files)
   {
          Console.WriteLine(file);
   }
}

The first file will never appear here.

A better solution is:

public static bool IsNullOrEmpty<T>(this IEnumerable<T> items, out IEnumerable<T> newItems) 
{
    newItems = items;
    if(items == null)
        return false;
    
    var enumerator = items.GetEnumerator();
    if(enumerator.MoveNext() == false)
        return false;
        
    newItems = new[]{enumerator.Current}.Concat(enumerator);
    
    return true;
}

That will not lose data.

time to read 2 min | 218 words

What is the output of this code?

 IDictionary<object,string> instanceToKey = new Dictionary<object, string>();

 IDictionary<string, int> keyToCost = new Dictionary<string, int>();

 var inst1 = new object();
 var inst2 = new object();

 instanceToKey[inst1] = "one";
 keyToCost["one"] = 1;

 instanceToKey[inst2] = "two";
 keyToCost["two"] = 2;

 string value = null;
 foreach (var key
     in (from inst in new[]{inst1, inst2, new object()}
         where instanceToKey.TryGetValue(inst, out value) 
         select value))
 {
     int cost;
     if(keyToCost.TryGetValue(key, out cost))
         Console.WriteLine("Key: {0}, Cost: {1}", key, cost);
 }
time to read 3 min | 472 words

Take a look at the following Erlang function:

count_promises(Id, N, {_, PromisesQueue}) ->
    rec_count_promises(0, Id, N, PromisesQueue).

rec_count_promises(Count, _, _, []) ->
    Count;
rec_count_promises(Count, Id, N, [{Id, N, _, _} | RestQueue]) ->
    rec_count_promises((Count + 1), Id, N, RestQueue);
rec_count_promises(Count, Id, N, [_ | RestQueue]) ->
    rec_count_promises(Count, Id, N, RestQueue).

I am reading a codebase full of this sort of things, and it is really painful. I keep thinking back to how I would do it in C#:

public int CountPromises(int id, int n, Tuple<List<Record>, List<Record>> phase1)
{
     int count=0;
     foreach(var record in phase1.Item2)
       {
            if(record.Id == id && record.N == n)
                   count ++;
     }
     return count ;
}

Yet that is imperative, and involve mutating state. I agree, it is a huge improvement, but it can be made both functionally safe and easier to read:

phase1.Items2.Count(record => record.Id == id && record.N == n);

As I said, I am currently going through a codebase full of these sort of functions, and it is painful, annoying and irritating. I am not experienced enough with Erlang to be able to tell conclusively if this is idiomatic Erlang code, but I think so.

time to read 4 min | 612 words

Today I had to modify a piece of JavaScript code. The code used return a single string, and I needed to modify it to return an array of objects. Using C#, it would have been easy, change the return type, hit the compile button, fix the errors, rinse & repeat until it compiles.

In JavaScript code, however, it was much more complex, I had to find all the places where that method was called, and that particular parameter would pass unchanged throughout several functions before it was used, so I had to track that down. Pretty annoying.

And since I know that I’ll get questions on that, here is the actual example:

getDocument: function (id, operation, successCallback) {
    $.ajax({
        url: settings.server + 'docs/' + id,
        dataType: 'json',
        complete: function(xhr) {
            switch(xhr.status) 
            {
                case 200:
                    var data = JSON.parse(xhr.responseText);
                    var etag = xhr.getResponseHeader("Etag");
                    var template = xhr.getResponseHeader('Raven-' + operation + '-Template');
                    successCallback(data, etag, template);
                    break;
                case 404:
                    successCallback(null, null, null);
                    break;
            }
        }
    });
},

I needed to change the template variable from to a dictionary of headers, since it needed to be processed elsewhere.

As for where it was actually used, here is one such example, which shows several layer of indirection (because of continuations):

image

time to read 2 min | 208 words

Not just because it is concurrent, because of this wonderful method:

public class Storage : IStorage
{
    private readonly ConcurrentDictionary<Guid, ConcurrentDictionary<int, List<object>>> values =
        new ConcurrentDictionary<Guid, ConcurrentDictionary<int, List<object>>>();

    public void Store(Guid taskId, int level, object value)
    {
        values.GetOrAdd(taskId, guid => new ConcurrentDictionary<int, List<object>>())
            .GetOrAdd(level, i => new List<object>())
            .Add(value);
    }
}

Doing this with Dictionary is always a pain, but this is an extremely natural way of doing things.

time to read 2 min | 378 words

Here is an interesting little problem:

public class Program
{
    private static void Main()
    {
        foreach (int i in RobustEnumerating(Enumerable.Range(0, 10), FaultyFunc))
        {
            Console.WriteLine(i);
        }
    }

    public static IEnumerable<T> RobustEnumerating<T>(
        IEnumerable<T> input,Func<IEnumerable<T>, IEnumerable<T>> func)
    {
        // how to do this?
        return func(input);

    }

    public static IEnumerable<int> FaultyFunc(IEnumerable<int> source)
    {
        foreach (int i in source)
        {
            yield return i/(i%2);
        }
    }
}

This code should not throw, but print:

1
3
5
7
9

Can you make this happen? You can only change the RobustEnumerating method, nothing else in the code

time to read 1 min | 198 words

How do you make this code legal?

var foo = new IFoo(1);

And yes, IFoo is an interfacae.

The answer is quite simple, actually. It was there since C# 1.0, I am told, and I just stumbled upon it. Take a look at this code:

class Program
{
	static void Main(string[] args)
	{
		var foo = new IFoo(1);
		foo.Do();
	}
}

[
	ComImport, 
	Guid("C906C002-B214-40d7-8941-F223868B39A5"), 
	CoClass(typeof(FooImpl))
]
public interface IFoo
{
	void Do();
}

public class FooImpl : IFoo
{
	private readonly int i;

	public FooImpl(int i)
	{
		this.i = i;
	}

	public void Do()
	{
		Console.WriteLine(i);	
	}
}

We have an interface, and we specify the co class that implements it and is the default implementation. The rest is just required to make the compiler happy about it.

What it means, in turn, is that you can instantiate an interface and have a default implementation selected. You can even use constructor parameters. It has quite a lot of implications, if you think about it right.

Not sure it is a wise feature to use, but it is certainly an interesting tidbit.

time to read 2 min | 261 words

A while ago in the NHibernate mailing list we got a report that NHibernate is making use of a dictionary with an enum as the key, and that is causing a big performance issue.

The response was almost unanimous, I think, “what?! how can that be?!!?”. Several people went in to and tried to figure out what is going on there. The answer is totally non oblivious, Dictionary<K,V> force boxing for any value type that is used as the key.

That sound completely contradictory to what you would expect, after all, one of the major points in generics was the elimination of boxing, so what happened?

Well, the issue is that Dictionary<K,V> has to compare the keys, and for that, it must make some assumptions about the actual key. It is abstracted into EqualityComparer, and that is where the actual problem starts. EqualityComparer has some special cases for the common types (anything that is IEquatable<T>, which most of the usual suspects implements), to speed this up.

The problem is that the fall back is to an ObjectComparer, and that, of course, will box any value type.

And enum does not implements IEquatable<T>…

Omer has a good coverage on the subject, with really impressive results. Take a look at his results.

image

I am not going to steal his thunder, but I suggest going over and reading the code, it is very elegant.

Elegant code

time to read 7 min | 1359 words

I just like this code, so I thought I would publish it.

   1: public static class ArrayExtension
   2: {
   3:     public static T[] GetOtherElementsFromElement<T>(this T[] array , T element)
   4:     {
   5:         var index = Array.IndexOf(array, element);
   6:         if (index == -1)
   7:             return array;
   8:         return array.Skip(index + 1).Union(array.Take(index)).ToArray();
   9:     }
  10: }

And the unit test:

   1: public class ReplicationUnitTest
   2: {
   3:     [Fact]
   4:     public void Will_distribute_work_starting_with_next_node()
   5:     {
   6:         var nodes = new[] { 1, 2, 3 };
   7:         Assert.Equal(new[] { 3, 1 }, nodes.GetOtherElementsFromElement(2));
   8:         Assert.Equal(new[] { 1, 2 }, nodes.GetOtherElementsFromElement(3));
   9:         Assert.Equal(new[] { 2, 3 }, nodes.GetOtherElementsFromElement(1));
  10:         Assert.Equal(new[] { 1, 2, 3 }, nodes.GetOtherElementsFromElement(4));
  11:     }
  12: }

FUTURE POSTS

  1. Partial writes, IO_Uring and safety - about one day from now
  2. Configuration values & Escape hatches - 5 days from now
  3. What happens when a sparse file allocation fails? - 7 days from now
  4. NTFS has an emergency stash of disk space - 9 days from now
  5. Challenge: Giving file system developer ulcer - 12 days from now

And 4 more posts are pending...

There are posts all the way to Feb 17, 2025

RECENT SERIES

  1. Challenge (77):
    20 Jan 2025 - What does this code do?
  2. Answer (13):
    22 Jan 2025 - What does this code do?
  3. Production post-mortem (2):
    17 Jan 2025 - Inspecting ourselves to death
  4. Performance discovery (2):
    10 Jan 2025 - IOPS vs. IOPS
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}