Ayende @ Rahien

It's a girl

Concurrent Max

Can you think of a better way to implement this code?

private volatile Guid lastEtag;
private readonly object lastEtagLocker = new object();
internal void UpdateLastWrittenEtag(Guid? etag)
{
    if (etag == null)
        return;

    var newEtag = etag.Value.ToByteArray();

    // not the most recent etag
    if (Buffers.Compare(lastEtag.ToByteArray(), newEtag) <= 0)
    {
        return;
    }

    lock (lastEtagLocker)
    {
        // not the most recent etag
        if (Buffers.Compare(lastEtag.ToByteArray(), newEtag) <= 0)
        {
            return;
        }

        lastEtag = etag.Value;
    }
}

We have multiple threads calling this function, and we need to ensure that lastEtag value is always the maximum value. This has the potential to be called often, so I want to make sure that I chose the best way to do this. Ideas?

Tags:

Published at

Originally posted at

Comments (31)

Checking For Empty Enumerations

Phil Haack has an interesting post about this topic, where he presents the following solution:

public static bool IsNullOrEmpty<T>(this IEnumerable<T> items) {
    return items == null || !items.Any();
}

This solution, unfortunately, suffers from a common problem related to handling IEnumerables. The assumption that you can iterate over enumerable more than once. This hold true for things like collections, but in many cases, this sort of code will silently hide data:

var files = Directory.EnumerateFiles(".","*.cs");
if(files.IsNullOrEmpty())
{
    Cosnole.WriteLine("No files");
}
else
{
   foreach(var file in files)
   {
          Console.WriteLine(file);
   }
}

The first file will never appear here.

A better solution is:

public static bool IsNullOrEmpty<T>(this IEnumerable<T> items, out IEnumerable<T> newItems) 
{
    newItems = items;
    if(items == null)
        return false;
    
    var enumerator = items.GetEnumerator();
    if(enumerator.MoveNext() == false)
        return false;
        
    newItems = new[]{enumerator.Current}.Concat(enumerator);
    
    return true;
}

That will not lose data.

C# Coding Challenge: What will this code do?

What is the output of this code?

 IDictionary<object,string> instanceToKey = new Dictionary<object, string>();

 IDictionary<string, int> keyToCost = new Dictionary<string, int>();

 var inst1 = new object();
 var inst2 = new object();

 instanceToKey[inst1] = "one";
 keyToCost["one"] = 1;

 instanceToKey[inst2] = "two";
 keyToCost["two"] = 2;

 string value = null;
 foreach (var key
     in (from inst in new[]{inst1, inst2, new object()}
         where instanceToKey.TryGetValue(inst, out value) 
         select value))
 {
     int cost;
     if(keyToCost.TryGetValue(key, out cost))
         Console.WriteLine("Key: {0}, Cost: {1}", key, cost);
 }
Tags:

Published at

Originally posted at

Comments (18)

Sometimes imperative is so much easier

Take a look at the following Erlang function:

count_promises(Id, N, {_, PromisesQueue}) ->
    rec_count_promises(0, Id, N, PromisesQueue).

rec_count_promises(Count, _, _, []) ->
    Count;
rec_count_promises(Count, Id, N, [{Id, N, _, _} | RestQueue]) ->
    rec_count_promises((Count + 1), Id, N, RestQueue);
rec_count_promises(Count, Id, N, [_ | RestQueue]) ->
    rec_count_promises(Count, Id, N, RestQueue).

I am reading a codebase full of this sort of things, and it is really painful. I keep thinking back to how I would do it in C#:

public int CountPromises(int id, int n, Tuple<List<Record>, List<Record>> phase1)
{
     int count=0;
     foreach(var record in phase1.Item2)
       {
            if(record.Id == id && record.N == n)
                   count ++;
     }
     return count ;
}

Yet that is imperative, and involve mutating state. I agree, it is a huge improvement, but it can be made both functionally safe and easier to read:

phase1.Items2.Count(record => record.Id == id && record.N == n);

As I said, I am currently going through a codebase full of these sort of functions, and it is painful, annoying and irritating. I am not experienced enough with Erlang to be able to tell conclusively if this is idiomatic Erlang code, but I think so.

I like strong typing and compilation errors

Today I had to modify a piece of JavaScript code. The code used return a single string, and I needed to modify it to return an array of objects. Using C#, it would have been easy, change the return type, hit the compile button, fix the errors, rinse & repeat until it compiles.

In JavaScript code, however, it was much more complex, I had to find all the places where that method was called, and that particular parameter would pass unchanged throughout several functions before it was used, so I had to track that down. Pretty annoying.

And since I know that I’ll get questions on that, here is the actual example:

getDocument: function (id, operation, successCallback) {
    $.ajax({
        url: settings.server + 'docs/' + id,
        dataType: 'json',
        complete: function(xhr) {
            switch(xhr.status) 
            {
                case 200:
                    var data = JSON.parse(xhr.responseText);
                    var etag = xhr.getResponseHeader("Etag");
                    var template = xhr.getResponseHeader('Raven-' + operation + '-Template');
                    successCallback(data, etag, template);
                    break;
                case 404:
                    successCallback(null, null, null);
                    break;
            }
        }
    });
},

I needed to change the template variable from to a dictionary of headers, since it needed to be processed elsewhere.

As for where it was actually used, here is one such example, which shows several layer of indirection (because of continuations):

image

I love ConcurrentDictionary!

Not just because it is concurrent, because of this wonderful method:

public class Storage : IStorage
{
    private readonly ConcurrentDictionary<Guid, ConcurrentDictionary<int, List<object>>> values =
        new ConcurrentDictionary<Guid, ConcurrentDictionary<int, List<object>>>();

    public void Store(Guid taskId, int level, object value)
    {
        values.GetOrAdd(taskId, guid => new ConcurrentDictionary<int, List<object>>())
            .GetOrAdd(level, i => new List<object>())
            .Add(value);
    }
}

Doing this with Dictionary is always a pain, but this is an extremely natural way of doing things.

Tags:

Published at

Originally posted at

Comments (8)

Challenge: Robust enumeration over external code

Here is an interesting little problem:

public class Program
{
    private static void Main()
    {
        foreach (int i in RobustEnumerating(Enumerable.Range(0, 10), FaultyFunc))
        {
            Console.WriteLine(i);
        }
    }

    public static IEnumerable<T> RobustEnumerating<T>(
        IEnumerable<T> input,Func<IEnumerable<T>, IEnumerable<T>> func)
    {
        // how to do this?
        return func(input);

    }

    public static IEnumerable<int> FaultyFunc(IEnumerable<int> source)
    {
        foreach (int i in source)
        {
            yield return i/(i%2);
        }
    }
}

This code should not throw, but print:

1
3
5
7
9

Can you make this happen? You can only change the RobustEnumerating method, nothing else in the code

Instantiating interfaces

How do you make this code legal?

var foo = new IFoo(1);

And yes, IFoo is an interfacae.

The answer is quite simple, actually. It was there since C# 1.0, I am told, and I just stumbled upon it. Take a look at this code:

class Program
{
	static void Main(string[] args)
	{
		var foo = new IFoo(1);
		foo.Do();
	}
}

[
	ComImport, 
	Guid("C906C002-B214-40d7-8941-F223868B39A5"), 
	CoClass(typeof(FooImpl))
]
public interface IFoo
{
	void Do();
}

public class FooImpl : IFoo
{
	private readonly int i;

	public FooImpl(int i)
	{
		this.i = i;
	}

	public void Do()
	{
		Console.WriteLine(i);	
	}
}

We have an interface, and we specify the co class that implements it and is the default implementation. The rest is just required to make the compiler happy about it.

What it means, in turn, is that you can instantiate an interface and have a default implementation selected. You can even use constructor parameters. It has quite a lot of implications, if you think about it right.

Not sure it is a wise feature to use, but it is certainly an interesting tidbit.

Dictionary Puzzler

A while ago in the NHibernate mailing list we got a report that NHibernate is making use of a dictionary with an enum as the key, and that is causing a big performance issue.

The response was almost unanimous, I think, “what?! how can that be?!!?”. Several people went in to and tried to figure out what is going on there. The answer is totally non oblivious, Dictionary<K,V> force boxing for any value type that is used as the key.

That sound completely contradictory to what you would expect, after all, one of the major points in generics was the elimination of boxing, so what happened?

Well, the issue is that Dictionary<K,V> has to compare the keys, and for that, it must make some assumptions about the actual key. It is abstracted into EqualityComparer, and that is where the actual problem starts. EqualityComparer has some special cases for the common types (anything that is IEquatable<T>, which most of the usual suspects implements), to speed this up.

The problem is that the fall back is to an ObjectComparer, and that, of course, will box any value type.

And enum does not implements IEquatable<T>…

Omer has a good coverage on the subject, with really impressive results. Take a look at his results.

image

I am not going to steal his thunder, but I suggest going over and reading the code, it is very elegant.

Elegant code

I just like this code, so I thought I would publish it.

   1: public static class ArrayExtension
   2: {
   3:     public static T[] GetOtherElementsFromElement<T>(this T[] array , T element)
   4:     {
   5:         var index = Array.IndexOf(array, element);
   6:         if (index == -1)
   7:             return array;
   8:         return array.Skip(index + 1).Union(array.Take(index)).ToArray();
   9:     }
  10: }

And the unit test:

   1: public class ReplicationUnitTest
   2: {
   3:     [Fact]
   4:     public void Will_distribute_work_starting_with_next_node()
   5:     {
   6:         var nodes = new[] { 1, 2, 3 };
   7:         Assert.Equal(new[] { 3, 1 }, nodes.GetOtherElementsFromElement(2));
   8:         Assert.Equal(new[] { 1, 2 }, nodes.GetOtherElementsFromElement(3));
   9:         Assert.Equal(new[] { 2, 3 }, nodes.GetOtherElementsFromElement(1));
  10:         Assert.Equal(new[] { 1, 2, 3 }, nodes.GetOtherElementsFromElement(4));
  11:     }
  12: }

Why NH Prof isn't functional

One of the things that I wanted to do with NH Prof is to build it in a way that would be very close to the Erlang way. That is, functional, immutable message passing. After spending some time trying to do this, I backed off, and used mutable OO with message passing.

The reason for that is quite simple. State.

Erlang can get away with being functional language with immutable state because it has a framework that manages that state around, and allow you to replace your state all the time. With C#, while I can create immutable data structures, if I want to actually create a large scale application using this manner, I have to write the state management framework, which is something that I didn't feel like doing.

Instead, I am using a more natural model for C#, and using the bus model to manage thread safety and multi threading scenarios.

A case study of bad API design: ASP.Net MVC Routing

I am doing a spike in ASP.Net MVC now (and I'll talk about this at length at another time). I hit the wall when I wanted to do something that is trivially simple in MonoRail, limit a routing parameter to be a valid integer.

Luckily, just looking at the API signature told me that this is a supported scenario:

image

Unfortunately, that is all that it told me. This method accept an object. And there is no hint of documentation to explain what I am suppose to do with it. A bit of thinking suggested that I am probably supposed to pass an anonymous type with the key as the route parameter and the value is some sort of a constraint. But what sort of a constraint.

Type information is one of those things that static language actually do, and from experience in both dynamic and static languages, while it is often a PITA to specify types, it actually help for people who read the code. Not often, I'll admit, but it is helpful for the uninitiated.

I am... unused to having this type of problem in C#.

So I did what any developer would do, hit google and tried to find some information about it. Didn't work.

I pulled reflector and started to track down what is going on there. Following a maze of untyped paths that I have not seen the like since the 1.1 days, I finally figured out that the value that I need to push is an instance of IRouteConstraint.

Obvious, isn't it?

In short, and the reason of this post. I am seeing a lot of parameter signatures that look like that, and have barely defined semantics. I would file this under C#.Abuse();

Reading MEF code

Okay, here is the deal. There is a feature in MEF that I find interesting, the ability to dynamically recompose the imports that an instance have. Well, that is not accurate. that doesn't really interest me. What does interest me is some of the implementation details. Let me explain a bit better.

As I understand the feature, MEF can load the imports from an assembly, and if I drop another file into the appropriate location, it will be able to update my imports collection. Now, what I am interested in is to know whatever MEF allow me to update file itself and update it on the fly. The reason that I am interested in that is to know how this is done without locking the file (loading an assembly usually locks the file, unless you use shadow copy assemblies, which means that you have to use a separate AppDomain).

As you can imagine, this is a very specific need, and I want to go in, figure out if this is possible, and go away.

I started by checking out the MEF code:

svn co https://mef.svn.codeplex.com/svn mef

I just love the SVN integration that CodePlex has.

Now, the only way that MEF can implement this feature is by watching the file system, and that can be done using a FileSystemWatcher. Looking for that, I can see that it appears that DirectoryPartCatalog is using it, which isn't really surprising.

But, going there and reading the code gives us this:

image

Note what isn't there. there is no registration to Changed. This is likely not something that MEF supports.

Okay, one more try. Let us see how it actually load an assembly. We start from Export<T> and GetExportedObject() which calls to GetExportedObjectCore() which shell out to a delegate. Along the way I looked at CompositionException, just to make sure that it doesn't have the same problem as TypeLoadException and the hidden information, it doesn't.

I tried to follow the reference chain, but I quickly got lost, I then tried to figure out how MEF does delayed assembly loading, to see if it is doing anything special there, but I am currently hung at ComposablePartDefinition.Create, which seems promising, but it is accepting a delegate and no one is calling this.

So this looks like it for now.

More code review errors

Take a look at this method:

image

Now, let us make this simple, shall we?

image

Same meaning, and a significant reduction of complexity. Damn, but this is annoying.

Common issues found in code review

I am going over a code base that I haven't seen in a while, and I am familiarizing myself with it by doing a code review to see that I understand what the code is doing now.

I am going to post code samples of changes that I made, along with some explanations.

image

This code can be improved by introducing a guard clause, like this:

image

This reduce nesting and make the code easier to read in the long run (no nesting).

image

I hope you recognize the issue. The code is using reflection to do an operation that is already built into the CLR.

This is much better:

image

Of course, there is another issue here, why the hell do we have those if statement on type instead of pushing this into polymorphic behavior. No answer yet, I am current just doing blind code review.

Here is another issue, using List explicitly:

image

It is generally better to rely on the most abstract type that you can use:

image

This is a matter of style more than anything else, but it drives me crazy:

image

I much rather have this:

image

Note that I added braces for both clauses, because it also bother me if one has it and the other doesn't.

Another issue is hanging ifs:

image

Which we can rewrite as:

image

I think that this is enough for now...

Emulating Java Enums

Java Enums are much more powerful than the ones that exists in the CLR. There are numerous ways of handling this issue, but here is my approach.

Given this enum (defined in Java):

private static enum Layer {
    FIRST,
    SECOND;

	public boolean isRightLayer(WorkType type) {
		if (this == FIRST && type != WorkType.COLLECTION) return true;
		return this == SECOND && type == WorkType.COLLECTION;
		}
}

And the C# version is:

private class Layer
{
    public static readonly Layer First = new Layer(delegate(WorkType type)
    {
        return type != WorkType.Collection;
    });
    public static readonly Layer Second = new Layer(delegate(WorkType type)
    {
        return type == WorkType.Collection;
    });

    public delegate bool IsRightLayerDelegate(WorkType type);

    private readonly IsRightLayerDelegate isRightLayer;

    protected Layer(IsRightLayerDelegate isRightLayer)
    {
        this.isRightLayer = isRightLayer;
    }

    public bool IsRightLayer(WorkType type)
    {
        return isRightLayer(type);
    }
}

Setting out to break the compiler...

I look at a bit of code that dealt with traversing expression an expression tree, using recursion, of course. The edge condition immediate popped to mind was unbounded expression. I decided to see if I can kill the compiler using this. Why? Because.

The first thing to do is to find out how deep a stack we usually need. I wrote this simple test:

class Program
{
	static void Main(string[] args)
	{
		Recursive(1);
	}

	static void Recursive(int i)
	{
		Console.WriteLine(i);
		Recursive(i+1);
	}
}

The last result was: 79994

Obviously this change based on how much stack space each function takes, but it is a good number to go with. I started with this code:

class Program
{
	static void Main(string[] args)
	{
		using(var fw = File.CreateText("text.txt"))
		{
			for (int i = 0; i < 80000; i++)
			{
				fw.Write(" a > "+i +" && ");
				if(i%10==0)
					fw.WriteLine();
			}
		}
	}
}

I then took the file ( slightly over 1 MB in size) and pasted the content to Visual Studio.

That was a mistake:

image

Okay, I can deal with this, let us try a different approach:

class Program
{
	static void Main(string[] args)
	{
		using(var fw = File.CreateText("text.cs"))
		{
			fw.WriteLine("public class Program {");
			fw.WriteLine("	static void Main(string[] args) {");
			fw.WriteLine("		var a = -1;");
			fw.Write    ("		var test = ");
			for (int i = 0; i < 80000; i++)
			{
				fw.Write(" a > "+i +" && ");
				if(i%10==0)
					fw.WriteLine();
			}
			fw.WriteLine(" a < 0;");
			fw.WriteLine("System.Console.WriteLine(test);");
			fw.WriteLine("	}");
			fw.WriteLine("}");

		}
	}
}

Trying to compile that produces:

fatal error CS1647: An expression is too long or complex to compile near ''

The help for CS1647 is:

There was a stack overflow in the compiler processing your code. To resolve this error, simplify your code. If your code is valid, contact Product Support.

The is valid, I guess, just not really reasonable. What is scary is that this is something that was added for 2.0, so at the 1.0 days, someone actually run into this issue.

Some experimentation showed that the C# compiler can handle expressions composed of 23,553 nodes. Now it is the time to get to the next stage, now the code is this:

class Program
{
	static void Main(string[] args)
	{
		using(var fw = File.CreateText("text.cs"))
		{
			fw.WriteLine("using System;");
			fw.WriteLine("using System.Linq.Expressions;");
			fw.WriteLine("public class Program {");
			fw.WriteLine("	static void Main(string[] args) {");
			fw.Write    ("		Expression<Predicate<int>> test = (a) => ");
			for (int i = 0; i < 11700; i++)
			{
				fw.Write(" a > "+i +" && ");
				if(i%10==0)
					fw.WriteLine();
			}
			fw.WriteLine(" a < 0;");
			fw.WriteLine("System.Console.WriteLine(test);");
			fw.WriteLine("	}");
			fw.WriteLine("}");

		}
	}
}

Note that I had to dramatically simplify the expression. Before it handled 23 thousands and change, but now it chokes on merely 12 thousands.

What is really surprising is that after compiling the code, it is running and seems to do the expected thing. Amazing.

Anyway, here is a completely useless post, but now I know that the C# compiler has well defined behavior for stack overflows. :-)

Not all objects are created equals

I found something extremely surprising while profiling a project. Take a look at this piece of code:

Stopwatch stop = Stopwatch.StartNew();
for (int i = 0; i < 1000000; i++)
{
	new WeakReference(null);
}
stop.Stop();
Console.WriteLine("WeakRef: " + stop.ElapsedMilliseconds);

stop = Stopwatch.StartNew();
for (int i = 0; i < 1000000; i++)
{
	new string('a', 5);
}
stop.Stop();
Console.WriteLine("'aaaaa': " + stop.ElapsedMilliseconds);

On my machine, this has the following output:

WeakRef: 980
'aaaaa': 35

Creating a WeakReference is much more costly than creating a normal object. Not surprising, when you think of it, WeakReference has deep ties to the CLR, but I couldn't really believe it when I saw it the first time.

Challenge: What does this code do?

Without compiling this, can you answer me whatever this piece of code will compile? And if so, what does it do?

var dummyVariable1 = 1;
var dummyVariable2 = 3;
var a = dummyVariable1
+-+-+-+-+ + + + + + +-+-+-+-+-+
dummyVariable2;

Oh, and I want to hear reasons, too.

ReSharper is smarter than me

Given the following block of code:

if (presenter.GetServerUrlFromRequest!=null)
	GetServerUrlFromRequest.Checked = presenter.GetServerUrlFromRequest.Value;
else
	GetServerUrlFromRequest.Checked = true;

Resharper gave me the following advice:

image

And turned the code to this piece:

GetServerUrlFromRequest.Checked = !(presenter.GetServerUrlFromRequest!=null) || 
presenter.GetServerUrlFromRequest.Value;

And while it has the same semantics, I actually had to dechiper the code to figure out what it was doing.

I choose to keep the old version.

Csc.exe and delegate inference, or: Why C# has awkward syntax

I just tried to to do a major revamp of Rhino Mocks' interface. It was intended to make it easier to work with C# 3.0 and use the smarter compiler to get better syntax.

I shouldn't have bothered. Take a look at this.

	public class TestCsc
	{
		public static void TestMethod()
		{
			Execute(Bar); // fail to compile
			Execute(delegate(int ia, string x) { }); // compiles fine
			Execute((int i, string x) => { return; }); // Compiles fine
			Execute((int i, string x) => { return true; }); // fail to compile
			Execute(Foo);// fail to compile
			Execute(delegate(int ia, string x) { return true; }); // fail to compile
		}

		public static bool Foo(int ia, string x)
		{
			return false;
		}

		public static void Bar(int ia, string x)
		{
		}

		public static void Execute<T, K>(Action<T, K> e)
		{
			
		}

		public static void Execute<T, K>(Func<bool, T, K> e)
		{

		}
	}
Annoyed.

Generic extension methods

I was playing around with the compiler when I hit this interesting feature. I was very surprised to see that this has compiled successfully.

   1: static class Program
   2: {
   3:     static void Main(string[] args)
   4:     {
   5:         IProcesser<GZipStream> p = null;
   6:         p.HasTimeout();
   7:     }
   8: }
   9:  
  10: public static class Extensions
  11: {
  12:     public static bool HasTimeout<T>(this IProcesser<T> s)
  13:         where T : Stream
  14:     {
  15:         return s.Desitnation.CanTimeout;
  16:     }
  17: }
  18:  
  19: public interface IProcesser<TDestination>
  20:     where TDestination : Stream
  21: {
  22:     TDestination Desitnation { get; }
  23: }