Ayende @ Rahien

Refunds available at head office

Troubleshooting, when F5 debugging can’t help you

You might have noticed that we have been doing a lot of work on the operational side of things. To make sure that we give you as good a story as possible with regards to the care & feeding of RavenDB. This post isn’t about this. This post is about your applications and systems, and how you are going to react when !@)(*#!@(* happens.

In particular, the question is what do you do when this happens?

This situation can crop up in many disguises. For example, you might be seeing a high memory usage in production, or experiencing growing CPU usage over time, or see request times go up, or any of a hundred and one different production issues that make for a hell of a night (somehow, they almost always happen at nighttime)

Here is how it usually people think about it.

The first thing to do is to understand what is going on. About the hardest thing to handle in this situations is when we have an issue (high memory, high CPU, etc) and no idea why. Usually all the effort is spent just figuring out what and why.. The problem with this process for troubleshooting issues is that it is very easy to jump to conclusions and have an utterly wrong hypothesis. Then you have to go through the rest of the steps to realize it isn’t right.

So the first thing that we need to do is gather information. And this post is primarily about the various ways that you can do that. In RavenDB, we have actually spent a lot of time exposing information to the outside world, so we’ll have an easier time figuring out what is going on. But I’m going to assume that you don’t have that.

The end all tool for this kind of errors in WinDBG. This is the low level tool that gives you access to pretty much anything you can want. It is also very archaic and not very friendly at all. The good thing about it is that you can load a dump into it. A dump is a capture of the process state at a particular point in time. It gives you the ability to see the entire memory contents and all the threads. It is an essential tool, but also the last one I want to use, because it is pretty hard to do so. Dump files can be very big, multiple GB are very common. That is because they contain the full memory dump of the process. There is also mini dumps, which are easier to work with, but don’t contain the memory dump, so you can watch the threads, but not the data.

The .NET Memory Profiler is another great tool for figuring things out. It isn’t usually so good for production analysis, because it uses the Profiler API to figure things out, but it has a wonderful feature of loading dump files (ironically, it can’t handle very large dump files because of memory issuesSmile) and give you a much nicer view of what is going on there.

For high CPU situations, I like to know what is actually going on. And looking at the stack traces is a great way to do that. WinDBG can help here (take a few mini dumps a few seconds apart), but again, that isn’t so nice to use.

Stack Dump is a tool that takes a lot of the pain away for having to deal with that. Because it just output all the threads information, and we have used that successfully in the past to figure out what is going on.

For general performance stuff “requests are slow”, we need to figure out where the slowness actually is. We have had reports that run the gamut from “things are slow, client machine is loaded” to “things are slow, the network QoS settings throttle us”. I like to start by using Fiddler to start figuring those things out. In particular, the statistics window is very helpful:

image

The obvious things are the bytes sent & bytes received. We have a few cases where a customer was actually sending 100s of MB in either of both directions, and was surprised it took some time. If those values are fine, you want to look at the actual performance listing. In particular, look at things like TCP/IP connect, time from client sending the request to server starting to get it, etc.

If you found the problem is actually at the network layer, you might not be able to immediately handle it. You might need to go a level or two lower, and look at the actual TCP traffic. This is where something like Wire Shark comes into play, and it is useful to figure out if you have specific errors at  that level (for example, a bad connection that cause a lot of packet loss will impact performance, but things will still work).

Other tools that are very important include Resource Monitor, Process Explorer and Process Monitor. Those give you a lot of information about what your application is actually doing.

One you have all of that information, you can form a hypothesis and try to test it.

If you own the application in question, the best way to improve your chances of figuring out what is going on is to add logging. Lots & lots of logging. In production, having the logs to support what is going on is crucial. I usually have several levels of logging. For example, what is the traffic in/out of my system. Next there is the actual system operations, especially anything that happens in the background. Finally, there are the debug/trace endpoints that will expose internal state and allow you to tweak various things at runtime.

Having good working knowledge on how to properly utilize the above mention tools is very important, and should be considered to be much more useful than learning a new API or a language feature.

Async event loops in C#

I’m designing a new component, and I want to reduce the amount of complexity involved in dealing with it. This is a networked component, and after designing several such, I wanted to remove one area of complexity, which is the use of explicitly concurrent code. Because of that, I decided to go with the following architecture:

image

 

The network code is just reading messages from the network, and putting them in an in memory queue. Then we have a single threaded event loop that simply goes over the queue and process those messages.

All of the code that is actually processing messages is single threaded, which make it oh so much easier to work with.

Now, I can do this quite easily with a  BlockingCollection<T>, which is how I usually did those sort of things so far. It is simple, robust and easy to understand. It also tie down a full thread for the event loop, which can be a shame if you don’t get a lot of messages.

So I decided to experiment with async approaches. In particular, using the BufferBlock<T> from the DataFlow assemblies.

I came up with the following code:

var q = new BufferBlock<int>(new DataflowBlockOptions
{
CancellationToken = cts.Token,
});

This just create the buffer block, but the nice thing here is that I can setup a “global” cancellation token for all operations on this. The problem is that this actually generate bad exceptions (InvalidOperationException, instead of TaskCancelledException). Well, I’m not sure if bad is the right term, but it isn’t the one I would expect here, at least. If you pass a cancellation token directly to the method, you get the behavior I expected.

At any rate, the code for the event loop now looks like this:

private static async Task EventLoop(BufferBlock<object> bufferBlock, CancellationToken cancellationToken)
{
while (true)
{
object msg;
try
{
msg = await bufferBlock.ReceiveAsync(TimeSpan.FromSeconds(3), cancellationToken);
}
catch (TimeoutException)
{
NoMessagesInTimeout();
continue;
}
catch (Exception e)
{
break;
}
ProcessMessage(msg);
}
}

And that is pretty much it. We have a good way to handle timeouts, and processing messages, and we don’t take up a thread. We can also be easily cancelled. I still need to run this through a lot more testing, in particular, to verify that this doesn’t cause issues when we need to debug this sort of system, but it looks promising.

DSLs in Boo, and a look back

About 6 years ago, I started writing the DSLs in Boo book, it came out in 2010, and today I got an email saying that this is now officially out of print. It was never a hugely popular book, so I’m not really surprised, but it really got me thinking.

I got to build several DSLs for production during the time I was writing this book, but afterward, I pretty much pivoted hard to RavenDB, and didn’t do much with DSLs since. However, the knowledge acquired during the writing of this book has actually been quite helpful when writing RavenDB itself.

I’m not talking about the design aspects of writing a DSLs, or the business decisions that are involved with that, although that is certainly a factor. I’m talking about the actual technical details of working with a language, a parser, etc.

In fact, you won’t see that, probably, but RavenDB indexes and transformers are actually DSLs, and they use a lot of the techniques that I talk about in the book. We start with something that looks like a C# code, but what ends up running is actually something that is far different. The Linq provider, too, rely heavily on those same techniques. We show you one thing but actually do something quite different under the cover.

It is interesting to see how the actual design of RavenDB was influenced by what my own history and the choices I made in various places. If I wasn’t well versed with abusing a language, I would probably have to go with something like CouchDB’s views, for example.

Tags:

Published at

Originally posted at

Comments (11)

Windows Overlapped I/O and TPL style programming

I really like the manner in which C# async tasks work. And while building Voron, I run into a scenario in which I could really make use of Windows async API. This is exposed via the Overlapped I/O. The problem is that those are pretty different models, and they don’t appear to want to play together very nicely.

Since I don’t feel like having those two cohabitate in my codebase, I decided to see if I could write a TPL wrapper that would provide nice API on top of the underlying Overlapped I/O implementation.

Here is what I ended up with:

   1: public unsafe class Win32DirectFile : IDisposable
   2: {
   3:     private readonly SafeFileHandle _handle;
   4:  
   5:     public Win32DirectFile(string filename)
   6:     {
   7:         _handle = NativeFileMethods.CreateFile(filename,
   8:             NativeFileAccess.GenericWrite | NativeFileAccess.GenericWrite, NativeFileShare.None, IntPtr.Zero,
   9:             NativeFileCreationDisposition.CreateAlways,
  10:             NativeFileAttributes.Write_Through | NativeFileAttributes.NoBuffering | NativeFileAttributes.Overlapped, IntPtr.Zero);
  11:  
  12:         if (_handle.IsInvalid)
  13:             throw new Win32Exception();
  14:  
  15:         if(ThreadPool.BindHandle(_handle) == false)
  16:             throw new InvalidOperationException("Could not bind the handle to the thread pool");
  17:     }

Note that I create the file with overlapped enabled, as well as write_through & no buffering (I need them for something else, not relevant for now).

It it important to note that I bind the handle (which effectively issue a BindIoCompletionCallback under the cover, I think), so we won’t have to use events, but can use callbacks. This is much more natural manner to work when using the TPL.

Then, we can just issue the actual work:

   1: public Task WriteAsync(long position, byte* ptr, uint length)
   2: {
   3:     var tcs = new TaskCompletionSource<object>();
   4:  
   5:     var nativeOverlapped = CreateNativeOverlapped(position, tcs);
   6:     
   7:     uint written;
   8:     var result = NativeFileMethods.WriteFile(_handle, ptr, length, out written, nativeOverlapped);
   9:     
  10:     return HandleResponse(result, nativeOverlapped, tcs);
  11: }

As you can see, all the actual details are handled in the helper functions, we can just run the code we need, passing it the overlapped structure it requires. Now, let us look at those functions:

   1: private static NativeOverlapped* CreateNativeOverlapped(long position, TaskCompletionSource<object> tcs)
   2: {
   3:     var o = new Overlapped((int) (position & 0xffffffff), (int) (position >> 32), IntPtr.Zero, null);
   4:     var nativeOverlapped = o.Pack((code, bytes, overlap) =>
   5:     {
   6:         try
   7:         {
   8:             switch (code)
   9:             {
  10:                 case ERROR_SUCCESS:
  11:                     tcs.TrySetResult(null);
  12:                     break;
  13:                 case ERROR_OPERATION_ABORTED:
  14:                     tcs.TrySetCanceled();
  15:                     break;
  16:                 default:
  17:                     tcs.TrySetException(new Win32Exception((int) code));
  18:                     break;
  19:             }
  20:         }
  21:         finally
  22:         {
  23:             Overlapped.Unpack(overlap);
  24:             Overlapped.Free(overlap);
  25:         }
  26:     }, null);
  27:     return nativeOverlapped;
  28: }
  29:  
  30: private static Task HandleResponse(bool completedSyncronously, NativeOverlapped* nativeOverlapped, TaskCompletionSource<object> tcs)
  31: {
  32:     if (completedSyncronously)
  33:     {
  34:         Overlapped.Unpack(nativeOverlapped);
  35:         Overlapped.Free(nativeOverlapped);
  36:         tcs.SetResult(null);
  37:         return tcs.Task;
  38:     }
  39:  
  40:     var lastWin32Error = Marshal.GetLastWin32Error();
  41:     if (lastWin32Error == ERROR_IO_PENDING)
  42:         return tcs.Task;
  43:  
  44:     Overlapped.Unpack(nativeOverlapped);
  45:     Overlapped.Free(nativeOverlapped);
  46:     throw new Win32Exception(lastWin32Error);
  47: }

The complexity here is that we need to handle 3 cases:

  • Successful completion
  • Error (no pending work)
  • Error (actually success, work is done in an async manner).

But that seems to be working quite nicely for me so far.

Tags:

Published at

Originally posted at

Comments (2)

Cache, it ain’t just remembering stuff

I mentioned that this piece of code have an issue:

public class LocalizationService
{
    MyEntities _ctx;
    Cache _cache;

    public LocalizationService(MyEntities ctx, Cache cache)
    {
        _ctx = ctx;
        _cache = cache;
        Task.Run(() =>
        {
            foreach(var item in _ctx.Resources)
            {
                _cache.Set(item.Key + "/" + item.LanguageId, item.Text);
            }
        });
    }    

    public string Get(string key, string languageId)
    {
        var cacheKey = key +"/" + languageId;
        var item = _cache.Get(cacheKey);
        if(item != null)
            return item;

        item = _ctx.Resources.Where(x=>x.Key == key && x.LanguageId == languageId).SingleOrDefault();
        _cache.Set(cacheKey, item);
        return item;
    }
}

And I am pretty sure that the lot of you’ll be able to find a lot of additional issues that I’ve not thought about.

But there are at least three major issues in the code above. It doesn’t do anything to solve the missing value problem, it doesn’t have good handling for expiring values and have no way to handle changing values.

Look at the code above, assume that I am making continuous calls to Get(“does not exists”, “nh-YI”), or something like that. The way the code is currently written, it will always hit the database to get that value.

The second problem is that if we have had a cache cleanup run, which expired some values, we will actually load them one at a time, in pretty much the worst possible way from the point of view of performance.

Then we have the problem of how to actually handle updating values.

Let us see how we can at least approach this. We will replace the Cache with a ConcurrentDictionary. That will mean that the data cannot just go away from under us, and since we expect the number of resources to be relatively low, there is no issue in holding all of them in memory.

Because we know we hold all of them in memory, we can be sure that if the value isn’t there, it isn’t in the database either, so we can immediately return null, without checking with the database.

Last, we will add a StartRefreshingResources task, which will do the actual refreshing in an async manner. In other words:

public class LocalizationService
{
    MyEntities _ctx;
    ConcurrentDictionary<Tuple<string,string>,string> _cache = new ConcurrentDictionary<Tuple<string,string>,string>();

    Task _refreshingResourcesTask;

    public LocalizationService(MyEntities ctx)
    {
        _ctx = ctx;
        StartRefreshingResources();
    } 

    public void StartRefreshingResources()
    {
         _refreshingResourcesTask = Task.Run(() =>
        {
            foreach(var item in _ctx.Resources)
            {
                _cache.Set(item.Key + "/" + item.LanguageId, item.Text);
            }
        });
    }

    public string Get(string key, string languageId)
    {
        var cacheKey = Tuplce.Create(key,languageId);
        var item = _cache.Get(cacheKey);
        if(item != null || _refreshingResourcesTask.IsCompleted)
            return item;

        item = _ctx.Resources.Where(x=>x.Key == key && x.LanguageId == languageId).SingleOrDefault();
        _cache.Set(cacheKey, item);
        return item;
    }
}

Note that there is a very subtle thing going on in here. as long as the async process is running, if we can’t find the value in the cache, we will go to the database to find it. This gives us a good balance between stopping the system entirely for startup/refresh and having the values immediately available.

Tags:

Published at

Originally posted at

Comments (11)

Optimizing the space & time matrix

The following method comes from the nopCommerce project. Take a moment to read it.

public virtual string GetResource(string resourceKey, int languageId,
    bool logIfNotFound = true, string defaultValue = "", bool returnEmptyIfNotFound = false)
{
    string result = string.Empty;
    if (resourceKey == null)
        resourceKey = string.Empty;
    resourceKey = resourceKey.Trim().ToLowerInvariant();
    if (_localizationSettings.LoadAllLocaleRecordsOnStartup)
    {
        //load all records (we know they are cached)
        var resources = GetAllResourceValues(languageId);
        if (resources.ContainsKey(resourceKey))
        {
            result = resources[resourceKey].Value;
        }
    }
    else
    {
        //gradual loading
        string key = string.Format(LOCALSTRINGRESOURCES_BY_RESOURCENAME_KEY, languageId, resourceKey);
        string lsr = _cacheManager.Get(key, () =>
        {
            var query = from l in _lsrRepository.Table
                        where l.ResourceName == resourceKey
                        && l.LanguageId == languageId
                        select l.ResourceValue;
            return query.FirstOrDefault();
        });

        if (lsr != null) 
            result = lsr;
    }
    if (String.IsNullOrEmpty(result))
    {
        if (logIfNotFound)
            _logger.Warning(string.Format("Resource string ({0}) is not found. Language ID = {1}", resourceKey, languageId));
        
        if (!String.IsNullOrEmpty(defaultValue))
        {
            result = defaultValue;
        }
        else
        {
            if (!returnEmptyIfNotFound)
                result = resourceKey;
        }
    }
    return result;
}

I am guessing, but I am assuming that the intent here is to have a tradeoff between startup time and the system responsiveness. If you have LoadAllLocaleRecordsOnStartup set to true, it will load all the data from the database, and access it from there. Otherwise, it will load the data in a piece at a time.

That is nice, but it shows a single tradeoff, and that isn’t a really good idea. Not only that, but look how it uses the cache. There are separate entries in the cache for the resources if they are loaded via the GetAllResourceValues() vs. individually. That leaves the cache with a lot less options when it needs to clear the cache. The cache deciding that it can remove a single item would result in a very expensive and long query taking place.

Instead, we can do it like this:

public class LocalizationService
{
    MyEntities _ctx;
    Cache _cache;

    public LocalizationService(MyEntities ctx, Cache cache)
    {
        _ctx = ctx;
        _cache = cache;
        Task.Run(() =>
        {
            foreach(var item in _ctx.Resources)
            {
                _cache.Set(item.Key + "/" + item.LanguageId, item.Text);
            }
        });
    }    

    public string Get(string key, string languageId)
    {
        var cacheKey = key +"/" + languageId;
        var item = _cache.Get(cacheKey);
        if(item != null)
            return item;

        item = _ctx.Resources.Where(x=>x.Key == key && x.LanguageId == languageId).SingleOrDefault();
        _cache.Set(cacheKey, item);
        return item;
    }
}

Of course, this has a separate issue, but I’ll discuss that in my next post.

Tags:

Published at

Originally posted at

Comments (4)

Diagnosing RavenDB problem in production

We have run into a problem in our production system. Luckily, this is a pretty obvious case of misdirection.

Take a look at the stack trace that we have discovered:

image

The interesting bit is that this is an “impossible” error. In fact, this is the first time that we have actually seen this error ever.

But looking at the stack trace tells us pretty much everything. The error happens in Dispose, but that one is called from the constructor. Because we are using native code, we need to make sure that an exception in the ctor will properly dispose all unmanaged resources.

The code looks like this:

image

And here we can see the probable cause for error. We try to open a transaction, then an exception happens, and the Dispose is called, but it isn’t ready to handle this scenario, so throws.

The original exception is masked, and you have a hard debug scenario.

Published at

Originally posted at

The Candy Crush Challenge

Here is an interesting challenge. In Candy Crush (which I do not have a problem with), you have 5 lives to try. Life renew at a rate of about 1 per 30 minutes. So it is pretty common to get to this stage:

candy crush life step 1

Now, you can go and change your system clock, and then you’ll get 5 more lives, and you can play some more.

Now, there are probably very good reason why this is done in this manner, to ensure players are still hooked, and it is just inconvenient enough that there is still meaning to the number of lives you have.

However, let us say that we wanted to stop that. How would you go about approaching this? Remember, we are talking about an app on a phone, and this isn’t something super serious if it gets broken, but we want to avoid the super easy workaround.

How would you solve that if this was on a computer, instead of a phone? What are the different considerations? What if this was something that was very important?

Tags:

Published at

Originally posted at

Comments (21)

Trivial Lru Cache impl

It has been a while since I actually posted some code here, and I thought that this implementation was quite nice, in that it is simple & works for what it needs to do.

 

   1: public class LruCache<TKey, TValue>
   2: {
   3:     private readonly int _capacity;
   4:     private readonly Stopwatch _stopwatch = Stopwatch.StartNew();
   5:  
   6:     private class Node
   7:     {
   8:         public TValue Value;
   9:         public volatile Reference<long> Ticks;
  10:     }
  11:  
  12:     private readonly ConcurrentDictionary<TKey, Node> _nodes = new ConcurrentDictionary<TKey, Node>();
  13:  
  14:     public LruCache(int capacity)
  15:     {
  16:         Debug.Assert(capacity > 10);
  17:         _capacity = capacity;
  18:     }
  19:  
  20:     public void Set(TKey key, TValue value)
  21:     {
  22:         var node = new Node
  23:         {
  24:             Value = value,
  25:             Ticks = new Reference<long> { Value = _stopwatch.ElapsedTicks }
  26:         };
  27:  
  28:         _nodes.AddOrUpdate(key, node, (_, __) => node);
  29:         if (_nodes.Count > _capacity)
  30:         {
  31:             foreach (var source in _nodes.OrderBy(x => x.Value.Ticks).Take(_nodes.Count / 10))
  32:             {
  33:                 Node _;
  34:                 _nodes.TryRemove(source.Key, out _);
  35:             }
  36:         }
  37:     }
  38:  
  39:     public bool TryGet(TKey key, out TValue value)
  40:     {
  41:         Node node;
  42:         if (_nodes.TryGetValue(key, out node))
  43:         {
  44:             node.Ticks = new Reference<long> {Value = _stopwatch.ElapsedTicks};
  45:             value = node.Value;
  46:             return true;
  47:         }
  48:         value = default(TValue);
  49:         return false;
  50:     }
  51: }
Tags:

Published at

Originally posted at

Comments (19)

Code crimes, because even the Law needs to be broken

In 99.9% of the cases, if you see this, it is a mistake:

image

But I recently came into a place where this is actually a viable solution to a problem. The issue occurs deep inside RavenDB. And it boils down to a managed application not really having any way of actually assessing how much memory it is using.

What I would really like to be able to do is to specify a heap (using HeapCreate) and then say: “The following objects should be created on this heap”, which would allow to me to control exactly how much memory I am using for a particular task. In practice, that isn’t really possible (if you know how, please let me know).

What we do instead is to use as much memory as we can, based on the workload that we have. At certain point, we may hit an Out of Memory condition, which we take to mean, not that there is no more memory, but the system’s way of telling us to calm down.

By the time we reached the exception handler, we already lost all of the current workload, that means that now there is a LOT of garbage around for the GC. Before, when it tried to cleanup, we were actually holding to a lot of that memory, but not anymore.

Once an OOME happened, we can adjust our own behavior, to know that we are consuming too much memory and should be more conservative about our operations. We basically reset the clock back and become twice as conservative about using more memory. And if we get another OOME? We become twice as conservative again Smile, and so on.

Eventually, even under high workloads and small amount of memory, we complete the task, and we can be sure that we make effective use of the system’s resources available to us.

Beware of big Task Parallel Library Operations

Take a look at the following code:

class Program
{
    static void Main()
    {
        var list = Enumerable.Range(0, 10 * 1000).ToList();

        var task = ProcessList(list, 0);


        Console.WriteLine(task.Result);

    }

    private static Task<int> ProcessList(List<int> list, int pos, int acc = 0)
    {
        if (pos >= list.Count)
        {
            var tcs = new TaskCompletionSource<int>();
            tcs.TrySetResult(acc);
            return tcs.Task;
        }

        return Task.Factory.StartNew(() => list[pos] + acc)
            .ContinueWith(task => ProcessList(list, pos + 1, task.Result))
            .Unwrap();
    }
}

This is a fairly standard piece of code, which does a “complex” async process and then move on. It is important in this case to do the operation in the order they were given, and the real code is actually doing something that need to be async (go and fetch some data from a remote server).

It is probably easier to figure out what is going on when you look at the C# 5.0 code:

class Program
{
    static void Main()
    {
        var list = Enumerable.Range(0, 10 * 1000).ToList();

        var task = ProcessList(list, 0);

        Console.WriteLine(task.Result);

    }

    private async static Task<int> ProcessList(List<int> list, int pos, int acc = 0)
    {
        if (pos >= list.Count)
        {
            return acc;
        }

        var result = await Task.Factory.StartNew(() => list[pos] + acc);

        return await ProcessList(list, pos + 1, result);
    }
}

I played with user mode scheduling in .NET a few times in the past, and one of the things that I was never able to resolve properly was the issue of the stack depth. I hoped that the TPL would resolve it, but it appears that it didn’t. Both code samples here will throw StackOverFlowException when run.

It sucks, quite frankly. I understand why this is done this way, but I am quite annoyed by this. I expected this to be solved somehow. Using C# 5.0, I know how to solve this:

class Program
{
    static void Main()
    {
        var list = Enumerable.Range(0, 10 * 1000).ToList();

        var task = ProcessList(list);

        Console.WriteLine(task.Result);

    }

    private async static Task<int> ProcessList(List<int> list)
    {
        var acc = 0;
        foreach (var i in list)
        {
            var currentAcc = acc;
            acc += await Task.Factory.StartNew(() => i + currentAcc);
        }
        return acc;
    }
}

The major problem is that I am not sure how to translate this code to C# 4.0. Any ideas?

Tags:

Published at

Originally posted at

Comments (43)

Memorable code

public class Program
{
    static List<Thread> list = new List<Thread>();
    private static void Main(string[] args)
    {
        var lines = File.ReadAllLines(args[0]);

        foreach (var line in lines)
        {
            var t = new Thread(Upsert)
            {
                Priority = ThreadPriority.Highest,
                IsBackground = true
            };
            list.Add(t);
            t.Start(line);
        }

        foreach (var thread in list)
        {
            thread.Join();
        }

    }

    private static void Upsert(object o)
    {
        var args = o.ToString().Split(',');
        try
        {
            using(var con = new SqlConnection(Environment.CommandLine.Split(' ')[1]))
            {
                var cmd = new SqlCommand
                {
                    Connection = con, 
                    CommandText = "INSERT INTO Accounts VALUES(@p1, @p2, @p3, @p4,@p5)"
                };

                for (var index = 0; index < args.Length; index++)
                {
                    cmd.Parameters.AddWithValue(@"@p" + (index + 1), args[index]);
                }

                try
                {
                    cmd.ExecuteNonQuery();
                }
                catch (SqlException e)
                {
                    if(e.Number == 2627 )
                    {
                        cmd.CommandText = "UPDATE Accounts SET Name = @p2, Email = @p3, Active = @p4, Birthday = @p5 WHERE ID = @p1";
                        cmd.ExecuteNonQuery();
                    }
                }
            }
        }
        catch (SqlException e)
        {
            if(e.Number == 1205)
            {
                var t = new Thread(Upsert)
                {
                    Priority = ThreadPriority.Highest,
                    IsBackground = true
                };
                list.Add(t);
                t.Start(o);
            }
        }
    }
}
Tags:

Published at

Originally posted at

Comments (54)

Negative hiring decisions, Part II

Another case of a candidate completing a task at home and sending it to me which resulted in a negative hiring decision is this:

protected void Button1_Click(object sender, EventArgs e)
{
    string connectionString = @"Data Source=OFFICE7-PC\SQLEXPRESS;Integrated Security=True";
    string sqlQuery = "Select UserName From  [Users].[dbo].[UsersInfo] Where UserName = ' " + TextBox1.Text + "' and Password = ' " + TextBox2.Text+"'";
    
    using (SqlConnection connection = new SqlConnection(connectionString))
    {
        SqlCommand command = new SqlCommand(sqlQuery, connection);
        connection.Open();
        SqlDataReader reader = command.ExecuteReader();
        try
        {
            while (reader.Read())
            {
           
            }
        }
        finally
        {
            if (reader.HasRows)
            {
                reader.Close();
                Response.Redirect(string.Format("WebForm2.aspx?UserName={0}&Password={1}", TextBox1.Text, TextBox2.Text));
            }
            else
            {
                reader.Close();
                Label1.Text = "Wrong user or password";
            }
        }
    }
}

The straw that really broke the camel’s back in this case was the naming of WebForm2. I could sort of figure out the rest, but not bothering to give a real name to the page was over the top.

Is Node.cs a cure for cancer?

This is mainly a tongue in cheek post, in reply to this guy. I decided to take his scenario and try it using my Node.cs “framework”. Here is the code:

 

public class Fibonaci : AbstractAsyncHandler
{
    protected override Task ProcessRequestAsync(HttpContext context)
    {
        return Task.Factory.StartNew(() =>
        {
            context.Response.ContentType = "text/plain";
            context.Response.Write(Fibonacci(40).ToString());
        });
    }

    private static int Fibonacci(int n)
    {
        if (n < 2)
            return 1;
        
        return Fibonacci(n - 2) + Fibonacci(n - 1);
    }
}

We start by just measuring how long it takes to serve a single request:

$ time curl http://localhost/Fibonaci.ashx
165580141
real    0m2.763s
user    0m0.000s
sys     0m0.031s

That is 2.7 seconds for a highly compute bound operation. Now, let us see what happens when we use Apache Benchmark to test things a little further:

ab.exe -n 10 -c 5 http://localhost/Fibonaci.ashx

(Make a total of ten requests, maximum of 5 concurrent ones)

And this gives us:

Requests per second:    0.91 [#/sec] (mean)
Time per request:       5502.314 [ms] (mean)
Time per request:       1100.463 [ms] (mean, across all concurrent requests)

Not bad, considering the best node.js (on a different machine and hardware configuration) was able to do was 0.17 requests per second.

Just for fun, I decided to try it with  a hundred requests, with 25 of them concurrent.

Requests per second:    0.97 [#/sec] (mean)
Time per request:       25901.481 [ms] (mean)
Time per request:       1036.059 [ms] (mean, across all concurrent requests)

Not bad at all.

Tags:

Published at

Originally posted at

Comments (10)

What is the cost of try/catch

I recently got a question about the cost of try/catch, and whatever it was prohibitive enough to make you want to avoid using it.

That caused some head scratching on my part, until I got the following reply:

But, I’m still confused about the try/catch block not generating an overhead on the server.

Are you sure about it?

I learned that the try block pre-executes the code, and that’s why it causes a processing overhead.

Take a look here: http://msdn.microsoft.com/en-us/library/ms973839.aspx#dotnetperftips_topic2

Maybe there is something that I don’t know? It is always possible, so I went and checked and found this piece:

Finding and designing away exception-heavy code can result in a decent perf win. Bear in mind that this has nothing to do with try/catch blocks: you only incur the cost when the actual exception is thrown. You can use as many try/catch blocks as you want. Using exceptions gratuitously is where you lose performance. For example, you should stay away from things like using exceptions for control flow.

Note that the emphasis is in the original. There is no cost to try/catch the only cost is when an exception is thrown, and that is regardless of whatever there is a try/catch around it or not.

Here is the proof:

var startNew = Stopwatch.StartNew();
var mightBePi = Enumerable.Range(0, 100000000).Aggregate(0d, (tot, next) => tot + Math.Pow(-1d, next)/(2*next + 1)*4);
Console.WriteLine(startNew.ElapsedMilliseconds);

Which results in: 6015 ms of execution.

Wrapping the code in a try/catch resulted in:

var startNew = Stopwatch.StartNew();
double mightBePi = Double.NaN;
try
{
    mightBePi = Enumerable.Range(0, 100000000).Aggregate(0d, (tot, next) => tot + Math.Pow(-1d, next)/(2*next + 1)*4);
}
catch (Exception e)
{
    Console.WriteLine(e);
}
Console.WriteLine(startNew.ElapsedMilliseconds);

And that run in 5999 ms.

Please note that the perf difference is pretty much meaningless (only 0.26% difference) and is well within the range of derivations for tests runs.

Tags:

Published at

Originally posted at

Comments (24)

A surprise TaskCancelledException

All of a sudden, my code started getting a lot of TaskCancelledException. It took me a while to figure out what was going on. We can imagine that the code looked like this:

var unwrap = Task.Factory.StartNew(() =>
{
    if (DateTime.Now.Month % 2 != 0)
        return null;

    return Task.Factory.StartNew(() => Console.WriteLine("Test"));
}).Unwrap();

unwrap.Wait();

The key here is that when Unwrap is getting a null task, it will throw a TaskCancelledException, which was utterly confusing to me. It make sense, because if the task is null there isn’t anything that the Unwrap method can do about it. Although I do wish it would throw something like ArgumentNullException with a better error message.

The correct way to write this code is to have:

var unwrap = Task.Factory.StartNew(() =>
{
    if (DateTime.Now.Month % 2 != 0)
    {
        var taskCompletionSource = new TaskCompletionSource<object>();
        taskCompletionSource.SetResult(null);
        return taskCompletionSource.Task;
    }

    return Task.Factory.StartNew(() => Console.WriteLine("Test"));
}).Unwrap();

unwrap.Wait();

Although I do wish that there was an easier way to create a completed task.

Tags:

Published at

Originally posted at

Comments (8)

Concurrent Max

Can you think of a better way to implement this code?

private volatile Guid lastEtag;
private readonly object lastEtagLocker = new object();
internal void UpdateLastWrittenEtag(Guid? etag)
{
    if (etag == null)
        return;

    var newEtag = etag.Value.ToByteArray();

    // not the most recent etag
    if (Buffers.Compare(lastEtag.ToByteArray(), newEtag) <= 0)
    {
        return;
    }

    lock (lastEtagLocker)
    {
        // not the most recent etag
        if (Buffers.Compare(lastEtag.ToByteArray(), newEtag) <= 0)
        {
            return;
        }

        lastEtag = etag.Value;
    }
}

We have multiple threads calling this function, and we need to ensure that lastEtag value is always the maximum value. This has the potential to be called often, so I want to make sure that I chose the best way to do this. Ideas?

Tags:

Published at

Originally posted at

Comments (31)

ConcurrentDIctionary.GetOrAdd may call the valueFactory method more than once

Originally posted at 3/28/2011

When you assume, you are making an ass out of yourself, you stupid moronic idiot with no sense whatsoever. The ass in question, if anyone cares to think about it, is yours truly.

Let us take a look at the following code snippet:

var concurentDictionary = new ConcurrentDictionary<int, int>();
var w = new ManualResetEvent(false);
int timedCalled = 0;
var threads = new List<Thread>();
for (int i = 0; i < Environment.ProcessorCount; i++)
{
    threads.Add(new Thread(() =>
    {
        w.WaitOne();
        concurentDictionary.GetOrAdd(1, i1 =>
        {
            Interlocked.Increment(ref timedCalled);
            return 1;
        });
    }));
    threads.Last().Start();
}

w.Set();//release all threads to start at the same time
foreach (var thread in threads)
{
    thread.Join();
}

Console.WriteLine(timedCalled);

What would you say would be the output of this code?

Well, I assumes that it would behave in an atomic fashion, that the implementation is something like:

if(TryGetValue(key, out value))
   return value;

lock(this)
{
   if(TryGetValue(key, out value))
      return value;

   AddValue( key, valueFactory());
}

Of course, the whole point of the ConcurentDictionary is that there are no locks. Well, that is nice, except that because I assumed that the call is only made once, I called that with a function that had side effects when called twice.

That was a pure hell to figure out, because in my mind, of course that there was no error with this function.

Tags:

Published at

Originally posted at

Comments (11)

Synchronization primitives, MulticastAutoResetEvent

I have a very interesting problem within RavenDB. I have a set of worker processes that all work on top of the same storage. Whenever a change happen in the storage, they wake up and start working on it. The problem is that this change may be happening while the worker process is busy doing something other than waiting for work, which means that using Monitor.PulseAll, which is what I was using, isn’t going to work.

AutoResetEvent is what you are supposed to use in order to avoid losing updates on the lock, but in my scenario, I don’t have a single worker, but a set of workers. And I really wanted to be able to use PulseAll to release all of them at once. I started looking at holding arrays of AutoResetEvents, keeping tracking of all changes in memory, etc. But none of it really made sense to me.

After thinking about it for  a while, I realized that we are actually looking at a problem of state. And we can solve that by having the client hold the state. This led me to write something like this:

public class MultiCastAutoResetEvent 
{
    private readonly object waitForWork = new object();
    private int workCounter = 0;
    
    
    public void NotifyAboutWork()
    {
        Interlocked.Increment(ref workCounter);
        lock (waitForWork)
        {
            Monitor.PulseAll(waitForWork);
            Interlocked.Increment(ref workCounter);
        }
    }
    
    
    public void WaitForWork(TimeSpan timeout, ref int workerWorkCounter)
    {
        var currentWorkCounter = Thread.VolatileRead(ref workCounter);
        if (currentWorkCounter != workerWorkCounter)
        {
            workerWorkCounter = currentWorkCounter;
            return;
        }
        lock (waitForWork)
        {
            currentWorkCounter = Thread.VolatileRead(ref workCounter);
            if (currentWorkCounter != workerWorkCounter)
            {
                workerWorkCounter = currentWorkCounter;
                return;
            }
            Monitor.Wait(waitForWork, timeout);
        }
    }
}

By forcing the client to pass us the most recently visible state, we can efficiently tell whatever they still have work to do or do they have to wait.

Tags:

Published at

Originally posted at

Comments (38)

JAOO: The Go Programming Language

Currently sitting in the Go language keynote in JAOO.

Go was created to handle Google needs:

  • Compilation speed
  • Efficient
  • Fit for the current set of challenges we face

Go is meant to be Simple, Orthogonal, Succinct. Use C syntax, but leave a lot of complexity behind.

This make sense, because C/C++ is horrible for a lot of reasons, because of complexity added to it over the years. Actually, the major problem is that C/C++ mindset are literally designed for a different era. Take into account how long it can take to build a C++ application (I have seen no too big apps that had half an hour build times!), and you can figure out why Google wanted to have a better model.

Nice quotes:

  • Programming shouldn’t be a game of Simon’s Says
  • Difference between seat belts & training wheels – I am going to use that a lot.

I find Go’s types annoying:

  • [variable name] [variable type] vs. [variable type] [variable name]
  • And []int vs. int[]

I like the fact that is uses garage collection in native code, especially since if you reference a variable (including local) it will live as long as it has a reference to. Which is a common mistake in C.

It also have the notion of deferring code. This is similar to how you can use RAII in C++, but much more explicit, which is good, because RAII is always tricky. It very cleanly deals with the need to dispose things.

All methods looks like extension methods. Data is held in structures, not in classes. There is support for embedding structures inside one another, but Go isn’t an OO language.

Interfaces work for all types, and interfaces are satisfied implicitly! Much easier to work, think about this like super dynamic interfaces, but strong typed. Implies that we can retrofit things afterward.

Error handling is important, and one of the major reasons that I moved from C++ to .NET. Go have two modes, the first is error codes, which they use for most errors. This is especially nice since Go have multiple return values, so it is very simple to use. But it also have the notion of panics, which are similar to exceptions. Let us take a look at the following test code:

package main

func test() {
  panic("ayende")
}

func main() {
    test()
}

Which generate the following panic:

panic: ayende

panic PC=0x2ab34dd47040
runtime.panic+0xb2 /sandbox/go/src/pkg/runtime/proc.c:1020
    runtime.panic(0x2ab300000000, 0x2ab34dd470a0)
main.test+0x47 /tmp/gosandbox-a9aaff6c_68e6b411_26fb5255_aa397bb1_6299a954/prog.go:5
    main.test()
main.main+0x18 /tmp/gosandbox-a9aaff6c_68e6b411_26fb5255_aa397bb1_6299a954/prog.go:8
    main.main()
mainstart+0xf /sandbox/go/src/pkg/runtime/amd64/asm.s:78
    mainstart()
goexit /sandbox/go/src/pkg/runtime/proc.c:145
    goexit()

I got my stack trace, so everything is good. Error handling is more tricky. There isn’t a notion of try/catch, because panics aren’t really exceptions, but you can recover from panics:

package main
import ( 
   "fmt"
)
func badCall() {
  panic("ayende")
}

func test() {
   defer func() { 
    if e := recover(); e != nil {
       fmt.Printf("Panicing %s\r\n", e);
    }
   
    }()
    badCall()
   fmt.Printf("After bad call\r\n");
}

func main() {
   fmt.Printf("Calling test\r\n");
   test()
   fmt.Printf("Test completed\r\n");
 
 
}

Which will result in the following output:

Calling test
Panicing ayende
Test completed

All in all, if I need to write native code, I’ll probably go with Go instead of C now. Especially since the ideas that it have about multi threading are so compelling. I just wish that the windows port would be completed soon.

It ain’t no simple feature, mister

I recently got a bug report about NH Prof in a multi monitor environment. Now, I know that NH Prof works well in multi monitor environment, because I frequently run in such an environment myself.

The problem turned out to be not multi monitors in and of itself, but rather how NH Prof handles the removal of a monitor. It turns out that NH Prof has a nice little feature that actually remembers the last position the window was at, and returns to it on start. When the monitor NH Prof was located on was removed, on start NH Prof would put itself beyond the screen position.

That led to me having to figure out how to find the available monitor space, so I could detect if the saved positions were valid or not. What I found interesting in this is that what seemed to be a very trivial feature (save two numbers) turned out to be somewhat more complex, and I am pretty sure that there are other scenarios that I am missing (in the very same feature).

Implementing CreateSequentialUuid()

We run into an annoying problem in Raven regarding the generation of sequential guids. Those are used internally to represent the etag of a document.

For a while, we used the Win32 method CreateSequentialUuid() to generate that. But we run into a severe issue with that, it create sequential guids only as long as the machine is up. After a reboot, the guids are no longer sequential. That is bad, but it also means that two systems calling this API can get drastically different results (duh! that is the point, pretty much, isn’t it?). Which wouldn’t bother me, except that we use etags to calculate the freshness of an index, so we have to have an always incrementing number.

How would you implement this method?

public static Guid CreateSequentialUuid()

A few things to note:

  • We really actually care about uniqueness here, but only inside a single process, not globally.
  • The results must always be incrementing.
  • The always incrementing must be consistent across machine restarts and between different machines.

Yes, I am fully aware of the NHibernate’s implementation of guid.comb that creates sequential guids. It isn't applicable here, since it doesn't create truly sequential guids, only guids that sort near one another.

LightSwitch: Initial thoughts

As promised, I intend to spend some time today with LightSwitch, and see how it works. Expect a series of post on the topic. In order to make this a read scenario, I decided that that a simple app recording animals and their feed schedule is appropriately simple.

I created the following table:

image

Note that it has a calculated field, which is computed using:

image

There are several things to note here:

  • ReSharper doesn’t work with LightSwitch, which is a big minus to me.
  • The decision to use partial methods had resulted in really ugly code.
  • Why is the class called Animals? I would expect to find an inflector at work here.
  • Yes, the actual calculation is crap, I know.

This error kept appearing at random:

image

It appears to be a known issue, but it is incredibly annoying.

This is actually really interesting:

image

  • You can’t really work with the app unless you are running in debug mode. That isn’t the way I usually work, so it is a bit annoying.
  • More importantly, it confirms that this is indeed KittyHawk, which was a secret project in 2008 MVP Summit that had some hilarious aspects.

There is something that is really interesting, it takes roughly 5 – 10 seconds to start a LS application. That is a huge amount of time. I am guessing, but I would say that a lot of that is because the entire UI is built dynamically from the data source.

That would be problematic, but acceptable, except that it takes seconds to load data even after the app has been running for a while. For example, take a look here:

image

This is running on a quad core, 8 GB machine, in 2 tiers mode. It takes about 1 – 2 seconds to load each screen. I was actually able to capture a screen half way loaded. Yes, it is beta, I know. Yes, perf probably isn’t a priority yet, but that is still worrying.

Another issue is that while Visual Studio is very slow, busy about 50% of the time. This is when the LS app is running or not. As an a side issue, it is hard to know if the problem is with LS or VS, because of all the problems that VS has normally.

image

As an example of that, this is me trying to open the UserCode, it took about 10 seconds to do so.

What I like about LS is that getting to a working CRUD sample is very quick. But the problems there are pretty big, even at a cursory examination. More detailed posts touching each topic are coming shortly.

Runtime code compilation & collectible assemblies are no go

The problem is quite simple, I want to be able to support certain operation on Raven. In order to support those operations, the user need to be able to submit a linq query to the server. In order to allow this, we need to accept a string, compile it and run it.

So far, it is pretty simple. The problem begins when you consider that assemblies can’t be unloaded. I was very hopeful when I learned about collectible assemblies in .NET 4.0, but they focus exclusively on assemblies generated from System.Reflection.Emit, while my scenario is compiling code on the fly (so I invoke the C# compiler to generate an assembly, then use that).

Collectible assemblies doesn’t help in this case. Maybe, in C# 5.0, the compiler will use SRE, which will help, but I don’t hold much hope there. I also checked out Mono.CSharp assembly, hoping that maybe it can do what I wanted it to do, but that suffer from the memory leak as well.

So I turned to the one solution that I knew would work, generating those assemblies in another app domain, and unloading that when it became too full. I kept thinking that I can’t do that because of the slowdown with cross app domain communication, but then I figured that I am violating one of the first rules of performance: You don’t know until you measure it. So I set out to test it.

I am only interested in testing the speed of cross app domain communication, not anything else, so here is my test case:

public class RemoteTransformer : MarshalByRefObject
{
    private readonly Transformer transfomer = new Transformer();

    public JObject Transform(JObject o)
    {
        return transfomer.Transform(o);
    }
}

public class Transformer
{
    public JObject Transform(JObject o)
    {
        o["Modified"] = new JValue(true);
        return o;
    }
}

Running things in the same app domain (base line):

static void Main(string[] args)
{
    var t = new RemoteTransformer();
    
    var startNew = Stopwatch.StartNew();

    for (int i = 0; i < 100000; i++)
    {
        var jobj = new JObject(new JProperty("Hello", "There"));

        t.Transform(jobj);

    }

    Console.WriteLine(startNew.ElapsedMilliseconds);
}

This consistently gives results under 200 ms (185ms, 196ms, etc). In other words, we are talking about over 500 operations per millisecond.

What happen when we do this over AppDomain boundary? The first problem I run into was that the Json objects were serializable, but that was easy to fix. Here is the code:

 static void Main(string[] args)
 {
    var appDomain = AppDomain.CreateDomain("remote");
    var t = (RemoteTransformer)appDomain.CreateInstanceAndUnwrap(typeof(RemoteTransformer).Assembly.FullName, typeof(RemoteTransformer).FullName);
    
    var startNew = Stopwatch.StartNew();
     
     for (int i = 0; i < 100000; i++)
     {
         var jobj = new JObject(new JProperty("Hello", "There"));

         t.Transform(jobj);

     }

     Console.WriteLine(startNew.ElapsedMilliseconds);
 }

And that run close to 8 seconds, (7871 ms). Or over 40 times slower, or just about 12 operations per millisecond.

To give you some indication about the timing, this means that an operation over 1 million documents would spend about 1.3 minutes just serializing data across app domains.

That is… long, but it might be acceptable, I need to think about this more.

LightSwitch: The Return Of The Secretary

image Microsoft LightSwitch is a new 4GL tool from Microsoft, this is another in the series of “you don’t have to write any code” tools that I have seen.

Those are the tools that will give the secretary the ability to create applications and eliminate the need for coders. The industry has been chasing those tools since the 80s (does anyone remember the promises of the CASE tools?). We have seen many attempts at doing this, and all of them have run into a wall pretty quickly.

Oh, you can build a tool that gives you UI on top of a data store pretty easily. And you can go pretty far with it, but eventually your ability to point & click hit the limit, and you have to write code. And that is things totally breaks down.

LightSwitch is not yet publically available, so I have to rely on the presentation that Microsoft published. And I can tell you that I am filled with dread, based on what I have seen.

First of all, I strongly object to the following slide. Because I have the experience to know that working with a tool like that is akin to do back flips with a straightjacket on.

image

The capabilities of the tools that were shown in the presentation have strongly underwhelmed me in terms of newness, complexity or applicability.

Yeah, a meta data driven UI. Yeah, it can do validation on a phone number automatically (really, what happen with my Israeli based phone number?), etc. What is worse, even through the demo, I get the very strong feeling that the whole things is incredibly slow, you can see in the presentation multi second delays between screen repaints.

Then there are things like “it just works as a web app or windows app” which is another pipe dream that the industry has been chasing for a while. And the only piece of code that I have seen is this guy:

image 

Which makes me want to break down and cry.

Do you know why? Because this is going to be the essence of a SELECT N+1 in any system, because this code is going to run once per each row in the grid. And when I can find bugs from watching a presentation, you know that there are going to be more issues.

So, just for fun sake, since I don’t have the bits and I can rely only on the presentation, I decided to make a list of all the things that are likely to be wrong with LightSwitch.

I’ll review it when it comes out, and if it does manage to do everything that it does and still be a tool usable by developers, I’ll have to eat crow (well, Raven :-) ), but I am not overly worried.

Here are a few areas where I am feeling certain things will not work right:

  • Source control – how do I diff two versions of the app to see what changes? Are all changes diffable?
  • Programmatic modifications:
    • what happen when I want to write some code to do custom validation of property (for instance, calling a web service)?
    • what happen when I want to put a custom control on the screen (for instance, a google maps widget)?
  • Upsizing – when it gets to a 1,000 users and we need a full blown app, how hard it is to do?
  • Performance – as I said, I think it is slow from the demo.
  • Data access behavior – from what I have seen so far, I am willing to be that it hits its data store pretty heavily.

I fully recognize that there is a need for such a tool, make no mistake. And giving users the ability to do that is important. What I strongly object to is the notion that it would be useful for developers writing real apps, not forms over data. To put it simply, simple forms over data is a solved problem. There is a large number of tools out there to do that. From Access to Oracle Apex to FoxPro. Hell, most CRM solutions will give you just that.

My concern is that there seems to be an emphasis on that being useful for developers as well, and I strongly doubt that.