Ayende @ Rahien

Refunds available at head office

Bug hunting in a massively multi threaded environment

We got a really nasty bug report from a user. Sometimes, out of the blue, RavenDB would throw an error:

System.InvalidOperationException: Collection was modified; enumeration operation may not execute.
at System.Collections.Generic.List`1.Enumerator.MoveNextRare()
at Raven.Json.Linq.RavenJArray.WriteTo(JsonWriter writer, JsonConverter[] converters) in c:\Builds\RavenDB-Stable\Raven.Abstractions\Json\Linq\RavenJArray.cs:line 174
at Raven.Json.Linq.RavenJObject.WriteTo(JsonWriter writer, JsonConverter[] converters) in c:\Builds\RavenDB-Stable\Raven.Abstractions\Json\Linq\RavenJObject.cs:line 275 at Raven.Json.Linq.RavenJObject.WriteTo(JsonWriter writer, JsonConverter[] converters) in c:\Builds\RavenDB-Stable\Raven.Abstractions\Json\Linq\RavenJObject.cs:line 275

Unfortunately, this error only happened once in a while, usually when the system was under load. But they weren’t able to provide a repro for that.

Luckily, they were able to tell us that they suspected that this is related to the replication support. I quickly setup a database with replication and wrote the following code to try to reproduce this:

using(var store = new DocumentStore
{
    Url = "http://localhost:8080",
    DefaultDatabase = "hello"
}.Initialize())
{
    using(var session = store.OpenSession())
    {
        session.Store(new ReadingList
        {
            UserId = "test/1",
            Id = "lists/1",
            Books = new List<ReadingList.ReadBook>()
        });
        session.SaveChanges();
    }
    Parallel.For(0, 100, i =>
    {
        while (true)
        {
            try
            {
                using (var session = store.OpenSession())
                {
                    session.Advanced.UseOptimisticConcurrency = true;
                    session.Load<ReadingList>("lists/1")
                            .Books.Add(new ReadingList.ReadBook
                            {
                                ReadAt = DateTime.Now,
                                Title = "test " + i
                            });
                    session.SaveChanges();
                }
                break;
            }
            catch (ConcurrencyException)
            {
                
            }
        }
    });
}

And that reproduced the bug! Hurrah! Done deal, we can move on, right?

Except that the bug was only there when we have massive amount of threads hitting the db at once, and trying to figure out what is actually going on there was next to impossible using standard debugging tools. Instead, I reached down to my tracing toolbelt and starting pulling stuff out. First, we identified that the problem occurred when iterating over RavenJArray, which is our own object, so we added the following:

        ConcurrentQueue<StackTrace>  addStackTraces = new ConcurrentQueue<StackTrace>();

        public void Add(RavenJToken token)
        {
            if (isSnapshot)
                throw new InvalidOperationException("Cannot modify a snapshot, this is probably a bug");

            addStackTraces.Enqueue(new StackTrace(true));

            Items.Add(token);
        }

And this one (where the exception is raised):

public override void WriteTo(JsonWriter writer, params JsonConverter[] converters)
{
    writer.WriteStartArray();

    if (Items != null)
    {
        try
        {
            foreach (var token in Items)
            {
                token.WriteTo(writer, converters);
            }
        }
        catch (InvalidOperationException e)
        {
            foreach (var stackTrace in addStackTraces)
            {
                Console.WriteLine(stackTrace);
            }
            throw;
        }
    }

    writer.WriteEndArray();
}

With the idea that we would actually be able to get what is going on there. By tracking down who added items to this particular instance, I hoped that I would be able to figure out why we have an instance that is shared among multiple threads.

When I had that, it was pretty easy to see that it was indeed the replication bundle that was causing the issue. The problem was that the replication bundle was modifying an inner array inside the document metadata. We protected the root properties of the metadata from concurrent modifications, and most of the time, it works just fine. But the problem was that now we had a bundle that was modifying a nested array, which wasn’t protected.

This is one of those bugs that are really hard to catch:

  • My understanding of the code said that this is not possible, since I believed that we protected the nested properties as well*.
  • This bug will only surface if and only if:
    • You have the replication bundle enabled.
    • You have a great deal of concurrent modifications (with optimistic concurrency enabled) to the same document.
    • You are unlucky.

I was grateful that the user figured out the replication connection, because I already sat on that bug previously, and there was no way I could figure out what is going on unless I had the trace to point me to where the actual problem was.

Production issue: ASP.Net Cache kills the application

In one of our production deployments, we occasionally get a complete server process crash. Investigating the event log, we have this:

Exception: System.InvalidOperationException

Message: Collection was modified; enumeration operation may not execute.

StackTrace:    at System.Collections.Generic.Dictionary`2.KeyCollection.Enumerator.MoveNext()

   at System.Web.Hosting.ObjectCacheHost.TrimCache(Int32 percent)

   at System.Web.Hosting.HostingEnvironment.TrimCache(Int32 percent)

   at System.Web.Hosting.HostingEnvironment.TrimCache(Int32 percent)

   at System.Web.Hosting.ApplicationManager.TrimCaches(Int32 percent)

   at System.Web.Hosting.CacheManager.CollectInfrequently(Int64 privateBytes)

   at System.Web.Hosting.CacheManager.PBytesMonitorThread(Object state)

   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean ignoreSyncCtx)

   at System.Threading._TimerCallback.PerformTimerCallback(Object state)

As you can see, this is a case of what appears to be a run of the mill race condition, translated to a process killing exception because it was thrown from a separate thread.

This thread, by the way, is the ASP.Net Cache cleanup thread, and we have no control whatsoever over that. To make things worse, this application doesn’t even use the ASP.NET Cache in any way shape or form.

Any ideas how to resolve this would be very welcome.

Tags:

Published at

Originally posted at

Comments (33)

Silverlight streaming - the race condition is already included

I am not sure how to call this issue, except maddening. For a simple repro, you can check this github repository.

The story is quite simple, let us assume that you need to send a set of values from the server to the client. For example, they might be tick values, or updates, or anything of this sort.

You can do this by keeping an HTTP connection open and sending data periodically. This is a well known technique and it works quite well. Except in Silverlight, where it works, but only if you put the appropriate Thread.Sleep() in crucial places.

Here is an example of the server behavior:

var listener = new HttpListener
{
    Prefixes = {"http://+:8080/"}
};
listener.Start();
while (true)
{
    var ctx = listener.GetContext();
    using (var writer = new StreamWriter(ctx.Response.OutputStream))
    {
        writer.WriteLine("first\r\nsecond");
        writer.Flush();
    }
    Console.ReadKey();
}

In this case, note that we are explicitly flushing the response, then just wait. If you look at the actual network traffic, you can see that this will actually be sent, the connection will remain open, and we can actually send additional data as well.

But how do you consume such a thing in Silverlight?

var webRequest = (HttpWebRequest)WebRequestCreator.ClientHttp.Create(new Uri("http://localhost:8080/"));
webRequest.AllowReadStreamBuffering = false;
webRequest.Method = "GET";

Task.Factory.FromAsync<WebResponse>(webRequest.BeginGetResponse, webRequest.EndGetResponse, null)
    .ContinueWith(task =>
    {
        var responseStream = task.Result.GetResponseStream();
        ReadAsync(responseStream);
    });

 

We start by making sure that we disable read buffering, then we get the response and start reading from it. The read method is a bit complex, because is has to deal with partial response, but it should still be fairly obvious what is going on:

byte[] buffer = new byte[128];
private int posInBuffer;
private void ReadAsync(Stream responseStream)
{
    Task.Factory.FromAsync<int>(
        (callback, o) => responseStream.BeginRead(buffer, posInBuffer, buffer.Length - posInBuffer, callback, o),
        responseStream.EndRead, null)
        .ContinueWith(task =>
        {
            var read = task.Result;
            if (read == 0) 
                throw new EndOfStreamException();
            // find \r\n in newly read range

            var startPos = 0;
            byte prev = 0;
            bool foundLines = false;
            for (int i = posInBuffer; i < posInBuffer + read; i++)
            {
                if (prev == '\r' && buffer[i] == '\n')
                {
                    foundLines = true;
                    // yeah, we found a line, let us give it to the users
                    var data = Encoding.UTF8.GetString(buffer, startPos, i - 1 - startPos);
                    startPos = i + 1;
                    Dispatcher.BeginInvoke(() =>
                    {
                        ServerResults.Text += data + Environment.NewLine;
                    });
                }
                prev = buffer[i];
            }
            posInBuffer += read;
            if (startPos >= posInBuffer) // read to end
            {
                posInBuffer = 0;
                return;
            }
            if (foundLines == false)
                return;

            // move remaining to the start of buffer, then reset
            Array.Copy(buffer, startPos, buffer, 0, posInBuffer - startPos);
            posInBuffer -= startPos;
        })
        .ContinueWith(task =>
        {
            if (task.IsFaulted)
                return;
            ReadAsync(responseStream);
        });
}

While I am sure that you could find bugs in this code, that isn’t the crucial point.

If we run the server, then run the SL client, we could see that we get just one lousy byte, and that is it. Now, reading about this, it appears that in some versions of some browsers, you need to send 4KB of data to get the connection going. But that isn’t what I have observed. I tried sending 4KB+ of data, and I still saw the exact same behavior, we got called for the first byte, and nothing else.

Eventually, I boiled it down to the following non working example:

writer.WriteLine("first");
writer.Flush();
writer.WriteLine("second");
writer.Flush();

Versus this working example:

writer.WriteLine("first");
writer.Flush();
Thread.Sleep(50);
writer.WriteLine("second");
writer.Flush();

Yes, you got it right, if I put the thread sleep in the server, I’ll get both values in the client. Without the Thread.Sleep, we get only the first byte. It seems like it isn’t an issue of size, but rather of time, and I am at an utter loss to explain what is going on.

Oh, and I am currently awake for 27 hours straight, most of them trying to figure out what the )(&#@!)(DASFPOJDA(FYQ@YREQPOIJFDQ#@R(AHFDS:OKJASPIFHDAPSYUDQ)(RE is going on.

Tags:

Published at

Originally posted at

Comments (22)

The HIGH cost of ConcurrentBag in .NET 4.0

I got some strange results when using concurrent collections, so I decided to try to track it down, and wrote the following code:

var count = ?;
var list = new List<object>(count);
var sp = Stopwatch.StartNew();
for (int i = 0; i < count; i++)
{
    list.Add(new ConcurrentBag<int>());
}
sp.Stop();
Console.WriteLine("{0} {2} items in {1:#,#;;0}ms = {3:#,#;;0}ms per item",
    sp.Elapsed, sp.ElapsedMilliseconds, count, sp.ElapsedMilliseconds / count);

And then I started to play with the numbers, and it is not good.

  • 10 items in 2ms = 0ms per item

This is incredibly high number, you have to understand. Just to compare, List<int> takes 8 ms to create 100,000 items.

Let us see how it works when we use more of this.

  • 100 items in 5ms = 0ms per item
  • 1,000 items in 37ms = 0ms per item
  • 10,000 items in 2,319ms = 0ms per item

Note the numbers, will you?

1,000 items in 37 ms, but 10,000 items? 2.3 seconds!

  • 20,000 items in 21,331ms = 1ms per item

And doubling the amount took ten times as long?

  • 25,000 items in 32,588ms = 1ms per item

And at this point, I stopped trying, because I didn’t have the patience.

Note that the other concurrent collection, ConcurrentStack, ConcurrentQueue and ConcurrentDictionary do not suffer from the same problem.

I contacted Microsoft about this, and this is already resolved in .NET 4.5. The underlying issue was that ThreadLocal, which ConcurrentBag uses, didn’t expect to have a lot of instances. That has been fixed, and now can run fairly fast.

Tags:

Published at

Originally posted at

Comments (10)

And this is Ironic

I was trying to press the Yes button, and this happened:

image

To be fair, I am trying to get it to read my mail, and it is a bit… largish, I guess:

image

Tags:

Published at

Originally posted at

Comments (4)

A bad test

image

This is a bad test, because what it does is ensuring that something does not works. I just finished implementing the session.Advaned.Defer support, and this test got my attention by failing the build.

Bad test, you should be telling me when I broke something, not when I added new functionality.

Tags:

Published at

Originally posted at

Comments (16)

Non overlapping time periods–because I like the pain of 2 AM wakeup calls

This post is partly in response for this post, discussing the Azure problem with leap year. But it is actually a bit more general than that.

In my code, here is how I define “one year from now”:

DateTime.Today.AddYears(1).AddDays(3);

As it turned out, this tend to have a lot of implications on your business, most of them are actually pretty good ones.

For a start, you will never get hit with a leap year bug, but more importantly, you are never going to have to deal with an immediate cutoff. This is important because it gives you time. Mostly, it gives you time to screw up, but having the time to do so without having an egg all of your face is a really nice thing.

For example, all of our subscriptions are using a similar method of calculation, and this is why we can take the ordering system down for a few hours or even a day or two and no one will actually notice. We have a big grace period in which we can work things out.

Sure, a user gets 3 “extra” days for free out of this, but frankly, I don’t give a damn. It is more important that I get the buffer, and most users like it much better when you don’t slam the doors in their faces on the first chance.

One of the things that is important is style, and giving a grace period for those sort of things is crucial.

Tags:

Published at

Originally posted at

Comments (5)

Rotten Scheduling: Don’t roll your own

“We need to run a specific task every 72 hours, I thought about this approach…”

public class TimedTask
{
  public static Timer Timer;
  
  public static void Init()
  {
    Timer = new Timer(ExecuteEvery72Hours, null, TimeSpan.FromHours(72), TimeSpan.FromHours(72));
  }
  
  public static void ExecuteEvery72Hours()
  {
    // do something important
  }
}

This is a bloody rotten idea, let us see why…

  • What happens if your application is recycled every 29 hours?
  • What happens if your application is always on, but during that 72 hour call, it was offline?
  • What happens if your task actually takes more than 72 hours to run?
  • What happens if the task fails?
  • How do you report errors, warnings, etc?

Scheduling is a hard problem. There are a lot of things that you actually need to consider. And the code above is really considering none of them. I would be very surprised if something like that ever run. in production. It most certainly can’t be made to run reliably.

Things that run every X time, where X is a long time (hours / days) tend to be pretty important. In some of the systems that we wrote, that include doing things like updating VAT and interest rates, pulling from external source, generating the weekly report, etc.

You do not want this to be messed up.

If you need to do anything like that, make use of the builtin scheduling features of the OS you are running on (Windows Task Scheduler is an amazingly full featured, and cron isn’t bad if you are running on Linux). If you still insist on doing this in code, at the very least do something like this:

public class TimedTask
{
  public static Timer Timer;
  
  public static void Init()
  {
    Timer = new Timer(()=>
    {
      if( (DateTime.UtcNow - GetLastExecutedTime()) > 72)
        ExecuteEvery72Hours();
    }, null, TimeSpan.FromMinutes(1), TimeSpan.FromMinutes(1));
  }
  
  public static void ExecuteEvery72Hours()
  {
    // do something important
  }
}

This still has a lot of problems, but at least it it solving some crucial problems for you (note that GetLastExecutedTime has to be a persisted value).

Of course, if you need something like this in your code, you have better use something like Quartz, instead. Don’t roll your own. Sure, this is ten lines of code to do so, but as I said, this is the very basics, and it gets complex really fast.

Tags:

Published at

Originally posted at

Comments (19)

Rotten Scheduling

“We need to run a specific task every 72 hours, I thought about this approach…”

public class TimedTask
{
  public static Timer Timer;
  
  public static void Init()
  {
    Timer = new Timer(ExecuteEvery72Hours, null, TimeSpan.FromHours(72), TimeSpan.FromHours(72));
  }
  
  public static void ExecuteEvery72Hours()
  {
    // do something important
  }
}

Let us assume that this code is being called properly. Why is this a bloody rotten idea?

Tags:

Published at

Originally posted at

Comments (28)

The best multi threading debugging tool is Microsoft Excel

In any system that gets to a certain size, especially one that is multi threaded, there are a certain class of bugs that are really a bitch to figure out.

They are usually boil down to some sort of a race condition. I have been doing that recently, trying to fix a bunch of race conditions in RavenDB. The problem with race conditions is that they are incredibly hard to reproduce, and even when you can reproduce them, you can’t really debug them.

I came up with the following test harness to try to narrow things down. Basically, create a test case that sometimes fails, and run it a thousand times.

Odds are that you’ll get the failure, but that isn’t the important bit. The important bit is that you setup logging properly. In my case, I set things up so the logging output to CSV and each test run had a different file.

This gives me output that looks like this:

23:54:34.5952,Raven.Database.Server.HttpServer,Debug,Request #  10: GET     -    12 ms - <default>  - 200 - /indexes/dynamic/Companies?query=&start=0&pageSize=1&aggregation=None,,11
23:54:34.5952,Raven.Client.Document.SessionOperations.QueryOperation,Debug,"Stale query results on non stale query '' on index 'dynamic/Companies' in 'http://reduction:8079/', query will be retried, index etag is: bcf0fed1-d975-43b4-bfb7-65221ef06b99",,1
23:54:34.5952,Raven.Database.Indexing.WorkContext,Debug,"No work was found, workerWorkCounter: 10, for: TasksExecuter, will wait for additional work",,6
23:54:34.5952,Raven.Database.Indexing.WorkContext,Debug,Incremented work counter to 11 because: WORK BY IndexingExecuter,,12
23:54:34.5952,Raven.Database.Indexing.WorkContext,Debug,"No work was found, workerWorkCounter: 11, for: TasksExecuter, will wait for additional work",,6
23:54:34.5952,Raven.Database.Indexing.WorkContext,Debug,"No work was found, workerWorkCounter: 11, for: ReducingExecuter, will wait for additional work",,21
23:54:34.5952,Raven.Database.Indexing.WorkContext,Debug,"No work was found, workerWorkCounter: 11, for: IndexingExecuter, will wait for additional work",,12
23:54:34.6952,Raven.Client.Document.SessionOperations.QueryOperation,Debug,Executing query '' on index 'dynamic/Companies' in 'http://reduction:8079/',,1
23:54:34.8172,Raven.Database.Indexing.Index.Querying,Debug,Issuing query on index Temp/Companies for all documents,,11
23:54:34.8172,Raven.Storage.Esent.StorageActions.DocumentStorageActions,Debug,Document with key 'companies/1' was found,,11

Not really helpful in figure out what is going on, right? Except for one tiny thing, we load them to Excel, and we use conditional formatting to get things to look like this:

image

The reason this is so helpful? You can actually see the threads interleaving. This usually help me get to roughly the right place, and then I can add additional logging so I can figure out better what is actually going on.

I was able to detect and fix several race conditions using this approach.

And yes, I know that this is basically printf() debugging. But at least it is printf() debugging with pretty colors.

Tags:

Published at

Originally posted at

Comments (11)

Thou shall not delete

Ouch!

public ActionResult DeleteComment(int id)
{
  var userComment = RavenSession.Load<UserComment>(id);

  if (userComment == null)
    return new HttpStatusCodeResult(204);

  var user = RavenSession.GetUser(User.Identity.Name);
  if(user == null || (user.Role != UserRole.Moderator && user.Role != UserRole.Admin))
    return new HttpStatusCodeResult(403, "You must be logged in as moderator or admin to be able to delete comments");

  RavenSession.Delete(user);

  return new HttpStatusCodeResult(204);
}
Tags:

Published at

Originally posted at

Comments (34)

What is wrong here? Solution

In my previous post, I showed the database schema and the UI and asked what was wrong with that. Before we move on, here is what I showed.

image

image

If you looked carefully, you might have noticed that there are duplicate PO# and Tracking# in the UI. More than that, we somehow double charged the customer for shipping.

What is going on?

It is actually fairly obvious, when you think about it. Look at the schema, there isn’t actually any association between the Tracking # and the PO #. In most orders, we have only 1 PO #, so it was easy to just add this information by just pulling it from the DB and adding a few columns. But when we got an order that has multiple POs… that is when all hell breaks lose.

This is a classic Cartesian Product problem.

The solution?  Actually model the UI to avoid suggesting that there is a relationship between the tracking and purchase orders, like this:

image

Tags:

Published at

Originally posted at

Comments (11)

What is wrong here?

This case, it isn’t code that I am going to show, rather, I am going to show the final UI and the database structure, and let you figure:

  • What is wrong.
  • How to fix this.

Here is the database schema:

image

And here is the problem:

image

Hints, this used to work for a long time, and suddenly it doesn’t, and the customer is pissed, annoyed and threatening to sue.

Tags:

Published at

Originally posted at

Comments (38)

Ask Ayende: What about the QA env?

Matthew Bonig asks, with regards to a bug in RavenDB MVC Integration (RavenDB Profiler) that caused major slow down on this blog.:

I'd be very curious to know how this code got published to a production environment without getting caught. I would have thought this problem would have occurred in any testing environment as well as it did here. Ayende, can you comment on where the process broke down and how such an obvious bug was able to slip through?

Well, the answer for that comes in two parts. The first part  is that no process broke down. We use our own assets for final testing of all our software, that means that whenever there is a stable RavenDB release pending (and sometimes just when we feel like it) we move our infrastructure to the latest and greatest.

Why?

Because as hard as you try testing, you will never be able to catch everything. Production is the final test ground, and we have obvious incentives of trying to make sure that everything works.  It is dogfooding, basically. Except that if we get a lemon, that is a very public one.

It means that whenever we make a stable release, we can do that with high degree of confidence that everything is going to work, not just because all the tests are passing, but because our production systems had days to actually see if things are right.

The second part of this answer is that this is neither an obvious bug nor one that is easy to catch. Put simply, things worked. There wasn’t even an infinite loop that would make it obvious that something is wrong, it is just that there was a lot of network traffic that you would notice only if you either had a tracer running, or were trying to figure out why the browser was suddenly so busy.

Here is a challenge, try to devise some form of an automated test that would catch something like this error, but do so without actually testing for this specific issue. After all, it is unlikely that someone would have written a test for this unless they run into the error in the first place. So I would be really interested in seeing what sort of automated approaches would have caught that.

Tags:

Published at

Originally posted at

Comments (19)

Bug Hunt: What made this blog slow?

A while ago the blog start taking 100% CPU on the client machines. Obviously we were doing something very wrong there, but what exactly was it?

We tracked down the problem to the following code:

image_thumb

image_thumb[1]

As you can probably guess, the problem is that we have what is effective an infinite loop. On any Ajax request, we will generate a new Ajax request. And that applies to our own requests as well.

The fix was pretty obvious when we figured out what was going on, but until then…

image

Tags:

Published at

Originally posted at

Comments (14)

Bug Hunt: What made this blog slow?

A while ago the blog start taking 100% CPU on the client machines. Obviously we were doing something very wrong there, but what exactly was it?

We track down the problem to the following code, can you figure out what the problem?

image

image

Tags:

Published at

Originally posted at

Comments (11)

What is up with all those 404?

Recently I had a spate of posts showing up, and then reporting 404. The actual reason behind that was that the server and the database disagreed with one another with regards to the timezone of the post (one was using UTC, the other local). Sorry for the trouble, and this shouldn’t happen any longer.

Tags:

Published at

Originally posted at

Comments (8)

Mixing Integrated Authentication and Anonymous Authentication with PreAuthenticated = true doesn’t work

This StackOverflow question indicate that it is half a bug and half a feature, but that it sure as hell looks like a bug to me.

Let us assume that we have a couple of endpoints in our application, called http://localhost:8080/secure and http://localhost:8080/public. As you can imagine, the secure endpoint is… well, secure, and requires authentication. The public endpoint does not.

We want to optimize the number of request we make, so we specify PreAuthenticated = true; And that is where all hell break lose.

The problem is that it appears that when using request with entity body (in other words, PUT / POST) with PreAuthenticate = true, the .NET framework will issue a PUT / POST request with empty body to the server. Presumably to get the 401 authentication information. At that point, if the endpoint that it happened to have reached is public, it will be accepted as a standard request, and processing will be tried. The problem here is that it has an empty body, so that has a very strong likelihood of failing.

This error cost me a day and a half or so. Here is the full repro:

static void Main()
{
    new Thread(Server)
    {
        IsBackground = true
    }.Start();

    Thread.Sleep(500); // let the server start

    bool secure = false;
    while (true)
    {
        secure = !secure;
        Console.Write("Sending: ");
        var str = new string('a', 621);
        var req = WebRequest.Create(secure ? "http://localhost:8080/secure" : "http://localhost:8080/public");
        req.Method = "PUT";

        var byteCount = Encoding.UTF8.GetByteCount(str);
        req.UseDefaultCredentials = true;
        req.Credentials = CredentialCache.DefaultCredentials;
        req.PreAuthenticate = true;
        req.ContentLength = byteCount;

        using(var stream = req.GetRequestStream())
        {
            var bytes = Encoding.UTF8.GetBytes(str);
            stream.Write(bytes, 0, bytes.Length);
            stream.Flush();
        }

        req.GetResponse().Close();

    }

}

And the server code:

public static void Server()
{
    var listener = new HttpListener();
    listener.Prefixes.Add("http://+:8080/");
    listener.AuthenticationSchemes = AuthenticationSchemes.IntegratedWindowsAuthentication | AuthenticationSchemes.Anonymous;
    listener.AuthenticationSchemeSelectorDelegate = request =>
    {

        return request.RawUrl.Contains("public") ? AuthenticationSchemes.Anonymous : AuthenticationSchemes.IntegratedWindowsAuthentication;
    };

    listener.Start();

    while (true)
    {
        var context = listener.GetContext();
        Console.WriteLine(context.User != null ? context.User.Identity.Name : "Anonymous");
        using(var reader = new StreamReader(context.Request.InputStream))
        {
            var readToEnd = reader.ReadToEnd();
            if(string.IsNullOrEmpty(readToEnd))
            {
                Console.WriteLine("WTF?!");
                Environment.Exit(1);
            }
        }

        context.Response.StatusCode = 200;
        context.Response.Close();
    }
}

If we remove pre authenticate is set to false, everything works, but then we have twice as many requests. The annoying thing is that if it would be trying to authenticate to a public endpoint, nothing would happen, if it were sending the bloody entity body along as well.

This is quite annoying.

Tags:

Published at

Originally posted at

Comments (13)

Dynamic resolution rules joy

In the following code, what do you believe the output should be?

class Program
{
    static void Main(string[] args)
    {
        dynamic stupid = new Stupid{Age = 3};

        Console.WriteLine(stupid.Age);
    }
}

public class Stupid : DynamicObject
{
    public int Age { get; set; }

    public override bool TryGetMember(GetMemberBinder binder, out object result)
    {
        result = 1;
        return true;
    }
}

Argh!

The default is to use the actual type members, and then fall back to the dynamic behavior. Whereas I would expect it to first check the dynamic behavior and then fall back to the actual members if it can’t find dynamic stuff.

Quite annoying.

Tags:

Published at

Originally posted at

Comments (24)

What is wrong with this code?

Calling the Compare method will (sometimes) crash your entire application in a very rude fashion, why?

static bool Compare(byte[] b1, byte[] b2)
  {
    IntPtr retval = memcmp(b1, b2, new IntPtr(b1.Length));
    return retval == IntPtr.Zero;
  }
 
  [DllImport("msvcrt.dll")]
  static extern IntPtr memcmp(byte[] b1, byte[] b2, IntPtr count);

In fact, there are actually two different problems that I can see here. The easy one would consistently crash, the hard one would usually pass, but sometime crash you in an apparently random ways.

Tags:

Published at

Originally posted at

Comments (14)

Twitter doesn’t like my identity

Recently I noticed that a very important feature in twitter website is broken for me. I can’t follow the entire conversation any longer, which drive me crazy.

image

It didn’t take me long to figure out what the actual issue is:

image

It appears that even though I am logged in, some parts of twitter doesn’t like this.

I then thought that this has something to do with chrome, so I used the incognito mode to try it:

image

It works!

I logged out and in again, nada. I cleared all browser history and restarted chrome, nada.

I uninstalled chrome (selecting delete all browsing history) and re-installed, and now it is working.

No idea why, and I don’t like the heavy handed approach, but at least now I got my twitter tracking back.

Tags:

Published at

Originally posted at

Comments (4)

Debugging Security Exceptions

One of the horrible things about SecurityException is just how little information you are given.

For example, let us take a look at:

image

Yes, that is informative. The real problem is that most security analysis is done at the method level, which means that _something_ in this method caused this problem. Which meant that in order to debug this, I have to change my code to be like this:

image

Which gives me this:

image

Just to point out, the first, second, etc are purely artificial method names, meant just to give me some idea about the problem areas for this purpose only.

Then we need to go into the code in the First method and figure out what the problem is there, and so on, and so forth.

Annoying, and I wish that I knew of a better way.

Tags:

Published at

Originally posted at

Comments (12)

High memory usage with WCF Discovery

Yes, I know that this is basically saying that select is broken, but I am seeing some very strange stuff here. The code in question is this:

for (int i = 0; i < 15; i++)
{
    var discoveryClient = new DiscoveryClient(new UdpDiscoveryEndpoint());
    var findCriteria = new FindCriteria(typeof(IDiscoverableService))
    {
        Duration = TimeSpan.FromSeconds(1)
    };
    discoveryClient.Find(findCriteria);
    discoveryClient.Close();
}

The full repro can be found here.

The problem is, put simply, that before this code, the working set is:  Working Set: 57,640 kb, after this have executed, it is 90,288 kb.

I went over the code with a fine tooth comb, but I don’t really see where all of this memory have gone.

The actual memory reported by GC.GetTotalMemory does go down after the cleanup, so I guess it could be that .NET isn’t releasing the memory back to the OS, but it still worries me somewhat.

I tried it with higher numbers for the loop, and it seems like eventually it settles down on some number and doesn’t grow from there. My main problem is what happens when you start doing async stuff. Let us take a look here:

int count = 150;
var countdown = new CountdownEvent(count);

for (int i = 0; i < count; i++)
{
    var discoveryClient = new DiscoveryClient(new UdpDiscoveryEndpoint());
    var findCriteria = new FindCriteria(typeof(IDiscoverableService))
    {
    };
    discoveryClient.FindProgressChanged += (sender, eventArgs) => {  }; // do nothing
    discoveryClient.FindCompleted += (sender, eventArgs) =>
    {
        countdown.AddCount();
        discoveryClient.Close();
        PrintMemory("Complete: ");
    };
    discoveryClient.FindAsync(findCriteria);
}

countdown.Wait();

At peek, I am seeing 600 MB (!) of memory used. Note that we are now using the default duration of 20 seconds, and it seems that DiscoveryClient is very heavy on memory.

If you know how, can you take a look at the code and tell me that I am crazy?

Tags:

Published at

Originally posted at

Comments (25)