Ayende @ Rahien

Refunds available at head office

What is wrong with this code? Answers

Wow, that was a popular topic, just take a look at the comment thread. I'll do more of those in the future.

The code in question is:

public byte[] DownloadBytes(string url,
                            ICredentials credentials)
{
    WebRequest request = Util.SetupWebRequest(WebRequest.Create(url), credentials);

    using (WebResponse response = request.GetResponse())
    {
        using (Stream stream = GetResponseStream(response))
        {
            byte[] buffer = new byte[response.ContentLength];
            int current = 0;
            int read;
            do
            {
                read = stream.Read(buffer, current, buffer.Length - current);
                current += read;
            } while (read != 0);
            return buffer;
        }
    }
}

The code as written will result in data corruption. Now, people have found a lot of issues with this code. So many that I am considering doing a "production code" post about this.

The problem that I had in mind, which El Guapo correctly pointed at, is that this code relies on response.ContentLength to be accurate. There are innumerous reasons for it to be inaccruate (server not sending the header, chucking, etc).

The main issue, however, is that ContentLegnth represent the size of the payload on the wire. This is very important, because size_on_wire != actual_size. This happened to me when the server was using GZip compression to send files. The CLR Web API will automatically notify the server that they can accept gzip and deflate compressions, and they will decompress it behind the scene when it comes the time to read it. The result is that size on the wire is significantly smaller than the actual file size, but I blithely ignored everything else and read only the size on the wire.

It took me a while to track that, because the code that had this issue was an async file reader, and I initially assume that I had a threading issue there.

Comments

Eber Irigoyen
04/01/2008 10:22 PM by
Eber Irigoyen

so now that you have a real world scenario... let's see the tests for this

Ayende Rahien
04/01/2008 10:34 PM by
Ayende Rahien

Here they are:

http://www.codeplex.com/CodePlexClient/SourceControl/FileView.aspx?itemId=210197&changeSetId=17338

Jeremy Gray
04/02/2008 10:33 PM by
Jeremy Gray

"This happened to me when the server was using GZip compression to send files. The CLR Web API will automatically notify the server that they can accept gzip and deflate compressions, and they will decompress it behind the scene when it comes the time to read it."

Since when? I've always had to layer up a bit of code to add the deflate request header and decompress the response body. Are you perhaps somehow hiding some of these details down in "Util.SetupWebRequest(...)"?

Ayende Rahien
04/02/2008 10:39 PM by
Ayende Rahien

Here is the code in full:

public static WebRequest SetupWebRequest(WebRequest result,

                                     ICredentials credentials)

{

result.Timeout = Timeout.Infinite;


HttpWebRequest httpResult = result as HttpWebRequest;


if (httpResult != null)

{

    httpResult.Credentials = credentials;

    httpResult.ServicePoint.ConnectionLimit = 15;

    httpResult.ServicePoint.UseNagleAlgorithm = false;

    httpResult.SendChunked = false;

    httpResult.Pipelined = false;

    httpResult.KeepAlive = true;

    httpResult.PreAuthenticate = false;

    httpResult.UserAgent = "CodePlexClient";

}


return result;

}

Comments have been closed on this topic.