What is wrong with this code? Answers

time to read 3 min | 482 words

Wow, that was a popular topic, just take a look at the comment thread. I'll do more of those in the future.

The code in question is:

public byte[] DownloadBytes(string url,
                            ICredentials credentials)
    WebRequest request = Util.SetupWebRequest(WebRequest.Create(url), credentials);

    using (WebResponse response = request.GetResponse())
        using (Stream stream = GetResponseStream(response))
            byte[] buffer = new byte[response.ContentLength];
            int current = 0;
            int read;
                read = stream.Read(buffer, current, buffer.Length - current);
                current += read;
            } while (read != 0);
            return buffer;

The code as written will result in data corruption. Now, people have found a lot of issues with this code. So many that I am considering doing a "production code" post about this.

The problem that I had in mind, which El Guapo correctly pointed at, is that this code relies on response.ContentLength to be accurate. There are innumerous reasons for it to be inaccruate (server not sending the header, chucking, etc).

The main issue, however, is that ContentLegnth represent the size of the payload on the wire. This is very important, because size_on_wire != actual_size. This happened to me when the server was using GZip compression to send files. The CLR Web API will automatically notify the server that they can accept gzip and deflate compressions, and they will decompress it behind the scene when it comes the time to read it. The result is that size on the wire is significantly smaller than the actual file size, but I blithely ignored everything else and read only the size on the wire.

It took me a while to track that, because the code that had this issue was an async file reader, and I initially assume that I had a threading issue there.