The case of the hanging request in the cache
A user just reported that after a short time of using RavenDB, all requests started to hang and eventually timed out. We are talking about making several dozen requests in the space of a minute or so, as a user operating against a web application.
That was a cause of concern, obviously, and something that was very strange. Luckily, the user was able to send us a stripped down version of his application and we were able to reproduce this state locally. On that particular application only. We couldn’t reproduce this anywhere else.
In fact, we fixed the problem and still don’t know how to reproduce it. The underlying issue is that we are making a GET request using HttpClient and specifying HttpCompletionOption.ResponseHeadersRead in the SendAsync method. Now, the server might return us a 304 response, in which case we’ll use the cached value we have. If we make a few requests like that, all future requests will hang. In order to fix this problem we need to dispose the response, or read the (empty) content of the response.
Now, the crazy thing is that I can probably speculate pretty accurately on why this is going on, we have fixed it, but I can’t reproduce this problem except within the application the user sent us. So no independent reproduction.
The likely issue is that the HttpClient considers the request open until we read all its content (even if there is non), and that as we make more requests that have the same 304 and therefor no access to the content of the request, we have more and more requests that are considered to be open, up to the maximum number of requests that are allowed against the server concurrently. At this point, we are waiting for timeout, and it look like we are hanging.
All very reasonable and perfectly understandable (if annoying because you only need to dispose it sometimes is a trap waiting to happen), but I don’t know why we couldn’t reproduce it.
Comments
Maybe it is related to the new System.Net.Http (4.3.2)? It looks like that could be the cuprit, as the above behavior looks just wrong. What happens with 204 (no content)?
I just encountered and fix this same problem today. Oren, is this proof that the Protoss psionic link does in fact exist?
So exactly what was the fix? Just removing that option in the SendAsync?
Steven, We had to dispose the response.
Comment preview