Ayende @ Rahien

Hendry Luk commented on API Design: Sharding Status for failure scenarios–explicit failure management

Mon, 04 Jun 2012 03:43:58 GMT

What about the standard .net TryXxx(out) API convention, which will do almost the exact behavior as its Xxx() counterpart, except that it will return its success/failure result in lieau of exceptions?

Justin commented on API Design: Sharding Status for failure scenarios–explicit failure management

Thu, 31 May 2012 15:10:52 GMT

I would hope shards going down are just as rare ;). Both issues are time based, specifically *transaction time*, in both situations the transaction has already occurred in the past and the index is not showing the committed transaction for two different technical reasons, but logically the issue is *identical* to the application. Regardless of how long it takes for the indexing to complete or the shard to come up, the the application must handle these situations in a similar manner. If you code your application to assume indexing only takes <1 second and then a large amount of data is re-indexed what happens? You must handle this possibility somehow. Once you've handled the long-running index operation, you've just handled a down shard too, at least for queries against indexes.

Ayende Rahien commented on API Design: Sharding Status for failure scenarios–explicit failure management

Thu, 31 May 2012 14:46:30 GMT

Justin, Staleness that takes hours to go away is REALLY rare. We usually talking about ms under normal load, seconds under very heavy load. And there is a big difference between "those results are accurate as of TIME" vs. "those results may be partial".

Justin commented on API Design: Sharding Status for failure scenarios–explicit failure management

Thu, 31 May 2012 14:44:25 GMT

That's why Raven doesn't wait by default right? What does it matter to the user/application *why* the results are incomplete? It probably matters a lot to the admin but either way the user/application didn't get the expected results and can't make certain application level decisions until it does, and may not for an *unknown* amount of time. If the DB has recently been loaded from an ETL process, the indexing may take *hours* which is probably why WaitForNonStaleResults has a timeout right? All this has already been handled. I would imagine whatever you do for a down shard will look very similar to how a stale index is handled currently since you want the same tenets to apply (system/world doesn't stop, no waiting).

Ayende Rahien commented on API Design: Sharding Status for failure scenarios–explicit failure management

Thu, 31 May 2012 14:33:23 GMT

Justin, Because that would mean _waiting_. It means that you have to stop and wait for a result and that may increase your latency. Also, that depend on what type of waiting you are doing. But in general, showing results from a few ms ago is more than good enough.

Justin commented on API Design: Sharding Status for failure scenarios–explicit failure management

Thu, 31 May 2012 14:29:13 GMT

@Patrick, If there is no danger in waiting for a index to catch up why is the default not to wait and return incomplete results? Indexes being rebuilt on large databases can take quite a while(>1 min) so the "danger" of waiting on a re-index maybe be just as bad as waiting for a down shard to come up. Either way Raven already provides a boolean status of possible incomplete results on a query and RavenQueryStatistics that can be extended to describe in more detail why those results are incomplete.

Patrick Huizinga commented on API Design: Sharding Status for failure scenarios–explicit failure management

Thu, 31 May 2012 07:30:19 GMT

@Justin, The big difference between a stale index and a down shard, is that the index is expected to catch up quickly (< 1 sec.), while a down shard is 'expected' to remain unavailable for a while (> 1 min.). So there is no danger waiting for the index to catch up, while it's a bad idea to wait (block) for the shard to come back.

Patrick Huizinga commented on API Design: Sharding Status for failure scenarios–explicit failure management

Thu, 31 May 2012 07:25:08 GMT

@Martin Doms, Where did you get the idea the discussion was about an extension method?

Martin Doms commented on API Design: Sharding Status for failure scenarios–explicit failure management

Thu, 31 May 2012 02:07:13 GMT

An extension method with a side effect actually makes me slightly sick to my stomach.

Justin commented on API Design: Sharding Status for failure scenarios–explicit failure management

Wed, 30 May 2012 16:41:44 GMT

How is this any different than a stale index? Raven DB already has a way to communicate that your query results may be incomplete. The reason the results are incomplete is really secondary, either way you have incomplete results that can cause business logic issues. Just use IsStale and WaitForNonStaleResults and add something to RavenQueryStatistics to describe the stale reason(still indexing or shard down or ...) Think of it this way how should the application handle a down shard vs a long running index process? They both cause missing results for an indeterminate amount of time and the application should respond the same regardless by either waiting to see if the results become complete or failing the operation and notifying the user.

steve commented on API Design: Sharding Status for failure scenarios–explicit failure management

Wed, 30 May 2012 15:48:15 GMT

Is there a possibility of taking some design ideas from a RAID and build the sharding out in a way that if a server goes down the remaining machines in the cluster can rebuild themselves to return full results if space allows?

Ju commented on API Design: Sharding Status for failure scenarios–explicit failure management

Wed, 30 May 2012 14:58:15 GMT

What about a Maybe-like monad? I mean a session.ShardQuery method could return a enhanced type, so that the user must explicitely get the underlying collection by matching if it is partial or not. The strategy to apply next is up to him.

Scooletz commented on API Design: Sharding Status for failure scenarios–explicit failure management

Wed, 30 May 2012 12:49:15 GMT

It's wrong if you want to make users use it always on per query basis. Raven should allow introducing a cross-cutting setting, registered once, to handle this situation (like in ISessionFactory, if there is one) and overriding when it's needed. Handling majority of cases in one way is what you should go for.

Rafal commented on API Design: Sharding Status for failure scenarios–explicit failure management

Wed, 30 May 2012 11:36:21 GMT

update: wrong, the out parameter can't be used here because it needs to be returned after the query is executed, not before

Rafal commented on API Design: Sharding Status for failure scenarios–explicit failure management

Wed, 30 May 2012 10:19:16 GMT

The desired API depends on the application. Sometimes you'll want to silently ignore shard's failure and sometimes you'll explicitly handle it. But it's not an error, it's a normal situation that some shard may be unavailable, therefore Ravens API shouldn't throw an exception. This approach (with ShardingStatus out parameter) is better than an exception but it's not very elegant as it requires you to remember to call ShardingStatus method with each query and then to add some code for handling the status returned. Besides, the out parameter is not so great for fluent interface because you don't know when it will be set. A callback function would be imho better.

Shaddix commented on API Design: Sharding Status for failure scenarios–explicit failure management

Wed, 30 May 2012 10:06:52 GMT

I meant List<Error> (List of Error) but the parser broken my c# :)

Shaddix commented on API Design: Sharding Status for failure scenarios–explicit failure management

Wed, 30 May 2012 10:05:08 GMT

One of the ways could be storing something like a List inside a session (or maybe even populating it to the Store), so every careful site-owner could display a big red cross on the top-right corner of the site, meaning that something bad happened :) That requires no friction at every .Query() call, but will give a sensible information about an error with only one-time setup.

Knaģis commented on API Design: Sharding Status for failure scenarios–explicit failure management

Wed, 30 May 2012 09:49:56 GMT

This approach could be ok, if omitting the method would cause an exception (a method name like AllowPartialResults() could be slightly better). It would be easier to implement than catching specific exception (where the partial results are in the exception details) - the solution that was proposed in the last post comments. But it would still ensure that the developer has to make a concious decision that the data he is retrieving is allowed to be incomplete (which is ok for blog posts, but is not ok when calculating financials). This approach could also enable certain entities to specify this automatically when the Query is called so that the decision making is left to the author of the model instead of the consumer.