Ayende @ Rahien

Ayende @ Rahienhttp://ayende.comAyende @ RahienCopyright (C) Ayende Rahien 2004 - 2021 (c) 202660pete w commented on Fun with a non relational databasesThere is a big company well-entrenched in this kind of approach to databases. They have derived answers to many of the questions raised in this discussion, and developed a very mature product. This company is named FileMaker. I have worked a few engagements in which company needs for BI has surpassed the abilities of FileMaker, prompting us to build "replication" bridges to data warehouse. the "replication" approach is a viable answer to the problems with aggregation and general reporting, however if your primary goal was "NOSQL" well then you have only halfway succeeded. http://ayende.com/4410/fun-with-a-non-relational-databases#comment17http://ayende.com/4410/fun-with-a-non-relational-databases#comment17Sun, 28 Feb 2010 17:57:26 GMTAyende Rahien commented on Fun with a non relational databasesiLude, Agreed, in any such scenario that i am aware of, this is the job of background tasks http://ayende.com/4410/fun-with-a-non-relational-databases#comment16http://ayende.com/4410/fun-with-a-non-relational-databases#comment16Fri, 26 Feb 2010 21:07:47 GMTiLude commented on Fun with a non relational databases@Oren, I was referring to the time it takes to compute the average not the time that it takes for the index to be updated after you have the the new data ready for writing to the data store. Computing the average of all ratings on a book is straight forward and shouldn't take a lot of time regardless of the size of your dataset. But if you want to find users who share similar taste in books to the current user and show their average rating for the currently viewed book, the answers going to take a bit more time to figure out. My point was that figuring out the answers to these questions is not the job of the data store. Sure SQL can be used to figure out these answers. But it only scales so far. http://ayende.com/4410/fun-with-a-non-relational-databases#comment15http://ayende.com/4410/fun-with-a-non-relational-databases#comment15Fri, 26 Feb 2010 21:04:58 GMTAyende Rahien commented on Fun with a non relational databasesiLude, The question is, how do you define instantly. My current testing shows latency of ~25 - 50 ms between update & index sync. That is with zero perf work and on tough conditions. http://ayende.com/4410/fun-with-a-non-relational-databases#comment14http://ayende.com/4410/fun-with-a-non-relational-databases#comment14Fri, 26 Feb 2010 18:56:24 GMTAyende Rahien commented on Fun with a non relational databases Josh, At that point, you include the average user rating inside the document, and index that. From there, it is a very simple issue http://ayende.com/4410/fun-with-a-non-relational-databases#comment13http://ayende.com/4410/fun-with-a-non-relational-databases#comment13Fri, 26 Feb 2010 18:54:53 GMTAyende Rahien commented on Fun with a non relational databasesJordan, I would like to use it where I would use Couch / Mongo, yes. I think that this is very feasible. http://ayende.com/4410/fun-with-a-non-relational-databases#comment12http://ayende.com/4410/fun-with-a-non-relational-databases#comment12Fri, 26 Feb 2010 18:53:57 GMTAyende Rahien commented on Fun with a non relational databasesVadim, The issue isn't with deletion, the issue is with references to deleted documents. The DB won't handle that automatically for you http://ayende.com/4410/fun-with-a-non-relational-databases#comment11http://ayende.com/4410/fun-with-a-non-relational-databases#comment11Fri, 26 Feb 2010 18:53:22 GMTRafal commented on Fun with a non relational databases@Jordan Small to mid-size applications only? Why not large databases? I think large databases is where nosql shows its superiority over rdbms. http://ayende.com/4410/fun-with-a-non-relational-databases#comment10http://ayende.com/4410/fun-with-a-non-relational-databases#comment10Fri, 26 Feb 2010 18:25:13 GMTiLude commented on Fun with a non relational databases@Josh, As Oren said Aggregation isn't supported nor is it a problem that I believe he's trying to solve. In the case of the average user rating, I think you would fire off an event/message/command to the system when a user provided a rating on a book. The interface responds saying thanks for you feedback. Meanwhile the system queues the event and processes it computing the new average based on the new data. This new aggregate value is stored in a data structure for queries that need that use that average. That value may be stored in the doc db with the book and as such it can be queried using lucene. Or it might be stored in a sql table that has book ids and averages, it doesn't really matter. The main problem people usually have is that the users feedback is not instantly available in the view. And to paraphrase Udi "So What?". it will be soon enough and the user who provided the feedback isn't waiting on the interface while this new aggregate is calculated. Some will say that computing the average should not take long. True, but what if I would rather compute the average review of this book by users who like the same books as me? This is a much more interesting question and can not be calculated in a few milliseconds. http://ayende.com/4410/fun-with-a-non-relational-databases#comment9http://ayende.com/4410/fun-with-a-non-relational-databases#comment9Fri, 26 Feb 2010 17:41:13 GMTguy commented on Fun with a non relational databasesI think anonymous types might be a bit more nice to look at as far as the query API. http://ayende.com/4410/fun-with-a-non-relational-databases#comment8http://ayende.com/4410/fun-with-a-non-relational-databases#comment8Fri, 26 Feb 2010 17:08:50 GMTjosh commented on Fun with a non relational databasesHi Ayende, The first thing that popped in my head was: What if you only want to show a list of books with the average user rating? Let's say you have a fairly large number of books, and some books with a lot of reviews/ratings. To show a listing you wouldn't necessarily want to fetch all those books with all of their reviews. That's a lot of overhead and data you wouldn't use in that view. The second thing I thought was: Cool, even uses Lucene. Very nice. -josh http://ayende.com/4410/fun-with-a-non-relational-databases#comment7http://ayende.com/4410/fun-with-a-non-relational-databases#comment7Fri, 26 Feb 2010 15:54:41 GMTJordan commented on Fun with a non relational databasesI must admit, I would really like to see this project work out to be a good, in-process document database for small to mid-size applications. However, my question is this: Is your end goal to "compete" in the same space as CouchDB or MongoDB? Or is this more an experiment that might yield a useful tool? http://ayende.com/4410/fun-with-a-non-relational-databases#comment6http://ayende.com/4410/fun-with-a-non-relational-databases#comment6Fri, 26 Feb 2010 15:49:09 GMTVadim Kantorov commented on Fun with a non relational databasesOk, you don't delete but you mark objects with 'bad' or 'not actual anymore' marks. Anyway, you have to fix the indexes... http://ayende.com/4410/fun-with-a-non-relational-databases#comment5http://ayende.com/4410/fun-with-a-non-relational-databases#comment5Fri, 26 Feb 2010 15:47:51 GMTNick Meldrum commented on Fun with a non relational databasesI read Udi's article and agree with his example, however does that really extend to never delete *anything*? Surely it means never delete anything that is not really deleted, but merely changed state (as in Udi's examples.) Sometimes Delete really does make sense. A poor example I have now is, "What if I add an item to be persisted that I realise later shouldn't exist." A pretty poor example I admit... I will try to think of a better one... I am quite excited at the prospect of using a .Net based object graph database, and this project seems the closest so far to my needs. I would definitely vote in favour of implementing delete! http://ayende.com/4410/fun-with-a-non-relational-databases#comment4http://ayende.com/4410/fun-with-a-non-relational-databases#comment4Fri, 26 Feb 2010 14:43:23 GMTAyende Rahien commented on Fun with a non relational databasesBunter, Paging is implemented in overload for Query. Query(“booksByAuthor”, “author:weber”, N, M); Ordering for paging is something that I need to think about, probably something like: Query(“booksByAuthor”, “author:weber”, N, M, function(x,y) { return x.name.CompareTo(y.name); }); Aggregation isn't supported, because it isn't something that you can do in a doc db. Well, you can, but it provides interesting challenges and stops working when you try to think about distributed db. Since you can't really do aggregation in a distributed fashion easily, I might not do it. I actually have a couple of interesting ideas about how to implement this, though. Why Json? Because I am thinking about very low level API, not user facing ones. > Getting N top rated products under categories that are direct or indirect child of category X? There are two options, the easy one is to include the entire category hierarchy in the book. The idea is that it means that I can pull the entire list in a single request, vs. having to do hierarchical stuff. In that is the case, you would need a view like: var booksByCategory = from doc in docs where doc.type == "book" from cat in doc.categories select new { cat.name }; Query("booksByCategory", "name:foo"); Another approach, which I am not overly fond of, is to support hierarchy in the view generation, something like (highly tentative): var booksByCategory = from doc in docs.include_matching(d=>d.category && d.type == "book") where doc.type == "book" from cat in doc.categories select new { cat.name }; Which would build the same index, but without requiring the book to carry all the hierarchy, but since you generally need the entire hierarchy anyway, it doesn't really make sense not to have it. Final option is to generate the categories key as: 1. Books 1.2 Books > Sci Fi 1.2.3 Books > Sci Fi > Military And in this case you can do this sort of queries very easily because of the nature of the queries. http://ayende.com/4410/fun-with-a-non-relational-databases#comment3http://ayende.com/4410/fun-with-a-non-relational-databases#comment3Fri, 26 Feb 2010 12:17:52 GMTJavi commented on Fun with a non relational databasesHi Ayende, I see that the read operations use a JsonDoc entity... What about serializing/deserializing the json docs into .NET entities? Are you leaving this out because of performance concerns? Javi http://ayende.com/4410/fun-with-a-non-relational-databases#comment2http://ayende.com/4410/fun-with-a-non-relational-databases#comment2Fri, 26 Feb 2010 11:25:27 GMTBunter commented on Fun with a non relational databasesHow does the query interface handles paging, ordering and aggregation? Getting N - M books matching a query? Getting N top rated products under categories that are direct or indirect child of category X? Getting books with more than 1 author? http://ayende.com/4410/fun-with-a-non-relational-databases#comment1http://ayende.com/4410/fun-with-a-non-relational-databases#comment1Fri, 26 Feb 2010 11:17:27 GMT