Ayende @ Rahien

Ayende @ Rahienhttp://ayende.comAyende @ RahienCopyright (C) Ayende Rahien 2004 - 2021 (c) 202660Fabien commented on RavenDB & FreeDB: An optimization opportunityThanks for the answer Ayende. I'm felling much better looking at those numbers :) Daniel, I understand indexing runs in the background and I was not really concerned about the performance of the queries not using the new index. I was just wondering what would happen to the queries needing them. Now I know it will just use a stale index, which is a problem if the process takes hours, not so much when it's a few minutes.http://ayende.com/154401/ravendb-freedb-an-optimization-opportunity#comment11http://ayende.com/154401/ravendb-freedb-an-optimization-opportunity#comment11Tue, 17 Apr 2012 07:29:25 GMTAyende Rahien commented on RavenDB & FreeDB: An optimization opportunityMicha, Sharding would split the load, yes. But the numbers you are seeing there are not _real_ numbers. They are PRE optimization numbers. Our current numbers are _much_ faster. http://ayende.com/154401/ravendb-freedb-an-optimization-opportunity#comment10http://ayende.com/154401/ravendb-freedb-an-optimization-opportunity#comment10Tue, 17 Apr 2012 07:20:50 GMTAyende Rahien commented on RavenDB & FreeDB: An optimization opportunityChris, Actually, no. There are no such things as slow queries in RavenDB :-)http://ayende.com/154401/ravendb-freedb-an-optimization-opportunity#comment9http://ayende.com/154401/ravendb-freedb-an-optimization-opportunity#comment9Tue, 17 Apr 2012 07:19:38 GMTDaniel Lang commented on RavenDB & FreeDB: An optimization opportunityFabien, indexing is a background operation. So on a production system, everythings stays online and responds to requests while the indexing is done behind the scenes. To all the other: Re-read the title of this post. There _will_ be optimization and todays indexing performance is much faster than at the time Oren wrote this post.http://ayende.com/154401/ravendb-freedb-an-optimization-opportunity#comment8http://ayende.com/154401/ravendb-freedb-an-optimization-opportunity#comment8Tue, 17 Apr 2012 07:19:30 GMTAyende Rahien commented on RavenDB & FreeDB: An optimization opportunityFabien, As I noted in the post, those were PRE optimization number. The numbers now are FAR better. You can see this in today's post. In production, you are likely to see RavenDB keeping up with your workload. If you are creating a new index, RavenDB will split the load between the new index and the current ones, making sure that they are all up to date. Searches keep being fast, but you might not be able to search the entire data set until the new index caught up. In our testing, even for complex indexing, for your data set size, this should take roughly 5 - 10 minutes.http://ayende.com/154401/ravendb-freedb-an-optimization-opportunity#comment7http://ayende.com/154401/ravendb-freedb-an-optimization-opportunity#comment7Tue, 17 Apr 2012 07:19:20 GMTMicha Schopman commented on RavenDB & FreeDB: An optimization opportunityLet's assume that you've applied sharding. What would have happened then with the performance? Would indexing time be shortened to total time / instances? Would that also mean the query of 1sec over 31 million records with 52 million fields can be made much faster when letting multiple instances do the actual work; maybe with a map / reduce on all instances? http://ayende.com/154401/ravendb-freedb-an-optimization-opportunity#comment6http://ayende.com/154401/ravendb-freedb-an-optimization-opportunity#comment6Tue, 17 Apr 2012 06:33:04 GMTChris Wright commented on RavenDB & FreeDB: An optimization opportunityFabien, my understanding is that indexing proceeds in the background and you deal with a slow query in the meantime. Best you can hope for -- except maybe making use of partial indices, and indexing on query, and the like.http://ayende.com/154401/ravendb-freedb-an-optimization-opportunity#comment5http://ayende.com/154401/ravendb-freedb-an-optimization-opportunity#comment5Tue, 17 Apr 2012 04:31:27 GMTFabien commented on RavenDB & FreeDB: An optimization opportunityInterresting stats. I work on an ASP.Net site that currently uses FAST as the search engine. We've had all sorts of issues with it and to put it simply, it's just not reliable enough. So we are considering using RavenDB as a replacement. All our needs seem to be covered and because we are using .Net it would be easy to use RavenDB. However, I have to say that I am a bit concerned with the time indexes take to create and was hoping you could clarify a few things for me. Imagine this scenario: we currently have 800,000 documents needing to be in RavenDB. We create the indexes before adding the documents so that the indexes are populated as we go. Testing goes great and everything is performing well. Now the problem is that we missed one scenario and there is another index needed that we didn't know about. On production, because of the auto-tuning on RavenDB, when this scenario is encountered the new index will be automatically create and because of the 800,000 documents it will take a while... What is going to happen to my search query at this point? The same goes if there is a code change producing queries with a different query profile but we don't realise that before going to production. I know it's our fault in both cases but I'm just curious of the consequences. Thankshttp://ayende.com/154401/ravendb-freedb-an-optimization-opportunity#comment4http://ayende.com/154401/ravendb-freedb-an-optimization-opportunity#comment4Tue, 17 Apr 2012 01:31:25 GMTRafal commented on RavenDB & FreeDB: An optimization opportunityNathan I suspect you will find many databases faster than Raven in such benchmarks, imho Raven guys concentrate on ease of use and on features rather than speed. Ayende, I don't understand how you get 'index entry for every single track in the world' - from what I see you have a Lucene document for each CD with multivalued Query and Track fields, both of them analyzed - this means you have an index entry for each distinct word in track and disk titles. And there are not so many such words - song names don't use too rich vocabulary... Out of curiosity, could you post the list of top 100 most frequently used words in song titles? Such list should be possible to get from Lucene index..http://ayende.com/154401/ravendb-freedb-an-optimization-opportunity#comment3http://ayende.com/154401/ravendb-freedb-an-optimization-opportunity#comment3Mon, 16 Apr 2012 13:22:55 GMTNathan Palmer commented on RavenDB & FreeDB: An optimization opportunityNot sure if you are going down the route of comparisons.. but these tests would be interesting side-by-side on other document databases such as MongoDB or Redis. You obviously can't compare performance metrics alone when evaluating a product.. but it proves to be a useful metric.http://ayende.com/154401/ravendb-freedb-an-optimization-opportunity#comment2http://ayende.com/154401/ravendb-freedb-an-optimization-opportunity#comment2Mon, 16 Apr 2012 12:59:55 GMTjumbo commented on RavenDB & FreeDB: An optimization opportunityCOMPARE TO SQL DB? EG. POSTGRESQL?http://ayende.com/154401/ravendb-freedb-an-optimization-opportunity#comment1http://ayende.com/154401/ravendb-freedb-an-optimization-opportunity#comment1Mon, 16 Apr 2012 11:08:00 GMT