Ayende @ Rahien

Ayende @ Rahienhttp://ayende.comAyende @ RahienCopyright (C) Ayende Rahien 2004 - 2021 (c) 202660Adam Langley commented on The RavenDB indexing process: Optimization–De-parallelizing workThe wonderful thing about that scenario, is if the region of code you are optimising is modular (which I'm sure it is), the problem space is not variable once the software is installed. Hence, you could provide two indexing modules, one designed for single-core, and one for multi-core parallelism. Of course you would increase your code maintenance, but thats just another decision...http://ayende.com/155393/the-ravendb-indexing-process-optimization-de-parallelizing-work#comment11http://ayende.com/155393/the-ravendb-indexing-process-optimization-de-parallelizing-work#comment11Sun, 22 Apr 2012 21:00:26 GMTAyende Rahien commented on The RavenDB indexing process: Optimization–De-parallelizing workFrank, No, we didn't do that. We handle the control in a much simple concept by partitioning the work before starting the parallel workhttp://ayende.com/155393/the-ravendb-indexing-process-optimization-de-parallelizing-work#comment10http://ayende.com/155393/the-ravendb-indexing-process-optimization-de-parallelizing-work#comment10Sat, 21 Apr 2012 10:57:53 GMTFrank Quednau commented on The RavenDB indexing process: Optimization–De-parallelizing workHave you implemented your own Task Scheduler for the Task library?http://ayende.com/155393/the-ravendb-indexing-process-optimization-de-parallelizing-work#comment9http://ayende.com/155393/the-ravendb-indexing-process-optimization-de-parallelizing-work#comment9Sat, 21 Apr 2012 10:53:03 GMTRafal commented on The RavenDB indexing process: Optimization–De-parallelizing workA nice thing about indexing in Raven is that usually you have all recently modified documents in memory so you can index them without reading from the storage. You will not have such luxury when the lucene index is external to the applicationhttp://ayende.com/155393/the-ravendb-indexing-process-optimization-de-parallelizing-work#comment8http://ayende.com/155393/the-ravendb-indexing-process-optimization-de-parallelizing-work#comment8Sat, 21 Apr 2012 10:37:21 GMTAyende Rahien commented on The RavenDB indexing process: Optimization–De-parallelizing workMadhav, RavenDB is DivanDBhttp://ayende.com/155393/the-ravendb-indexing-process-optimization-de-parallelizing-work#comment7http://ayende.com/155393/the-ravendb-indexing-process-optimization-de-parallelizing-work#comment7Sat, 21 Apr 2012 08:18:09 GMTAyende Rahien commented on The RavenDB indexing process: Optimization–De-parallelizing workMatthew, It is using a lot of CPU for full text indexing, it requires a lot of memory and it writes a lot to disk. It wouldn't be workable to do this on the cloud, because the cost of actually sending the data up there and then getting it back would be too high.http://ayende.com/155393/the-ravendb-indexing-process-optimization-de-parallelizing-work#comment6http://ayende.com/155393/the-ravendb-indexing-process-optimization-de-parallelizing-work#comment6Sat, 21 Apr 2012 08:17:57 GMTMatthew Sullivan commented on The RavenDB indexing process: Optimization–De-parallelizing workSorry if this question is too naive, but is the indexing primarily cpu bound, memory bound or i/o bound? Would it be helpful or possible to use a cloud computing to create indexes in a speedy fashion? I've just heard that slow indexing speed is a major drawback of doc dbs, a prime reason why reporting etc needs to be done on sql... just wondering if you can throw a little cloud money at the problem to get faster turnaround on ad hoc reporting or index fixes. http://ayende.com/155393/the-ravendb-indexing-process-optimization-de-parallelizing-work#comment4http://ayende.com/155393/the-ravendb-indexing-process-optimization-de-parallelizing-work#comment4Fri, 20 Apr 2012 17:20:10 GMTGene Hughson commented on The RavenDB indexing process: Optimization–De-parallelizing workPerformance tuning on a line of business app is expensive and labor-intensive to get right: you really need a comprehensive suite of load tests on the same hardware profile as production using an equivalent network profile - easy-peasy. I can only imagine the headache involved with a more general-purpose tool like Raven. Like Daniel above, I'm enjoying the peek into your world.http://ayende.com/155393/the-ravendb-indexing-process-optimization-de-parallelizing-work#comment3http://ayende.com/155393/the-ravendb-indexing-process-optimization-de-parallelizing-work#comment3Fri, 20 Apr 2012 14:25:07 GMTDaniel O commented on The RavenDB indexing process: Optimization–De-parallelizing workEnjoying these posts on the ongoing development of RavenDB. Could you imagine if the SQL Server or Oracle devs did posts like this?http://ayende.com/155393/the-ravendb-indexing-process-optimization-de-parallelizing-work#comment2http://ayende.com/155393/the-ravendb-indexing-process-optimization-de-parallelizing-work#comment2Fri, 20 Apr 2012 12:47:18 GMTgandjustas commented on The RavenDB indexing process: Optimization–De-parallelizing workWhat actual API you used for parallel computations? Parallel.ForEach? It's not suitable for IO-bound concurrency. For IO-bound you should use Tasks.http://ayende.com/155393/the-ravendb-indexing-process-optimization-de-parallelizing-work#comment1http://ayende.com/155393/the-ravendb-indexing-process-optimization-de-parallelizing-work#comment1Fri, 20 Apr 2012 09:27:25 GMT