﻿<?xml version="1.0" encoding="utf-8"?><rss version="2.0"><channel><title>Ayende @ Rahien</title><link>http://ayende.com/blog/</link><description>Ayende @ Rahien</description><copyright>Copyright (C) Ayende Rahien  2004 - 2012 (c) 2013</copyright><ttl>60</ttl><item><title>Why scalability matters?</title><description>&lt;p&gt;Otherwise, you get this:&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/Why-scalability-matters_76E8/image_2.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/Why-scalability-matters_76E8/image_thumb.png" width="992" height="565"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;And that is one sale that isn’t going to happen!&lt;/p&gt;</description><link>http://ayende.com/blog/153217/why-scalability-matters?key=a23938a1-bb81-438e-973a-e24e7c2c961a</link><guid>http://ayende.com/blog/153217/why-scalability-matters?key=a23938a1-bb81-438e-973a-e24e7c2c961a</guid><pubDate>Wed, 13 Jun 2012 09:00:00 GMT</pubDate></item><item><title>The RavenDB indexing process: Optimization–Tuning? Why, we have auto tuning</title><description>&lt;p&gt;The final aspect of RavenDB’s x7 jump in indexing performance is the fact that we made it &lt;em&gt;freakishly smart&lt;/em&gt;.&lt;/p&gt; &lt;p&gt;During standard operation, most indexes only update when new information comes in, we are usually talking about a small number of documents for every indexing run. The problem is what happens when you have a sudden outpour of documents into RavenDB? For example, during nightly ETL batch, or just if you suddenly have a flood of users doing write operations.&lt;/p&gt; &lt;p&gt;The problem here is that we actually have to balance a lot of variable at the same time:&lt;/p&gt; &lt;ul&gt; &lt;li&gt;The number of documents that we have to index*.&lt;/li&gt; &lt;li&gt;The current memory utilization**.&lt;/li&gt; &lt;li&gt;How any cores I have available to do the index work with?&lt;/li&gt; &lt;li&gt;How much time do I have to do this?&lt;/li&gt;&lt;/ul&gt; &lt;p&gt;Basically, the idea goes like this, if I have a small batch size, I am able to index more quickly, ensuring that we have fresher results. If I have big batch size, I am able to index more documents, and my overall indexing times goes down.&lt;/p&gt; &lt;p&gt;There is a non trivial cost associated with every indexing run, so reducing the number of indexing run is good, but the more documents I shove into a single run, the more memory will I use, and the more time it will take before the results are visible to the users.&lt;/p&gt; &lt;p&gt;* It is non trivial because there is no easy way for us to even know how many documents we have left to index (to find out is costly).&lt;/p&gt; &lt;p&gt;** Memory utilization is &lt;em&gt;hard &lt;/em&gt;to figure out in a managed world. I don’t actually have a way to &lt;em&gt;know&lt;/em&gt; how much memory I am using for indexing and how much for other stuff, and there is no real way to say “free the memory from the last indexing run”, or even estimate how much memory that took.&lt;/p&gt; &lt;p&gt;What we have decided on doing is to start from a very small (low hundreds) indexing batch size, and see what is actually going on live. If we see that we have more documents to index than the current batch size, we will slowly double the size of the batch. Slowly, because bigger batches requires more memory, and we also have to take into account current utilization, memory usage, and a bunch of other factors as well. We also go the other way around, able to reduce the indexing batch size on demand based on how much work we have to do right now.&lt;/p&gt; &lt;p&gt;We also provide an upper limit, because at some point it make sense to just do a big batch and make the indexing results visible than to try to do everything all at once. &lt;/p&gt; &lt;p&gt;The fun part in all of that is that once we have found the appropriate algorithm for this, it means that RavenDB will automatically adjust itself based on real production load. If you have an low update rate, it will favor small indexing batches and immediately execute indexing on the new documents. However, if you suddenly have a spike in traffic and the update rate goes up, RavenDB will adjust the indexing batch size so it will be able to keep up with your rate.&lt;/p&gt; &lt;p&gt;We have done some (read, a huge amount) testing with regards to this new optimization, and it turns out that under slow update frequency, we are seeing an average of 15 – 25 ms between a document update and it showing up in the indexes. That is pretty good, but what is going on when we have data just pouring in?&lt;/p&gt; &lt;p&gt;We tested this with a 3 million documents and 3 indexes. And it turn out that under this scenario, where we are trying to shove data into RavenDB as fast as it can accept it, we do see an increase in index latency. Under those condition, latency rose all the way to 1.5 seconds.&lt;/p&gt; &lt;p&gt;This is actually something that I am &lt;em&gt;very&lt;/em&gt; happy about, because we were able to automatically adjust to the changing conditions, and were still able to index things at a reasonable rate (note that under this scenario, the batch size was usually 8 – 16 thousands documents, vs. the 128 – 256 that it is normally).&lt;/p&gt; &lt;p&gt;Because we were able to adjust the batch size on the fly, we could handle sustained writes at this rate with no interruption in service and no real need to think about this from the users perspective.. Exactly what the RavenDB philosophy calls for.&lt;/p&gt;</description><link>http://ayende.com/blog/155425/the-ravendb-indexing-process-optimization-tuning-why-we-have-auto-tuning?key=e8935e72-3d8c-4b63-a000-2e4f35b2fc57</link><guid>http://ayende.com/blog/155425/the-ravendb-indexing-process-optimization-tuning-why-we-have-auto-tuning?key=e8935e72-3d8c-4b63-a000-2e4f35b2fc57</guid><pubDate>Tue, 24 Apr 2012 09:00:00 GMT</pubDate></item><item><title>The RavenDB indexing process: Optimization–Getting documents from disk</title><description>&lt;p&gt;As I noted in my &lt;a href="http://ayende.com/blog/154721/the-ravendb-indexing-process-optimization?key=c5c0347883c34378b5bae4c17d05a292"&gt;previous post&lt;/a&gt;, we have done major optimizations for RavenDB. One of the areas where we improved the performance was reading the documents from the disk for indexing.&lt;/p&gt; &lt;p&gt;In Pseudo Code, it looks like this:&lt;/p&gt; &lt;blockquote&gt;&lt;pre class="csharpcode"&gt;&lt;span class="kwrd"&gt;while&lt;/span&gt; database_is_running:
  stale = find_stale_indexes()
  lastIndexedEtag = find_last_indexed_etag(stale)
  docs_to_index = &lt;font style="background-color: #ffff00"&gt;get_documents_since&lt;/font&gt;(lastIndexedEtag, batch_size)
  &lt;/pre&gt;
&lt;style type="text/css"&gt;.csharpcode, .csharpcode pre
{
	font-size: small;
	color: black;
	font-family: consolas, "Courier New", courier, monospace;
	background-color: #ffffff;
	/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt 
{
	background-color: #f4f4f4;
	width: 100%;
	margin: 0em;
}
.csharpcode .lnum { color: #606060; }
&lt;/style&gt;
&lt;/blockquote&gt;
&lt;p&gt;As it turned out, we had a major optimization option here, because of the way the data is actually structured on disk. In simple terms, we have an on disk index that lists the documents in the order in which they were updated, and then we have the actual documents themselves, which may be anywhere on the disk.&lt;/p&gt;
&lt;p&gt;Instead of loading the documents in the orders in which they were modified, we decided to try something different. We first query the information we need to find the document on disk from the index, then we sort them based on the optimal access pattern, to reduce disk movement and ensure that we have as sequential reads as possible. Then we take those results in memory and sort them based on their last update time again.&lt;/p&gt;
&lt;p&gt;This seems to be a perfectly obvious thing to do, assuming that you are aware of such things, but it is actually something that is very easy not to notice. The end result is quite promising, and it contributed to the 7+ times improvements in perf that we had for indexing costs.&lt;/p&gt;
&lt;p&gt;But surprisingly, it wasn’t the major factor, I’ll discuss a &lt;em&gt;huge&lt;/em&gt; perf boost in this area tomorrow.&lt;/p&gt;</description><link>http://ayende.com/blog/155201/the-ravendb-indexing-process-optimization-getting-documents-from-disk?key=83100102-0777-4529-9bfd-9f98734fea82</link><guid>http://ayende.com/blog/155201/the-ravendb-indexing-process-optimization-getting-documents-from-disk?key=83100102-0777-4529-9bfd-9f98734fea82</guid><pubDate>Mon, 23 Apr 2012 07:00:00 GMT</pubDate></item><item><title>The RavenDB indexing process: Optimization–De-parallelizing work</title><description>&lt;p&gt;One of the major dangers in doing perf work is that you have a scenario, and you optimize the &lt;em&gt;hell&lt;/em&gt; out of that scenario. It is actually pretty easy to do without even noticing it. The problem is that when you do things like that, you are likely to be optimizing a single scenario to perform really well, but you are hurting the overall system performance.&lt;/p&gt; &lt;p&gt;In this example, we have moved heaven and earth to make sure that we are indexing things as fast as possible, and we tested with 3 indexes, on an 4 cores machine. As it turned out, we actually &lt;em&gt;had&lt;/em&gt; improved things, for &lt;em&gt;that particular scenario&lt;/em&gt;.&lt;/p&gt; &lt;p&gt;Using the same test case on a single core machine was suddenly far more heavy weight, because we were pushing a lot of work at the same time. More than the machine could process. The end result was that it actually got there, but much more slowly than if we would have run things sequentially.&lt;/p&gt; &lt;p&gt;Of course, I give you the outliers, but those are good indicators for what we found out. Initially, we thought that we could resolve that by using the TPL’s MaxDegreeOfParallelism, but it turned out to be more complex than that. We have IO bound and we have CPU bound tasks that we need to execute, and trying to execute IO heavy tasks with this would actually cause issues in this scenario.&lt;/p&gt; &lt;p&gt;We had to manually throttle things ourselves, both to ensure limited number of parallel work, and because we have a lot more information about the actual tasks than the TPL have. We can schedule them in a way that is far more efficient because we can tell what is actually going on.&lt;/p&gt; &lt;p&gt;The end result is that we are actually using less parallelism, overall, but in a more efficient manner.&lt;/p&gt; &lt;p&gt;In my next post, I’ll discuss the auto batch tuning support, which allows us to do some really amazing things from the point of view of system performance. &lt;/p&gt;</description><link>http://ayende.com/blog/155393/the-ravendb-indexing-process-optimization-de-parallelizing-work?key=ca7a266b-fb3e-4642-bf6d-f38b9357bc86</link><guid>http://ayende.com/blog/155393/the-ravendb-indexing-process-optimization-de-parallelizing-work?key=ca7a266b-fb3e-4642-bf6d-f38b9357bc86</guid><pubDate>Fri, 20 Apr 2012 09:00:00 GMT</pubDate></item><item><title>The RavenDB indexing process: Optimization–Parallelizing work</title><description>&lt;p&gt;
	One of the things that we are doing during the index process for RavenDB is applying triggers and deciding what, if and how a document will be indexed. The actual process is a bit more involved, because we have to do additional things (like figure out which indexes have already indexed those particular documents).&lt;/p&gt;
&lt;p&gt;
	At any rate, the interesting thing is that this is a process which is pretty basic:&lt;/p&gt;
&lt;blockquote&gt;
	&lt;pre class="csharpcode"&gt;
&lt;span class="kwrd"&gt;for&lt;/span&gt; doc &lt;span class="kwrd"&gt;in&lt;/span&gt; docs:
    matchingIndexes = FindIndexesFor(doc)
    &lt;span class="kwrd"&gt;if&lt;/span&gt; matchingIndexes.Count &amp;gt; 0:
       doc = ExecuteTriggers(doc) 
       &lt;span class="kwrd"&gt;if&lt;/span&gt; doc != &lt;span class="kwrd"&gt;null&lt;/span&gt;:
          yield doc&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;
	&lt;style type="text/css"&gt;
.csharpcode, .csharpcode pre
{
	font-size: small;
	color: black;
	font-family: consolas, "Courier New", courier, monospace;
	background-color: #ffffff;
	/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt 
{
	background-color: #f4f4f4;
	width: 100%;
	margin: 0em;
}
.csharpcode .lnum { color: #606060; }	&lt;/style&gt;
&lt;/p&gt;
&lt;p&gt;
	The interesting thing about this is that this is a set of operations that only works on a single document at a time, and the result is the modified documents.&lt;/p&gt;
&lt;p&gt;
	We were able to gain &lt;em&gt;significant&lt;/em&gt; perf boost by simply moving to a Parallel.ForEach call.&amp;nbsp; This seems simple enough, right? Parallelize the work, get better benefits.&lt;/p&gt;
&lt;p&gt;
	Except that there are issues with this as well, which I&amp;rsquo;ll touch on my next post.&lt;/p&gt;
</description><link>http://ayende.com/blog/155233/the-ravendb-indexing-process-optimization-parallelizing-work?key=1b0d214b-677a-422a-8e17-fa739f5d2804</link><guid>http://ayende.com/blog/155233/the-ravendb-indexing-process-optimization-parallelizing-work?key=1b0d214b-677a-422a-8e17-fa739f5d2804</guid><pubDate>Thu, 19 Apr 2012 09:00:00 GMT</pubDate></item><item><title>The RavenDB indexing process: Optimization</title><description>&lt;p&gt;The actual process done by RavenDB to index documents is a fairly complex one. In order to understand what exactly happened, I decided to break it apart to pseudo code.&lt;/p&gt; &lt;p&gt;It looks something like this:&lt;/p&gt; &lt;blockquote&gt;&lt;pre class="csharpcode"&gt;&lt;span class="kwrd"&gt;while&lt;/span&gt; database_is_running:
  stale = find_stale_indexes()
  lastIndexedEtag = find_last_indexed_etag(stale)
  docs_to_index = get_documents_since(lastIndexedEtag, batch_size)
  
  filtered_docs = execute_read_filters(docs_to_index)
  
  indexing_work = []
  
  &lt;span class="kwrd"&gt;for&lt;/span&gt; index &lt;span class="kwrd"&gt;in&lt;/span&gt; stale:
    
    index_docs = select_matching_docs(index, filtered_docs)
    
    &lt;span class="kwrd"&gt;if&lt;/span&gt; index_docs.empty:
      set_indexed(index, lastIndexedEtag)
    &lt;span class="kwrd"&gt;else&lt;/span&gt;
      indexing_work.add(index, index_docs)
      
  &lt;span class="kwrd"&gt;for&lt;/span&gt; work &lt;span class="kwrd"&gt;in&lt;/span&gt; indexing_work:
  
     work.index(work.index_docs)&lt;/pre&gt;
&lt;style type="text/css"&gt;.csharpcode, .csharpcode pre
{
	font-size: small;
	color: black;
	font-family: consolas, "Courier New", courier, monospace;
	background-color: #ffffff;
	/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt 
{
	background-color: #f4f4f4;
	width: 100%;
	margin: 0em;
}
.csharpcode .lnum { color: #606060; }
&lt;/style&gt;
&lt;/blockquote&gt;
&lt;p&gt;And now let me show you the areas in which we did some perf work:&lt;/p&gt;
&lt;blockquote&gt;&lt;pre class="csharpcode"&gt;&lt;span class="kwrd"&gt;while&lt;/span&gt; database_is_running:
  stale = find_stale_indexes()
  lastIndexedEtag = find_last_indexed_etag(stale)
  docs_to_index = &lt;font style="background-color: #ffff00"&gt;get_documents_since&lt;/font&gt;(lastIndexedEtag, batch_size)
  
  filtered_docs = &lt;font style="background-color: #ffff00"&gt;execute_read_filters&lt;/font&gt;(docs_to_index)
  
  indexing_work = []
  
  &lt;span class="kwrd"&gt;for&lt;/span&gt; index &lt;span class="kwrd"&gt;in&lt;/span&gt; stale:
    
    index_docs = &lt;font style="background-color: #ffff00"&gt;select_matching_docs&lt;/font&gt;(index, filtered_docs)
    
    &lt;span class="kwrd"&gt;if&lt;/span&gt; index_docs.empty:
      set_indexed(index, lastIndexedEtag)
    &lt;span class="kwrd"&gt;else&lt;/span&gt;
      indexing_work.add(index, index_docs)
      
  &lt;font style="background-color: #ffff00"&gt;&lt;span class="kwrd"&gt;for&lt;/span&gt; work &lt;span class="kwrd"&gt;in&lt;/span&gt; indexing_work:&lt;/font&gt;
  
     work.index(work.index_docs)&lt;/pre&gt;
&lt;style type="text/css"&gt;.csharpcode, .csharpcode pre
{
	font-size: small;
	color: black;
	font-family: consolas, "Courier New", courier, monospace;
	background-color: #ffffff;
	/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt 
{
	background-color: #f4f4f4;
	width: 100%;
	margin: 0em;
}
.csharpcode .lnum { color: #606060; }
&lt;/style&gt;
&lt;/blockquote&gt;
&lt;p&gt;All of which gives us a &lt;em&gt;major&lt;/em&gt; boost in the system performance. I’ll discuss each part of that work in detail, don’t worry &lt;img style="border-bottom-style: none; border-left-style: none; border-top-style: none; border-right-style: none" class="wlEmoticon wlEmoticon-winkingsmile" alt="Winking smile" src="http://ayende.com/blog/Images/Windows-Live-Writer/The-RavenDB-indexing-process_1177B/wlEmoticon-winkingsmile_2.png"&gt;&lt;/p&gt;</description><link>http://ayende.com/blog/154721/the-ravendb-indexing-process-optimization?key=c5c03478-83c3-4378-b5ba-e4c17d05a292</link><guid>http://ayende.com/blog/154721/the-ravendb-indexing-process-optimization?key=c5c03478-83c3-4378-b5ba-e4c17d05a292</guid><pubDate>Wed, 18 Apr 2012 09:00:00 GMT</pubDate></item><item><title>Performance implications of method signatures</title><description>&lt;p&gt;In my previous post, I asked: What are the performance implications of the two options?&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/Performance-implications-of-method-signa_AE1F/image_thumb_2.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image_thumb" border="0" alt="image_thumb" src="http://ayende.com/blog/Images/Windows-Live-Writer/Performance-implications-of-method-signa_AE1F/image_thumb_thumb.png" width="891" height="85"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;Versus:&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/Performance-implications-of-method-signa_AE1F/image_thumb1_2.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image_thumb1" border="0" alt="image_thumb1" src="http://ayende.com/blog/Images/Windows-Live-Writer/Performance-implications-of-method-signa_AE1F/image_thumb1_thumb.png" width="973" height="73"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;And the answer is quite simple. The chance to optimize how it works.&lt;/p&gt; &lt;p&gt;In the first example, we have to return an unknown amount of information. In the second example, we know how much data we need to return. That means that we can optimize ourselves based on that.&lt;/p&gt; &lt;p&gt;What do I mean by that?&lt;/p&gt; &lt;p&gt;Look at the method signatures, those requires us to scan a secondary index, and get the results back. From there, we need to get back to the actual data. If we knew what the size of the data that we need to return is, we could fetch just the locations from the index, then optimize our disk access pattern to take advantage of sequential reads. &lt;/p&gt; &lt;p&gt;In the first example, we have to assume that every read is the last read. Callers may request one item, or 25 or 713, so we don’t really have a way to optimize things. The moment that we have the amount that the caller wants, things change.&lt;/p&gt; &lt;p&gt;We can scan the index to get just actual position of the document on disk, and then load the documents from the disk based on the optimal access pattern in terms of disk access. It is a very small change, but it allowed us to make a &lt;em&gt;big&lt;/em&gt; optimization.&lt;/p&gt;</description><link>http://ayende.com/blog/154529/performance-implications-of-method-signatures?key=0c3cfa8b-83a7-487d-a30b-697972ad1b1c</link><guid>http://ayende.com/blog/154529/performance-implications-of-method-signatures?key=0c3cfa8b-83a7-487d-a30b-697972ad1b1c</guid><pubDate>Tue, 03 Apr 2012 10:00:00 GMT</pubDate></item><item><title>Compare and contrast: Performance implications of method signatures</title><description>&lt;p&gt;What are the performance implications of the two options?&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/Compare-and-contrast-Performance-implica_AD35/image_2.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/Compare-and-contrast-Performance-implica_AD35/image_thumb.png" width="891" height="85"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;Versus:&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/Compare-and-contrast-Performance-implica_AD35/image_4.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/Compare-and-contrast-Performance-implica_AD35/image_thumb_1.png" width="973" height="73"&gt;&lt;/a&gt;&lt;/p&gt;</description><link>http://ayende.com/blog/154497/compare-and-contrast-performance-implications-of-method-signatures?key=838a499a-dc07-43c7-89d6-726b970ac449</link><guid>http://ayende.com/blog/154497/compare-and-contrast-performance-implications-of-method-signatures?key=838a499a-dc07-43c7-89d6-726b970ac449</guid><pubDate>Mon, 02 Apr 2012 10:00:00 GMT</pubDate></item><item><title>If you throttle me any me I am going to throttle you back!</title><description>&lt;p&gt;&lt;img style="display: inline; float: right" align="right" src="http://www.richielottoutdoors.com/2009/full_throttle_group.jpg" width="212" height="240"&gt;&lt;/p&gt; &lt;p&gt;It is interesting to note that for a long while, what we were trying to do with RavenDB was make it use less and less resources. One of the reasons for that is that less resources is &lt;em&gt;obviously&lt;/em&gt; better, because we aren’t wasting anything.&lt;/p&gt; &lt;p&gt;The other reason is that we have users running us on a 512MB/650 MHz Celeron 32 bit machines. So we really need to be able to fit into a small box (and also allow enough processing power for the user to actually do something with the machine).&lt;/p&gt; &lt;p&gt;We have gotten &lt;em&gt;really &lt;/em&gt;good in doing that, actually. &lt;/p&gt; &lt;p&gt;The problem is that we also have users running RavenDB on standard server hardware (32 GB / 16 cores, RAID and what not) in which case they (rightly) complain that RavenDB isn’t actually using all of their hardware.&lt;/p&gt; &lt;p&gt;Now, being conservative about resource &lt;em&gt;usage&lt;/em&gt; is generally good, and we do have the configuration in place which can tell RavenDB to use more memory. It is just that this isn’t &lt;em&gt;polite&lt;/em&gt; behavior. &lt;/p&gt; &lt;p&gt;RavenDB in most cases shouldn’t require anything special for you to run, we want it to be truly a zero admin database. The solution?&amp;nbsp; Take into account the system state and increase the amount of work that we do to get things done. And yes, I am aware of the &lt;a href="http://blogs.msdn.com/b/oldnewthing/archive/2012/01/18/10257834.aspx"&gt;pitfalls&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;As long as there is enough free RAM available, we will increase the amount of documents that we are going to index in a single batch. That is subject to some limits (for example, if we just created a new index on a big database, we need to make sure we aren’t trying to load it entirely to memory), and it knows how to reserve some room for other things, and how to throttle down and as well as up.&lt;/p&gt; &lt;p&gt;This post is written before I had the chance to actually test this on production level size dataset, but I am looking forward to seeing how it works. &lt;/p&gt; &lt;p&gt;&lt;strong&gt;Update&lt;/strong&gt;: Okay, that is encouraging, it looks like what we did just made things over &lt;strong&gt;7 times faster&lt;/strong&gt;. And this isn’t a micro benchmark, this is when you throw this on a multi GB database with full text search indexing.&lt;/p&gt; &lt;p&gt;Next, we need to investigate what we are going to do about multiple running indexes and how this optimization affects them. Fun &lt;img style="border-bottom-style: none; border-left-style: none; border-top-style: none; border-right-style: none" class="wlEmoticon wlEmoticon-smile" alt="Smile" src="http://ayende.com/blog/Images/Windows-Live-Writer/If-you-throttle-me-any-me-I-am-going-to-_10FBD/wlEmoticon-smile_2.png"&gt;.&lt;/p&gt;</description><link>http://ayende.com/blog/154625/if-you-throttle-me-any-me-i-am-going-to-throttle-you-back?key=d67eada4-4791-49f1-a046-631941264a36</link><guid>http://ayende.com/blog/154625/if-you-throttle-me-any-me-i-am-going-to-throttle-you-back?key=d67eada4-4791-49f1-a046-631941264a36</guid><pubDate>Thu, 29 Mar 2012 10:00:00 GMT</pubDate></item><item><title>Watch your 6, or is it your I/O? It is the I/O, yes</title><description>&lt;p&gt;As I said in my previous post, tasked with having to load 3.1 million files into RavenDB, most of them in the 1 – 2 KB range. &lt;/p&gt; &lt;p&gt;Well, the &lt;em&gt;first &lt;/em&gt;thing I did had absolutely nothing to do with RavenDB, it had to do with avoiding dealing with this:&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/Watch-your-6-or-is-it-your-IO-It-is-the-_94BB/image_4.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/Watch-your-6-or-is-it-your-IO-It-is-the-_94BB/image_thumb_1.png" width="278" height="199"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;As you can see, that is a &lt;em&gt;lot. &lt;/em&gt;&lt;/p&gt; &lt;p&gt;But when the freedb dataset is distributed, what we &lt;em&gt;have&lt;/em&gt; is actually:&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/Watch-your-6-or-is-it-your-IO-It-is-the-_94BB/image_2.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/Watch-your-6-or-is-it-your-IO-It-is-the-_94BB/image_thumb.png" width="443" height="46"&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;This is a tar.bz2, which we can read using the SharpZipLib library.&lt;/p&gt; &lt;p&gt;The really interesting thing is that reading the archive (even after adding the cost of decompressing it) is &lt;em&gt;far&lt;/em&gt; faster than reading directly from the file system. Most file systems do badly on large amount of small files, and at any rate, it is &lt;em&gt;very&lt;/em&gt; hard to optimize the access pattern to a lot of small files.&lt;/p&gt; &lt;p&gt;However, when we are talking about something like reading a single large file? That is really easy to optimize and significantly reduces the cost on the input I/O.&lt;/p&gt; &lt;p&gt;Just this step has reduced the cost of importing by a significant factor, we are talking about twice as much as before, and with a &lt;em&gt;lot&lt;/em&gt; less disk activity.&lt;/p&gt;</description><link>http://ayende.com/blog/154465/watch-your-6-or-is-it-your-i-o-it-is-the-i-o-yes?key=0d9ee7ac-e864-4fef-b97b-3f0f1d2bd77d</link><guid>http://ayende.com/blog/154465/watch-your-6-or-is-it-your-i-o-it-is-the-i-o-yes?key=0d9ee7ac-e864-4fef-b97b-3f0f1d2bd77d</guid><pubDate>Wed, 28 Mar 2012 10:00:00 GMT</pubDate></item><item><title>Watch your 6, or is it your I/O?</title><description>&lt;p&gt;One of the interesting things about the freedb dataset is that it is distributed as a 3.1 million &lt;em&gt;separate files, &lt;/em&gt;most of them in the 1 – 2 KB range.&lt;/p&gt; &lt;p&gt;Loading that to RavenDB took a while, so I set out to fix that. Care to guess what is the absolutely the first thing that I did?&lt;/p&gt;</description><link>http://ayende.com/blog/154433/watch-your-6-or-is-it-your-i-o?key=75f640e5-87b0-4714-9cbc-031d1b886d41</link><guid>http://ayende.com/blog/154433/watch-your-6-or-is-it-your-i-o?key=75f640e5-87b0-4714-9cbc-031d1b886d41</guid><pubDate>Tue, 27 Mar 2012 10:00:00 GMT</pubDate></item><item><title>When you pit RavenDB &amp; SQL Server against one another…</title><description>&lt;p&gt;Here is how it works. I &lt;em&gt;hate&lt;/em&gt; benchmarks, because they are very easily manipulated. Whenever I am testing performance stuff, I am posting numbers, but they are usually in reference to themselves (showing improvements).&lt;/p&gt; &lt;p&gt;That said… &lt;/p&gt; &lt;p&gt;Mark Rodseth .Net Technical Architect at &lt;a href="http://www.fortunecookie.co.uk/"&gt;Fortune Cookie&lt;/a&gt; in London, UK and he did a really interesting comparison between RavenDB &amp;amp; SQL Server. I feel good about posting this because Mark is a totally foreign agent (hm…. well, maybe not that &lt;img style="border-bottom-style: none; border-left-style: none; border-top-style: none; border-right-style: none" class="wlEmoticon wlEmoticon-smile" alt="Smile" src="http://ayende.com/blog/Images/Windows-Live-Writer/RavenDB-and_11869/wlEmoticon-smile_2.png"&gt; ) but he has no association with RavenDB or Hibernating Rhinos.&lt;/p&gt; &lt;p&gt;Also, &lt;a href="http://tech-rash.blogspot.com/2012/02/is-raven-db-all-its-cracked-up-to-be.html"&gt;this post&lt;/a&gt; really made my day.&lt;/p&gt; &lt;blockquote&gt; &lt;p&gt;&lt;strong&gt;Update: &lt;/strong&gt;Mark posted &lt;a href="http://tech-rash.blogspot.com/2012/02/ravendb-vs-sql-follow-up.html"&gt;more details&lt;/a&gt; on his test case.&lt;/p&gt;&lt;/blockquote&gt; &lt;p&gt;Mark setup a load test for two identical applications, one using RavenDB, the other one using SQL Server. The results:&lt;/p&gt; &lt;blockquote&gt; &lt;p&gt;&lt;b&gt;SQL Load Test&lt;/b&gt;&lt;br&gt;Transactions: 111,014 (Transaction = Single Get Request)&lt;br&gt;Failures: 110,286 (Any 500 or timeout)&lt;br&gt; &lt;p&gt;&lt;img src="http://3.bp.blogspot.com/-s8OGQmWkVGw/TzEEukU_EeI/AAAAAAAAAOE/KlBMa9Kx7jI/s400/SQLDbThroughPut.jpg"&gt;&lt;/p&gt;&lt;/blockquote&gt; &lt;p&gt;And for RavenDB ?  &lt;blockquote&gt; &lt;p&gt;&lt;b&gt;RavenDB Load Test&lt;/b&gt;&lt;br&gt;Transactions: 145,554 (Transaction = Single Get Request)&lt;br&gt;Failures: 0 (Any 500 or timeout)  &lt;p&gt;&lt;img src="http://3.bp.blogspot.com/-4zPzgzwfI8U/TzEE2mxsNBI/AAAAAAAAAOM/G-f6XoqRMYE/s400/RavenDbThroughPut.jpg"&gt;&lt;/p&gt;&lt;/blockquote&gt; &lt;p&gt;And now &lt;em&gt;that&lt;/em&gt; is pretty cool.&lt;/p&gt;</description><link>http://ayende.com/blog/154593/when-you-pit-ravendb-sql-server-against-one-another?key=66b0a94c-e178-4429-9d4a-238e1eec1699</link><guid>http://ayende.com/blog/154593/when-you-pit-ravendb-sql-server-against-one-another?key=66b0a94c-e178-4429-9d4a-238e1eec1699</guid><pubDate>Thu, 09 Feb 2012 16:07:00 GMT</pubDate></item><item><title>Stupid smart code: Solution</title><description>&lt;p&gt;The reason that I said that this is very stupid code?&lt;/p&gt; &lt;blockquote&gt;&lt;pre class="csharpcode"&gt;&lt;span class="kwrd"&gt;public&lt;/span&gt; &lt;span class="kwrd"&gt;static&lt;/span&gt; &lt;span class="kwrd"&gt;void&lt;/span&gt; WriteDataToRequest(HttpWebRequest req, &lt;span class="kwrd"&gt;string&lt;/span&gt; data)
{
    var byteCount = Encoding.UTF8.GetByteCount(data);
    req.ContentLength = byteCount;
    &lt;span class="kwrd"&gt;using&lt;/span&gt; (var dataStream = req.GetRequestStream())
    {
        &lt;span class="kwrd"&gt;if&lt;/span&gt;(byteCount &amp;lt;= 0x1000) &lt;span class="rem"&gt;// small size, just let the system allocate it&lt;/span&gt;
        {
            var bytes = Encoding.UTF8.GetBytes(data);
            dataStream.Write(bytes, 0, bytes.Length);
            dataStream.Flush();
            &lt;span class="kwrd"&gt;return&lt;/span&gt;;
        }

        var buffer = &lt;span class="kwrd"&gt;new&lt;/span&gt; &lt;span class="kwrd"&gt;byte&lt;/span&gt;[0x1000];
        var maxCharsThatCanFitInBuffer = buffer.Length / Encoding.UTF8.GetMaxByteCount(1);
        var charBuffer = &lt;span class="kwrd"&gt;new&lt;/span&gt; &lt;span class="kwrd"&gt;char&lt;/span&gt;[maxCharsThatCanFitInBuffer];
        &lt;span class="kwrd"&gt;int&lt;/span&gt; start = 0;
        var encoder = Encoding.UTF8.GetEncoder();
        &lt;span class="kwrd"&gt;while&lt;/span&gt; (start &amp;lt; data.Length)
        {
            var charCount = Math.Min(charBuffer.Length, data.Length - start);

            data.CopyTo(start, charBuffer, 0, charCount);
            var bytes = encoder.GetBytes(charBuffer, 0, charCount, buffer, 0, &lt;span class="kwrd"&gt;false&lt;/span&gt;);
            dataStream.Write(buffer, 0, bytes);
            start += charCount;
        }
        dataStream.Flush();
    }
}&lt;/pre&gt;
&lt;style type="text/css"&gt;.csharpcode, .csharpcode pre
{
	font-size: small;
	color: black;
	font-family: consolas, "Courier New", courier, monospace;
	background-color: #ffffff;
	/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt 
{
	background-color: #f4f4f4;
	width: 100%;
	margin: 0em;
}
.csharpcode .lnum { color: #606060; }
&lt;/style&gt;
&lt;/blockquote&gt;
&lt;p&gt;Because all of this lovely code can be replaced with a simple:&lt;/p&gt;
&lt;blockquote&gt;&lt;pre class="csharpcode"&gt;&lt;span class="kwrd"&gt;public&lt;/span&gt; &lt;span class="kwrd"&gt;static&lt;/span&gt; &lt;span class="kwrd"&gt;void&lt;/span&gt; WriteDataToRequest(HttpWebRequest req, &lt;span class="kwrd"&gt;string&lt;/span&gt; data)
{
    req.ContentLength = Encoding.UTF8.GetByteCount(data);

    &lt;span class="kwrd"&gt;using&lt;/span&gt; (var dataStream = req.GetRequestStream())
    &lt;span class="kwrd"&gt;using&lt;/span&gt;(var writer = &lt;span class="kwrd"&gt;new&lt;/span&gt; StreamWriter(dataStream, Encoding.UTF8))
    {
        writer.Write(data);
        writer.Flush();
    }
}
&lt;/pre&gt;
&lt;style type="text/css"&gt;.csharpcode, .csharpcode pre
{
	font-size: small;
	color: black;
	font-family: consolas, "Courier New", courier, monospace;
	background-color: #ffffff;
	/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt 
{
	background-color: #f4f4f4;
	width: 100%;
	margin: 0em;
}
.csharpcode .lnum { color: #606060; }
&lt;/style&gt;
&lt;/blockquote&gt;
&lt;p&gt;And that is &lt;em&gt;so&lt;/em&gt; much better.&lt;/p&gt;</description><link>http://ayende.com/blog/148481/stupid-smart-code-solution?key=2945f3ff-de0a-4088-885d-a977668dd40c</link><guid>http://ayende.com/blog/148481/stupid-smart-code-solution?key=2945f3ff-de0a-4088-885d-a977668dd40c</guid><pubDate>Wed, 21 Dec 2011 08:00:00 GMT</pubDate></item><item><title>Stupid smart code</title><description>&lt;p&gt;We had the following code:&lt;/p&gt; &lt;blockquote&gt;&lt;pre class="csharpcode"&gt;&lt;span class="kwrd"&gt;public&lt;/span&gt; &lt;span class="kwrd"&gt;static&lt;/span&gt; &lt;span class="kwrd"&gt;void&lt;/span&gt; WriteDataToRequest(HttpWebRequest req, &lt;span class="kwrd"&gt;string&lt;/span&gt; data)
{
    var byteArray = Encoding.UTF8.GetBytes(data);

    req.ContentLength = byteArray.Length;

    &lt;span class="kwrd"&gt;using&lt;/span&gt; (var dataStream = req.GetRequestStream())
    {
        dataStream.Write(byteArray, 0, byteArray.Length);
        dataStream.Flush();
    }
}&lt;/pre&gt;
&lt;style type="text/css"&gt;.csharpcode, .csharpcode pre
{
	font-size: small;
	color: black;
	font-family: consolas, "Courier New", courier, monospace;
	background-color: #ffffff;
	/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt 
{
	background-color: #f4f4f4;
	width: 100%;
	margin: 0em;
}
.csharpcode .lnum { color: #606060; }
&lt;/style&gt;
&lt;/blockquote&gt;
&lt;p&gt;And that is a problem, because it allocates the memory twice, once for the string, once for the buffer. I changed that to this:&lt;/p&gt;
&lt;blockquote&gt;&lt;pre class="csharpcode"&gt;&lt;span class="kwrd"&gt;public&lt;/span&gt; &lt;span class="kwrd"&gt;static&lt;/span&gt; &lt;span class="kwrd"&gt;void&lt;/span&gt; WriteDataToRequest(HttpWebRequest req, &lt;span class="kwrd"&gt;string&lt;/span&gt; data)
{
    var byteCount = Encoding.UTF8.GetByteCount(data);
    req.ContentLength = byteCount;
    &lt;span class="kwrd"&gt;using&lt;/span&gt; (var dataStream = req.GetRequestStream())
    {
        &lt;span class="kwrd"&gt;if&lt;/span&gt;(byteCount &amp;lt;= 0x1000) &lt;span class="rem"&gt;// small size, just let the system allocate it&lt;/span&gt;
        {
            var bytes = Encoding.UTF8.GetBytes(data);
            dataStream.Write(bytes, 0, bytes.Length);
            dataStream.Flush();
            &lt;span class="kwrd"&gt;return&lt;/span&gt;;
        }

        var buffer = &lt;span class="kwrd"&gt;new&lt;/span&gt; &lt;span class="kwrd"&gt;byte&lt;/span&gt;[0x1000];
        var maxCharsThatCanFitInBuffer = buffer.Length / Encoding.UTF8.GetMaxByteCount(1);
        var charBuffer = &lt;span class="kwrd"&gt;new&lt;/span&gt; &lt;span class="kwrd"&gt;char&lt;/span&gt;[maxCharsThatCanFitInBuffer];
        &lt;span class="kwrd"&gt;int&lt;/span&gt; start = 0;
        var encoder = Encoding.UTF8.GetEncoder();
        &lt;span class="kwrd"&gt;while&lt;/span&gt; (start &amp;lt; data.Length)
        {
            var charCount = Math.Min(charBuffer.Length, data.Length - start);

            data.CopyTo(start, charBuffer, 0, charCount);
            var bytes = encoder.GetBytes(charBuffer, 0, charCount, buffer, 0, &lt;span class="kwrd"&gt;false&lt;/span&gt;);
            dataStream.Write(buffer, 0, bytes);
            start += charCount;
        }
        dataStream.Flush();
    }
}
&lt;/pre&gt;
&lt;style type="text/css"&gt;.csharpcode, .csharpcode pre
{
	font-size: small;
	color: black;
	font-family: consolas, "Courier New", courier, monospace;
	background-color: #ffffff;
	/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt 
{
	background-color: #f4f4f4;
	width: 100%;
	margin: 0em;
}
.csharpcode .lnum { color: #606060; }
&lt;/style&gt;
&lt;/blockquote&gt;
&lt;p&gt;And I was quite proud of myself.&lt;/p&gt;
&lt;p&gt;Then I realized that I was stupid. Why?&lt;/p&gt;</description><link>http://ayende.com/blog/147457/stupid-smart-code?key=07155aa2-9aa6-4f67-8782-b3a353b06dd6</link><guid>http://ayende.com/blog/147457/stupid-smart-code?key=07155aa2-9aa6-4f67-8782-b3a353b06dd6</guid><pubDate>Tue, 20 Dec 2011 10:00:00 GMT</pubDate></item><item><title>You can’t cache DateTime.Now</title><description>&lt;p&gt;One of the things that were itching me was the fact that it seems that not all the queries in RaccoonBlog were hitting the cache. Oh, it is more than fast enough, but I couldn’t really figure out what is going on there. Then again, it was never important enough for me to dig in.&lt;/p&gt; &lt;p&gt;I was busy doing the profiling stuff for RavenDB and I used RaccoonBlog as my testing ground, when I realized what the problem was:&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/You-cant-cache-DateTime.Now_F5AF/image_2.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/You-cant-cache-DateTime.Now_F5AF/image_thumb.png" width="1044" height="61"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;Everything was working just fine, the problem was here:&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/You-cant-cache-DateTime.Now_F5AF/image_4.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/You-cant-cache-DateTime.Now_F5AF/image_thumb_1.png" width="635" height="92"&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;Do you get the light bulb moment that I had? I was using Now in a query, and since Now by &lt;em&gt;definition&lt;/em&gt; changes, we keep generating new queries, which can’t be cached, etc.&lt;/p&gt; &lt;p&gt;I changed all of the queries that contained Now to:&lt;/p&gt; &lt;p&gt;&lt;a href="http://ayende.com/blog/Images/Windows-Live-Writer/You-cant-cache-DateTime.Now_F5AF/image_8.png"&gt;&lt;img style="background-image: none; border-bottom: 0px; border-left: 0px; margin: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="image" border="0" alt="image" src="http://ayende.com/blog/Images/Windows-Live-Writer/You-cant-cache-DateTime.Now_F5AF/image_thumb_3.png" width="712" height="83"&gt;&lt;/a&gt;&lt;/p&gt;   &lt;p&gt;Which means that it would only use a different value every minute. Once I fixed that, I &lt;em&gt;still&lt;/em&gt; saw that there was a caching problem, which led me to discover that there was an error in how we calculated etags for dynamic indexes after they have been promoted. Even a very basic profiling tool helped us fix two separate bugs (in Raccoon Blog and in RavenDB).&lt;/p&gt;</description><link>http://ayende.com/blog/26625/you-cant-cache-datetime-now?key=6924bb46-75cc-4724-810c-163d02aab58d</link><guid>http://ayende.com/blog/26625/you-cant-cache-datetime-now?key=6924bb46-75cc-4724-810c-163d02aab58d</guid><pubDate>Tue, 28 Jun 2011 09:00:00 GMT</pubDate></item><item><title>RavenDB Aggressive Caching Mode</title><description>&lt;p&gt;RavenDB has the notion of HTTP caching out of the box, what this means is that by default, without you having to take any action, RavenDB will cache as much as possible for you. It can get away with doing this because it is utilizing the notion of the 304 Not Modified response. The second time that we load an entity or execute a query, we can simply ask the server whatever the data has been modified since the last time we saw it, and if it wasn’t, we can just skip executing the code and get the data directly from the cache.&lt;/p&gt; &lt;p&gt;This saves a lot in terms of bandwidth and processing power, it also means that we don’t have to worry about RavenDB’s caching returning any stale results, because we checked for that. It does mean, however, that we have to send a request to the server. There are situation where we want to squeeze even better performance from the system, and we can move to RavenDB’s aggressive caching mode.&lt;/p&gt; &lt;p&gt;Aggressive caching means that RavenDB won’t even ask the server whatever anything has changed, it will simply return the reply directly from the local cache if it is there. This means that you might get stale data, but it also means that you’ll get it &lt;em&gt;fast&lt;/em&gt;.&lt;/p&gt; &lt;p&gt;You can activate this mode using:&lt;/p&gt; &lt;blockquote&gt;&lt;pre class="csharpcode"&gt;&lt;span class="kwrd"&gt;using&lt;/span&gt; (session.Advanced.DocumentStore.AggressivelyCacheFor(TimeSpan.FromMinutes(5)))
{
    session.Load&amp;lt;User&amp;gt;(&lt;span class="str"&gt;"users/1"&lt;/span&gt;);
}
&lt;/pre&gt;
&lt;style type="text/css"&gt;.csharpcode, .csharpcode pre
{
	font-size: small;
	color: black;
	font-family: consolas, "Courier New", courier, monospace;
	background-color: #ffffff;
	/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt 
{
	background-color: #f4f4f4;
	width: 100%;
	margin: 0em;
}
.csharpcode .lnum { color: #606060; }
&lt;/style&gt;
&lt;/blockquote&gt;
&lt;p&gt;Now, if there is a value in the cache for users/1 that is at most 5 minutes old, we can directly use that.&lt;/p&gt;
&lt;p&gt;It also works on queries too:&lt;/p&gt;
&lt;blockquote&gt;&lt;pre class="csharpcode"&gt;&lt;span class="kwrd"&gt;using&lt;/span&gt; (session.Advanced.DocumentStore.AggressivelyCacheFor(TimeSpan.FromMinutes(5)))
{
    session.Query&amp;lt;User&amp;gt;().ToList();
}&lt;/pre&gt;
&lt;style type="text/css"&gt;.csharpcode, .csharpcode pre
{
	font-size: small;
	color: black;
	font-family: consolas, "Courier New", courier, monospace;
	background-color: #ffffff;
	/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt 
{
	background-color: #f4f4f4;
	width: 100%;
	margin: 0em;
}
.csharpcode .lnum { color: #606060; }
&lt;/style&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is an explicit step beyond the normal caching, and it does mean that you might get out of date information, but if you really want to reduce the number of remote calls, it is a really nice feature.&lt;/p&gt;</description><link>http://ayende.com/blog/25601/ravendb-aggressive-caching-mode?key=45166843-5a5c-43ca-8d9d-c8854978285a</link><guid>http://ayende.com/blog/25601/ravendb-aggressive-caching-mode?key=45166843-5a5c-43ca-8d9d-c8854978285a</guid><pubDate>Mon, 27 Jun 2011 09:00:00 GMT</pubDate></item><item><title>Performance numbers in the pub</title><description>&lt;blockquote&gt;Originally posted at 3/31/2011&lt;/blockquote&gt;&lt;p&gt;I am currently sitting with 3 guys in the pub, and the discussion naturally turned to performance. I asked all three the following question: “How many CLR objects can you create in one second?”&lt;/p&gt;  &lt;p&gt;I got the following replies:&lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;2,000 objects per second&lt;/li&gt;    &lt;li&gt;50,000 objects per second&lt;/li&gt;    &lt;li&gt;100,000 objects per second&lt;/li&gt; &lt;/ul&gt;  &lt;p&gt;Then I sat down to check:&lt;/p&gt;  &lt;blockquote&gt;   &lt;pre class="csharpcode"&gt;&lt;span class="kwrd"&gt;class&lt;/span&gt; Program
{
    &lt;span class="kwrd"&gt;static&lt;/span&gt; &lt;span class="kwrd"&gt;void&lt;/span&gt; Main(&lt;span class="kwrd"&gt;string&lt;/span&gt;[] args)
    {
        var sp = Stopwatch.StartNew();

        &lt;span class="kwrd"&gt;int&lt;/span&gt; i = 0;
        &lt;span class="kwrd"&gt;while&lt;/span&gt;(sp.ElapsedMilliseconds &amp;lt; 1000)
        {
            &lt;span class="kwrd"&gt;new&lt;/span&gt; MyClass();
            i++;
        }
        sp.Stop();
        Console.WriteLine(&lt;span class="str"&gt;"Created {0} in {1}"&lt;/span&gt;, i, sp.Elapsed);
    }
}

&lt;span class="kwrd"&gt;public&lt;/span&gt; &lt;span class="kwrd"&gt;class&lt;/span&gt; MyClass
{
    &lt;span class="kwrd"&gt;public&lt;/span&gt; &lt;span class="kwrd"&gt;string&lt;/span&gt; A;
    &lt;span class="kwrd"&gt;public&lt;/span&gt; &lt;span class="kwrd"&gt;int&lt;/span&gt; B;
    &lt;span class="kwrd"&gt;public&lt;/span&gt; DateTime C;
}&lt;/pre&gt;
  &lt;style type="text/css"&gt;&lt;![CDATA[
.csharpcode, .csharpcode pre
{
	font-size: small;
	color: black;
	font-family: consolas, "Courier New", courier, monospace;
	background-color: #ffffff;
	/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt 
{
	background-color: #f4f4f4;
	width: 100%;
	margin: 0em;
}
.csharpcode .lnum { color: #606060; }]]&gt;&lt;/style&gt;&lt;/blockquote&gt;

&lt;p&gt;The result?&lt;/p&gt;

&lt;p&gt;Created 7,715,305 in 00:00:01&lt;/p&gt;</description><link>http://ayende.com/blog/4811/performance-numbers-in-the-pub?key=37b089f9-0fb2-42f8-a68a-d7b949d03b8e</link><guid>http://ayende.com/blog/4811/performance-numbers-in-the-pub?key=37b089f9-0fb2-42f8-a68a-d7b949d03b8e</guid><pubDate>Thu, 14 Apr 2011 09:00:00 GMT</pubDate></item><item><title>You are only as fast as your slowest bottleneck</title><description>&lt;p&gt;Chris points out &lt;a href="http://ayende.com/Blog/archive/2010/10/23/what-is-the-cost-of-storage-again.aspx#42991"&gt;something very important&lt;/a&gt;:&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;“A much better solution would have been to simply put the database on a compressed directory, which would slow down some IO ..."     &lt;br /&gt;&lt;/p&gt;    &lt;p&gt;I don't agree.     &lt;br /&gt;Compression needs CPU. We got a lot of more IO by switching on compression (it's just less to write and read). Previous our CPU was about 40%, now averaging at 70%. Compression rate saves us about 30% per file. After switching on compression our IO bound application was about 20% faster.      &lt;br /&gt;We are currently planning switching on compression on all our production servers over Christmas, because using cpu-cores for compression is even cheaper than adding hard disks and raid for performance.&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;In general, most operations today are mostly IO bound, with the CPU mostly sitting there twiddling  the same byte until that byte threatens to sue for harassment. It make sense to trade off IO for CPU time, because our systems are being starved for IO.&lt;/p&gt;  &lt;p&gt;In fact, you can just turn on compression at the File System level in most OSes, and it is likely to result in a significant saving for the application performance, assuming that the data does not already fits in memory.&lt;/p&gt;</description><link>http://ayende.com/blog/4683/you-are-only-as-fast-as-your-slowest-bottleneck?key=89038690-ed57-490e-ab22-6cc45c954be7</link><guid>http://ayende.com/blog/4683/you-are-only-as-fast-as-your-slowest-bottleneck?key=89038690-ed57-490e-ab22-6cc45c954be7</guid><pubDate>Wed, 03 Nov 2010 10:00:00 GMT</pubDate></item><item><title>How to become a speaker?</title><description>&lt;p&gt;I get asked that quite frequently. More to the point, how to become an international speaker?&lt;/p&gt;  &lt;p&gt;I was recently at a gathering where no less than three different people asked me this question, so I thought that it might be a good post.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;Note: this post isn’t meant for someone who isn’t already speaking. And if you are speaking but are bad at it, this isn’t for you. The underlying assumption here is that you can speak and are reasonably good at it.&lt;/p&gt;    &lt;p&gt;Note II: For this post, speaking is used to refer to presenting some technical content in front of an audience. &lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;&lt;strong&gt;Why would you want to be a speaker anyway?&lt;/strong&gt;&lt;/p&gt;  &lt;p&gt;I heard that it is actually possible to make a living as a speaker. I haven’t found it to be the case, but then again, while I speak frequently, I don’t speak &lt;em&gt;that&lt;/em&gt; frequently. &lt;/p&gt;  &lt;p&gt;There are several reasons to want to be a speaker:&lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;reputation (and in the end, good reputation means you get to raise your rates, get more work, etc).&lt;/li&gt;    &lt;li&gt;contacts (speaking put you in front of dozens or hundreds of people, and afterward you get to talk with the people who are most interested in what you talked about)&lt;/li&gt;    &lt;li&gt;advertising for your product (all those “lap around Visual Studio 2010” are actually an hour long ad that you paid to see :-) ).&lt;/li&gt; &lt;/ul&gt;  &lt;p&gt;I’ll focus on the first two, reputation &amp;amp; contacts gives you a much wider pool of potential work that you can choose from, increase the money you can make, etc.&lt;/p&gt;  &lt;p&gt;&lt;strong&gt;So how do I do that, damn it?&lt;/strong&gt;&lt;/p&gt;  &lt;p&gt;Honestly, I have no idea. The first time that I presented at a technical conference, it was due to a mixup in communication. Apparently when in the US “it would have been delightful” means “we regret to inform”, but in Israel we read that as “great, let us do it”, and put the guy on the spot, so he had to scramble and do something.&lt;/p&gt;  &lt;p&gt;Okay, I lied, I do have some idea about how to do this.&lt;/p&gt;  &lt;p&gt;Again, I am assuming you are a reasonably good speaker (for myself, I know that my accent is a big problem when speaking English), but there are a &lt;em&gt;lot&lt;/em&gt; of reasonably good speakers out there.&lt;/p&gt;  &lt;p&gt;So, what is the answer? &lt;em&gt;Make yourself different&lt;/em&gt;.&lt;/p&gt;  &lt;p&gt;Pick a topic that is near &amp;amp; dear to your heart (or to your purse, which also works) and prepare a few talks on it. Write about it in a blog, comment on other people blogs about the topic. Your goal should be that when people think about topic X, your name would be on that list.  Forums like Stack Overflow can help, writing articles (whatever it is for pay or in places like CodeProject). Join a mailing list and be active there (and helpful). Don’t focus on regionally associated forums / mailing list, though. The goal is international acknowledgement. &lt;/p&gt;  &lt;p&gt;This will take at least a year, probably, for people to start recognizing your name (it took over 2 years for me). If it is possible, produce a set of tools that relate to your topic. Publish them for free, and write it off as an investment in your future. &lt;/p&gt;  &lt;p&gt;For myself, NHibernate Query Analyzer would a huge boost in terms of getting recognized. And Rhino Mocks was probably what clinched the deal. I honestly have no idea how much time &amp;amp; effort I put into Rhino Mocks, but &lt;a href="http://www.ohloh.net/p/8830"&gt;Ohloh&lt;/a&gt; estimate that project at $ 12,502,089(!). While I disagree about that number, I did put a &lt;em&gt;lot&lt;/em&gt; of effort into it, but it paid itself off several times over.&lt;/p&gt;  &lt;p&gt;If you don’t have a blog, get one. Don’t get one at a community site, either. Community sites like blogs.microsoft.co.il are good to get your stuff read, but they have a big weakness in terms of branding yourself. You &lt;em&gt;don’t&lt;/em&gt; want to get lost in a crowd, you want people to notice who you are. And most people are going to read your posts in a feed reader, and they are going to notice that the &lt;em&gt;community feed&lt;/em&gt; is interesting, not that &lt;em&gt;you&lt;/em&gt; are interesting.&lt;/p&gt;  &lt;p&gt;Post regularly. I try to have a daily post, but that would probably not be possible for you, try to post at least once a week, and try to time it so it is always on the same date &amp;amp; time. Monday’s midnight usually works.&lt;/p&gt;  &lt;p&gt;&lt;strong&gt;Okay, I did all of that, what now?&lt;/strong&gt;&lt;/p&gt;  &lt;p&gt;Another note, this is something that you may want to do in parallel to the other efforts.&lt;/p&gt;  &lt;p&gt;Unless you become &lt;em&gt;very&lt;/em&gt; well known, you won’t be approached, you’ll have to submit session suggestions. Keep an eye on the conferences that interest you, and wait until they have a call for sessions. Submit your stuff. Don’t get offended if they reject you.&lt;/p&gt;  &lt;p&gt;If you live in a place that host international conferences (which usually rule Israel out), a good bet is to try to get accepted as a speaker there. You would be considerably cheaper than bringing someone from out of town/country. And that also play a role. Usually, if you managed to get into a conference once, they’ll be much more likely to have you again. They have your speaker eval, and unless you truly sucked (like going on stage and starting to speak in Hebrew at Denmark), and that gives them more confidence in bringing you a second time.&lt;/p&gt;  &lt;p&gt;And that is about it for now.&lt;/p&gt;</description><link>http://ayende.com/blog/4596/how-to-become-a-speaker?key=4aa231c3-0501-4c2b-a253-165ef4b0fbcb</link><guid>http://ayende.com/blog/4596/how-to-become-a-speaker?key=4aa231c3-0501-4c2b-a253-165ef4b0fbcb</guid><pubDate>Thu, 19 Aug 2010 09:00:00 GMT</pubDate></item><item><title>Paxos enlightment</title><description>&lt;p&gt;Paxos is an algorithm used to reach consensus among a group of machines, which is resilient to failures. For a long time, I had a really hard time understand Paxos. Or, to be rather more exact, I didn’t have an issue with Paxos per-se, I understood the protocol. What I had a trouble with is its application.&lt;/p&gt;  &lt;p&gt;My problem always was that I couldn’t figure out what you &lt;em&gt;do&lt;/em&gt; with it. That had to do with a basic problem on my part, I failed to understand how to go from a shared consensus on a numeric value to something that is actually useful. After a while, I had enough of feeling stupid, and I started reading all the material that I could on that, including going through the source codes of available Paxos implementations (such as libpaxos). The main problem was that I had a huge misconception in my head.&lt;/p&gt;  &lt;p&gt;I kept thinking that Paxos is a consensus algorithm to arrive at a value in a distributed system. It isn’t, and because I kept thinking that it is, I had a &lt;em&gt;really&lt;/em&gt; hard time understand what it does and how to apply it.&lt;/p&gt;  &lt;p&gt;Leslie Lamport [pdf], the original author of the algorithm, describe the goal of Paxos as follows:&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;Assume a collection of processes that can propose values. A consensus algorithm ensures that a single one among the proposed values is chosen. If no value is proposed, then no value should be chosen. If a value has been chosen, then processes should be able to learn the chosen value.&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;&lt;img style="display: inline; margin-left: 0px; margin-right: 0px" align="right" src="http://humanityquest.com/themes/inspiration/Comics/images_Microsoft/%20BeanManIdeaLight.gif" /&gt;This is from the &lt;a href="http://research.microsoft.com/en-us/um/people/lamport/pubs/paxos-simple.pdf"&gt;Paxos Made Simple paper&lt;/a&gt;, I don’t know about you, but I have a hard time going from proposed value to something useful. What finally made everything click was the &lt;a href="http://www.inf.usi.ch/faculty/pedone/MScThesis/marco.pdf"&gt;Paxos Made Code&lt;/a&gt;, which describe the implementation of &lt;a href="http://libpaxos.sourceforge.net/"&gt;libpaxos&lt;/a&gt;. While reading the paper, I had a light bulb moment.&lt;/p&gt;  &lt;p&gt;Paxos is an algorithm used to ensure consist ordering semantics over a set of values in a cluster of machines. Why is this important? Because if you have consistent ordering over a set of values, and the values are events (or commands, or states), you can be sure that all machines in the cluster have either the same state or a previous version of the state.&lt;/p&gt;  &lt;p&gt;Let us see how this can be useful. We have the following scenario:&lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;Jane has no money in her back account.&lt;/li&gt;    &lt;li&gt;Joe sends 100$ to Jane&lt;/li&gt;    &lt;li&gt;When Jane is notified that it got the 100$, she send a 50$ check to the IRS&lt;/li&gt;    &lt;li&gt;The IRS cash the check and spend it on something stupid.&lt;/li&gt; &lt;/ul&gt;  &lt;p&gt;&lt;a href="http://ayende.com/Blog/images/ayende_com/Blog/WindowsLiveWriter/Paxosenlightment_CCBF/image_6.png"&gt;&lt;img style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" border="0" alt="image" src="http://ayende.com/Blog/images/ayende_com/Blog/WindowsLiveWriter/Paxosenlightment_CCBF/image_thumb_2.png" width="620" height="431" /&gt;&lt;/a&gt; &lt;/p&gt;  &lt;p /&gt;  &lt;p /&gt;  &lt;p&gt;Without a consistent ordering, each machine in the cluster may view the events in any order, which means that the following three timelines are allowed:&lt;/p&gt;  &lt;p&gt;&lt;a href="http://ayende.com/Blog/images/ayende_com/Blog/WindowsLiveWriter/Paxosenlightment_CCBF/image_8.png"&gt;&lt;img style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" border="0" alt="image" src="http://ayende.com/Blog/images/ayende_com/Blog/WindowsLiveWriter/Paxosenlightment_CCBF/image_thumb_3.png" width="644" height="437" /&gt;&lt;/a&gt; &lt;/p&gt;  &lt;p&gt;As you can imagine, Jane isn’t very happy about that overdraft fee she was just charged with. It is important to note that Paxos sole use here is to ensure that all machines will have a &lt;em&gt;consistent&lt;/em&gt; view of events across all machines. That view might not be the same as the order those events showed up. It is possible to give that guarantee as well, on top of Paxos, but that isn’t the topic of this post.&lt;/p&gt;  &lt;p&gt;Now that we all (hopefully) understand what Paxos is and what it is used for, let us talk about the algorithm itself.&lt;/p&gt;  &lt;p&gt;Paxos has three actor types (Usually, the different actors are all part of the same system, either all together or two of the roles together), we will assume that we have 3 of each in our cluster and that we are interested in events ordered by sequential integer event ids with no gaps:&lt;/p&gt;  &lt;p&gt;&lt;a href="http://ayende.com/Blog/images/ayende_com/Blog/WindowsLiveWriter/Paxosenlightment_CCBF/image_16.png"&gt;&lt;img style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" border="0" alt="image" src="http://ayende.com/Blog/images/ayende_com/Blog/WindowsLiveWriter/Paxosenlightment_CCBF/image_thumb_7.png" width="414" height="173" /&gt;&lt;/a&gt; &lt;/p&gt;  &lt;p&gt;When you want to make a change in the system, you go to the proposer and tell it that you want to add event &lt;em&gt;SendCheck { To= “IRS”, Amount = 50 } &lt;/em&gt;to the system.&lt;/p&gt;  &lt;p&gt;The proposer then check what is the latest event id that it knows about, increment it, and then ask all the acceptors in the cluster to reserve that event id for its use. (Please note that I intentionally skip the details of how this is done, I am trying to get a high level description here, you can read the actual algorithm description for all the details).&lt;/p&gt;  &lt;p&gt;There are several options here:&lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;Another proposer is currently trying to reserve that event id.&lt;/li&gt;    &lt;li&gt;Another proposer successfully claimed this event id.&lt;/li&gt;    &lt;li&gt;Another proposer tried to claim this id and then crashed midway.&lt;/li&gt;    &lt;li&gt;Etc… :-)&lt;/li&gt; &lt;/ul&gt;  &lt;p&gt;What Paxos ensures is that in the end, even in the presence of failures of network and machines, the proposer will be able to write the &lt;em&gt;SendCheck { To= “IRS”, Amount = 50 } &lt;/em&gt;to an event id in such a way that no other machine will see another value in that location and that eventually all machines in the cluster will see that value in that location.&lt;/p&gt;  &lt;p&gt;It is important to understand that even with Paxos, the following timelines are possible:&lt;/p&gt;  &lt;p&gt;&lt;a href="http://ayende.com/Blog/images/ayende_com/Blog/WindowsLiveWriter/Paxosenlightment_CCBF/image_18.png"&gt;&lt;img style="border-bottom: 0px; border-left: 0px; display: inline; border-top: 0px; border-right: 0px" title="image" border="0" alt="image" src="http://ayende.com/Blog/images/ayende_com/Blog/WindowsLiveWriter/Paxosenlightment_CCBF/image_thumb_8.png" width="645" height="441" /&gt;&lt;/a&gt; &lt;/p&gt;  &lt;p&gt;That is, due to some error, we may not be fully up to date on some machines (as seen on machine #2) or that we have missing events (as seen on machine #3).&lt;/p&gt;  &lt;p&gt;What Paxos provides is that there wouldn’t be a scenario in which we have missing events and are not explicitly aware of that. At that point, we can defer processing of cashing the check until we know what event we are missing, explicitly decide to ignore the discontinuity in the time line or something else that fit the business needs.&lt;/p&gt;  &lt;p&gt;In order to understand how this all works, I had to write my own implementation of Paxos in C#. There doesn’t seem to be anything like that available to the public, and I (at least) find the code I wrote much easier to understand than libpaxos’ C or Erlang implementations. You can find the implementation here: &lt;a href="http://github.com/ayende/Paxos.Demo"&gt;http://github.com/ayende/Paxos.Demo&lt;/a&gt;&lt;/p&gt;</description><link>http://ayende.com/blog/4496/paxos-enlightment?key=702e05e1-50af-4220-8531-219899cd98e5</link><guid>http://ayende.com/blog/4496/paxos-enlightment?key=702e05e1-50af-4220-8531-219899cd98e5</guid><pubDate>Wed, 12 May 2010 09:00:00 GMT</pubDate></item></channel></rss>