﻿<?xml version="1.0" encoding="utf-8"?><rss version="2.0"><channel><title>Ayende @ Rahien</title><link>http://ayende.com</link><description>Ayende @ Rahien</description><copyright>Copyright (C) Ayende Rahien  2004 - 2021 (c) 2026</copyright><ttl>60</ttl><item><title>Greg young  commented on Get thou out of my head, damn idea</title><description>+1 Clemens</description><link>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment26</link><guid>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment26</guid><pubDate>Wed, 10 Oct 2012 06:20:00 GMT</pubDate></item><item><title>Ayende Rahien commented on Get thou out of my head, damn idea</title><description>Kelly,
Let us say that you want to read an existing event stream.
It has 10,000,000 events in it. And it takes an hour to iterate over it.
During that hour, you have what is effectively snapshot isolation over this.
You can only see if as it was when you started the read.

This is because of the way the system works. You get the latest file offset from the in memory data structure.
Then you start moving backward. The file is immutable, and you are always moving backward, there is no chance of you seeing other data.</description><link>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment25</link><guid>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment25</guid><pubDate>Tue, 09 Oct 2012 17:12:04 GMT</pubDate></item><item><title>Ayende Rahien commented on Get thou out of my head, damn idea</title><description>Kelly,
Isolation here works by having the reads always work on top of the idToPos dictionary, which contains the file positions.
Only after we flush to disk will we update that dictionary.
Therefor, reads see the "old" state, and only after the disk flush "tx commit", is that state visible.</description><link>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment24</link><guid>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment24</guid><pubDate>Tue, 09 Oct 2012 17:09:54 GMT</pubDate></item><item><title>Kelly Sommers commented on Get thou out of my head, damn idea</title><description>If isolation guarantees were provided, if I were to start a read operation where I want to iterate over the stream of 100 million events to make some sort of calculation that may take an hour there would be no possible way for me to retrieve events committed via other transactions during the same time period. Isolation by a cursor position isn't good enough either because you could guess what the next sequence is and get an event that didn't exist at the time the reads began.</description><link>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment23</link><guid>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment23</guid><pubDate>Tue, 09 Oct 2012 16:10:51 GMT</pubDate></item><item><title>Clemens Vasters commented on Get thou out of my head, damn idea</title><description>I'll observe that the use of ACID here seems to be pure marketing. Since there is no batching, this is obviously about single records that are written into a fairly simple store. That's basic consistency. The common notion of transactions starts at two or more things that need to happen at the same time.</description><link>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment22</link><guid>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment22</guid><pubDate>Tue, 09 Oct 2012 15:51:01 GMT</pubDate></item><item><title>Kelly Sommers commented on Get thou out of my head, damn idea</title><description>Ayende,

That statement about isolation doesn't make sense to me. Perhaps you can elaborate? 

If isolation is guaranteed, how are you isolating reads from writes?
</description><link>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment21</link><guid>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment21</guid><pubDate>Tue, 09 Oct 2012 15:22:55 GMT</pubDate></item><item><title>Greg Young commented on Get thou out of my head, damn idea</title><description>to be clear what you are describing here has no form of dependency between any two things being written or read. 

What I read has no dependency on what I write
When I write multiple things they can have no dependency on each other

Well yeah you can make this very fast. Its just a log file. You can get it even faster very simply. Put it on 5 drives.No write or read from any drive can possibly have a dependency on a read/write from another drive
</description><link>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment20</link><guid>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment20</guid><pubDate>Tue, 09 Oct 2012 15:18:00 GMT</pubDate></item><item><title>Greg Young commented on Get thou out of my head, damn idea</title><description>Atomicity - We don't support the notion of a batch, so either we have successfully written an even to the disk, or we didn't. There isn't a middle ground.

Most of the time you have to support a batch. 

Consistency - We don't make the newly written event visible to the app code until we have saved that to disk, therefor, we are consistent.

This is an interesting version of consistency. There are basically no rules. Add a small rule. I can put an expected version (optimistic concurrency) for the stream when writing and things get quite a bit trickier especially with supporting batches</description><link>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment19</link><guid>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment19</guid><pubDate>Tue, 09 Oct 2012 15:12:54 GMT</pubDate></item><item><title>Ayende Rahien commented on Get thou out of my head, damn idea</title><description>Kelly &amp; Greg,

Atomicity - We don't support the notion of a batch, so either we have successfully written an even to the disk, or we didn't. There isn't a middle ground.

Consistency - We don't make the newly written event visible to the app code until we have saved that to disk, therefor, we are consistent. 

Isolated - Same as before, an event can't touch another event, or impact it in any way.

Durable - Data goes to disk, and you can verify when it is actually flushed there.</description><link>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment18</link><guid>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment18</guid><pubDate>Tue, 09 Oct 2012 14:46:01 GMT</pubDate></item><item><title>Kelly Sommers commented on Get thou out of my head, damn idea</title><description>Also from my understanding it doesn't have Atomicity from ACID either since a batch doesn't fail as a unit. As far as I can tell you can get partial writes.</description><link>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment17</link><guid>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment17</guid><pubDate>Tue, 09 Oct 2012 13:26:30 GMT</pubDate></item><item><title>Kelly Sommers commented on Get thou out of my head, damn idea</title><description>Greg,

+1 about consistency.

From my understanding I didn't see any isolation from ACID implementation either.

If I'm incorrect please point it out to me as I'd like to learn how it's implemented but as far as I could tell I couldn't see any.</description><link>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment16</link><guid>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment16</guid><pubDate>Tue, 09 Oct 2012 13:15:40 GMT</pubDate></item><item><title>Greg Young commented on Get thou out of my head, damn idea</title><description>To be clear on my consistency comment:

"The consistency property ensures that any transaction will bring the database from one valid state to another. Any data written to the database must be valid according to all defined rules, including but not limited to constraints, cascades, triggers, and any combination thereof."

I guess if you have no rules at all to make consistent then you are consistent but most systems have some rules to actually make consistent :)

</description><link>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment15</link><guid>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment15</guid><pubDate>Tue, 09 Oct 2012 08:43:41 GMT</pubDate></item><item><title>Greg Young commented on Get thou out of my head, damn idea</title><description>Ayende,

This is how most systems are built under the covers. Check out cass, bitcask or even Essent that you use in RavenDb. Bitcask is OSS you can see it does essentially the code you have written (well with more concerns but same general idea). The event store works this way as well and is OSS (BSD license) you could even use core bits in ravendb.

"The entire thing work with ACID guarantees"

There is no C here. If by ACID you mean Durable then yes I agree durability has been reached. What do you intend to make C? For us our indexes are C. As you know from Raven indexes that are C can be expensive to get.

Even doing something as simple as keeping a current version number of say the document you are writing to to provide basic optimistic concurrency is actually reasonably difficult and slow to provide on top when we start talking about not being able to fit all your keys in memory. Especially since in order to maintain your consistency it would either run on the same thread as your write or you would end up with lots of very intricate locking code.

</description><link>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment14</link><guid>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment14</guid><pubDate>Tue, 09 Oct 2012 08:26:11 GMT</pubDate></item><item><title>nick commented on Get thou out of my head, damn idea</title><description>Still disagree that weekend hacks ALWAYS outperform established software. I don't buy into context-insensitive absolutes. Lazy thinking.</description><link>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment13</link><guid>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment13</guid><pubDate>Tue, 09 Oct 2012 05:55:35 GMT</pubDate></item><item><title>Ayende Rahien commented on Get thou out of my head, damn idea</title><description>Nick,
No, that is actually pretty good.
It is _easy_ to get good perf when you don't have to consider all of the other stuff that you want done.</description><link>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment12</link><guid>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment12</guid><pubDate>Mon, 08 Oct 2012 22:12:06 GMT</pubDate></item><item><title>Ayende Rahien commented on Get thou out of my head, damn idea</title><description>Roy,
The entire thing work with ACID guarantees. And the latency is max 200 ms under very heavy load with read latency that is _very_ low.</description><link>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment11</link><guid>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment11</guid><pubDate>Mon, 08 Oct 2012 22:11:14 GMT</pubDate></item><item><title>Nick commented on Get thou out of my head, damn idea</title><description>@Roy, Ayende blog about those cases and more in a few posts last week about how certain features in RavenDb were implemented.</description><link>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment10</link><guid>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment10</guid><pubDate>Mon, 08 Oct 2012 18:12:44 GMT</pubDate></item><item><title>Nick commented on Get thou out of my head, damn idea</title><description>"That's why weekend hacks always out perform well established software"

Might want to rephrase that just a touch.</description><link>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment9</link><guid>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment9</guid><pubDate>Mon, 08 Oct 2012 18:10:07 GMT</pubDate></item><item><title>Kelly Sommers commented on Get thou out of my head, damn idea</title><description>I agree with Roy. People make careers out of writing production quality transactional storage engines. The benchmark numbers aren't interesting to me on a storage backend hacked in a weekend.

The compaction strategy and failure handling concerns me. I don't think it would do well in a busy system over time.

All of these intricate details affect benchmarks quite a bit. It's easy to get favourable numbers by taking liberties of things required by production grade software that can be ignored in a benchmark. 

That's why weekend hacks always out perform well established software.</description><link>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment8</link><guid>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment8</guid><pubDate>Mon, 08 Oct 2012 16:32:34 GMT</pubDate></item><item><title>Roy Jacobs commented on Get thou out of my head, damn idea</title><description>Not to be snarky, but 'just getting throughput' is the easy part, usually. Especially with the excellent language support you get in .NET. Getting everything to work with low latencies, transactionally and without corrupting when the power goes out is decidedly non-trivial. Or is this just an exercise in what the upper bounds of the performance can be?</description><link>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment7</link><guid>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment7</guid><pubDate>Mon, 08 Oct 2012 13:45:18 GMT</pubDate></item><item><title>Rasmus Schultz commented on Get thou out of my head, damn idea</title><description>Now you're thinking like me, which is dangerous - you will eventually end up writing your storage engine for RavenDB or something, heh ;-)</description><link>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment6</link><guid>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment6</guid><pubDate>Mon, 08 Oct 2012 13:25:12 GMT</pubDate></item><item><title>Frisian commented on Get thou out of my head, damn idea</title><description>The title should read "Get thee..." or "Get thyself...". Just sayin'</description><link>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment5</link><guid>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment5</guid><pubDate>Mon, 08 Oct 2012 12:33:10 GMT</pubDate></item><item><title>Frank Quednau commented on Get thou out of my head, damn idea</title><description>The write model should also work rather nicely with old magnetic-tape storage. If we use the numbers provided by wikipedia on the UNISERVO (128 chars / inch density), then in order to sustain the write speed you observed, based on a message size of 50 chars, you would need 456.4 metres of tape per second. Hence the tape would have to move at a speed of roughly Mach 1.34.

Assuming those tapes were available in length of 730 metres, you would have to change the tape every 1.6 seconds.</description><link>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment4</link><guid>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment4</guid><pubDate>Mon, 08 Oct 2012 12:26:05 GMT</pubDate></item><item><title>cocowalla commented on Get thou out of my head, damn idea</title><description>It the code in question publicly available? Would be interesting to see it in context.</description><link>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment3</link><guid>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment3</guid><pubDate>Mon, 08 Oct 2012 12:18:26 GMT</pubDate></item><item><title>Greg Young commented on Get thou out of my head, damn idea</title><description>I should add another beauty of the model is if you DONT want to flush. You can just allow read aheads.</description><link>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment2</link><guid>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment2</guid><pubDate>Wed, 03 Oct 2012 05:44:51 GMT</pubDate></item><item><title>Greg Young commented on Get thou out of my head, damn idea</title><description>Ayende,

Yes the mechanism is age old and works very well (lots of systems use similar mechanisms sql server, bitcask in riak, cassandra), . A few comments on your implementation. You are flushing once per second or when the queue is empty in the code I have. This second one is a great optimization for lowering request latency when you don't have tons of requests.

if(hadWrites)
                                {
                                        if ((DateTime.UtcNow - lastWrite).TotalSeconds &gt; 1)
                                        {


While this will give you very high throughput it also introduces a massive amount of latency under the load you gave it (assuming a dual append commit model you are talking 1-2 seconds of latency per transaction assuming durability). If you look in the Event Store code there is actually a heuristic for working around this by looking at the queue and how long your disk takes to fsync.

Another comment worth mentioning is that you use

streamSource.Flush(file);

From looking at the code I was sent by you flush is implemented by:

		public void Flush(Stream stream)
		{
			//((FileStream) stream).Flush(true);
		}

This means that your test is not durable and is actually likely running solely in memory albeit kernel level OS memory but memory. Putting on the flush will knock down your performance quite a bit. The other thing you have to be very careful about here is that very often consumer hardware will deliberately lie when you tell it to flush and will "say sure I did it" even though it didn't. :)

In general though yes reading and writing sequential streams is very very fast.</description><link>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment1</link><guid>http://ayende.com/159009/get-thou-out-of-my-head-damn-idea#comment1</guid><pubDate>Wed, 03 Oct 2012 05:42:05 GMT</pubDate></item></channel></rss>