Machine bias in profiler based optimizations
Take a look at the following profilers results. They both showcase pretty much the same codebase, without any major changes in between. However, they have been both run on different machines, and they result in very different performance optimization paths.
Main machine:
Laptop:
As you can see, when running the code on my main machine, the most expensive thing is actually writing to disk (WriteToJournal). However, on my laptop, writing to disk is actually very cheap. The likely reason is, again, my laptop is faking writing to disk in favor of buffering writes, even though we explicitly tell it not to. That means that it behaves as if it is we already optimized writing to disk. My main machine behaves more like a server, and we see a lot more I/O costs, which pretty much mask any other costs that we might have.
Using the profiler to guide you with regards to how to optimize the system on any one of those machines would lead you in very different paths. With Voron, we are testing it in a variety of scenarios, to make sure that we aren’t optimizing for one particular machine and hurting another.
Comments
Couldn't quite follow how your laptop is "faking" as if it is buffering writes. Are you running different configurations ? Does it right to assume the server is probably talking to NAS.
"Even though we explicitly tell it not to"...
I understand from your posts you are using o-direct. Using direct io (unbuffered io) does not explicitly say to the disk not to cache. It tells the file system not to cache. I have mentioned this before but you have to turn off caching in windows. That little check box does not just control windows caching, if enabled it also turns on caching on your controller. You can see this easily by grabbing a USB stick with linux on it and using hdparm. You can enable it if using disk specific tools to ensure that the controller caching is disabled.
Further it gets more complicated than this. Many commodity grade disks will still cache (and reorder writes) even if you explicitly turn off their caching! They just ignore the fact that you told them. Benchmarking disk io is hard :)
Vadi, I am not sure how to characterize it better. It is drastically faster than it should, even when we tell it not to.
Greg, Yes, I _,know_. I have done that, and I still see perf that is way above what the disk is supposed to be able to give me. That is why I said that I believe it is faking it.
Greg, This is also why we make certain assumptions about our disks, and document them. We can't work around what they will do if they explicitly violate their own docs.
The real question you need to ask is if it has a super cap on it. If it does what it's doing is totally valid!
Greg, super cap?
He means super capacitors which support write coalescing
http://thessdguy.com/how-controllers-maximize-ssd-life-external-data-buffering/#more-628
Comment preview