Some notes about fsync

time to read 2 min | 252 words

On my laptop, fsync has effectively no cost. That is probably because of some configuration setting (it is a battery based system, no need to pay the fsync call).  On my desktop machine (significantly more powerful than my laptop), I have an fsync times that are an order of magnitude or more higher.

In practice, if you work with fsync, you can expect to get a maximum of about 200 – 300 fsync calls per second on SSD, and significantly less on HDD. If you are seeing higher numbers than that, you are probably not really doing fsync, and are exposed to data integrity issues if you have a hard crash.

In particular, it appears that for high performance code, you really want to forget all about fsync for ensuring your transactional needs.

And that is before we started talking about the cost of fsync (FlushFileBuffers, to be more accurate) as the file size grows. It appears that there is at least some correlation between the size of the file and the cost of calling fsyn/FlushFileBuffers on it.

Considering that we are talking about potentially very large files, we really want to be careful about it. All in all, I think that we need to say goodbye to relying of fsync for ensuring ACID.

But how can we ensure that we’ll be properly ACID? Well, the answer is in the previous post, look at how other people are doing it, but I’ll expand on that in my next post.