Ayende @ Rahien

Hi!
My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:

ayende@ayende.com

+972 52-548-6969

, @ Q c

Posts: 10 | Comments: 37

filter by tags archive

Esent, identity and the case of the duplicate key

time to read 2 min | 333 words

Following up on a bug report that I got from a user of Rhino Queues, I figured out something very annoying about the way Esent handles auto increment columns.

Let us take the following bit of code:

using (var instance = new Instance("test.esent"))
{
	instance.Init();

	using (var session = new Session(instance))
	{
		JET_DBID dbid;
		Api.JetCreateDatabase(session, "test.esent", "", out dbid, CreateDatabaseGrbit.OverwriteExisting);

		JET_TABLEID tableid;
		Api.JetCreateTable(session, dbid, "outgoing", 16, 100, out tableid);
		JET_COLUMNID columnid;

		Api.JetAddColumn(session, tableid, "msg_id", new JET_COLUMNDEF
		{
			coltyp = JET_coltyp.Long,
			grbit = ColumndefGrbit.ColumnNotNULL |
					ColumndefGrbit.ColumnAutoincrement |
					ColumndefGrbit.ColumnFixed
		}, null, 0, out columnid);

		Api.JetCloseDatabase(session, dbid, CloseDatabaseGrbit.None);
	}
}

for (int i = 0; i < 3; i++)
{
	using (var instance = new Instance("test.esent"))
	{
		instance.Init();

		using (var session = new Session(instance))
		{
			JET_DBID dbid;
			Api.JetAttachDatabase(session, "test.esent", AttachDatabaseGrbit.None);
			Api.JetOpenDatabase(session, "test.esent", "", out dbid, OpenDatabaseGrbit.None);

			using (var table = new Table(session, dbid, "outgoing", OpenTableGrbit.None))
			{
				var cols = Api.GetColumnDictionary(session, table);
				var bytes = new byte[Api.BookmarkMost];
				int size;
				using (var update = new Update(session, table, JET_prep.Insert))
				{
					update.Save(bytes, bytes.Length, out size);
				}
				Api.JetGotoBookmark(session, table, bytes, size);
				var i = Api.RetrieveColumnAsInt32(session, table, cols["msg_id"]);
				Console.WriteLine(i);

				Api.JetDelete(session, table);
			}

			Api.JetCloseDatabase(session, dbid, CloseDatabaseGrbit.None);
		}
	}
}

What do you think is going to be the output of this code?

If you guessed:

1
1
1

I have a cookie for you.

One of the problems of working with low level libraries is that they are… well, low level. As such, they don’t provide all the features that you think they would. Most databases keep track of the auto incrementing columns outside of the actual table. But Esent keep it in memory, and read max(id) from the table on init.

It is an… interesting bug* to track down, I have to say.

* Bug in my code, no in Esent, just to be clear.


Comments

Thomas Eyde

Aren't you recreating the table in the loop? Then it's not so strange anymore. I would expect a new table to start counting from scratch.

Ayende Rahien

Thomas,

Look again, the creation of the table is outside the loop

Laurion Burchall

In the code above the table is being opened each time and Esent behaves just as described.

This is done for performance reasons. Tracking the max autoinc value in the catalog would require updating a second database page on every update, generating more log and database I/O. If the table has a lot of inserts then the autoinc value would become 'hot', meaning that a lot of different threads would want to update this page. This makes checkpoint advancement harder because the page can only be written to disk when it isn't latched. That problem can be solved by our page versioning but creating versions uses up more buffer manager space.

One note about this -- when Esent looks for max(id) from the table it will look for an index where the first index segment is on id. If no such index exists then a full table scan has to be performed. No-one has made that mistake yet, but it is something to consider.

Fabio Maulo

NH has the same behavior for <increment strategy.

Benny Thomas

It looks like your bookmark is the same and you will hit the first column each loop. Giving the same result?

Ayende Rahien

Laurion,

I actually assumed that the reason for that was to keep things simple, but that makes sense as well.

I don't mind, but I had some code that relied on being able to copy identity value to another table, without taking into account that it can be reused.

Just something to be aware of

Comment preview

Comments have been closed on this topic.

FUTURE POSTS

  1. Production postmortem: The case of the memory eater and high load - about one day from now
  2. Production postmortem: The case of the lying configuration file - 3 days from now
  3. Production postmortem: The industry at large - 4 days from now
  4. The insidious cost of allocations - 5 days from now
  5. Find the bug: The concurrent memory buster - 6 days from now

And 4 more posts are pending...

There are posts all the way to Sep 10, 2015

RECENT SERIES

  1. Find the bug (5):
    20 Apr 2011 - Why do I get a Null Reference Exception?
  2. Production postmortem (10):
    14 Aug 2015 - The case of the man in the middle
  3. What is new in RavenDB 3.5 (7):
    12 Aug 2015 - Monitoring support
  4. Career planning (6):
    24 Jul 2015 - The immortal choices aren't
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats