Ayende @ Rahien

Refunds available at head office

Esent, identity and the case of the duplicate key

Following up on a bug report that I got from a user of Rhino Queues, I figured out something very annoying about the way Esent handles auto increment columns.

Let us take the following bit of code:

using (var instance = new Instance("test.esent"))
{
	instance.Init();

	using (var session = new Session(instance))
	{
		JET_DBID dbid;
		Api.JetCreateDatabase(session, "test.esent", "", out dbid, CreateDatabaseGrbit.OverwriteExisting);

		JET_TABLEID tableid;
		Api.JetCreateTable(session, dbid, "outgoing", 16, 100, out tableid);
		JET_COLUMNID columnid;

		Api.JetAddColumn(session, tableid, "msg_id", new JET_COLUMNDEF
		{
			coltyp = JET_coltyp.Long,
			grbit = ColumndefGrbit.ColumnNotNULL |
					ColumndefGrbit.ColumnAutoincrement |
					ColumndefGrbit.ColumnFixed
		}, null, 0, out columnid);

		Api.JetCloseDatabase(session, dbid, CloseDatabaseGrbit.None);
	}
}

for (int i = 0; i < 3; i++)
{
	using (var instance = new Instance("test.esent"))
	{
		instance.Init();

		using (var session = new Session(instance))
		{
			JET_DBID dbid;
			Api.JetAttachDatabase(session, "test.esent", AttachDatabaseGrbit.None);
			Api.JetOpenDatabase(session, "test.esent", "", out dbid, OpenDatabaseGrbit.None);

			using (var table = new Table(session, dbid, "outgoing", OpenTableGrbit.None))
			{
				var cols = Api.GetColumnDictionary(session, table);
				var bytes = new byte[Api.BookmarkMost];
				int size;
				using (var update = new Update(session, table, JET_prep.Insert))
				{
					update.Save(bytes, bytes.Length, out size);
				}
				Api.JetGotoBookmark(session, table, bytes, size);
				var i = Api.RetrieveColumnAsInt32(session, table, cols["msg_id"]);
				Console.WriteLine(i);

				Api.JetDelete(session, table);
			}

			Api.JetCloseDatabase(session, dbid, CloseDatabaseGrbit.None);
		}
	}
}

What do you think is going to be the output of this code?

If you guessed:

1
1
1

I have a cookie for you.

One of the problems of working with low level libraries is that they are… well, low level. As such, they don’t provide all the features that you think they would. Most databases keep track of the auto incrementing columns outside of the actual table. But Esent keep it in memory, and read max(id) from the table on init.

It is an… interesting bug* to track down, I have to say.

* Bug in my code, no in Esent, just to be clear.

Comments

Thomas Eyde
04/24/2009 12:35 AM by
Thomas Eyde

Aren't you recreating the table in the loop? Then it's not so strange anymore. I would expect a new table to start counting from scratch.

Ayende Rahien
04/24/2009 01:40 AM by
Ayende Rahien

Thomas,

Look again, the creation of the table is outside the loop

Laurion Burchall
04/24/2009 01:43 AM by
Laurion Burchall

In the code above the table is being opened each time and Esent behaves just as described.

This is done for performance reasons. Tracking the max autoinc value in the catalog would require updating a second database page on every update, generating more log and database I/O. If the table has a lot of inserts then the autoinc value would become 'hot', meaning that a lot of different threads would want to update this page. This makes checkpoint advancement harder because the page can only be written to disk when it isn't latched. That problem can be solved by our page versioning but creating versions uses up more buffer manager space.

One note about this -- when Esent looks for max(id) from the table it will look for an index where the first index segment is on id. If no such index exists then a full table scan has to be performed. No-one has made that mistake yet, but it is something to consider.

Fabio Maulo
04/24/2009 04:55 AM by
Fabio Maulo

NH has the same behavior for <increment strategy.

Benny Thomas
04/24/2009 06:42 AM by
Benny Thomas

It looks like your bookmark is the same and you will hit the first column each loop. Giving the same result?

Ayende Rahien
04/24/2009 07:08 AM by
Ayende Rahien

Laurion,

I actually assumed that the reason for that was to keep things simple, but that makes sense as well.

I don't mind, but I had some code that relied on being able to copy identity value to another table, without taking into account that it can be reused.

Just something to be aware of

Comments have been closed on this topic.