Ayende @ Rahien

Hi!
My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:

ayende@ayende.com

+972 52-548-6969

, @ Q c

Posts: 6,124 | Comments: 45,475

filter by tags archive

Esent, identity and the case of the duplicate key

time to read 2 min | 333 words

Following up on a bug report that I got from a user of Rhino Queues, I figured out something very annoying about the way Esent handles auto increment columns.

Let us take the following bit of code:

using (var instance = new Instance("test.esent"))
{
	instance.Init();

	using (var session = new Session(instance))
	{
		JET_DBID dbid;
		Api.JetCreateDatabase(session, "test.esent", "", out dbid, CreateDatabaseGrbit.OverwriteExisting);

		JET_TABLEID tableid;
		Api.JetCreateTable(session, dbid, "outgoing", 16, 100, out tableid);
		JET_COLUMNID columnid;

		Api.JetAddColumn(session, tableid, "msg_id", new JET_COLUMNDEF
		{
			coltyp = JET_coltyp.Long,
			grbit = ColumndefGrbit.ColumnNotNULL |
					ColumndefGrbit.ColumnAutoincrement |
					ColumndefGrbit.ColumnFixed
		}, null, 0, out columnid);

		Api.JetCloseDatabase(session, dbid, CloseDatabaseGrbit.None);
	}
}

for (int i = 0; i < 3; i++)
{
	using (var instance = new Instance("test.esent"))
	{
		instance.Init();

		using (var session = new Session(instance))
		{
			JET_DBID dbid;
			Api.JetAttachDatabase(session, "test.esent", AttachDatabaseGrbit.None);
			Api.JetOpenDatabase(session, "test.esent", "", out dbid, OpenDatabaseGrbit.None);

			using (var table = new Table(session, dbid, "outgoing", OpenTableGrbit.None))
			{
				var cols = Api.GetColumnDictionary(session, table);
				var bytes = new byte[Api.BookmarkMost];
				int size;
				using (var update = new Update(session, table, JET_prep.Insert))
				{
					update.Save(bytes, bytes.Length, out size);
				}
				Api.JetGotoBookmark(session, table, bytes, size);
				var i = Api.RetrieveColumnAsInt32(session, table, cols["msg_id"]);
				Console.WriteLine(i);

				Api.JetDelete(session, table);
			}

			Api.JetCloseDatabase(session, dbid, CloseDatabaseGrbit.None);
		}
	}
}

What do you think is going to be the output of this code?

If you guessed:

1
1
1

I have a cookie for you.

One of the problems of working with low level libraries is that they are… well, low level. As such, they don’t provide all the features that you think they would. Most databases keep track of the auto incrementing columns outside of the actual table. But Esent keep it in memory, and read max(id) from the table on init.

It is an… interesting bug* to track down, I have to say.

* Bug in my code, no in Esent, just to be clear.


Comments

Thomas Eyde

Aren't you recreating the table in the loop? Then it's not so strange anymore. I would expect a new table to start counting from scratch.

Ayende Rahien

Thomas,

Look again, the creation of the table is outside the loop

Laurion Burchall

In the code above the table is being opened each time and Esent behaves just as described.

This is done for performance reasons. Tracking the max autoinc value in the catalog would require updating a second database page on every update, generating more log and database I/O. If the table has a lot of inserts then the autoinc value would become 'hot', meaning that a lot of different threads would want to update this page. This makes checkpoint advancement harder because the page can only be written to disk when it isn't latched. That problem can be solved by our page versioning but creating versions uses up more buffer manager space.

One note about this -- when Esent looks for max(id) from the table it will look for an index where the first index segment is on id. If no such index exists then a full table scan has to be performed. No-one has made that mistake yet, but it is something to consider.

Fabio Maulo

NH has the same behavior for <increment strategy.

Benny Thomas

It looks like your bookmark is the same and you will hit the first column each loop. Giving the same result?

Ayende Rahien

Laurion,

I actually assumed that the reason for that was to keep things simple, but that makes sense as well.

I don't mind, but I had some code that relied on being able to copy identity value to another table, without taking into account that it can be reused.

Just something to be aware of

Comment preview

Comments have been closed on this topic.

FUTURE POSTS

  1. RavenDB 3.5 whirl wind tour: You want all the data, you can’t handle all the data - one day from now
  2. The design of RavenDB 4.0: Making Lucene reliable - about one day from now
  3. RavenDB 3.5 whirl wind tour: I’ll find who is taking my I/O bandwidth and they SHALL pay - 3 days from now
  4. The design of RavenDB 4.0: Physically segregating collections - 4 days from now
  5. RavenDB 3.5 Whirlwind tour: I need to be free to explore my data - 5 days from now

And 14 more posts are pending...

There are posts all the way to May 30, 2016

RECENT SERIES

  1. RavenDB 3.5 whirl wind tour (14):
    29 Apr 2016 - A large cluster goes into a bar and order N^2 drinks
  2. The design of RavenDB 4.0 (13):
    28 Apr 2016 - The implications of the blittable format
  3. Tasks for the new comer (2):
    15 Apr 2016 - Quartz.NET with RavenDB
  4. Code through the looking glass (5):
    18 Mar 2016 - And a linear search to rule them
  5. Find the bug (8):
    29 Feb 2016 - When you can't rely on your own identity
View all series

RECENT COMMENTS

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats