Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,640
|
Comments: 51,260
Privacy Policy · Terms
filter by tags archive
time to read 2 min | 314 words

For a few days, some (~4) of SvnBridge integration tests would fail. Not always the same ones (but usually the same group), and only if I run them all as a group, never if I run each test individually, or if I run the entire test class (which rules out most of the test dependencies that causes this). This was incredibly annoying, but several attempts to track down this issue has been less than successful.

Today I got annoyed enough to say that I am not leaving until I solve this. Considering that a full test run of all SvnBridge's tests is... lengthy, that took a while, but I finally tracked down what was going on. The fault was with this method:

protected string Svn(string command)
{
	StringBuilder output = new StringBuilder();
	string err = null;
	ExecuteInternal(command, delegate(Process svn)
	{
		ThreadPool.QueueUserWorkItem(delegate

		{
			err = svn.StandardError.ReadToEnd();
		});
		ThreadPool.QueueUserWorkItem(delegate
		{
			string line;
			while ((line = svn.StandardOutput.ReadLine()) != null)
			{
				Console.WriteLine(line);
				output.AppendLine(line);
			}
		});
	});
	if (string.IsNullOrEmpty(err) == false)
	{
		throw new InvalidOperationException("Failed to execute command: " + err);
	}
	return output.ToString();
}

This will execute svn.exe and gather its input. Only sometimes it would not do so.

I fixed it by changing the implementation to:

protected static string Svn(string command)
{
	var output = new StringBuilder();
	var err = new StringBuilder();
	var readFromStdError = new Thread(prc =>
	{
		string line;
		while ((line = ((Process)prc).StandardError.ReadLine()) != null)
		{
			Console.WriteLine(line);
			err.AppendLine(line);
		}
	});
	var readFromStdOut = new Thread(prc =>
	{
		string line;
		while ((line = ((Process) prc).StandardOutput.ReadLine()) != null)
		{
			Console.WriteLine(line);
			output.AppendLine(line);
		}
	});
	ExecuteInternal(command, svn =>
	{
		readFromStdError.Start(svn);
		readFromStdOut.Start(svn);
	});

	readFromStdError.Join();
	readFromStdOut.Join();

	if (err.Length!=0)
	{
		throw new InvalidOperationException("Failed to execute command: " + err);
	}
	return output.ToString();
}

And that fixed the problem. What was the problem?

time to read 2 min | 306 words

There was a known, but irreproducible, imagein SvnBridge for a long while now.

The essence of the bug is that under some circumstances, updating from SvnBridge would fail. The reason for failure was an assertion failure. Somehow, someone made a change to a file that doesn't exists.

Did you note the some circumstances part? That is the critical issue there. It is the circumstances that I wasn't able to reproduce. As a matter of fact, I wasn't able to narrow what happened. I sometimes got it, and when I had a repro, I could fix it. But all too often, trying to reproduce the same steps would not cause the bug, but work as expected.

Annoying as hell, as you might imagine.

Today I finally managed to figure out what was going on. You can see the repro on the right. This is a timeline, where each user is colored differently.

As you can see, it require a failure unusual set of circumstances, and the way the SVN protocol work make it harder to figure out. (When you ask for changes on an item, you are actually requesting all changes on its parent, which is a critical issue here).

It is also important which side of the rename you are asking about, and... but now I am probably boring you with technical details that are not interesting even to geeks.

I have been in weird bug fixing mode for SvnBridge for a while now, and it is... interesting to see what crops up. Right now most bugs takes quite a long time to track and fix, unfortunately.

Oh, and as a parting shot, take a look here at how I tracked that one:

image

time to read 3 min | 499 words

The bug: SvnBridge will not accept a change to a filename using multi byte format.

Let us start figuring out what is going on, shall we?

Hm... looks like when TortoiseSVN is PUT-ing the file, it uses the directory name instead of the actual file name. Obviously, this is a client bug, and I can close this bug and move on to doing other things. Except that I am pretty sure that SVN can handle non ASCII characters...

Let us compare the network trace from SVN and SvnBridge and see what is going on...

Oh, I got the problem. TortoiseSVN is requesting a CHECKOUT, and we return the wrong URL. Obviously I did something astoundingly stupid there when I processed that request. Indeed, taking a look at what is going on there is... interesting.

Finally I realize that the problem is that while TortoiseSVN sends the URL in seemingly reasonable format, it is obviously being read wrongly by SvnBridge. It is using ASCII encoding to read this, while it probably should use UTF8. Because of that, the URL looks something like /tfsrtm08/test/????.txt, and ? is the query string terminator, no wonder I have issues.

But, how does it work for standard SVN servers. Hm, looks like when TortoiseSVN uses Url Encoding for the URL when taking to a standard SVN server. Why is it doing this?

Hm... looks like the URL that TortoiseSVN uses when it is request a CHECKOUT is the one that is returned in the response to a PROPFIND request.

Sure, that is easy to fix, we will just Url Encode the response from the PROPFIND handler.

Except that it doesn't work...

Oh, we also handle some of that in the file node class, so we will fix it there.

Yeah!

Except that it still doesn't work!

Grr! Looks like the way we handle Url Encoding is too invasive, we need to only UrlEncode non ASCII characters, this means that we should not encode characters such as '/', for example.

Let us try this again, shall we?

Hm... it still doesn't work. What is going on.

Excuse me while I am having a nervous breakdown, please.image

Oh! When we send the response to the CHECKOUT request, we send the URL for the PUT request in the location header, and we haven't Url Encoded that yet.

And now it works... :-D

Let us run the test and commit...

We have a broken test.

*Sob*

Well, okay, that is actually a test that is supposed to be broken (it is testing the way we Url Encode urls, after all).

Number of code lines changed, less than 20.

Number of hours spent of this bug: over eight hours.

Conclusion, I am not really being productive today. I am off to visit the code generation wizard...

time to read 1 min | 73 words

My personal success criteria for SvnBridge is when I no longer care what I am working against. And I think we got to that point. There are still issues, to be fair, but they won't hit you anywhere near mainline development.

I am listening to Rob Conery MVC Storefront 9 in which he makes the mistake of referring to CodePlex as Subversion. With that, I think that I can say, we are there.

time to read 2 min | 371 words

I was very impressed when I saw how Subversion handles the complexity of having data of hierarchical nature that needs to be serialized to XML. Check this out.

<?xml version="1.0" encoding="utf-8"?>
<S:editor-report xmlns:S="svn:">
  <S:target-revision rev="11"/>
  <S:open-root rev="-1"/>
  <S:open-directory name="tags" rev="-1"/>
  <S:add-directory name="tags/asd"/>
  <S:close-directory/>
  <S:close-directory/>
  <S:close-directory/>
</S:editor-report>

And here is another one:

<?xml version="1.0" encoding="utf-8"?>
<S:editor-report xmlns:S="svn:">
  <S:target-revision rev="15"/>
  <S:open-root rev="-1"/>
  <S:open-directory name="trunk" rev="-1"/>
  <S:open-file name="trunk/a.txt" rev="-1"/>
  <S:apply-textdelta checksum="eabc96676e7defda414a1eed33bdfb09">
    U1ZOAAAQEwETk2FzZDENCjINCjMNCjQNCjUNCjY=
  </S:apply-textdelta>
  <S:close-file checksum="c6301e5dad1330a7b9bd5491702c801b"/>
  <S:close-directory/>
  <S:close-directory/>
</S:editor-report>

I was, as they say incredibly happy with this Work Time Fun.

time to read 1 min | 99 words

It is amazing how much time you can hunt for the exact cause of a bug. In this case, it took me almost two days (intersperse with other work, however) to track down and find the issue.

Remember, there is no reason to use ASCII, ever. I actually run a blame on the code (a new feature for SvnBridge!) to find out who wrote it. And then I sent a nasty email about it to /dev/null, just to clear my mind.

image

time to read 2 min | 228 words

Let us see how much of a furor that will cause :-)

The reasons that I don't like distributed source control systems have nothing to do with technical reasons, and everything to do with how they are used and promoted.

The main reason that I don't like them is that they encourage isolated work. I don't want anyone in my project to just go off and work on their own for a few weeks, then come back with a huge changeset that now needs to be merged. The way DVCS are promoted, this is a core scenarios, "no one have to see your mistakes".

I completely disagree. I want to see everyone's mistake, and I want them to see mine. There is little to learn from successes, and much to learn from failures. Far worse, I want to be able to peak into other people work, so I can give my input on it, even if it is just "wow, nice", or "yuck, sucks!"

The main advantage of DVCS is speed, trying to browse the repository when you have the entire thing on your HD is lightning fast, compared to doing the same with a remote repository. I tend to use SVK now, but I use it strictly as a local cache, and nothing more.

And that is why I don't like DVCS, I don't want people to work in isolation.

time to read 3 min | 574 words

Let me start by stating that you really don't want to do that. This is not something that you want to do, period.

Now that we are over that, I had the chance lately to go fairly deeply into SCM and how they are implemented from two fairly different perspectives. This is a randomly collected set of observations about SCM systems. As usual, the order is arbitrary, and no attempt was made to make any coherent idea out of this.

  • It is all about the client. The client in an SCM system has significant responsibilities. It is in charge of reporting the client state, managing all the errors hat the user can cause, and shoulder a lot of the burden.
  • It is all about the protocol. Anyone who designs a SCM system should be given a lousy DSL line with disconnects every 15 minutes. Oh, and they should also have to work on a plane a lot.
  • On the wire, it is all so simple. It is really surprising to see how the SCM complexity is really just a lot of tiny, easy to handle, details.
  • The devil is in the details, though.
  • Complexities on the server side:
    • Space management - do you save the diff or the whole file?
    • If just the diff, how do you construct an arbitrary version
    • Keeping history around for branches and copies
    • Cheap copies

  • Complexities on the client side:
    • Do you have one version on the client, per the working copy?
    • Do you have multiple versions, one per each file?
    • Handling inconsistencies between server version, working copy version and original version.

  • What do you optimize for? Bandwidth? Roundtrips?
    • I know of one SCM product who is lousy optimizer for both

  • Distributed SCM can be handled on top of centralized SCM.
  • It is not hard at all, except for all the details.
  • Don't write your own SCM.
  • Trust matters, and you really don't want to be in the situation where you don't trust your SCM.
  • Remember that SCM is temporal, you can go backward in time, and even sideway, to a branch.
  • There are only three types of operations in SCM:
    • Generate a change set between two paths at two versions
    • Apply a diff to a path, generating a new version
    • Reporting (logs, mostly, and outputting various formats of a changeset)
Overall, it is very simple endeavor. It gets complex when you start talking beyond the wire protocol. As a simple example, how much does it cost you to branch? How much does it cost you to find out if there has been any changes to the working copy?
The other major issue is: How do you ensure that it is reliable?
Now, let me repeat myself, do not write your own source control!
time to read 1 min | 95 words

I have just had to make a modification to SvnBridge because someone decided to checkin a change set that included 1,695 changes..

Even ignoring the issue of "your checkin broke my source control" mentality that I have now, this is bad from other perspective.

How do you review such a change? How do you even know what is going on there? In changes of this size, you aren't making a single change, you are making a lot of changes. Now you can't mix & match them either.

Prefer small checkins, they are easier to deal with.

FUTURE POSTS

No future posts left, oh my!

RECENT SERIES

  1. API Design (10):
    29 Jan 2026 - Don't try to guess
  2. Recording (20):
    05 Dec 2025 - Build AI that understands your business
  3. Webinar (8):
    16 Sep 2025 - Building AI Agents in RavenDB
  4. RavenDB 7.1 (7):
    11 Jul 2025 - The Gen AI release
  5. Production postmorterm (2):
    11 Jun 2025 - The rookie server's untimely promotion
View all series

Syndication

Main feed ... ...
Comments feed   ... ...