Ayende @ Rahien

Refunds available at head office

What happened to the blog?

You might have noticed that during the last day, we had duplicated posts. When I first saw that, I was shocked, there was absolutely no reason for this to have happened, and it seems that somehow the index or the database were corrupted.

We started investigating this, and we were able to reproduce this locally by using the database export. That was a great relief, because it meant that at least we could debug this. Once we were able to do that, we found out what the problem was. And basically, there was no problem. It was a configuration error (actually, two of them) that caused the problem.

We accidently enabled versioning for the blog’s database, you can read more about RavenDB’s versioning bundle if you want, but basically, it keep copies of modified documents so you have an audit trail around. So far, so good, and no one noticed anything. But then we removed versioning from the blog’s database, since after all, we didn’t need it there in the first place.

The problem is that in the meantime, the versioning bundle created all of those additional documents. While it was in operation, it would hide those documents (since they were there for historical purposes only), but once we removed it, all those historical documents showed up. The reason for the duplicate posts was that we had duplicate posts, it was just that they were duplicate posts of the same post (the historical review).

Why did we have so many? Whenever you comment, we update the CommentCount field, so we had as many historical copies for most posts as we had comments.

Fixed now, and I apologize for the trouble.

As a general point of interest, bundles allows great flexibility in the database, but they are design to be with the database. Removing and adding bundles to a database on the fly is not something that is going to just work, as we have just re-learned.

I apologize for the problem, it was a simple misconfiguration error that caused all the historical records to show up when they didn’t need to. Nothing to see here, move along Smile

Raccoon Blog and RavenDB–One month later

One of the fun parts about RavenDB is that it will self optimize itself for you depending on how you are using your data.

With this blog, I decided when going live with RavenDB that I would not follow the best practices of ensuring static indexes for everything, but would let it figure it out on its own.

Today, I got curious and decided to check up on that:

image

What you see is pretty interesting.

  • The first three indexes were automatically created by RavenDB in response to queries made on the database.
  • The Raven/* indexes are created by RavenDB itself, for the Raven Studio.
  • The MapReduce indexes are for statistics on the blog, and are the only two that were actually created by the application explicitly.
Tags:

Published at

Originally posted at

Comments (17)

Elegant Code, Raccoon Blog’s CSS Controller

The following piece of code is responsible for the CSS generation for this blog:

public class CssController : Controller
{
     public ActionResult Merge(string[] files)
     {
         var builder = new StringBuilder();
         foreach (var file in files)
         {
             var pathAllowed = Server.MapPath(Url.Content("~/Content/css"));
             var normalizeFile = Server.MapPath(Url.Content(Path.Combine("~/Content/css", file)));
             if (normalizeFile.StartsWith(pathAllowed) == false)
             {
                 return HttpNotFound("Path not allowed");
             }
             if (System.IO.File.Exists(normalizeFile))
             {
                 Response.AddFileDependency(normalizeFile);
                 builder.AppendLine(System.IO.File.ReadAllText(normalizeFile));
             }
         }

         Response.Cache.VaryByParams["files"] = true;
         Response.Cache.SetLastModifiedFromFileDependencies();
         Response.Cache.SetETagFromFileDependencies();
         Response.Cache.SetCacheability(HttpCacheability.Public);

         var css = dotless.Core.Less.Parse(builder.ToString(), new DotlessConfiguration());

         return Content(css, "text/css");
     }
}

There are a lot of things going on in a very short amount of code. The CSS for this blog is defined as:

<link rel="stylesheet" type="text/css" href="/blog/css?files=ResetCss.css&files=custom/ayende.settings.less.css&files=base.less.css&files=custom/ayende.less.css">

This means that while we have multiple CSS files that make maintaining things easier, we only make a single request to the server to get all of them.

Next, and important, we have a security check that ensures that only files from the appropriate path can be served. If you aren’t in the CSS directory, you won’t be returned.

Then we have a lot of code that is related to caching, which basically means that we will rely on the ASP.Net output cache to do everything for us. The really nice thing is that unless the files change, we will not be executing this code again, rather, it would be served directly from the cache, without any computation on our part.

All in all, I am more than happy with this code.

Tags:

Published at

Originally posted at

Comments (31)

And now… Raccoon Blog

One of the things that people kept bugging me about is that I am not building applications any longer, another annoyance was with my current setup using this blog.

Basically, my own usage pattern is quite different than most people. I tend to post to the blog in batches, usually of five or ten blog posts in a short amount of time. That means that future posts are pretty important to me, and that scheduling posts is something that I care deeply about. I managed to hack Subtext to do my bidding in that regard, but it was never perfect, and re-scheduling posts was always a pain in the ass.

Then there is RavenDB, and my desire to run on it…

What this means, in short, is that we now have the Raccoon Blog, which is a blog application suited specifically for my own needs, running on MVC 3 and with a RavenDB backend.

By the time you read it, it will already be running our company blog and it is scheduled to run ayende.com/blog this week.

What is Raccoon Blog?

It is a blog application running on top of RavenDB store and tailored specifically for our needs.

  • Strong scheduling features
  • Strong re-scheduling features
  • Support multiple authors in a single blog
  • Single blog per site (no multi blog support)
  • Recaptcha support
  • Markdown support for comments (you’ll be able to post code!)
  • Easy section support (for custom sidebar content)
  • Smart tagging support

And just for fun:

  • Fully HTML 5 compliant
Tags:

Published at

Originally posted at

Comments (52)

Goodbye, ayende.com

Originally posted at 5/5/2011

The last time that I updated ayende.com was 2009, and I don’t see a lot of value in keeping it there. I am considering doing something drastic about that, and simply moving the blog to ayende.com.

Do you have anything there that you really care about?

Just to be clear, the blog will still be here, we are talking about the site available at ayende.com, not ayende.com/blog.

Tags:

Published at

Originally posted at

Comments (11)

Smart post re-scheduling with Subtext

As you know, I uses future posting quite heavily, which is awesome, as long as I keep to the schedule. Unfortunately, when you have posts two or three weeks in advance, it is actually quite common for you need to post things in a more immediate sense.

And that is just a pain. I just added smart re-scheduling to my fork of Subtext. Basically, it is very simple. If I post now, I want the post now. If I post it in the future, move everything one day ahead. If I post with no date, put it as the last item on the queue.

This is the test for this feature.

Tags:

Published at

Originally posted at

Comments (4)

I need my own blog software, damn it

I am well aware that I am… outside the curve for bloggers. For a long while I handled that by simply dumping the posts as soon as I wrote them, but that turned out to be quite a burden for some readers, and pieces that I think deserve more attention were skipped, because they were simply drowning in the noise of so many blog posts.

I am much happier with the future posting concept. It make things more predictable, both for me and for the readers. The problem happen when you push this to its logical conclusion. At the time of this writing, I have a month of scheduled posts ahead of me, and this is the third or forth blog post that I wrote in the last 24 hours.

In essence, I created a convoy for my own blog. At some point, if this trend progresses, it will be a problem. But I kinda like the fact that I can relax for a month and the blog will function on auto pilot. There is also the nice benefit that by the time that the blog post is published, I forgot what it said (I use the write & forget method), so I need to read the post again, which helps, a lot.

But there are some disadvantages to this as well. My current system will simply schedule a post on the next day after the last day. This works, great, if I have posts that are not time sensitive. But what actually happen is that there are lot of scenarios in which I want to set the date of the post to the near future. I still try to keep it to one post a day, so that means that I need to shuffle the rest of the items in the queue, though. This is especially troubling when you consider that I usually write a series of posts that interconnect to a full story.

So I can’t just take one of them and bump it to the end, I might have to do rearranging of the entire timeline. And there is no support for that, I have to go and manually update the timing for everything else.

It is pretty clear why this feature is missing, it is an outlier one. But it probably means that i am going to fork SubText and add those things. And the real problem is that I would really like to avoid doing any UI work there. So I need to think about a system that would let me do that without any UI work from my part.

On comments and social interaction

I got a request in email to add something like Disqus to my blog, which would allow a richer platform for the commenting that goes on here. I think that the request and my reply are interesting enough to warrant this blog post.

My comment system is the default subtext one, but there are several advantages to the way it works. You can read the full explanation in Joel on Software post about the matter, but basically, threading encourages people to go off in tangents, single thread of conversation make it significantly easier to have only one conversation.

There is another reason, which is personally important to me, which is that I want to "own" the comments. Not own in terms of copyright, but own in terms of having control of the data itself. Having the comments (a hugely important part of the blog) being managed by a 3rd party which might shut down and take all the comments with it is not acceptable.

That is probably a false fear, but it is something that I take under consideration. The reasoning about the type of interaction going on in the comments is a lot more important. There is also something else to consider, if a post gets too hot (generating too many comments), I am either going to close comments on it, or open a new post with summary of what went on in the previous post comment thread anyway, so it does have some checks & balances that keep a comment thread from growing too large.

What happened to technorati?

Recently all my technorati feeds started to give me stuff like this:

image

It looks like someone managed to crack the way that technorati is searching feeds, and I am getting what amounts to spammed search results. If this continues, it looks like I’ll just have to give up on it completely.

Any good alternatives?

Impleo – a CMS I can tolerate

If you head out to http://hibernatingrhinos.com/, you will see that I finally had the time to setup the corporate site. This is still very early, but I have a lot of content to add there, but it is a start.

Impleo, the CMS running the site, doesn’t have any web based interface, instead, it is built explicitly to take advantage of Windows Live Writer and similar tools. The “interface” for editing the site is the MetaWeblog API. This means that in order to edit the site, there isn’t any Wiki syntax to learn, or XML files to edit, or anything of this sort.

You have a powerful editor in your fingertips, one that properly handle things like adding images and other content. This turn the whole experience around. I usually find documentation boring, but I am used to writing in WLW, it is fairly natural to do, and it removes all the pain from the equation.

One of the things that I am trying to do with it is to setup a proper documentation repository for all my open source projects. This isn’t something new, and it is something that most projects have a hard time doing. I strongly believe in making things simple, in reducing friction. What I hope to do is to be able to accept documentation contributions from the community for the OSS projects.

I think that having a full fledged rich text editor in your hands is a game changer, compared to the usual way OSS handle documentation.  Take a look at what is needed to make this works, it should take three minutes to get started, no learning curve, no “how do they do this”.

So here is the deal, if you would like to contribute documentation (which can be anything that would help users with the projects), I just made things much easier for you. Please contact me directly and I’ll send you the credentials to be able to edit the site.

Thanks in advance for your support.

ayende.com move process completed

The server is now hosted at GoGrid, it took longer than I anticipated because I also moved it to EC2 to test that (post about this is already in the queue, and will show up in about 2 weeks).

Commenting is now enabled, and it all should just work. Please let me know if something is broken.

ayende.com is moving servers – some interruption may result

Well, the blog has grown a bit too large for my current host, and I decided that I need to move it elsewhere.

In order to make the move easier, I am disabling commenting site-wide. I’ll try to make this as fast as possible.

Tags:

Published at

Automation, Mark 1

I just got this email:

I seem to get automatic E-Mail notifications if you answer a comment from me on your blog. I asked our admin to enable this for our team blog, but he came up empty. Any hints?

My reply was a bit disappointing, I guess.

Yes, 

It works like this.

  1. You post a comment to my blog.
  2. I get an email.
  3. I answer the comment in gmail, send it to you.
  4. I then copy the reply to the blog and post it.

Sorry, no magic here :-)

It would be a nice feature, but by this time this is part of my workflow of answering comment.

How do you manage to blog so much?

In a recent email thread, I was asked:

How come that you manage to post and comment that much? I bet you spend really loads of time on your blog.

The reason why I ask is because roughly a month ago I've decided to roll out my own programming blog. I've made three or four posts there
(in 5 days) and abandoned the idea because it was consuming waaaaay too much time (like 2-3 hours per post)

The only problem is time. If I'm gonna to post that much every day (and eventually also answer to comments), it seems that my effective working time would be cut by 3+ hours. Daily.  So here comes my original question. How come that you manage to post and comment that much?

And here is my answer:

I see a lot of people in a similar situation. I have been blogging for 6 years now (Wow! how the the time flies), and I have started from a blog that has zero readers to one that has a respectable readership. There are only two things that I do that are in any way unique. First, my "does it fit to be blogged about?" level is pretty low. If it is interesting (to me), it will probably go to the blog.

Second, I don't mind pushing a blog post that requires fixing later. It usually takes me ten minutes to put out a blog post, so I can literally have a thought, post it up and move on, without really noticing it hurting my workflow. And the mere act of putting things in writing for others to read is a significant one, it allows me to look at things in writing in a way that is hard to do when they are just in my head.

It does take time, make no mistakes about that. The 10 minutes blog post is about 30% of my posts, in a lot of cases, it is something that I have to work on for half an hour. In some rare cases, it goes to an hour or two. There have been several dozens of posts that took days. But while it started out as a hobby, it has become part of my work now. The blog is my marketing effort, so to speak. And it is an effective one.

Right now, I have set it up so about once a week I am spending four or five hours crunching out blog posts and future posting them. Afterward, I can blog whenever I feel like, and that takes a lot of the pressure off. It helps that I am trying hard to schedule most of them day after day. So I get a lot of breathing room, but there is new content every day.

That doesn’t mean that I actually blog once a week, though. I push stuff to the blog all the time, but it is usually short notes, not posts that take a lot of time.

As for comments, take a look at my commenting style, I am generally commenting only when I actually have something to add to the conversation, and I very rarely have long comments (if I do, they turn into posts :-) ).

It also doesn’t take much time to reply to most of them, and it creates a feedback cycle that means that more people are participating and reading the blog. It is rare that I post a topic that really stir people up and that I feel obligated to respond to all/most of the comments. That does bother me, because it takes too much time. In those cases, I’ll generally close the comment threads with a note about that.

One final thought, the time I spend blogging is not wasted. It is well spent. Because it is an investment in reputation, respectability and familiarization.

Blog posts ideas, remapped

Wow,

I got a lot of suggestions for new blog posts. I am not promising that I would do all of them, but I am going to try.

The experiment went well enough that I decided to create a dedicated forum for this, which would allow people to also vote for individual post topics, and give me better idea what people want to see.

I also moved all the previous suggestions to the new forum, and I think that I’ll make it a regular feature.

Extending SubText to report Future Posts

Since I am making so much use of future posts recently, I decided that it would be interesting to have a sidebar that shows the future posts. The problem is that I really don’t want to mess around with SubText.

This is not a slight against SubText, it has served me well for a long time. It is simply that for what I wanted, the number of steps that I would have to go through is way too long:

  • Get the relevant source
  • Compile it on my machine
  • Figure out SubText’s architecture and where I should make my changes
  • Upload a new version of the blog engine

It is possible, but it just take too long. Moreover, it would most certainly break the next time that I would update SubText, because I would forget all about it.

But it is not the only way to get stuff done, however. Here is my solution:

Update: I actually had a bug here related to time zone handling, fixed now.

Update 2: I used the 3.5 TimeZone semantics for this, but my server is running 2.0, use an okay hack instead. And fixed potential issue with < & > in the titles.

<%@ Page Language="C#" %>
<%@ OutputCache Duration="60" VaryByParam="None" %>

<%@ Import Namespace="System.Data.SqlClient" %>
<%@ Import Namespace="System.Configuration" %>
<ul>
<%
string connectionString = ConfigurationManager.ConnectionStrings["subtextData"].ConnectionString;
using(SqlConnection connection = new SqlConnection(connectionString))
{
	connection.Open();
	
	using(SqlCommand cmd = connection.CreateCommand())
	{
		cmd.CommandText = @"
select top 15 Id, DateSyndicated, Title from subtext_Content
where DateSyndicated > @nowIsrael
order by DateSyndicated
";
		DateTime nowIsrael = DateTime.UtcNow.AddHours(3);
		
		cmd.Parameters.AddWithValue("nowIsrael",nowIsrael);
		
		using(SqlDataReader reader = cmd.ExecuteReader())
		{
			if(reader.HasRows == false)
			{
				%><li>Queue is empty</li><%
			}
			while(reader.Read())
			{
				DateTime dateSyndicated = (DateTime)reader["DateSyndicated"];
				string title = (string) reader["Title"];
				string formattedSyndication;
				TimeSpan timeLeft = dateSyndicated - DateTime.Now;
				
				if (timeLeft.Days > 7)
					formattedSyndication = timeLeft.Days / 7 + " weeks " + timeLeft.Days % 7 + " days";
				else if (timeLeft.Days > 0)
					formattedSyndication = timeLeft.Days + " days";
				else if (timeLeft.Hours > 0)
					formattedSyndication = timeLeft.Hours + " hours";
				else if (timeLeft.Minutes > 0)
					formattedSyndication = timeLeft.Minutes + " minutes";
				else
					formattedSyndication = "In a moment";
				
				%>
				<li><%= formattedSyndication %> <br /> <%= Server.HtmlEncode(title) %></li>
				<%
			}			
		}
	}
}
%>
</ul>

And then it was just a matter of changing the template to include the following JavaScript:

$('#futurePosts').load('http://ayende.com/Blog/FuturePosts.aspx');

And that is it. It works, it is safe, it can’t really break anything, and it will probably survive blog upgrades.

Blog posts ideas

Here is something a little different, I am going to open the suggestion box for blogging ideas.

Just jot them down here, and I’ll try to get to them.

Blog Analytics

This is for the blog, for the last month:

image

Some more interesting stats are when you go into the keywords:

image

And if I try to break it up a bit, it gets even more interesting:

  • NHibernate related searches are responsible for ~13% of the traffic on the site
  • Rhino Mocks is responsible for just ~1.7%
  • Generic Rhino Tools stuff,4.7%

But looking at my traffic sources is even more interesting:

image

I expected Google to be a major contributor to the traffic, but I quite surprised by reddit being so important there. Mostly because I don’t think that I submitted stories there.

Post scheduling

This is a general announcement about a change in the way that I am posting to this blog.

One of the more frequent feedback items about the blog was that people find it hard to catch up with my rate of posting. This is especially true since I tend to spend some days posting a large number of posts, and I feel that the sheer quantity reduce the amount of time people dedicate to each post (hence reducing its quality).

I have started making use of future posting to a high degree (almost all of the NHibernate mapping posts were written in a day or two, for example, but spaced over about a month). I don’t really try to keep any sort of organization, except that I am going to try to keep the maximum number of posts per day to no more than two. Each new post is just going to the back of the queue, and will be posted then.

Currently I have scheduled posts all the way to mid May, but I think it will get higher. This is good news in the sense that you are almost always going to get at least one post per day from me, but it does mean that sometimes posts that are written together are stretched over a period of time. Or I may refer (usually in comments) to posts that will be posted in the future.

There is no real meaning behind the timing of the posts, unless there is something special that happens in this date, so you may leave the conspiracy theories to rest :-) .

Posts vs. Comments

Just some interesting correlation:

image

Yesterday was the second most active day comments wise, eclipsed only by my single foray into politics.

Another interesting statistics:

# Posts / Day

1

2

3

4

5

6

7

8

9

10

11

12

13

14

17

22

# of Days

336

287

200

165

104

64

28

32

11

11

5

7

3

1

1

1

This table shows the relation between the number of posts per day and the number of times that I posted that number.

So, for example, you can see that my record is (once and never again!) 22 posts per day, and that I quite frequently post more than a single post per day.

Saying goodbye to proprietary software

I have been doing Open Source stuff for five years plus now, you could say that it defines a lot of what I do. Increasingly, but for a long time, I have been feeling uncomfortable in my position as an Open Source developer that works on a proprietary operating system and proprietary platform.

I have been spending more and more time investigating alternative technologies, as you probably noticed, and by now I think that I have accumulated enough knowledge to deal with the shift without causing too much pain for me.

I am writing this post on my new machine, a Linux box running Debian. This seems like a good match for my position, since Debian doesn't allow non free software on their distribution. I repaved my Macs as well. OS X has started life as free software, but is it no longer free.

I had some issues along the way, but nothing that was truly serious. Now, to the development environment. I considered moving to Java, since it is now truly Open Source and is very similar to .Net. However, its roots are non Free, and I would like to make a clean break from non Free software. I considered Ruby as well, but I don't like it very much. Nothing specific beyond the it is too much of a buzzward.

Combining both personal preference and political stance, I decided to go with Erlang, which is Free and Open Source for over a decade. I have been digging in CouchDB's code lately, and it is very interesting.

That is all for now.

Oh, and you might experience some problems with the blog. I am moving it from Subtext (which is OSS, but running on non OSS platform) to Drupal on Linux.

Newsflash to commentors: it is my blog

image I can't believe that I actually have to spell this out.

This is my blog.

You can double check the URL, to make sure that it clearly states that.

As such, I am going to write about whatever topic I feel like writing. And if I care enough about Chinese Procelaim Kittens, I am going to write about them.

If you don't like a particular post, feel free to skip it.