Ayende @ Rahien

Refunds available at head office

Challenge: Recent Comments with Future Posts

We were asked to implement comments RSS for this blog, and many people asked about the recent comments widget. That turned out to be quite a bit more complicated than it appeared on first try, and I thought it would make a great challenge.

On the face of it, it looks like a drop dead simple feature, right? Show me the last 5 comments in the blog.

The problem with that is that it ignores something that is very important to me, the notion of future posts. One of the major advantages of RaccoonBlog is that I am able to post stuff that would go on the queue (for example, this post would be scheduled for about a month from the date of writing it), but still share that post with other people. Moreover, those other people can also comment on the post. To make things interesting, it is quite common for me to re-schedule posts, moving them from one date to another.

Given that complication, let us try to define how we want the Recent Comments feature to behave with regards to future posts. The logic is fairly simple:

  • Until the post is public, do not show the post comments.
  • If a post had comments on it while it was a future post, when it becomes public, the comments that were already posted there, should also be included.

The last requirement is a bit tricky. Allow me to explain. It would be easier to understand with an example, which luckily I have:

image_thumb[2]

As you can see, this is a post that was created on the 1st of July, but was published on the 12th. Mike and me have commented on the post shortly after it was published (while it was still hidden from the general public). Then, after it was published, grega_g and Jonty have commented on that.

Now, let us assume that we query at 12 Jul, 08:55 AM, we will not get any comments from this post, but on 12 Jul, 09:01 AM, we should get both comments from this post. To make things more interesting, those should come after comments that were posted (chronologically) after them. Confusing, isn’t it? Again, let us go with a visual aid for explaining things.

In other words, let us say that we also have this as well:

image

Here is what we should see in the Recent Comments:

12 Jul, 2011 – 08:55 AM 12 Jul, 2011 – 09:05 AM
  1. 07/07/2011 07:42 PM – jdn
  2. 07/08/2011 07:34 PM – Matt Warren
  1. 07/03/2011 05:26 PM – Ayende Rahien
  2. 07/03/2011 05:07 PM - Mike Minutillo
  3. 07/07/2011 07:42 PM – jdn
  4. 07/08/2011 07:34 PM – Matt Warren

Note that the 1st and 2snd location on 9:05 are should sort after 3rd and 4th, but are sorted before them, because of the post publish date, which we also take into account.

Given all of that, and regardless of the actual technology that you use, how would you implement this feature?

Comments

Patrick Huizinga
08/23/2011 09:53 AM by
Patrick Huizinga

I guess I would go for an extra date field, called 'sort-by' or something.

If a comment is posted on a future post, it's sort-by would become the post date of the post + 1 tick. Each subsequent comment would become lastComment.sort-by + 1 tick (to maintain absolute ordering). Comments posted on released posts would have a sort-by equal to posted-at.

When fetching the recent comments you would sort on sort-by, but filter out any comment whose sort-by is in the future. When displaying the recent comments, you would still use the posted-at date.

Ayende Rahien
08/23/2011 09:56 AM by
Ayende Rahien

Patrick, A small detail, we also have to deal with post rescheduling. That is actually fairly common (I reshuffle things frequently).

Felipe Fujiy Pessoto
08/23/2011 10:36 AM by
Felipe Fujiy Pessoto

I think I would create a extra date field, publicationdate for comments.

Then, at first time that post show at the blog(or rss), this field is filled(you just need to see if it is null and set DateTime.Now).

If you reschedule the post, the comment continue with that date and dont show at RSS again. But if you want this behavior, you just need to set null to comment publicationdate at reschedule.

Sorry for my bad english =/

Felipe Fujiy Pessoto
08/23/2011 10:40 AM by
Felipe Fujiy Pessoto

Oh, the "order by" would be, PublicationDate(the new field), OriginallyPosted

Ayende Rahien
08/23/2011 10:44 AM by
Ayende Rahien

Felipe, Nice. The way we did that, we save the actual post date in the comments, and then do this sort of two staged order by.

tobi
08/23/2011 11:21 AM by
tobi

You can order on computed values as well, so order on

WasCommentCreatedBeforePublish desc, Comment.CreateDateTime

With WasCommentCreatedBeforePublish = Comment.CreateDateTime < Post.PublishDateTime

Matt McElheny
08/23/2011 11:24 AM by
Matt McElheny

Perhaps I'm oversimplifying it, but this seems like this is just a simple conditional order by. If the blog entry was posted publicly AFTER the comment, then the blog post date is used, otherwise use the date the comment was posted. A a secondary sort is needed in order to help sort the comments posted prior to the publish date.

var mostRecentComments = (from c in BlogComments orderby (c.BlogPost.PublishDate > c.CommentDate ? c.BlogPost.PublishDate : c.CommentDate) descending, c.CommentDate descending).Take(5);

Joe Marquardt
08/23/2011 02:12 PM by
Joe Marquardt

You have 3 dates: blog posted, blog published to public, comment posted.

I don't see the need for a double order by. Just filter out any comments that are attached to an unpublished blog post and order the rest by comment posted date desc.

Do you show the unpublished recent comments to those privileged users? If so, then omit the filtering out of unpublished posts based on their rights.

Patrick Huizinga
08/23/2011 03:02 PM by
Patrick Huizinga

Joe, The problem lies in the fact that at the moment a blog post becomes public, all the comments on it should appear at the top of the lists of recent comments. Event if such comments were actually posted last week.

So just sorting on comment posted date isn't going to cut it, because then all the old comments of the just-became-public blog post will never appear in the recent comments.

Joe Marquardt
08/23/2011 03:16 PM by
Joe Marquardt

ahhh! I thought I had to be missing something.

Harry M
08/23/2011 03:18 PM by
Harry M

So there are two groups, public and nonpublic (the group with Mike in?). Can't you just write a query for each group which runs against the set of posts that that group can access?

jdn
08/23/2011 04:24 PM by
jdn

Well, it is especially confusing since you have my comment as being posted at 7/8/2011, 08:42PM on the screenshot, but then as 7/7/2011, 07:42 in the text, so I have no idea what the proper order should be.

Alexei Kopylov
08/23/2011 04:35 PM by
Alexei Kopylov

I don't understand what the problem is. In your very last example (after the second screeshot) you just have comments listed chronologically.

So the logic is: if(IsPostHidden) ignore else show comments chronologically (ascending/descending)

Where is the problem?

Ayende Rahien
08/23/2011 04:37 PM by
Ayende Rahien

Alexei, Make this into a query.

Ajai
08/23/2011 05:41 PM by
Ajai

Nice to see there is a special Ayende Inner Circle :)

The dates towards the end of the post for jdn is confusing as hell, but would this work?

The ordering is based on "how soon" somebody commented when the post was made visible to them. Inner circle guy Mike got to see and comment on the post say 10 mins after it was shared with him, where the less privileged Eager Joe posted 2 mins after it was made public.

So the ordering in SQL would look something like this (skipping SQL date functions but you get the idea)

order by case when c.commentdate < p.publishdate then p.publishdate + (c.commentdate - p.postdate) -- how soon would Mike have commented if he did not belong else c.commentdate end

Ajai

Ayende Rahien
08/23/2011 05:59 PM by
Ayende Rahien

Ajai, Inner circle? Hardly, but I'll make a post about RavenDB available to anyone on the ravendb group, instead of having to wait a month or so until they show up

Ajai
08/23/2011 06:05 PM by
Ajai

Just kidding about inner circle :) Look forward to see how you ended up solving this...

Kaare Skovgaard
08/23/2011 06:17 PM by
Kaare Skovgaard

Isn't it just:

simply filtering out comments from unpublished posts (post.PublishTime < DateTime.Now) and then ordering by max(post.PublishTime, comment.PostedAt). I guess if the DBMS can't do these kind of sorts efficiently then an extra field is needed.

Chris Martin
08/23/2011 08:34 PM by
Chris Martin

Why not just use a second property in your post document?

post: :unpublishedComments :date :comments :date

Your view can loop through unpublishedComments and then comments. Absolute order preserved. ;)

Alessandro Riolo
08/23/2011 10:51 PM by
Alessandro Riolo

I am afraid I will have to confess a weakness of mine, but I found quite hard following this post, due its use of the middle endian date format.

I am also sorry to look as the usual preacher, as in my own blog I haven't really cared much about the date formats so far as well, but I am quite sure if my blog would be a technical one, and it is not, or if I would post something about dates, I would probably (and I really should have already) set the dates as YYYY-MM-DD (the ISO 8601 calendar date format), as I usually tend to do on my programming whenever possible.

Matthew
08/24/2011 03:46 AM by
Matthew

Seems to me like what you're saying is you want all comments on published posts and if the comment's publish date is before the post's publication date then you want to use the post's publication date. Wouldn't that be:

from c in comments where c.post.isPublished order by max(c.post.publishDate, c.commentDate) take 5

Comments have been closed on this topic.