Ayende @ Rahien

Refunds available at head office

Waiting for the Service Pack? I don't think so

Here are a few interesting things that I found about .Net 3.5 Service Pack 1:

  1. Serialization hangs or throws an OutOfMemoryException with static delegate and ISerializable on 3.5 SP1
  2. ExecutionEngineException with ParameterInfo.GetOptionalCustomModifiers and GetRequiredCustomModifiers on 3.5 SP1
  3. .NET 3.5 SP1 breaks use of WPF under IIS
  4. .NET 3.5 SP1 seems to break .NET 2.0 applications with assembly loading error.
  5. .NET Framework 3.5 SP1 breaks type verification

I just googled for "3.5 SP1" on connect.microsoft.com, and took only the verified (by Microsoft) items for the SP1 RTM. There are more, but I didn't feel like spending a lot of time digging there.

I don't remember previous service packs having regression bugs. Certainly not regression bugs as big as these. Number 2 is the one responsible for breaking Rhino Mocks, by the way.

Those are all regressions. That is, it used to work in previous version, now it doesn't.

Not happy at all.

Here is a challenge to Microsoft, Fix This. And fix this in a way that ensure that everyone get the fix. If this means SP1 Refresh, great. But fix this in a way that means that I don't have to answer "you need to call PSS and get KB32423 before you can run" for the next two years.

And fix this from Connect, if you ever want Connect to be useful for something.

Comments

glueball
08/15/2008 12:57 PM by
glueball

http://codebetter.com/blogs/patricksmacchia/archive/2008/08/13/net-3-5-sp1-changes-overview.aspx

This guy has counted up changes within SP1 with a help of NDepend. And seems 3.5 SP1 tends to be 3.6 =)

Summary:

Assemblies now: 112

Namespaces 919 to 935 (+16 +1.7%)

Types 39 988 to 40 513 (+525 +1.3%)

Methods 387 421 to 386 790 (-631 -0.2%)

Fields 241 567 to 246 795 (+5 228 +2.2%)

IL instructions 8 598 933 to 8 620 940 (+22 007 +0.3%)

1.393 new public methods

79 new public types

No public types removed (hopefully!)

14 non-public methods became public

6.384 methods where code was changed

2.485 types where code was changed

Gabriel Schenker
08/15/2008 02:27 PM by
Gabriel Schenker

It's a shame for Microsoft that this happens. They had enough time to do proper testing first. Why do a lot of groups inside MS still not listen to the community???

Mike Brown
08/15/2008 02:34 PM by
Mike Brown

@gabriel Did you see the numbers posted up above? over 40 thousand types in the framework. There are bound to be bugs. The issue is that they need to be fixed as well.

@Ayende I agree, they need to push a refresh rather than having some hidden knowledge base fix for the issue.

Frans Bouma
08/15/2008 04:29 PM by
Frans Bouma

A few more:

http://oakleafblog.blogspot.com/2008/08/serious-fail-of-aspnet-dynamic-data-sp1.html

http://oakleafblog.blogspot.com/2008/08/upgrading-databound-projects-to-entity.html

both EF related, but still... silly issues.

Steve Bohlen
08/15/2008 04:34 PM by
Steve Bohlen

@Gabriel:

I don't think they had enough time at all -- I think there is a real likelihood that they were pressed to release this to coincide with long-delayed RTM of SQL Server 2008. Recall MSSQL2008 was originally supposed to be released LAST NOVEMBER with the VS2008 + .NET 3.5 release so its 'conceptually' nearly a year 'late' already.

Since .NET 3.5 SP1 is where the driver support comes from for MSSQL2008 and VS2008 SP1 is where the designer integration (server explorer, etc.) comes from for the IDE, I think there must have been intense internal pressure to ship SP1 of both on or at the release of MSSQL2008.

@MIke:

NEW bugs are one thing; regression bugs (of which nearly all of these seem to be) are inexcusable. Period. If they tested them before the RTM of the base 3.5 framework, then for them to appear in SP1 is ridiculous. And if they didn't test them for the base 3.5 release, then shame on them for just crossing their fingers and praying all is well :)

Mike Brown
08/15/2008 05:36 PM by
Mike Brown

Are you saying that these were bugs that existed in 3.5, were fixed, and then came back in SP1. Or are you saying that the bugs didn't exist in 3.5 but now do in SP1.

The former is inexcusable and is the true definition of a regression (if you fix it you should have made a test to cover that specific bug so it won't show up again).

The latter is just something that happens...new code causes new bugs.

Maybe Ayende can help out here. Are they true regressions or are they new bugs?

Ayende Rahien
08/15/2008 06:11 PM by
Ayende Rahien

The bugs did NOT exist in 3.5, they appeared in 3.5 SP1.

This is not the case of a relying on a bug, this is the case of a bug appearing in the SP.

Mike Brown
08/15/2008 06:17 PM by
Mike Brown

Yes but it is a new bug. It's not a regression. A regression is an existing bug being fixed. And then reappearing in a later release.

I'm not trying to trivialize the impact of the bug. However, they are not regressions because they never existed before.

Ayende Rahien
08/15/2008 06:20 PM by
Ayende Rahien

Regression in this case means that working code stopped working.

Not that a bug resurfaced.

Mike Brown
08/15/2008 06:37 PM by
Mike Brown

I understand and understood what you meant by regression but in general when someone hears regression bug they take the other meaning. Hence you have people climbing out the woodworks saying how Microsoft doesn't care about code quality and they should have caught a bug that they weren't aware even existed.

That's why I put it in perspective. Show me someone with 40 thousand + types in their codebase without a single exposed bug and I'll show you someone with 40 thousand + useless code files.

Ayende Rahien
08/15/2008 06:43 PM by
Ayende Rahien

Mike,

The problem isn't that there is a bug. The problem is that there is a new bug in SP1, and Microsoft has been so quite about it.

SP in general should be safe. Breaking something as basic as type loading or the actual CLR code is... surprising.

Mike Brown
08/15/2008 07:15 PM by
Mike Brown

Ayende,

I FULLY agree with you. I'm sorry if I haven't made that clear up to now. My issue is with the slashdot mentality that people are popping up with here.

Steve Bohen appears to think that this were known and fixed bugs that reappeared in SP1. But they're not.

Working with a framework yourself, I'm sure that you've made a change to a piece of code that broke something on the opposite side of the framework when the two are used together. If you're lucky, you have a unit test that catches this interaction. Otherwise, a user of your framework catches it for you and bashes you for not caring enough about quality.

I'm not saying that these bugs are something that could not have been avoided, I don't know enough about the root cause of the problems. However, some of them look to be a bit extreme. Especially the type verification bug.

I can't even follow the inheritance chain for that bug. DoesntWork inherits from Works<T,T> which inherits from Works and implements IBlank. Works also implements IBlank. I could theoretically see this construct being useful maybe...then again...I tend to avoid deep inheritance trees. (Yes 3 levels is deep to me).

Like I said, I understand how frustrating the issue could be. I've sat through eight hours on the horn with PSS because of an error with the DTC on Windows DataCenter that happens with "large amounts of RAM, and large amounts of Processors." There is a patch for this bug...but the patch is only available if you are actually affected by it.

Ayende Rahien
08/15/2008 07:24 PM by
Ayende Rahien

Mike,

I think that we all agree that such things can slip past the tightest net. That isn't the problem, the problem is with the expected response. As you noticed, having "secret" patches is really not a good way of handling this.

The inheritance chain is quite common if you are using hierarchies of commands to handle common things.

I am actually much more worried by how you can cause execution engine exception

Steve Bohlen
08/15/2008 08:55 PM by
Steve Bohlen

@MIke:

I definitely didn't take the meaning that these were bugs that were fixed in 3.5 and reappeared in SP1. I am considering a regression bug anything that is introduced that breaks past WORKING behavior.

My statement about their needing to have tested adequately is (I think) just as relevant for this situation as well as a more 'pure' regression bug per your definition (e.g., a bug once fixed has regressed to a state where it has returned to bug-state again).

Can we agree that this is a 'behavior-regression-bug' perhaps where functionality that once worked AS DESIGNED no longer does and this means that an unbroken set of behaviors is now broken. And this is behavioral regression, no?

I am proceding from an assumption that there should be a test in the test suite that actually tests each and every DESIGNED BEHAVIOR and that the only explanation for being able to release new code that break past working behavior would have to be that they either didn't have this test or didn't bother to run them. Either is no excusable for a commercial software application like .NET that forms the foundation for so many other things upon which your customers depend, IMHO.

Bill Barry
08/15/2008 10:04 PM by
Bill Barry

@Steve:

As for the GetOptionalCustomModifiers bug:

Type ifc = typeof (ISomeInterface);

this is designed functionality and most definitely is tested.

Type ifc = typeof (IGenericInterface<int, int>);

this happens to work because IGenericInterface<int, int> is a specific interface (and that is designed functionality and as such is tested)

The fact that interfaces can have methods is designed and tested.

The fact that methods can have type parameters is designed and tested.

But it is an assumption that because interfaces can have methods and that methods can have type parameters, interfaces can have methods with type parameters and this may or may not be tested (it is after all an integration test of two orthogonal concepts).

On top of that it is a further assumption that an interface with type parameters can have methods with type parameters.

I think it is rather absurd that these separate concerns can interact with each other to produce this bug (at the very least it is an indication of poor design). The fact that this bug exists shows that someone at MS isn't very good with their oop principals (as if that wasn't already painfully obvious in various parts of the framework). This is what they should be shamed on, not the symptoms. Without acknowledging and fixing the problem they can't hope to ever be able to ship free of mistakes like this.

firefly
08/15/2008 10:22 PM by
firefly

We really need more people like Oren to speak up like this. As developer I think it's safe to say that we all understand that bug happen but what matter is how the bug being handled.

Especially with issue like Connect. What's the point of having Connect and having user to vote on them if they don't act on it? Personally I would imagine that most developers would love to fix bug... especially somebody that work on the CLR team... So it's probably a management issue.

Brad Abrams
08/16/2008 10:48 PM by
Brad Abrams

Thanks Ayende for highlighting this issue -- believe me teams at Microsoft are working hard on the issues you pointed out... We will be addressing these and getting it out to the community as soon as we can...

Ayende Rahien
08/16/2008 10:54 PM by
Ayende Rahien

Thanks,

It is good to know that this is being taken care of

Kamran Shahid
08/18/2008 09:34 AM by
Kamran Shahid

It is inexcuseable.

Only the Latency of SQL server 2008 RTM might have forced Too early release of VS 2008 SP1

David Nelson
08/19/2008 04:12 PM by
David Nelson

I reported number 5. Mike, I can see how it might look a little arcane to you, but it is actually an integral part of our DAL; it allows significant flexibility in defining the caching and retrieval behaviors of various data sources, and it has been invaluable as the application has evolved. And yes, I would define that as a regression bug, since existing behavior which worked correctly according to spec no longer works correctly after the "service pack" is installed.

Although it may not be the most common scenario, in my opinion this is not that complicated, and should definitely have been a part of the standard testing of the framework. If something this simple can slip through, it makes me wonder about what else might have broken that we don't even know about yet.

I agree that these bugs are surprising. In the past, Microsoft has been almost fanatically obsessed with backward compatibility, to their own detriment I sometimes feel. But in this "service pack" (which to any developer is obviously not a service pack but a minor version release), there have been numerous major breakages. I don't know if it is related to the release of SQL Server 08, although that would make sense; but I do know that if they make a habit of it they are suddenly going to find themselves with a shrinking developer base.

I am going to withhold judgment about how Microsoft handles these issues until we actually see how they handle them. I agree, optional hotfixes are clearly not sufficient; hopefully they release that and release these fixes as critical updates.

Comments have been closed on this topic.