Ayende @ Rahien

Refunds available at head office

Cross Site Scripting and letting the framework deal with it

Rob Conery asks how the MS MVC platform should handle XSS attacks. In general, I feel that frameworks should do their best to ensure that to be secure by default. This means that I feel that by default, you should encode everything that comes from the user to the app. People seems to think that encoding inbound data will litter your DB with encoded text that isn’t searchable and consumable by other applications.

That may be the case, but consider, what exactly is getting encoded? Assuming that this is not a field that require rich text editing, what are we likely to have there?

Text, normal text, text that can roundtrip through HTML encoding without modifications.

HTML style text in most of those form fields are actually rare. And if you need to have some form of control over it, you can always handle the decoding yourself. Safe by default is a good approach. In fact, I have a project that uses just this approach, and it is working wonderfully well.

Another approach for that would be to make outputting HTML encoded strings very easy. In fact, it should be so easy that it would be the default approach for output strings.

Here, the <%= %> syntax fails. It translate directly to Response.Write(), which means that you have to take an extra step to get secured output. I would suggest changing, for MS MVC, the output of <%= %> to output HTML encoded strings, and provide a secondary way to output raw text to the user.

In MonoRail, Damien Guard has been responsible for pushing us in this direction. He had pointed out several places where MonoRail was not secure by default. As a direct result of Damien's suggestions, Brail has gotten the !{post.Author} syntax, which does HTML encoding. This is now considered the best practice for output data, as well as my own default approach.

Due to backward comparability reasons, I kept the following syntax valid: ${post.Author}, mainly because it is useful for doing things like output HTML directly, such as in: ${Form.HiddenField("user.id")}. For the same reason, we cannot automatically encode everything by default, which is controversial, but very useful.

Regardless, having a very easy way ( !{post.Author} ) to do things in a secure fashion is a plus. I would strongly suggest that the MS MVC team would do the same. Not a "best practice", not "suggested usage", simply force it by default (and allow easy way out when needed).

Comments

Andrew Hallock
12/20/2007 08:33 PM by
Andrew Hallock

I agree with the latter part of your post completely. I couldn't quite understand what you were saying about encoding input data. I've always been of the mindset that you html decode input data and html encode user-generated output; otherwise you'll double encode input.

Text, normal text, text that can roundtrip through HTML encoding without modifications.

People use the & entity more than I would've thought.

Matei
12/20/2007 09:16 PM by
Matei

To address the <%= %> issue you could maybe build a custom ExpressionBuilder that would encode your output. <%$ Encode:myText %> I wonder if that would work.

Kiliman
12/20/2007 10:02 PM by
Kiliman

Let's say you have a text field for company name.

User enters: AT&T

Data is encoded in database as AT&T

User wants to edit so we encode on output and we have

which looks like: AT&T (double-encoding :-O)

Now database is stored as AT&mp;T UGH!

This is why I feel that encoding should be done on output only.

P.S. I hope the example comes out right since there's no preview.

Andrey Shchekin
12/20/2007 10:07 PM by
Andrey Shchekin

Secure by default is an approach I tend to disagree with (though I like 'easily securable').

I see two problems with it.

The first one is that framework should not do too much assumptions about my code. The ASP.NET likes to assume that < and > are dangerous, and it is extremelly annoying for users -- since they are not trusted to write a > b, and extremelly annoying for me, since I have to remember to turn it off. I had similar problem in PHP, where it required me to dig at least two arcane variables to turn off the automatic escaping of singe quotes.

Another assumption is that I only do html or prefer html. I want to see ASP.NET view engine as a content-type-independent engine, and I always saw <%= %> as 'here be it'. Engine should not be unusable if I use text/plain instead of text/html and should not require workarounds to do it.

The second problem is that secure by default gives a developer a belief in automatic security. But it is much easier for me to show someone that they should think about security when I can do it by SQL injection in their login form or by implanting alerts in their comment pages. But when all simple bases are automatically covered, it is too easy to forget about more advanced steps that could cause the same breaches.

Kiliman
12/20/2007 10:07 PM by
Kiliman

Sorry, that should be AT&amp;T

But you get the point.

Ayende Rahien
12/20/2007 10:18 PM by
Ayende Rahien

Matei,

That is possible, it is just six times as awkward as simple <%= %>

Ayende Rahien
12/20/2007 10:26 PM by
Ayende Rahien

Andrey,

1/ I am using ASP.Net in web world for 99.9% of the time. All considerations for that should take this into account. I have no issue in making it harder to use if for other stuff is I get better results for the web.

2/ Saying that secure by default makes is "too easy to forget about more advanced steps that could cause the same breaches" means that you are now putting all the burden on the developer. They will be so busy implementing the simple layer of security that they will never have time to the more advance scenarios.

Ayende Rahien
12/20/2007 10:28 PM by
Ayende Rahien

Kiliman,

Good point, and I agree. But I would rather have "dirty input" problem rather than security issues.

Damien Guard
12/20/2007 11:10 PM by
Damien Guard

Encoding on input is totally unnecessary and causes confusion and problems trying to understand what to output.

It's not just HTML but " ' <> etc. A lot of people have ' in their name and if people have written HTML 4 style entities such as '> then something can easily break out without the encoding.

[)amien

Sergio Pereira
12/21/2007 01:07 AM by
Sergio Pereira

I agree with the proposition. How often do you have the opportunity to pick the safer approach by design....

@matei

It would be simpler to add an extension method:

public static string H(this Control control, string text){

return HttpUtility.HtmlEncode(text);

}

Then your pages/ascx: (provided the namespace was included in your pages or web.config)

<%= H( unsafeHere ) %>

Brian Chavez
12/21/2007 01:07 AM by
Brian Chavez

typeo...

XSS is cross-site scripting, not cross-side. :)

Ayende Rahien
12/21/2007 01:10 AM by
Ayende Rahien

But my security is already alerted to look for cross eyed scripts!

Andrey Shchekin
12/21/2007 07:37 AM by
Andrey Shchekin

1/ I am using ASP.Net in web world for 99.9% of the time. All considerations for that should take this into account. I have no issue in making it harder to use if for other stuff is I get better results for the web.

This is where I have no arguments but still would never agree. I can not think of a place in ASP.NET basic markup that is specifically made for HTML. I may like to generate CSS, I may even like to generate JS. They are Web as well, just with different mime types.

2/ Saying that secure by default makes is "too easy to forget about more advanced steps that could cause the same breaches" means that you are now putting all the burden on the developer. They will be so busy implementing the simple layer of security that they will never have time to the more advance scenarios.

No, because I am saying that simple security should be also simple to implement and be the default best practice aproach (but not framework-enforced default). Like using SqlCommand arguments in ADO.NET is an easy best practice to prevent SQL Injection (which may occasionally teach developer on what the SQL Injection is).

It is not too hard to write <%= H( unsafeHere ) %> each time you want to escape something, isn't it?

You have worked with framework that likes to think for you and hide simple details -- that's ASP.NET Web Forms.

Ayende Rahien
12/21/2007 07:43 AM by
Ayende Rahien

Andrey,

Implementing simple security may be simple to implement, but it is tedious.

Passing parameters to the DB is a good example of very tedious coding.

Andrey Shchekin
12/21/2007 01:26 PM by
Andrey Shchekin

Ok, it seems my conceptual problem is with levels of abstraction. I do not have to pass paramaters by hand, I would definitely pass parameters NHibernate or my own NIH framework. But the lowest level (ADO.NET, ASP.NET markup) should be as simple as a blackboard -- I do not want to configure and mess with it, I want to build on it.

Steve
12/21/2007 03:06 PM by
Steve

I would love it if the default behaviour of <%= ... %> was to HTML-encode the result, since that's (a) safe and (b) what you want 90% of the time. Why should the default be to do a dangerous and unusual thing?

Here's a quick mechanism for changing the <%= ... %> behaviour yourself: http://blog.codeville.net/2007/12/19/aspnet-mvc-prevent-xss-with-automatic-html-encoding/

The nice thing about the above method is that the developer doesn't even need to choose whether or not to encode the output the vast majority of times - the decision is made in terms of the parameter passed. Normal strings are encoded, but the output from HTML helper methods isn't.

Jeremy Gray
12/21/2007 05:35 PM by
Jeremy Gray

Html-encoding on input is a an example of a badly-leaked abstraction that creates a variety of downstream problems. The leak itself is that you've allowed the fact that your resulting presentation is html-based to leak into the data captured in your database. If your presentation technology requires that application data be encoded for display, encode it only at the point of displaying it.

Comments have been closed on this topic.