Ayende @ Rahien

Ayende @ Rahienhttp://ayende.comAyende @ RahienCopyright (C) Ayende Rahien 2004 - 2021 (c) 202660James_2JS commented on Finding chrome bugs1) I ran the code, and did this with a fresh install of Chrome... by default the page encoding was set to Unicode (UTF-8). Choosing auto-detect and re-running removed the BOM 2) You can force the removal of the BOM by changing the code to this: using (StreamWriter writer = new StreamWriter(gzip, new UTF8Encoding(false))) http://ayende.com/4600/finding-chrome-bugs#comment17http://ayende.com/4600/finding-chrome-bugs#comment17Mon, 23 Aug 2010 14:44:58 GMTAyende Rahien commented on Finding chrome bugsI love it how no one has actually run the code. Guys, if I use the charset=utf-8, there is still a problem. So yes, it is a bug. http://ayende.com/4600/finding-chrome-bugs#comment16http://ayende.com/4600/finding-chrome-bugs#comment16Mon, 23 Aug 2010 09:40:00 GMTDaniel Fernandes commented on Finding chrome bugsI experienced nasty bugs with Chrome in the past too. I guess Chrome could do with more if statements ;) http://ayende.com/4600/finding-chrome-bugs#comment15http://ayende.com/4600/finding-chrome-bugs#comment15Mon, 23 Aug 2010 08:33:54 GMTPandaWood commented on Finding chrome bugsI'd like to shake it up a bit and go with the argument that this is a Chrome bug (or unacceptably dumb behaviour). Software (text editors) that can't handle the BOM are usually referred to as "Older Software" - from Wikipedia: "Older text editors may display the BOM as "ï»¿" at the start of the document, even if the UTF-8 file contains only ASCII and would otherwise display correctly". So In this case, the document would display correctly if Chrome were simply able to recognise the BOM, ignore it and read the remaining text. That doesn't sound like much to expect from software written sometime after 2000...? So, I would ask: is there any real excuse for modern software to fail to interpret the BOM and therefore leave the page in the state shown above (ie completely broken)? Is it not an "obvious" requirement to be able to interpret BOM and no BOM in UTF-8? http://ayende.com/4600/finding-chrome-bugs#comment14http://ayende.com/4600/finding-chrome-bugs#comment14Mon, 23 Aug 2010 04:53:13 GMTFrank commented on Finding chrome bugsHmmm, I'll have to correct myself about the BOM. The browser needs to check it based upon the null characters. http://ayende.com/4600/finding-chrome-bugs#comment13http://ayende.com/4600/finding-chrome-bugs#comment13Sun, 22 Aug 2010 17:03:38 GMTFrank commented on Finding chrome bugsMy goodness, what a bunch of crap commentary directed at Ayende. Have a look at the RFC 4627 standard, third part about encoding. [http://www.faqs.org/rfcs/rfc4627.html](http://www.faqs.org/rfcs/rfc4627.html) "JSON text SHALL be encoded in Unicode. The default encoding is UTF-8. Since the first two characters of a JSON text will always be ASCII characters [RFC0020], it is possible to determine whether an octet stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking at the pattern of nulls in the first four octets." In other words, no charset that you need to specify in the headers. The BOM will specify the encoding. http://ayende.com/4600/finding-chrome-bugs#comment12http://ayende.com/4600/finding-chrome-bugs#comment12Sun, 22 Aug 2010 17:02:11 GMThumpbacked lout commented on Finding chrome bugsThe best part is the title: how to elevate responsibility from your own lameness to someone else (Chrome this case). The more posts I read from this person, the more I see disguised lamer. Not only there was a lack of charset in declaration (which is violation of the standard) but also lack of (lame but working) solution with StreamWriter constructor that explicitly specifies no BOM. I think there also was no clue what BOM is... http://ayende.com/4600/finding-chrome-bugs#comment11http://ayende.com/4600/finding-chrome-bugs#comment11Sun, 22 Aug 2010 15:43:28 GMTtobi commented on Finding chrome bugsMaybe I am the real Joel Spolsky in disguise of a nickname... You will never know for sure ;-) http://ayende.com/4600/finding-chrome-bugs#comment10http://ayende.com/4600/finding-chrome-bugs#comment10Sun, 22 Aug 2010 14:09:59 GMTDaniel Steigerwald commented on Finding chrome bugsTobi, do you remmeber that nice article [www.joelonsoftware.com/items/2008/03/17.html](http://www.joelonsoftware.com/items/2008/03/17.html) :-) http://ayende.com/4600/finding-chrome-bugs#comment9http://ayende.com/4600/finding-chrome-bugs#comment9Sun, 22 Aug 2010 13:14:29 GMTtobi commented on Finding chrome bugsOk, I am the 10th person to confirm: It is the BOM-header^^ Such bugs make me believe that it would be very beneficial for most standards to have a reference implementation. That way the standards body can detect mistakes by themselves and implementers hopefully get even such details right. http://ayende.com/4600/finding-chrome-bugs#comment8http://ayende.com/4600/finding-chrome-bugs#comment8Sun, 22 Aug 2010 12:49:24 GMTMike Scott commented on Finding chrome bugsItamar is right, content type should include the charset: context.Response.ContentType = "application/json; charset=utf-8"; See [http://www.w3.org/International/O-HTTP-charset](http://www.w3.org/International/O-HTTP-charset)http://ayende.com/4600/finding-chrome-bugs#comment7http://ayende.com/4600/finding-chrome-bugs#comment7Sun, 22 Aug 2010 11:57:21 GMTconfigurator commented on Finding chrome bugsLike everyone said it's the BOM. Chrome shows everything for encodings that don't have specific rules about not showing them, like text/plain; application/json is good for applications, not for showing the text. Why is this a problem? Does the json not get parsed properly? A charset header should fix it - chrome is probably using the wrong charset here. http://ayende.com/4600/finding-chrome-bugs#comment6http://ayende.com/4600/finding-chrome-bugs#comment6Sun, 22 Aug 2010 11:56:20 GMTItamar Syn-Hershko commented on Finding chrome bugs@Rik, BOM identifies the encoding used for a stream of text. It is good to have whenever you are fetching a textual stream - from FS or not. @Ayende, try adding a charset header. Apparently all other browsers detect the BOM even when it isn't provided, although Chrome is perfectly alright when not doing so. Not providing a BOM is possible, but you may hit walls later on when this code is used with other encodings (UTF16/32 for CJK for example). http://ayende.com/4600/finding-chrome-bugs#comment5http://ayende.com/4600/finding-chrome-bugs#comment5Sun, 22 Aug 2010 10:52:28 GMTRik Hemsley commented on Finding chrome bugsI haven't checked the relevant RFCs, but as others have said, looks like a BOM where there shouldn't be one. As far as I am aware, BOMs are for files only. Oh and this blog software still doesn't remember me properly. And no it's not a bug in my browser. http://ayende.com/4600/finding-chrome-bugs#comment4http://ayende.com/4600/finding-chrome-bugs#comment4Sun, 22 Aug 2010 10:13:14 GMT13xforever commented on Finding chrome bugsI agree with anton. You'd be better off with new UTF8Encoding(false) http://ayende.com/4600/finding-chrome-bugs#comment3http://ayende.com/4600/finding-chrome-bugs#comment3Sun, 22 Aug 2010 09:58:30 GMTanton commented on Finding chrome bugsseems like a UTF8 BOM, clean up the files that introduce this and you'll be good to go http://ayende.com/4600/finding-chrome-bugs#comment2http://ayende.com/4600/finding-chrome-bugs#comment2Sun, 22 Aug 2010 09:30:30 GMTIgal Tabachnik commented on Finding chrome bugsI had something like this happen when I saved a batch file using Notepad2, which defaulted to "UTF-8 with Signature". What you're seeing is the BOM (byte order mark)... http://ayende.com/4600/finding-chrome-bugs#comment1http://ayende.com/4600/finding-chrome-bugs#comment1Sun, 22 Aug 2010 09:13:00 GMT