Ayende @ Rahien

Refunds available at head office

Strange production errors

The following code cause a really strange error in production:

new MailAddress("test@gmail.​com");

The specified string is not in the form required for an e-mail address.

Huh?!

Obviously it is!

After immediately leaping to the conclusion that .NET is crap and I should immediately start writing my own virtual machine, I decided to dig a little deeper:

Character Code
t 116
e 101
s 115
t 116
@ 64
g 103
m 109
a 97
i 105
l 108
. 46
? 8203
c 99
o 111
m 109

8203 stands for U+200B or zero width space.

I guess that someone with a software testing background decided to get medieval on one of our systems.

Comments

Phil Jones
05/04/2012 09:11 AM by
Phil Jones

Holy crap!

I just debugged the exact same issue on my client's system.

We were all similarly scratching our heads till I had to use to view source.

My solution:

// Remove HTML characters email = Regex.Replace(email, "&#[0-9]+;", "");

(A big hacky)

Itamar
05/04/2012 09:11 AM by
Itamar

This usually happens when you copy-paste from Word. That guy isn't too sophisticated, he is just lazy...

Nic Wise
05/04/2012 09:17 AM by
Nic Wise

We've just been dealing with something similar.

select id, catnum from table;

1 ABCD-1234 2 ABCD-1234

select id, '[' + catnum + ']' from table; 1 [ABCD-1234] 2 [ABCD-1234

(catnum is ment to be unique, too!)

Got some unicode nonsense going on in there somewhere.... I suspect a newline, but we still can't find it.

Mikkel Christensen
05/04/2012 09:26 AM by
Mikkel Christensen

Another problem to watch out for when using the MailAddress constructor:

http://social.msdn.microsoft.com/forums/en-US/netfxnetcom/thread/2217c413-968f-4dcf-8035-45eaf2a3c609

Matt
05/04/2012 09:52 AM by
Matt

I get this quite a lot in our databases. The source is usually legacy processes that rely on Excel spreadsheets/vba for data loading (yuck).

Sina
05/04/2012 12:28 PM by
Sina

This is a quite valid and common character in some languages, such as Persian (it is called Zero-Width Non Jointer) and joins different parts of a single word, when you don't want it get separated when word-wrapping happens. E.g., the following word contains a ZWNJ: می‌روم

Since it is a very common character for some languages it may happen usually that somebody changes the keyboard language accidentally and enter it without purpose.

Comments have been closed on this topic.