Ayende @ Rahien

Hi!
My name is Ayende Rahien
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:

ayende@ayende.com

+972 52-548-6969

@

Posts: 5,947 | Comments: 44,540

filter by tags archive

Strange production errors


The following code cause a really strange error in production:

new MailAddress("test@gmail.​com");

The specified string is not in the form required for an e-mail address.

Huh?!

Obviously it is!

After immediately leaping to the conclusion that .NET is crap and I should immediately start writing my own virtual machine, I decided to dig a little deeper:

Character Code
t 116
e 101
s 115
t 116
@ 64
g 103
m 109
a 97
i 105
l 108
. 46
? 8203
c 99
o 111
m 109

8203 stands for U+200B or zero width space.

I guess that someone with a software testing background decided to get medieval on one of our systems.


Comments

Phil Jones

Holy crap!

I just debugged the exact same issue on my client's system.

We were all similarly scratching our heads till I had to use to view source.

My solution:

// Remove HTML characters email = Regex.Replace(email, "&#[0-9]+;", "");

(A big hacky)

Itamar

This usually happens when you copy-paste from Word. That guy isn't too sophisticated, he is just lazy...

Nic Wise

We've just been dealing with something similar.

select id, catnum from table;

1 ABCD-1234 2 ABCD-1234

select id, '[' + catnum + ']' from table; 1 [ABCD-1234] 2 [ABCD-1234

(catnum is ment to be unique, too!)

Got some unicode nonsense going on in there somewhere.... I suspect a newline, but we still can't find it.

Mikkel Christensen

Another problem to watch out for when using the MailAddress constructor:

http://social.msdn.microsoft.com/forums/en-US/netfxnetcom/thread/2217c413-968f-4dcf-8035-45eaf2a3c609

Matt

I get this quite a lot in our databases. The source is usually legacy processes that rely on Excel spreadsheets/vba for data loading (yuck).

Sina

This is a quite valid and common character in some languages, such as Persian (it is called Zero-Width Non Jointer) and joins different parts of a single word, when you don't want it get separated when word-wrapping happens. E.g., the following word contains a ZWNJ: می‌روم

Since it is a very common character for some languages it may happen usually that somebody changes the keyboard language accidentally and enter it without purpose.

Comment preview

Comments have been closed on this topic.

FUTURE POSTS

No future posts left, oh my!

RECENT SERIES

  1. RavenDB Sharding (2):
    21 May 2015 - Adding a new shard to an existing cluster, the easy way
  2. The RavenDB Comic Strip (2):
    20 May 2015 - Part II – a team in trouble!
  3. Challenge (45):
    28 Apr 2015 - What is the meaning of this change?
  4. Interview question (2):
    30 Mar 2015 - fix the index
  5. Excerpts from the RavenDB Performance team report (20):
    20 Feb 2015 - Optimizing Compare – The circle of life (a post-mortem)
View all series

RECENT COMMENTS

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats