ChallengeCan you spot the bug?
I have this piece of code:
public static string FirstCharacters(this string self, int numOfChars) { if (self == null) return ""; if (self.Length < numOfChars) return self; return self .Replace(Environment.NewLine, " ") .Substring(0, numOfChars - 3) + "..."; }
And the following exception was sent to me:
System.ArgumentOutOfRangeException: Index and length must refer to a location within the string. Parameter name: length at System.String.InternalSubStringWithChecks(Int32 startIndex, Int32 length, Boolean fAlwaysCopy) at StringExtensions.FirstCharacters(String self, Int32 numOfChars)
It actually took me a while to figure out what is going on. Partly that is because we have an “impossible” stack trace (the JIT seems to have inlined the Substring call). When I did figure what the bug was, i is a head slapping moment.
More posts in "Challenge" series:
- (03 Feb 2025) Giving file system developer ulcer
- (20 Jan 2025) What does this code do?
- (01 Jul 2024) Efficient snapshotable state
- (13 Oct 2023) Fastest node selection metastable error state–answer
- (12 Oct 2023) Fastest node selection metastable error state
- (19 Sep 2023) Spot the bug
- (04 Jan 2023) what does this code print?
- (14 Dec 2022) What does this code print?
- (01 Jul 2022) Find the stack smash bug… – answer
- (30 Jun 2022) Find the stack smash bug…
- (03 Jun 2022) Spot the data corruption
- (06 May 2022) Spot the optimization–solution
- (05 May 2022) Spot the optimization
- (06 Apr 2022) Why is this code broken?
- (16 Dec 2021) Find the slow down–answer
- (15 Dec 2021) Find the slow down
- (03 Nov 2021) The code review bug that gives me nightmares–The fix
- (02 Nov 2021) The code review bug that gives me nightmares–the issue
- (01 Nov 2021) The code review bug that gives me nightmares
- (16 Jun 2021) Detecting livelihood in a distributed cluster
- (21 Apr 2020) Generate matching shard id–answer
- (20 Apr 2020) Generate matching shard id
- (02 Jan 2020) Spot the bug in the stream
- (28 Sep 2018) The loop that leaks–Answer
- (27 Sep 2018) The loop that leaks
- (03 Apr 2018) The invisible concurrency bug–Answer
- (02 Apr 2018) The invisible concurrency bug
- (31 Jan 2018) Find the bug in the fix–answer
- (30 Jan 2018) Find the bug in the fix
- (19 Jan 2017) What does this code do?
- (26 Jul 2016) The race condition in the TCP stack, answer
- (25 Jul 2016) The race condition in the TCP stack
- (28 Apr 2015) What is the meaning of this change?
- (26 Sep 2013) Spot the bug
- (27 May 2013) The problem of locking down tasks…
- (17 Oct 2011) Minimum number of round trips
- (23 Aug 2011) Recent Comments with Future Posts
- (02 Aug 2011) Modifying execution approaches
- (29 Apr 2011) Stop the leaks
- (23 Dec 2010) This code should never hit production
- (17 Dec 2010) Your own ThreadLocal
- (03 Dec 2010) Querying relative information with RavenDB
- (29 Jun 2010) Find the bug
- (23 Jun 2010) Dynamically dynamic
- (28 Apr 2010) What killed the application?
- (19 Mar 2010) What does this code do?
- (04 Mar 2010) Robust enumeration over external code
- (16 Feb 2010) Premature optimization, and all of that…
- (12 Feb 2010) Efficient querying
- (10 Feb 2010) Find the resource leak
- (21 Oct 2009) Can you spot the bug?
- (18 Oct 2009) Why is this wrong?
- (17 Oct 2009) Write the check in comment
- (15 Sep 2009) NH Prof Exporting Reports
- (02 Sep 2009) The lazy loaded inheritance many to one association OR/M conundrum
- (01 Sep 2009) Why isn’t select broken?
- (06 Aug 2009) Find the bug fixes
- (26 May 2009) Find the bug
- (14 May 2009) multi threaded test failure
- (11 May 2009) The regex that doesn’t match
- (24 Mar 2009) probability based selection
- (13 Mar 2009) C# Rewriting
- (18 Feb 2009) write a self extracting program
- (04 Sep 2008) Don't stop with the first DSL abstraction
- (02 Aug 2008) What is the problem?
- (28 Jul 2008) What does this code do?
- (26 Jul 2008) Find the bug fix
- (05 Jul 2008) Find the deadlock
- (03 Jul 2008) Find the bug
- (02 Jul 2008) What is wrong with this code
- (05 Jun 2008) why did the tests fail?
- (27 May 2008) Striving for better syntax
- (13 Apr 2008) calling generics without the generic type
- (12 Apr 2008) The directory tree
- (24 Mar 2008) Find the version
- (21 Jan 2008) Strongly typing weakly typed code
- (28 Jun 2007) Windsor Null Object Dependency Facility
Comments
Environment.NewLine is a two-character string so Replace shortens the original...
It checks if the length of the string is the minimum length necessary to return the version ending in "...". However, after that you replace the newlines with spaces, which means the string you work on will be shorter (depending on the platform possibly). So, kaboom.
Correct?
I wonder if the bug would have happened in Mono (on Unix).
There's another simple one in there too, yes? Your "numOfChars - 3" isn't checking if numOfChars is greater than 3, which it must be or a diff. exception is thrown. The error you got would not indicate that this was the bug, but the previous two got it, I think, so I thought I'd point out this extra one.
Why do you replace newlines when the string is cut, but not when it's whithin max characters? isn't that an unexpected side effect?
ie.
max. characters is 144,
if I write 5 lines with -144 chars, newlines are remained.
if i write 5 lines with +144 chars all newlines are replaced.
huge impact when showed on screen.
what Rafal says.
Something like myString.ToSingleLine().FirstCharacters(100) would probably be more explicit and less error prone.
I know it... i get it everytime :D
you didn't check if the actual string length is greater than (or egual to) 3
substring can't manage a negative range (0 to -1/-2/-3, in that example)
in the length check, you have to put too another expression, if lenght is greater than 3 (and maybe use a constant for this number)
" ". FirstCharacters(1);
Roberto, SubString accepts 0 as its range; it simply takes no characters.
the solutions mentioned so far were pretty easy to spot so i guess it was more complex than that.
a) when there are many new lines the second parameter is larger than the length of the string.
b) when num of chars was less than 3 the length was negative
here is a proof that no other explanation can exist. from the error message we know that the substring call is the culprit. as the first argument is an always valid constant the second argument must be the reason. it can either be < 0 or > string.length. so a) and b) are exhaustive.
Although most bugs were already found:
Replacing newline after checking string length
The offset of 3 for the ellipsis isn't taken into account when checking the length
I think I've got another one. Nothing prevents the method to be called with a negative int, isn't it? This would also blow up, I guess.
Safer to use an unsigned int for the number of characters?
On a side note, I also don't like some things about the implementation in general.
For instance that a null reference returns an empty string. I'd prefer to see it return null instead.
Also, if the length == numOfChars, why don't you just return the string? It's not too long in this case.
@Roberto: I think it's not that much the length of the string that should be greater than 3. Rather than the length of the ellipsis that should depend on numOfChars. If someone requests numOfChars == 1, you cannot add an ellipsis of 3 (and take it in account for your substring). The requested numOfChars should be > 3 for this implementation to work.
I think Roberto is right for negative substring ...
I wrote a test program and to my surprise the following code
asdf
returns " ..."
don't know why
This is exactly the sort of code which is easy to cover with unit tests. The fact it obviously isn't is a big WTF.
Rik,
Why do you assume that it isn't covered in tests?
Because it's very easy to make it fail. Or are the tests useless?
Rik,
I am watching the thread going on and growing more & more amused by the second.
People are trying to make this into a very generic routine, when in fact it is used only in very specific circumstances.
The tests are there for those circumstances, not for generic behavior.
In which case the code isn't covered by unit tests... which is the WTF.
I am not quite sure how to respond to that, but I think I'll refer you to my recent posts about unit testing.
Substring(0, numOfChars - 3) will fail for numOfChars=0,1,,2 hence FirstCharacters(0), FirstCharacters(1),FirstCharacters(2) will give you exception , provided your original string is long enough to bypass first two if conditions
@ Ayende,
Well, in all honesty it should be treated as a very generic routine. You are adding it as an extension method to String so it's as public and available as it could possibly be.
You've tripped one bug, but as mentioned by Roberto, there are at least 2 issues in there. (newline replacement reduces the length by 1 for each replacement so it will crash after 4 or more CRLFs, and numOfChars cannot be less than 3.)
For your specific error, this will occur when numOfChars-3 > self.Length after the replacement of the NewLine with " ".
This requires that there be 4+ NewLines in the string (and using multibyte NewLine), and numOfChars is between string.Length and (string.Length - numOfNewLine) + 4
I didn't take time to read the comments, but here are my two cents:
It's preety easy to Spot: You don't check numOfChars > 3
Despite the Length & Offset issues, i find something a lot more troublesome:
The fact that a method "FirstCharacters" removes spaces and that it appends "..." in some situations is something that i would consider a very naste side-effect of a method that returns the first characters of a string.
The string length could be less than 3 charachters
Well I had Pex running over the code and it has found the ArgumentOutOfRangeException 2 times. Once for numOfChars 0 and once for numOfChars as a negative number.
Did I cheat, or did I find a use for Pex? ;-)
The second parameter of SubString() would be negative if numOfChars == 1 or 2...
You are missing another pre-condition:
If(numOfChars <0) { throw ArgumentException(); } or
If(numOfChars <0) { return "" }
depending on desired solution...
You check for a shorter string length than numOfChars but not for one with the same length (which you could immediately return without change). Another one is that numOfChars could be smaller than 3, which would also cause the same exception as the length parameter for the substring would negative.
if (self == null)
if(self.Length == 0)
the first problem I would find with the method is that it doesn't describe what it does in it's name.
@ Ayende,
So, are you saying the possible negative parameters for SubString is not the problem since this routine is not designed for that purpose anyway? If so, I am out of clue why it throws that error. Can you shed some light?
Yes, the bug is that on Win Environment.NewLine is 2-bytes, and you replace it with one-byte space, so, for instance, this code (5 line breaks in it):
@"
".FirstCharacters(10);
gives you an exception you've listed above.
M?
Get creative people! I say Ayende needs to replace what he has with code that calls Regex.Replace successively, each time building on the last Regex.Replace. Incremental design at its best!
So what's the answer I'm curious now!?
A string of size 256 with a "\r\n" somewhere in it
Comment preview