Must resist... decoding
Rob Conery has posted an interesting poem, I wouldn't really mind, except that he posted that in binary, which meant that I really had to figure it out.
Can you resist the urge
1110111 1101000 1111001 100000 1100100 1101001 1100100 100000 1111001 1101111 1110101 100000 1101101 1100001 1101011 1100101 100000 1101101 1100101 100000 1100100 1100101 1100011 1101111 1100100 1100101 100000 1100010 1101001 1101110 1100001 1110010 1111001 100000 1100100 1100001 1110100 1100001 111111
Comments
No, I couldn't resist the urge... but it did give me an opportunity to give my python scripting a quick refresher course.
...no I couldn't either...
StringBuilder sb = new StringBuilder();
foreach (string line in File.ReadAllLines(@"C:\temp\poem.txt"))
sb.Append( (char) Convert.ToByte(line, 2));
Console.Out.WriteLine("{0}", sb);
"There's no earthly way of knowing..."
I noticed the Rob's message contains only first part of ASCII table characters. Thus every byte always has 0 bit at position 8.
..and it seems Ayende noticed it too here... :) and his message contains 7 bits per character (not 8).
Who wants an answer - it's here (com on... decode it yourself... it's fun...):
http://dnagir.blogspot.com/2007/10/matrix-can-you-read-it.html
Cheers.
1000010 1100101 1100011 1100001 1110101 1110011 1100101 100000 1100010 1101001 1101110 1100001 1110010 1111001 100000 1101001 1110011 100000 1110100 1101000 1100101 100000 1101110 1100101 1110111 100000 1010010 1110101 1100010 1111001 101110 100000 1010010 1110101 1100010 1111001 100000 1110111 1100001 1110011 100000 1110011 1101111 100000 1010011 1100101 1110000 1110100 1100101 1101101 1100010 1100101 1110010 100000 110010 110000 110000 111000
Russel, LOL.
Dmitriy,
That is because we are using unicode, but talking in English.
I don't think there's something to do with Unicode. Eighth bit just represents Extended ASCII: http://en.wikipedia.org/wiki/Extended_ASCII
Unicode in most cases would take more than one byte and it would not be so easy to decrypt it here :)
But there's something to do that we all are talking in English :)
Hm, I just tested it with Hebrew, this is the output.
1110111 1101000 1111001 100000 1111111111111101 1111111111111101 100000 1111111111111101 1111111111111101 1111111111111101 1111111111111101 100000 1111111111111101 1111111111111101 100000 1111111111111101 1111111111111101 1111111111111101 1111111111111101 1111111111111101 111111
If I could only read Hebrew :) Maybe I'll meet a guy these days and he will translate me :)
Now you used some kind of Unicode (probably variable-length UTF-8?). It's still possible to have it in a single byte: http://en.wikipedia.org/wiki/Windows-1255
Now decoding can be a bit more complicated if you'll chose a "secret" encoding :)
I think you did something wrong there...
Also, I searched endlessly for the simplest way to parse a binary number. You'd think it'd be somewhere in Byte.Parse or Int32.Parse, but no... of course it's in Convert.ToByte(string, int). Go figure.
Here is my secret algorithm:
cy = "string to code"
for c in cy:
Hebrew has the longest bytes I ever seen 1111111111111101 LOL

Bil,
you realize that you should have put spaces there, right? You had me re-write my decoding code.
And yes, as a matter of fact, I would. Do you do dry cleaning as well?
@Ayende: your "secret algorithm" has a bug, which makes decoding less than intuitive. You should've cast your chars to short rather than in, since the native encoding is UTF16.
Now repeat after me as I repeat after you: System.Text.Encoding.UTF8.GetBytes().
You are correct, except that Convert.ToString() will take care of it.
I still think this will get you signed/unsigned problems. Anyway, the only non-ascii character in your code is the repeated 1111111111111101 which doesn't translate to a hebrew character. I'm getting this:
"why ?? ???? ?? ??????" (the questionmarks are unidentified unicode characters)
My secret algorithm is this:
for line in System.IO.File.OpenText(argv[0]):
Avish, you cannot print Hebrew characters to the console.
Write them to a file, it should come up okay.
I was writing them to a file (piping not shown in original code). Gimme a little credit here :)
Also, if they are Hebrew characters, they're all the same single Hebrew character. The only Hebrew character I can think of that can come in so many repetitions is "ח", and I doubt you use that expression.
Comment preview