GCHQ Challenge Solution – Stage 1

I heard about “the code” at www.canyoucrackit.co.uk during the first friday of december (2011-12-02), and cracked the final stage on sunday two days later. The reason for not cracking it all during friday evening was, unfortunately, not because it presented much of a challenge but because I was out partying pretty much 24/7 during friday and saturday (including after parties until dawn both nights). 🙂

I didn’t want to publish any writeup until the challenge was over, even though I heard that there were others who didn’t bother waiting. As a man who loves challenges of all kinds, I would hate to be the one spoiling it for someone who actually wanted to give it a try though.

The first part of the code was this picture:

I recognized it as x86-assembler at the first glance, because of the “eb 04” (jmp $+6), the multiple “31 cX” (xor reg, reg), the “90”:s in the end (NOPs) and last but not least the “cd 80” (int 0x80) to mention a few of the immediately recognizable opcodes. The int 0x80 also tells me that this is most likely a Linux-based payload, since 0x80 is the interrupt used for making system calls in Linux. Looking at the code, I could see that it actually just calls exit(0) after the relevant code has been executed.

By looking at the code in a disassembler I could see that it has a decode-loop, that looks very much like the RC4-cipher (initializing a 256 byte buffer with values 0 to 255, swapping values around based on what corresponds to the RC4-key (0xdeadbeef) followed by a decryption loop. Not that it really matters which crypto is being used, since the key is embedded into the code.

Looking closer at the decryption loop we can see that it is actually slightly different from the original RC4 cipher, even though the same key scheduling is used.

Original RC4 can be implemented like this:

This implementation changes j to the xor-byte in each iteration:

This is actually pretty funny, considering that GCHQ themselves have published an (very brief) explanation of the challenge now after it’s over, which describes the crypto as RC4. So from what I can tell, this is an unintentional bug rather than by design. A mistake that is easily made when you’re hand coding assembler, and not being careful of which registers are used for what, but a rather silly mistake nonetheless.

Anyway. Analyzing the code further we can see that the payload seems to be missing some important parts, like the buffer that is to be decrypted for instance. This, on the other hand, is definitely by design. 🙂

See the “AAAA” (41 41 41 41) at the end of the payload? That’s a tag that is checked by this piece of code:

So far so good. If the tag isn’t there, it jumps ahead to the part of the code that calls exit(0):

Then it continues with checking for another tag (“BBBB” = 42 42 42 42), this tag is _not_ included in the payload:

Of course, we can just append this tag to the code to bypass it. This doesn’t really help us much though, since the code then continues with reading a byte count to decrypt, followed by the actual encrypted data:

After this follows the code that actually performs the decryption, and finally it calls exit(0) just like when the tag on the stack is not found. The exit(0) could be replaced by code that writes the decrypted data to stdout, or we can simply set a breakpoint there (or manually insert a 0xcc = int3 there instead) so we can read the decrypted data in a debugger.

So, where is the missing data? My first thought was that the data is probably either hidden in the HTML code of the www.canyoucrackit.co.uk page, or that it is hidden within the image of the payload. Hiding it within the image of the payload seemed pretty likely, since it actually was a bit odd that they used an image at all instead of plain text that would allow for copy & paste instead of manually writing the payload down from the image.

There are several ways to hide data within an image, some more refined than others. The easiest way to do it is to simply inject the data to the hide into some form of meta data. Using “exiftool” we can extract the following text field (although I first noticed it with a simple “strings”):

This is quite obviously base64-encoded data, judging from the “==” at the end and the range of characters being used. The following command line reveals its decoded contents:

As you can see, it starts with the “BBBB” that the payload looked for after the end of the payload, followed by an 32-bit LSB integer (32 00 00 00 = 0x00000032 = 50 = The size of the encrypted buffer), and finally the 50 bytes of encrypted data.

In other words, the full payload is as follows:

By embedding this payload into a small C program  and injecting a breakpoint at the exit(0) call by manually changing the “cd” in “cd 80” (int 0x80, the system call interrupt) to “cc” (int3) lets me read the decrypted string in a debugger.

This gives us the URL for the next stage of the challenge