-
Notifications
You must be signed in to change notification settings - Fork 9
Tutorial: Extending Proteus
This page presents a simple extension to the processor in the form of an assignment. Completing this assignment can be a good way of getting familiar with the processor and its ecosystem.
For this project, you will be implementing a processor extension to protect against return address smashing attacks. The idea is simple: whenever a function call is made, the return address is encrypted. Before returning from a function, the return address is decrypted again.
The extension will be implemented in two steps. First, the encryption and decryption functionality is added while using a hard-coded key. Then, a new instruction should be implemented to dynamically set the key from software. But before creating this return address smashing defence, you are asked to create a proof-of-concept attack.
First, create a file smash.c
with the following contents:
#include <stdio.h>
#include <stdint.h>
#include "interrupts.h"
void smash_this() {
char buf[128] = {0};
gets(buf);
printf("stdin: ");
for (size_t i = 0; buf[i] != 0; ++i)
printf("%02x ", buf[i]);
printf("\n");
}
int main() {
enable_interrupts();
smash_this();
}
This file contains a function that reads a string from the standard input
and then prints the hexadecimal ASCII values of all characters to the standard
output.
Since gets
stores all input from the standard input until the next newline
into the destination buffer, this function is clearly vulnerable to a buffer
overflow attack.
Task 1: Craft a payload that, when provided as the standard input of the exploitable program, prints the string "p0wned!" to the standard output.
Some hints (which you do not have to follow):
- Embed the string in your payload and call
puts
with a pointer to this string; - Run the program in the simulator and use GTKWave to find the address of the
stack pointer when entering the
smash_this
function. This will give you the address ofbuf
; - Certain addresses (e.g.,
puts
) need to be hard-coded; - Make sure the only
0x0a
(newline) byte in your payload is at the very end.
You can use the files in the newlib
directory as a template for running the
simulation (make smash.bin
in this directory if smash.c
was saved
here). The standard input of the simulator
will be attached to the standard input of the program running in the simulator.
Therefore, the program can be run as follows (which might take a second or two):
$ echo 'Hello' | ../sim/build/sim smash.bin
stdin: 48 65 6c 6c 6f
If you implement your payload in shellcode.s
, you can compile to it to a
binary file that you can then supply as the imput to the simulation.
First, create a link script shellcode.ld
:
OUTPUT_FORMAT("elf32-littleriscv", "elf32-littleriscv", "elf32-littleriscv")
OUTPUT_ARCH(riscv)
SECTIONS
{
/* Read-only sections, merged into text segment: */
PROVIDE (__executable_start = 0x0);
. = 0x0;
.text :
{
*(.text)
}
}
Then, you can compile your payload as follows:
make shellcode.o
riscv32-unknown-elf-gcc -ffreestanding -nostdlib -T shellcode.ld -o shellcode.elf shellcode.o
make shellcode.bin
To be able to encrypt and decrypt return addresses, we have to identify when function calls and returns are made. Unfortunately, RISC-V does not define dedicated instructions for calls and returns and instead uses unconditional jumps in both cases. This means we need a heuristic to distinguish calls and return from normal unconditional jumps.
All unconditional jump instructions in RISC-V store the return address in the
destination register. The RISC-V calling
convention
specifies the use of the x1
(ra
) register for the return address of function
calls. Therefore, unconditional jump instructions that store their return
address in x1
can be considered function calls. Try to find a similar
heuristic for identifying function returns (hint: look at the assembly generated
by GCC).
For the jumps that are identified as calls, their return address should be
encrypted before being stored in their destination register. Similarly, for
returns, the target address should be decrypted before jumping to it. The goal
of this project is not to implement cryptographic primitives, so a simple xor
or similar operation suffices.
Task 2: Add logic to the
BranchUnit
plugin for encrypting and decrypting return addresses. Use a simple cryptographic primitive, such as axor
operation. The key to be used for encrypting and decrypting values should be hard-coded.
In principle, the only code that needs to be changed is that of the BranchUnit
plugin in src/main/scala/riscv/plugins/BranchUnit.scala
.
To be able to set the key used for encrypting return addresses from software, we need to add a new instruction. The RISC-V unprivileged ISA specification dedicates a full chapter (Chapter 35) to extending RISC-V. For our current purposes, however, Table 70 contains all necessary information. Every cell in this table marked with "custom-i" may be used to encode custom instructions.
The new instruction needs the 32-bit key as input and has no output value. This
means the R-type encoding can be used while ignoring the rs2
and rd
fields
(which means setting them to fixed values).
The default value for the key should be zero and this should be interpreted as disabling the encryption and decryption of return addresses.
Task 3: Implement a new instruction that reads a key from
rs1
and makes sure that key is used for any subsequent encryptions and decryptions of return addresses. When the key is set to zero, the return address protection feature is disabled.
As with the previous task, the implementation of the new instruction can be done
by modifying the BranchUnit
plugin.