Shellcode creation and binary execution through execve
In this guide I will show you how to create shellcode and execute binaries using the execve function.
Motivation
Well, Assembly language is amazing, so why dont we learn some fancy ways to generate shellcode and execute programs?
When developing exploits, sometimes we will gonna have to generate our own shellcode and this technique is the way to go.
Another case is for example a shellcode runner, we can put our Assembly inline with the programming language and execute it to get a reverse shell and so on.
Pre-concepts before we continue
I know what you are thinking, “Assembly is hard”, “We really have to do it in assembly?” and the answer is YES! The learning curve of this language is challenging but its always possible to know more about it.
Here I leave some courses for you:
- Assembly Language Adventures
- SLAE32 - Pentester Academy
Why 32 bits first?
Because it’s easier and there are applications out there that still uses 32 bits.
Some concepts change in x64, so its better to learn the logic and concepts in 32 bits, then 64 will be better to assimilate.
Assembly x86 lightning course
Lets pick some basic concepts before we continue.
Registers
Processor’s fast internal memory to store and move data. There are types of registers and each type has its own purposes (Nowadays all registers can be used as general purpose).
You can also think that registers are like variables.
Lets pick some examples.
- EAX -> 32 bit register, can store a DWORD and be further divided in AX, AH and AL. (Commonly used as an accumulator for mathematic operations, store return addresses and syscalls)
EAX (32 Bits) -> 0x12345678
AX (16 Bits) -> 5678
AH (8 Bits, "higher") -> 56
AL (8 Bits, "lower") -> 78
- EIP -> Stores the next instruction to be executed (Gain control of EIP and control the execution flow of your program, you can imagine why).
- ESP -> Pointer to the top of the stack.
- EFLAGS -> Contains other register like ZF,OF (They are used for example in conditional branching).
- More info on registers
Zeroing Registers (And removing null bytes)
There are alot of situations in which we need to zero a register or remove specific characters from a shellcode, one example is related to remove null bytes to use it later in exploits.
Some techniques:
1. Using the lowest part of the register
This is very interesting, we can zero the register with a XOR operation and put some content in AL.
Example
xor eax,eax ; Make the register contains 0
mov al, 5 ; Move 0x05 to AL
EAX now contains 5.
2. Negative numbers
Another cool trick, its possible to have our selected value in situations where we cant use the number.
Imagine the situation where we are writing some exploit and cant use the number 0x00000001 because it contains null bytes (These zeros in the address).
To solve this problem, its possible to use the NEG instuction in assembly, this instruction will do its two complement and put the lower part in the register (What?)
Lets ilustrate this.
mov eax, 0xffffffff
neg EAX
EAX will contain 0x01 without the null bytes.
What happens here is when use the NEG instruction, it will flip the bits and add 1 to the result (Also known as two’s complement in assembly)
Binary representation of NEG instruction
11111111
After flip
00000000
Add 1
00000001
This way we can put the number inside EAX without actually putting it.
You can use your imagination, there are alot of things to create, another example, instead of using NOP operations (0x90) we can use XCHG EAX,EAX. (This effectly do nothing).
Segments
Other important concept are data segments, which are places where our code is put during the execution of assembly.
Some examples:
- .text segment -> Where the executable code is stored
- .bss -> Uninitialized variables
- .data -> Initialized variables
Here is an example of a program that prints our lovely “Hello World” message
global _start
; Define an area for our code, in this case the text section
section .text
; Define program entry point
_start:
; Print the message on the screen
mov eax, 0x04 ; Syscall number (Write = 4)
mov ebx, 0x01 ; Function argument 1 (Stdout)
mov ecx, message ; Function argument 2 (Pointer to the message)
mov edx, mlen ; Function argument 3 (Message lenght)
int 0x80 ; Invoke Interruption (syscall)
; Exit the program
mov eax, 0x01 ; Exit syscall number
mov ebx, 0x01 ; Arbitrary return value
int 0x80 ;
; Define an are to initialized data (.data section)
section .data
message: db "Hello World!" ; Here, we define a label with our string
mlen equ $-message ; Using the equ function to count the lenght of our message
Its easier to see how the data segments are used with this piece of code.
Stack
Stack is a data structure where we can store values for further processing, its a LIFO structure (Last in first out) where values are PUSHED to the stack and removed with the POP instruction (PUSH AND POP).
Think it like a pile of plates. You put one plate inside the other and when you need to remove, remove the one that is in the top. (Last in, first out).
Endianess
Sometimes your brain will get so confused you want to stop and take a break, the reason for that in many cases is Endianess.
This is the way the stack stores information, for our purpose, we will take a look at little endian.
A little-endian system stores the least-significant byte at the smallest address.
What does it mean?
Lets look at an example.
When we have an address like 0x12345678 the least significant bytes are the ones from right to left (876..), so this address when put into a stack will be shown as:
Lower addresses
8
7
6
5
4
3
2
1
High Addresses
The memory will always store information in lower and lower addresses. This is important when we develop exploits as we need to pass the address in this form to be correctly interpreted in memory.
There is also Big Endian, but I will leave you as an exercise to search about it.
Syscalls
In Linux, syscalls are ways to interact with the kernel to do specific actions, like write something on the screen, exit programs, create files and so on. (In Windows, they are called Api’s).
To execute a syscall in Linux, you need to pass its value in EAX and the other values in EBX,ECX..After this you will issue the command int 0x80 that will look up in the syscall table and execute the action.
Example
The exit syscall is used to finalize program and set a status code.It Contains the follow definition:
void exit(int status);
It receives one argument, the status. Converting this to assembly:
; Exit the program
mov eax, 0x01 ; Exit syscall number
mov ebx, 0x01 ; Arbitrary return value
int 0x80 ;
Passing the syscall number (1) in eax, the status in EBX and then issuing the int 0x80 to invoke the syscall.
EXECVE
From the man pages (man execve
):
execve() executes the program referred to by pathname
Definition
int execve(const char *pathname, char *const argv[],char *const envp[]);
Interesting info
argv is an array of pointers to strings passed to the new program as its command-line arguments. By convention, the first of these strings (i.e., argv[0]) should contain the filename associated with the file being executed. The argv array must be terminated by a NULL pointer.
(Thus, in the new program, argv[argc] will be NULL.)
envp is an array of pointers to strings, conventionally of the form key=value, which are passed as the environment of the new program. The envp array must be terminated by a NULL pointer.
We will use this function to execute our program with the following values.
- The pathname is the path for the program we want to execute (Example
/bin/bash, 0x00
) - The second argument will be the address of
/bin/bash
followed by null bytes (As this value is an array of pointers) - The third argument will contain null as well.
One interesting fact is that our payload cannot contain null bytes. So how we will leverage that?
Building our assembly
To solve the problem we faced earlier, we will gonna use the first technique to zero out the register.
/bin/bash trickery
The stack is aligned with 4 bytes in x86 processors, so we need to push values multiple of 4 and reverse them (Little-endian).
One thing that we can do is to push in the following order
/bin (4 bytes)
bash (4 bytes)
//// (4 bytes)
No matter how many /
we put in our command, the prompt will interpret as one.
Final Assembly code
This is our final code
; Execute programs through execve in Assembly
; Author: pop3ret
; The _start directive defines the start of our program
global _start
_start:
; .text segment, where our code resides
section .text:
xor eax,eax ; Zero out the register
push eax ; put 0x0 into the stack
; Put ////bin/bash into the Stack, remember that this value needs to be
; multiple of 4 to align the stack. (First argument)
push 0x68736162 ; /bin (Reversed)
push 0x2f6e6962 ; bash (Reversed)
push 0x2f2f2f2f ; ////
; Puts the other arguments into the stack
mov ebx, esp ; EBX will now point to the string ////bin/bash,0x0 (First argument). Each push, esp = esp - 4
push eax ; PUSH 0
mov edx,esp ; EDX now points to null (Third argument)
push ebx ; EBX contains a pointer to /bin/bash
mov ecx,esp ; ECX points to the address of /bin/bash (Second argument)
; Calling execve (Syscall 11)
mov al,11
int 0x80
Let’s compile this ASM with this simple bash script
#!/bin/bash
#############################################
# Simple compiler and linker for nasm (x86) #
# Author: pop3ret #
# Usage: ./compiler.sh asm_name #
#############################################
## @ Main
# Compiling
nasm $1.asm -o $1.o
# Linking
ld $1.o -o $1
And execute
./execve
After the execution we get our /bin/bash
.
Shellcode
Shellcode is basically the opcodes that will execute our assembly
Example with objdump:
objdump -D execve -M intel
The shellcode will be the second column. (31 c0..)
To grab this information and format, I will use a shell script from commandlinefu (Link)
objdump -d ./execve|grep '[0-9a-f]:'|grep -v 'file'|cut -f2 -d:|cut -f1-6 -d' '|tr -s ' '|tr '\t' ' '|sed 's/ $//g'|sed 's/ /\\x/g'|paste -d '' -s |sed 's/^/"/'|sed 's/$/"/g'
And put the result inside this C file
#include <stdio.h>
char shellcode[] = "\x31\xc0\x50\x68\x62\x61\x73\x68\x68\x62\x69\x6e\x2f\x68\x2f\x2f\x2f\x2f\x89\xe3\x50\x89\xe2\x52\x89\xe1\xb0\x0b\xcd\x80";
int main(void)
{
int (*ret)() = (int(*)())shellcode;
ret();
}
Compiling
gcc -fno-stack-protector -z execstack shellcode.c -o shellcode
And execute to get a shell!