Reverse Engineering with Ghidra
Some notes about ghidra
Overview
Open source SRE tool developed by NSA
Provides a disassembler and decompiler
Large library of supported processors / architectures
Custom processors can be added via SLEIGH modules
Loading a Binary
We can inform Ghidra about the target binary
Architecture / Language
File Format
Ghidra will attempt to autodetect features based on the file format
In our case these features are provided by the ELF header
Initial Analysis
During auto-analysis Ghidra will also attempt to:
Create and label functions
Identify cross references in memory (xrefs)
Navigation
Once the analysis window is done, the program can be explored
Some of the default CodeBrowser windows include:
Program Tree โ this shows the segments of the ELF file
Symbol Tree โ lists and displays all currently defined symbols
Data Type Manager โ shows data types inferred during auto-analysis
Listing โ the resulting assembly code from auto analysis
Console โ tool output / debugging information
Double clicking on Xrefs will navigate to that location
Decompiler
One of Ghidraโs most powerful features is the decompiler
Implemented utilizing Ghidraโs P-Code
The decompiler creates C code from the analyzed P-Code
How to identify basic C constructs in assembly language
Program Startup
There is additional code outside of our main function. These additional blocks of code are used to properly launch the binary. Program startup and behavior is defined by the System V ABI.
Within the ELF header, there is an e_entry field, this field points to the _start() function. This will call the main function.
TIP: Function Signatures
Function signatures can be edited in Ghidra, altering:
Argument count
Argument types
Return values
Fixing the function signature can greatly improve decompiler output
The C standard defines the arguments passed to a main function:
int argc = Argument Count
char **argv = Argument vector
TIP: Import and Exports
Imports and exports can be viewed from the Symbol tree
Imports: libraries that are utilized by your target binary
Exports: Exposed information about our binary for the operating system loader
Note: If you canโt find main, start with _start!
Control Flow
Control flow is the order in which instructions are executed
if /else, While or for are some good exemples
RIP contains the address of the next instruction to execute
The JMP instruction (and others) can alter RIP:
JMP ADDR
JMP can also selectively execute based on the RFLAGS register
JE: Jump if equal/zero
JNE: Jump if not equal/nonzero
JG: Jump if greater (signed)
JL: Jump if less (signed)
TIP: Graph View
When looking at multiple branches, graph view can be helpful
Window -> Function Graph
Switch Cases
Switch cases allow a variable to be compared against a list of values, each value being compared against is a case. The expression for the case must have the same data type as the variable in the switch
TIP: Converting Data
In the listing view, data types can be converted
This can be used to make the decompiler output more readable
Loops
Loops allow repeated execution of a block of code
For and while have almost the same representation in assembly code.
TIP: Highlighting / Slicing
When viewing the assembly listing or decompiler view, items can be highlighted
Slicing can be applied in the decompiler window
Ghidra will attempt to synchronize highlights between disassembly / decompiler views
Variables
When a variable is declared, it is declared within a particular scope
Local Variables: Defined within a function, only accessible within the function and have names like local_0 or local_1. The address that can be represented in assembly are things like:
mov DWORD PTR [rbp-0xc]
Global Variables: Declared outside of a function and can be used in all functions.
DWORD PTR [rip+0x2009e6]
TIP: Labelling/Renaming
Variables and offsets can be labelled in Ghidra. Ghidra will attempt to synchronize the variable names between the listing view and decompiler view.
Functions
Functions are called using the call instruction. call pushes the return address to the stack when called.
ret is used to return from a function
Functions: Calling Conventions
Calling conventions define how function calls are implemented:
How arguments are passed to functions
How return values are pass back from functions
Stack management and register cleanup
ABI = Application Binary Interface
Functions: Prologue/Epilogue
Functions can be thought of as three components:
Prologue
Body
Epilogue
The prologue reserves space for variables on the stack
The epilogue cleans up the stack frame and returns it to itโs original state
Heap Memory
The heap is used for dynamic memory allocations:
Used when the size of a variable can be varied
malloc/calloc โ Used to allocate
Heap variables can be accessed globally
Array Accesses
Last updated