Let's debug a kernel!
One of the major benefits of using a virtual environment is the ability to inspect and modify the state of the whole system under user control. The primary interface to these features is through the TCP-based GDB remote protocol compatible stub.
You will need an AArch64 compatible GDB. The versions that ship with Ubuntu 16.04 on AArch64, as well as the Linaro public releases since 7.1.1 (x86_64 Linux) have been tested to work. The AArch64 debug protocol was somewhat in flux until 2015, so very old GDB versions may not interoperate correctly.
If you are using the cloud product, you will need to be connected to VPN.
Note: The address and port provided here are for example purposes only. You will need to use the address and port for your particular virtual device. You can find the address and port for your device at the end of the "kernel gdb" link, located at the bottom of the virtual device page.
Example
1 2 3 4 5
(gdb) target remote 10.11.1.1:44219 Remote debugging using 10.11.1.1:44219 warning: No executable has been specified and target does not support determining executable automatically. Try using the "file" command. 0x00000008030e40c8 in ?? ()
1 2 3
(gdb) thread 2 [Switching to thread 2 (Thread 2)] #0 0x00000008030e40c8 in ?? ()
1 2
(gdb) monitor sr ttbr1_el1=0x0000000034d4593d CPU 1, ttbr1_el1 := 0x0000000034d4593d (before: 0x0000000000000000)
Otherwise, use regular GDB commands to control the debug stub.
The following instructions are for IDA 7.0 versions.
To access monitor commands from IDA, locate the GDB command line bar at the bottom of the window (just above the status bar, next to a GDB button). Enter the monitor commands there, without the word monitor itself. For instance, instead of monitor sr, simply write sr and press Enter. The output will appear in IDA's text output window above.
The platform supports the version of LLDB that ships with Xcode on macOS. Debugging with lldb is beneficial for virtual devices running iOS because lldb already has built-in knowledge of the Mach-O format and the fact that it is dealing with an XNU kernel. It is possible to use the debugger with or without the Mach-O kernel binary. Using the binary slows down load time, but gives you access to the symbols that are in the binary.
Note: The address and port provided here are for example purposes only. You will need to use the address and port for your particular virtual device. You can find the address and port for your device at the end of the "kernel gdb" link, located at the bottom of the virtual device page.
Example
1 2 3 4 5 6 7 8 9 10 11 12 13 14
$ lldb (lldb) gdb-remote 10.11.1.1:44219 Kernel UUID: B99BA98C-3AAA-30E9-B733-07845A9EC56B Load Address: 0xfffffff007004000 WARNING: Unable to locate kernel binary on the debugger system. Process 1 stopped * thread #1, stop reason = signal SIGINT frame #0: 0xfffffff0071a513c -> 0xfffffff0071a513c: bl 0xfffffff00709fc00 0xfffffff0071a5140: mrs x8, TPIDR_EL1 0xfffffff0071a5144: ldr x8, [x8, #0x428] 0xfffffff0071a5148: str xzr, [x8, #0xf0] Target 0: (No executable module.) stopped. (lldb)
Example
1 2 3 4 5 6 7 8 9
$ lldb ~/Documents/Work/kerneli61125 (lldb) gdb-remote 10.11.1.31:39535 Kernel UUID: B99BA98C-3AAA-30E9-B733-07845A9EC56B Load Address: 0xfffffff007004000 Kernel slid 0x0 in memory. Loaded kernel file /Users/planetbeing/Documents/Work/kerneli61125 Loading 140 kext modules warning: Can't find binary/dSYM for com.apple.kec.corecrypto (A6668145-C49A-3D84-96E6-80AE4AA9E4B4) .warning: Can't find binary/dSYM for com.apple.kec.Libm (51AFA03E-8041-3D11-BD40-A6D1AED1C667) .warning: Can't find binary/dSYM for com.apple.kec.pthread (75DF2E44-845A-3C15-987F-D53AC36CFD72)
...
1 2 3 4 5 6 7 8 9 10 11 12 13
.warning: Can't find binary/dSYM for com.apple.driver.usb.cdc (0B64EFC9-CF03-37AF-8C86-4A0C419BC87B) .warning: Can't find binary/dSYM for com.apple.driver.AppleUSBDeviceMux (84B68AED-C037-3166-BA42-906FE26FA76F) . done. Process 1 stopped * thread #1, stop reason = signal SIGINT frame #0: 0xfffffff0071a513c kerneli61125`___lldb_unnamed_symbol1665$$kerneli61125 + 296 kerneli61125`___lldb_unnamed_symbol1665$$kerneli61125: -> 0xfffffff0071a513c <+296>: bl 0xfffffff00709fc00 ; ___lldb_unnamed_symbol102$$kerneli61125 0xfffffff0071a5140 <+300>: mrs x8, TPIDR_EL1 0xfffffff0071a5144 <+304>: ldr x8, [x8, #0x428] 0xfffffff0071a5148 <+308>: str xzr, [x8, #0xf0] Target 0: (kerneli61125) stopped. (lldb)
The gdb stub represents CPU cores as threads.
1 2 3 4 5 6 7 8 9 10 11 12
(lldb) thread list Process 1 stopped * thread #1: tid = 0x0001, 0xfffffff0071a513c, stop reason = signal SIGINT thread #2: tid = 0x0002, 0xfffffff0071a513c (lldb) thread select 2 * thread #2 frame #0: 0xfffffff0071a513c -> 0xfffffff0071a513c: bl 0xfffffff00709fc00 0xfffffff0071a5140: mrs x8, TPIDR_EL1 0xfffffff0071a5144: ldr x8, [x8, #0x428] 0xfffffff0071a5148: str xzr, [x8, #0xf0] (lldb)
You can use the regular lldb commands to control the debug stub.
Your debugger will work as if it was attached to a hardware debugger (think OpenOCD).
The hypervisor can trap certain instructions in the guest kernel and execute short pieces of user-supplied code before either executing or skipping the trapped instruction.
The hooks are written in a simple programming language and run in the hypervisor environment without pausing the VM or engaging the debugger.
The monitor command form shown here is appropriate for GDB. In IDA, you can enter monitor commands without the leading mon. In LLDB they are generally not available because of LLDB deficiencies.
How to set a hook:
1
mon patch 0xfffffff0061b8ca4 print(“hello world\n”);
A hook can be deactivated with:
1
mon patch 0xfffffff0061b8ca4 -
The virtual address is translated to physical based on current pagetable, so before the guest kernel is booted you have to use a physical address (translation is disabled).
Not all instructions can be hooked. The set of legal instructions includes all branches, most typical function prologue instructions and simple register MOVs.
Hooks run without locking on multiple cores. This is a major difference from using GDB breakpoint scripts, and (apart from the obviously better performance) allows investigating race issues in the guest.
Let's say the function at 0xfffffff001234570 is a memory allocator, with the interface of malloc, and we want to tag every allocation with the calling address. This can be done with kernel hooks with fairly low effort.
First, we need to enlarge the allocation by 8 bytes to leave space for the tag, and save the size for later:
1
mon patch 0xfffffff001234570 global[0] = cpu.x[0]; cpu.x[0] += 8;
Then, we find the RET opcode the allocator function uses - let's say it's at 0xfffffff0012346e4. We can patch this opcode to write LR (return address) before actually returning:
1
mon patch 0xfffffff0012346e4 if(cpu.x[0]) { *(u64 *)(cpu.x[0] + global[0]) = cpu.x[30]; }
This is however not fully reentrant. The issue could be solved by adding a stack frame (remember that AArch64 SP must be 16-byte aligned) and not using global[0]:
1 2
mon patch 0xfffffff001234570 cpu.sp -= 16; *(u64 *)cpu.sp = cpu.x[0]; cpu.x[0] += 8; mon patch 0xfffffff0012346e4 size = *(u64 *)cpu.sp; if(cpu.x[0]) { *(u64 *)(cpu.x[0] + size) = cpu.x[30]; } cpu.sp += 16;
The language used to write hooks is superficially similar to C, however it has the following differences from it:
The supported control structures are: if, if-else, while, do-while, break and for.
Built-in integer types can be referred to with standard C names (unsigned long etc.), stdint-style names (uint64_t) or short names (u64).
The unary & (reference) operator is not supported.
Variables declared with static keep their values between invocations of the hook, just like static variables inside a C function. There is a limit of 8 such variables in a given hook, imposed by the execution environment.
Accessing processor state is done by using a pseudo-struct cpu. The supported fields in that structure are:
Writing these fields will modify the corresponding processor state upon return to VM.
Accessing VM memory (kernel virtual address view) happens by dereferencing pointers. For instance,
1
print_int(“foo”, *(u64 *)0xfffffff001234568);
will print the value of a 64-bit word at 0xFFFFFFF001234568 in the VM.
The following functions are available:
1
void print(strlit s);
prints a literal string.
1
void print_int(strlit s, u64 a);
print an integer with optional string prefix s (pass 0 to not print a string).
1
void print_str(strlit s, void *p, u64 z);
print a zero-terminated string from VM memory at p, maximum z bytes, optional prefix s.
1
void print_buf(strlit s, void *p, u64 z);
print a buffer from VM memory at p, maximum size z bytes, with optional prefix s.
1
void print_thread(strlit s, u64 t);
print info on thread t (pass 0 for current) with optional string prefix s.
1
void print_backtrace(strlit s, u64 d);
print backtrace, maximum depth d, with optional string prefix s.
1
void usleep(u32 usec);
delay for usec microseconds.
1
void debug(void);
cause the VM to take a debug trap immediately after the hook returns.
1
u64 mapped(type *p);
check if VM memory location is valid; the type determines the size of location.
1
u64 mapped(void *p, u64 z);
check if VM memory range is valid; the size is defined by z.
1 2
u64 min(u64 a, u64 b); u64 max(u64 a, u64 b); s64 min(s64 a, s64 b); s64 max(s64 a, s64 b);
minimum/maximum for unsigned and signed values.
Additionally, the constant NULL is defined as 0.
The global[0] to global[63] are 64-bit numbers shared between all hooks installed on the VM.
Sometimes it's desirable to break into the kernel debugger from user software running in EL0. Inserting a BRK opcode usually results in it being intercepted by the operating system kernel. But if you want to end up directly in the kernel debugger instead, you can use the following opcode sequence:
1
hvc #0x7242
1 2
mrs xzr, cntpct_el0 hvc #0x7242
1 2
mrs xzr, pmcr_el0 hvc #0x7242
1 2
mrc p15,#0,r0,c9,c12,#0 hvc #0x7242
After running this code, the VM will enter a paused state. If a debugger is attached, this will result in the debugger registering a debug event (similar to a breakpoint).
Sometimes, when writing patches to kernel or user applications, it's nice to be able to print output without having to ask the operating system to do so. In fact, it can be fairly hard to get to debug output from some places, such as EL0 software that has no standard output and disabled debugging calls.
This feature helps you get debug output from patches, shellcodes or maybe just helper programs without having to deal with operating system constraints.
Both EL0 and EL1 code can print directly to VM console via a special HVC (hypervisor call). The form of the HVC itself depends on environment:
1
hvc #0x6C43
1 2
mrs xzr, cntpct_el0 hvc #0x6C43
1 2
mrs xzr, pmcr_el0 hvc #0x6C43
1 2
mrc p15,#0,r0,c9,c12,#0 hvc #0x6C43
To use the HVC, set registers x0 .. x2 (or r0 .. r2 for 32-bit code) to the following values:
CONSLOG_REQ_STR: prints a zero-terminated (C) string
1 2
x0 = 0xFFFF0000 x1 = pointer to string
CONSLOG_REQ_U64: prints a number as hex
1 2
x0 = 0xFFFF0001 x1 = number
CONSLOG_REQ_S64: prints a number as signed decimal
1 2
x0 = 0xFFFF0002 x1 = number
CONSLOG_REQ_HEX: prints a buffer as hex dump
1 2 3
x0 = 0xFFFF0003 x1 = pointer to buffer x2 = size in bytes
The memory printing calls (CONSLOG_REQ_STR and CONSLOG_REQ_HEX) return status in x0; if negative, the call failed. If non-negative, returns number of bytes retrieved from the buffer (or string). The other calls simply return zero in x0.
Text printed through this output path will show up in the VM console, in cyan color using ANSI control characters.
An example for the string printing function, including handling EL0 page faults, is in the Corellium GitHub repository at guest-tools - the conslog program will echo its standard input to VM console (in the highly visible cyan color of hypervisor messages).