Exploring UNIX pipes for iOS kernel exploit primitives

Written by Chris Williams | Nov 6, 2023 3:58:57 PM

Disclaimer: All technical explanations are to the best of my knowledge and subject to human fallibility. Concepts may be overly simplified intentionally or otherwise.

While playing with Corellium to practice developing exploits with previously-patched bugs, I started to think about how Corellium's hypervisor magic could be used to practice on generalized techniques even without an underlying vulnerability. A particular paragraph by Brandon Azad inspired the concept:

"Second, I wanted to evaluate the technique independently of the vulnerability or vulnerabilities used to achieve it. It seemed that there was a good chance that the technique could be made deterministic (that is, without a failure case); implementing it on top of an unreliable vulnerability would make it hard to evaluate separately."

In the browser world, a typical exploit strategy would take two ArrayBuffer objects and point the backing store pointer from one at the other, such that arrayBuffer1 can change arrayBuffer2->backing_store_pointer arbitrarily and safely, such as in this example from my Tesla Browser exploit:

The important part of the above diagram is the green box, corresponding to arrayBuffer1, and its backing store pointer containing the address of arrayBuffer2 (the standalone gray box on the right). By indexing into arrayBuffer1, fields within arrayBuffer2 can be modified, especially arrayBuffer2->backing_store_pointer. Indexing into arrayBuffer2 will now read/write the desired arbitrary address.

The iOS kernel, having a BSD component, contains an obvious equivalent: UNIX pipes. The pipe APIs are used much like files in typical UNIX fashion, but rather than being backed by a file on disk, their contents are stored in the kernel's address space in the form of a "pipe buffer" which is a separate allocation (by default 512 bytes, but can be expanded by writing more data to the pipe). Controlling the pipe buffer pointer creates arbitrary read/write primitives in the same way as controlling an ArrayBuffer's backing store pointer in a Javascript engine.

For example, this snippet will create a pipe, which is represented as a pair of file descriptors (one "read end" and one "write end"), and then write 32 bytes of A:

This creates at least two kernel allocations: A struct pipe and the pipe buffer itself. To build out the technique, we first need a simulated vulnerability.

Corellium is Magic

Corellium has a very neat feature that allows userland code to arbitrarily read/write kernel memory. While this will be perfectly reliable, for the sake of argument we'll pretend that there's a chance of failure leading a kernel panic. Thus, the whole point of the pipe technique is to "promote" from unreliable primitives to better primitives. Our example primitives will be an arbitrary read of 0x20 bytes (randomly chosen) and arbitrary 64-bit write^1:

For additional realism we could add a random chance of failure, for example 10% chance of causing a kernel panic for each usage, or increasing the probability of failure each time. For the purpose of building out the technique I decided to keep it at 100% reliable, however.

Importantly, these primitives don't provide a KASLR^2 leak, so part of the development process will be working around that weakness. Corellium does have another magic hvc call that gives the kernel base address, but I chose not to use it.

Building up the pipe primitives

To start, we need two pipes, with allocated buffers. This is very similar to the basic pipe example above:

Now we need to locate these structures in kernel memory. One approach would be to use the arbitrary read to walk the struct proc linked list to find the exploit process, then walk its p_fd->fd_ofiles array to find the pipe's fileglob, and finally read fileglob->fg_data, which will be a struct pipe. Unfortunately, that requires many reads, and we're pretending that the read primitive is unreliable. It also requires knowing the KASLR slide in order to find the head of the struct proc list. We need a different approach.

Fileports: The Reese's Peanut Butter Cup of XNU

There's an API for sharing a UNIX file descriptor via Mach ports, and spraying Mach ports has been a common technique for quite some time. The fileport creation API is very simple:

By making a huge number of these (say, 100k), the odds of one of the Mach ports landing at a predictable address are quite high. The port's kobject field points to the pipe's fileglob object. This contains two very useful fields:

fg_ops: a pointer to an array of function pointers. This is how the kernel knows to call pipe_read rather than vn_read (used for regular files on disk). This pointer is within the kernel's __DATA_CONST section, which means that it's a KASLR leak!
fg_data: a pointer to the struct pipe, which is what we wanted in the first place.

The struct pipe then contains an embedded structure (struct pipebuf) which holds the address of the pipe buffer^3. With two uses of the arbitrary read, we can identify the address of a struct pipe. For our purposes, we have to do it again to locate pipe2, so a total of four uses of the arbitrary read. But how do we figure out which kernel address to guess?

MORE CORELLIUM MAGIC: HYPERVISOR HOOKS

Rather than wildly guessing, we can use hypervisor hooks to output the address of each fileport allocation, and then pick one that shows up in multiple runs.

The hooks are placed via a debugger command, but run independently of the debugger after that. Consequently, they run much faster than breakpoints, and can log directly to the device's virtual console, which will make it easy to extract the data for later analysis.

Our hook will be about as trivial as possible: Simply print the value of a register when a particular address is executed. This is performed with a limited C-like syntax in the form of a one-liner:

process plugin packet monitor is lldb's overly verbose syntax for sending raw "monitor" commands^4 to the remote debugger stub. The hooks documentation says that these commands are "generally not available" with lldb, but at least this basic hook seems to work.

The rest of the command hooks the desired address and prints the contents of the X0 register to the device's console. Fortunately, the output of hooks are displayed in a different text color, so it's easy to spot.

To prepare the hook, we need to identify an address to patch where the address of the new allocation will be in a register. Looking at the implementation of fileport_makeport:

At mark #1, the file descriptor is received from the arguments structure, and will match the same integer representation of the pipe's fd as seen in userspace.

Mark #2 performs the translation of the fd (e.g. 3) to a pointer to the fileproc object representing the pipe in the kernel's memory. Then at mark #3, the fp_glob pointer is dereferenced, retrieving the fileglob for the pipe.

Mark #4 creates the Mach port, which wraps the fileglob object, placing its pointer in the kobject field. fileport is the address we want to log, and it's the return value from fileport_alloc, so it'll be in the X0 register. Let's take a look at fileport_alloc:

This function is short and only referenced once, so it'll likely be inlined. Now that we know the lay of the land, we need to find the equivalent code inside the kernelcache. Fortunately, jtool2 can help with that. After downloading the kernelcache from the Corellium web interface's "Connect" tab, jtool2's analyze feature can be used to create a symbol cache file:

And then that file can be grepped to find the two symbols we need:

Now we simply locate the call to ipc_kobject_alloc_port from within fileport_makeport:

The instruction after the call is the one to hook, so 0xFFFFFFF00756F4F8 (unslid). Since KASLR is enabled, patching this address directly won't work^5. Fortunately, as previously mentioned there's yet another bit of hypervisor magic: a way to obtain the slid kernel base from userspace by calling their provided get_kernel_addr function:

By placing this snippet at the beginning of the exploit, it provides a moment to get the debugger attached and install the hook, providing the correct slid address for the given kernelcache.

Once the hook is in place, we perform the spray of 100k fileports and select an allocation to use as the guess going forward. I simply scrolled up a bit and picked one at random about 3/4 of the way down the list, and that seems to work well enough for a proof of concept. A more serious implementation would track ranges over multiple runs and try to pick an address with a known high probabilty of landing the spray, such as in Justin Sherman's IOMobileFrameBuffer exploit.

Now that we have a guess, we can perform the same spray twice (once per pipe read-end fd) and read the kobject field to locate the struct pipe. Here's the full implementation:

Plumbing the pipes together

Now that we know where our pipes are, we can simply write a single 64-bit value and have a reliable method of arbitrary read/write! struct pipe contains an embedded structure, struct pipebuf, which contains all of the fields we care about:

The in and out fields are used as cursors to keep track of the current offsets for write and read operations on a pipebuf, and the buffer field points to the kernel memory containing the pipe's data. The next step is very simple, just set pipe1's buffer address (offset +0x10 from the struct pipe) to the address of pipe2's struct proc:

And now by reading from and writing to pipe1, we can control the buffer pointer and in/out fields of pipe2 reliably and safely:

Where prw is a structure matching the layout of struct pipebuf:

And then the write primitive works similarly:

Now that the new primitives are set up, we can test them out by reading and writing some known values, for example the version string^6 and a sysctl that has been used in the past for flagging previous exploitation:

Putting it all together and running the exploit looks like this:

Note that when the pipe file descriptors are closed (which happens automatically when the process terminates), the kernel will panic. This is because it will try to free the pipe buffer, which for pipe2 will point to wherever was last read/written, and for pipe1 will point to pipe2. This creates a chicken-and-egg scenario, as the pipes can't be used to fix themselves up before being closed. For testing purposes, I opted to simply hang forever.

At this point a full proof-of-concept would go through the standard procedure of escalating privileges and unsandboxing.

Testing on iOS 15.1

We'd like the technique to be generalized and work on newer versions of iOS as well, so the next step is to create a virtual iPhone 7 with iOS 15.1 and find the parts that are different. Of course static kernel addresses like the fg_ops field on a pipe and the kernel version string will be different, and likely the guessed kernel address for the spray will change. After performing the same steps of examining the kernelcache and sampling the fileports spray, here are the two sets of parameters together, first from iOS 14.6 and then from 15.1 (both on iPhone 7):

The only unexpected change was the kobject offset within the Mach port object. In theory, with all of this filled in the technique should "just work."

Well, almost:

This appears to be a new, albeit small mitigation specifically designed to counter this technique!

Opening up the kernelcache in a disassembler and finding the panic call by cross-referencing the string, it appears that this is only used within pipe_read and pipe_write. This appears to be conceptually similar to zone_require, integrating an element of kheaps. Essentially, pipe buffers under normal circumstances should only contain "data", or blobs that have no particular meaning to the kernel.

The decompilation is relatively straightforward (although not entirely accurate, but it's sufficient for a high-level understanding): looking up the relevant page in the zone metadata and checking a flag that indicates whether the allocation is from a KHEAP_DATA_BUFFERS zone.

For more accuracy, here's the disassembly^7 with some notes:

In earlier versions of XNU, pipe buffers would be allocated by kalloc:

This would result in an allocation in one of the kalloc zones rounded up from the size of the initial write. In iOS 14.x, this changed to allocating from the KHEAP_DATA_BUFFERS submap:

By itself, this only prevents pipe buffers from being used to build fake objects (e.g. as the replacer object for use-after-free), because most interesting objects would be allocated from the KHEAP_DEFAULT/KHEAP_KEXT submaps, or from a dedicated zone.

This new call to kalloc_data_require expands on this to enforce that the pipe buffer must be allocated from KHEAP_DATA_BUFFERS. This breaks the technique of pointing one pipe at another because the dedicated pipe zone is definitely not in KHEAP_DATA_BUFFERS.

At the time of this writing, there are zero Google results for kalloc_data_require (Update: The source code is now available!), which indicates that perhaps this pipe technique isn't particularly relevant anymore (especially having been already affected by data PAC). It's possible that changing a pipe buffer pointer to some other type of KHEAP_DATA_BUFFERS object could pan out, but that's an open research question. If such an object exists then it likely doesn't belong in KHEAP_DATA_BUFFERS and that itself could be considered a vulnerability.

This new mini-mitigation was a fun discovery, and shows Apple's strategy of hardening to break techniques as a form of defense-in-depth. Looking through Brandon Azad's excellent survey of public iOS kernel exploits, many of them use pipe buffers either as "replacer" objects in a use-after-free scenario or placed after another type of object and used as the target object of an overflow. Since those involve keeping the pipe buffer pointer untouched (i.e. pointing at a legitimate pipe buffer allocation), this mitigation wouldn't affect those techniques. Perhaps Apple has seen the technique used in the wild, or they've simply identified it as a fairly obvious technique and decided to eliminate it preemptively.

The full source code is available on Github.

Are these realistic primitives? Perhaps not, but the purpose here is to practice on a technique, so the underlying bug (real or otherwise) is less important.
Corellium devices by default have KASLR disabled. Be sure to edit the settings before booting.
The buffer pointer is now subject to data PAC, which unfortunately breaks the technique on A12+. The rest of this post is focused on pre-PAC devices.
There are a bunch of other cool monitor commands exposed by Corellium, run process plugin packet monitor help for a list!
Perhaps Corellium will add a way to hook kernel_base+offset in the future which would make this much easier.
Neither of these are great examples for real exploitation since there are other ways to read the version string, and the kern.maxfilesperproc sysctl is both readable and writable from userspace, but they demonstrate the point.
IDA Pro had some bizarre disassembly issues with this function, but Binary Ninja handled it quite well.

Advance Your Mobile Security Research with Corellium

Experience Corellium’s groundbreaking virtualization technology for mobile devices and discover never-before-possible mobile vulnerability and threat research for iOS and Android phones. Book a meeting today to explore how our platform can optimize mobile security research and malware analysis.

View full post