SearchKit & RevelariOS

Taking a dive into the vm_region()

by PsychoBird - September 28, 2020

Introduction

Searching through memory on iOS to find a specific value isn't a new idea. There's plenty of tools available for searching memory on iOS, and source code is available for most of them. However, the source code only explains the how behind the project but not the why. The goal of RevelariOS was to create a technical version of these tools with a how and why explaination available through a combination of this blog post and the source code. SearchKit is meant to be a development toolkit so developers can add memory searching in their own projects, while RevelariOS is built to show off the power of SearchKit and its potential uses in research and development. In order to explain how RevelariOS and similar programs work, we have to take a dive into the vm_region().

Before beginning, let's discuss the logical process behind searching memory. First, we'll need to find the exact memory region we're searching. We can't start at 0x0 and search up until we hit our desired result because the searching process would take hours, espeically if our input is not actually in memory and we end up searching until iOS won't let us anymore. Secondly, we'll need to read the valid memory regions we find for our input. Reading must be efficient or else searches will take way too long to be usable. Lastly, we'll need a medium for storing the addresses of where the memory scanner found a match.

Pt. 1 - Getting the searchable region from vm_region()

The searcable region of memory is different for each device, so we can't find the searchable region on one device, hard code it into the tool, and expect it to work on others. Fortunately, we can implement a function to find the base and end address through vm_region().

Because Apple doesn't provide much documentation for internal functions, we'll need to look at the MIT Mach IPC Documentation for vm_region():

kern_return_t vm_region (vm_task_t target_task, vm_address_t address, vm_size_t size, vm_region_flavor_t flavor, vm_region_info_t info, mach_msg_type_number_t info_count, memory_object_name_t object_name);

Since we're building this for newer devices, we'll need to use vm_region_64(). Since there is no documentation yet (that's something I may get to) for updated mach functions, the code below had to be strung together using the XNU source code as a reference.

kern_return_t get_region_size(mach_port_t task, vm_address_t *baseaddr, vm_address_t *endaddr) { vm_address_t addr = 0; vm_size_t size = 0; vm_region_flavor_t flavor = VM_REGION_BASIC_INFO_64; vm_region_basic_info_data_64_t info; mach_msg_type_number_t count = VM_REGION_BASIC_INFO_COUNT_64; mach_port_t object = 0; kern_return_t kret; int id = 0; bool found = false; while (1) { addr += size; kret = vm_region_64(task, &addr, &size, flavor, (vm_region_info_64_t) &info, &count, &object); if (kret != KERN_SUCCESS) { break; } else if (id < 3 && kret != KERN_SUCCESS) { return KERN_FAILURE; } id++; if (addr > 0 && !found) { found = true; *baseaddr = addr; } } *endaddr = addr; return KERN_SUCCESS; }

The only parts of the above code that we need to focus on are task, addr, and size. The other vm_region_64() fields are not useful for what we're currently doing.

task is simply a send right to the task port we're searching in. For simplicity purposes I will be using "task" and "send right to the task port" interchangably. It doesn't matter too much what task we input for getting the searchable region because the range will mostly the same, but for the sake of making this blog post simple we're using the same task that we're searching in.

How do we get a send right to the task we're using throughout this process? There's a few ways, depending on what's needed. To get the self task, assuming you're working with a dynamic library linked inside of a process / app or your own individual process, it's fairly simple:

/* get a send right of the task port of our own task note: mach_task_self() returns the send right to your task port */ mach_port_t task = mach_task_self();

We don't need the task-for-pid-allow entitlement for getting the send right to our own task port. If we need a send right to a separate task, we'll need the task-for-pid-allow entitlement. Unfortunately, to sign a binary with that entitlement, we need to be jailbroken and sign it with ldid or have some sort of exploit. I was seriously considering bundling the RevelariOS app into a sideloadable IPA using the Psychic Paper exploit by Siguza, but that's for another day. Anyway, here's the code using task_for_pid():

//check if code is not run as root if (geteuid() && getuid()) { exit(-1); } mach_port_t task = MACH_PORT_NULL; kern_return_t kret; //return to determine if tfp() succeeded pid_t pid = 1; //process id (pid) 1 is launchd. Just use some sort of input method for your desired pid. kret = task_for_pid(mach_task_self(), pid, &task); if (kret != KERN_SUCCESS) { exit(-1); }

Finally, we can move onto vm_address_t addr.

vm_address_t addr is the address of the first available valid region given to us by vm_region_64(). You may have noticed the & operator. I could be here forever explaining the & operator, so if you don't know what & means google "address of operator in C". Basically, it allows vm_region_64() to use the value of addr and change it. Once the function returns it'll spit out a new value for addr which we can use. I'll be referring to & as an in-pointer in these functions going forward.

vm_size_t size is an in-pointer which comes out as the size of the region at addr. As explained above, once vm_region_64() is executed a new value for size will be in place assuming it returns successfully.

If we input addr = 0 and size = 0 into vm_region_64(), we can get the address of the first valid region and its size. Once the function executes, let's say that addr = 0x10086c000 and size = 0x4000. This means that the region found at 0x10086c000 is 0x4000. Since 0x10086c000 is our first valid region, we'll set *baseaddr to addr(0x10086c000)

So how is this useful? We can add addr += size and loop through the code using an infinite while (1) loop. Since we're adding size to addr, the new addr value will be the start of the next valid region. This process is repeated until vm_region_64() doesn't return successfully.

The size of the region found by vm_region_64() can be variable and is often fairly large. For a searchable range of 0x10086c000 - 0x280000000, vm_region_64() will only run a few dozen times. Once we attempt to search for a region at an invalid address, let's say 0x290000000, vm_region_64() will return KERN_INVALID_ADDRESS and we will be kicked out of the while loop by hitting the break. Since 0x280000000 is the last address to a valid region, *endaddr will be set to addr(0x280000000). The function will now return with our *baseaddr and *endaddr found.

Pt. 2 - Reading our searchable region with vm_read_overwrite()

When reading back iOS memory in the searchable region, we'll need another logical start and end point. If we read back one byte at a time throughout our searchable region, the entire read process will take hours. If we don't pick the correct bounds to read back, the input that we're searching for may be cut off in between reads. So how can we read memory properly?

The easiest and most efficient way to start reading through iOS memory is through reading back a memory page at a time. Starting on A7 and A8 devices, memory pages have a 16kb (0x4000) size. We can read back a memory page as an array of bytes and compare our input to the read result one byte at a time. Although reading a byte at a time is slow, comparing it isn't. If our input isn't found, we can just increase the read region by one page size (0x4000) and restart the process. The start address of baseaddr will always start at a number divisible by the page size, so we don't need to worry about information being cut off between reads.

Here's a snippet of the function for searching:

typedef int search_t; typedef unsigned char byte_t; typedef uint8_t result_t; #define SEARCH_SUCCESS 0 #define SEARCH_FAILURE 1 #define READ_PAGE_SIZE getpagesize() search_t search_data(mach_port_t task, vm_address_t baseaddr, vm_address_t endaddr, vm_address_t *outaddr, uint8_t cmpbyte[100]) { size_t bytes = READ_PAGE_SIZE; byte_t readOut[READ_PAGE_SIZE]; kern_return_t kret; int accuracy = 0; unsigned long scannum = 20; for (; baseaddr < endaddr; baseaddr+=READ_PAGE_SIZE) { kret = vm_read_overwrite(task, baseaddr, bytes, (vm_offset_t) &readOut, (vm_size_t*) &bytes); int i; for (i=0; i < READ_PAGE_SIZE; i++) { if (kret != KERN_SUCCESS) { break; } accuracy = 0; if (cmpbyte[0] == readOut[i]) { accuracy++; for (int j=(i+0x1); j<READ_PAGE_SIZE; j++) { if (cmpbyte[accuracy] == (uint8_t) readOut[j]) { accuracy++; } else { break; } if (accuracy == scannum) { *outaddr = (vm_address_t*) baseaddr + i; return SEARCH_SUCCESS; } } } } } return SEARCH_FAILURE; }

This function has been thinned down; the full version of the search function is available on the GitHub repository for SearchKit / RevelariOS. The only major difference between the code snippet above and the real function is that we search for 255 total addresses and that the input formatter hasn't been removed.

Let's start by looking at the code from the beginning. The typedefs result_t (uint8_t), byte_t (unsigned char) and search_t (int) are simply added for readability.

READ_PAGE_SIZE getpagesize() is the page size of the device. As mentioned above, for my device the page size is 0x4000.

Since the above code is abbreviated and only an example, I set unsigned long scannum to 20. That just assumes that our input length (uint8_t cmpbyte[100]) is 20 bytes. This is normally handled by getting the length of the input through the formatter in the real function.

byte_t readOut[READ_PAGE_SIZE] is where we're storing the memory page we read and size_t bytes is the amount of bytes we'll be reading. (one memory page)

int accuracy = 0 is a counter to see if the bytes read back match the cmpbyte[]. Since we loop through readOut[] one byte at a time and compare it to cmpbyte[], when the bytes match we increment accuracy and if it doesn't match we set it back to 0.

We start our for loop by beginning at baseaddr and looping all the way to endaddr. If our comparisons inside of the for loop are not met, we add READ_PAGE_SIZE to baseaddr and start searching at the beginning of the next memory page.

Next, we run vm_read_overwrite. Here is the function declaration, courtesy of the MIT Documentation:

kern_return_t vm_read_overwrite (vm_task_t target_task, vm_address_t address, vm_size_t size, pointer_t data_in, target_task data_count);

Previously in the blog post I described what goes into target_task (or task) and its meaning, so let's move onto address. We enter baseaddr into the address field to signify that we're starting our read at that page of memory. size is where we enter our variable bytes - that's just the size of the region we're searching, which, once again, is one memory page. Lastly, data_in is a in-pointer to the array readOut where our read bytes will be stored.

Next, we enter the for loop with the iterator i that increases by one until it reaches READ_PAGE_SIZE. If vm_read_overwrite() fails and doesn't equal KERN_SUCCESS, we break the for loop so we don't waste time comparing garbage data.

Inside of the same loop we set accuracy to 0 and compare the first byte of our input (cmpbyte[0]) to the byte at readOut[i]. If the values are equal, accuracy is increased by 1 and we enter another for loop with the iterator j. This iterator is set to the value of i+1 so we don't accidentally compare the same bytes twice. Immediately after initializing the loop, cmpbyte[accuracy] is compared to readOut[j]. If the values are equal, accuracy is increased by 1 once again. If they're not equal, we break out of the loop and go back to comparing bytes. This loop will continue until accuracy equals scannum, which in this example is 20. Once they're both equal, we know that our input has been found successfully and that *outaddr can be set to baseaddr+i, which is the address of where the input matched an array of bytes read from memory. The function then returns SEARCH_SUCCESS. If no matching bytes are found, the function returns SEARCH_FAILURE.

Conclusion (pt. 3) - RevelariOS and using the result addresses

So we did all of this work, but for what? What usage does a bunch of addresses found in memory actually have?

For usage, let's look at RevelariOS, which is SearchKit wrapped inside of a tool I developed. RevelariOS is a command line utility and an iOS appication for searching memory. With it, we can find information from memory by searching for through the command line or inside of the app. Trying to find instances of string in memory? RevelariOS can do that. Want to find pointers that hold a specific address? RevelariOS can do that. As mentioned before, the full version of the function search_data supports up to 255 result addresses. That number can be increased to 65536, but it is only useful in some scenarios. For example, holding that many addresses means that we can track down every instance of a loaded mach-o binary in memory by searching for 0xfeedfacf, the mach-o magic number. Just make sure that the header is in Little Endian! Finding where binaries are loaded can be used for dumping applications, for example. The only real limitations of RevelariOS / SearchKit are your imagination. I've used it countless times in my own research.

RevelariOS was developed to show off the power of SearchKit. It can also read / write memory and pause our searching task, making it a multi purpose utility. If you made it this far, I recommend checking out the GitHub repository for RevelariOS. You might be able to find it useful for yourself or even develop it further. Innovation breeds from sharing ideas. :)