11 min read

iOS, LLDB & Introductory Debugging Skills

iOS, LLDB & Introductory Debugging Skills

Take a second to think about a 'low-level language'...C, C++, a few others too? Today we'll gain a little insight into (arguably) the lowest level language there is - assembly.

If you've ever opened a binary in a disassembler (to either attempt some reverse engineering or out of pure curiosity - what you are looking at is assembly instructions) - it'll look something like this:

What you see here are instructions - which can appear differently depending on the architecture, target operating system, and even the compiler which compiled the binary.

Source Code != X86/ARM Instructions

The instructions you see are not the source code, but the result of a compilers interpretation of the source code.

The reason I point this out is because not all compilers will produce exactly the same instructions. For example, there are 'obfuscation' compilers, designed specifically to make the compiled binary more difficult for us to reverse engineer if we inspect it with a disassembler (see Useful Terms) (often by adding extra unnecessary instructions/functions to distract from the real purpose of the binary).

For example, a binary compiled for ARM (see Useful Terms) will use a different 'instruction set' (see Useful Terms) than a binary compiled for X86 (see Useful Terms) - and therefore software you compile will fulfill tasks/processes using different low-level instructions (while completing exactly the same task).

We describe the process of a binary being compiled in my book, iOS Research & Exploration Volume I. That won't be necessary to follow this article though!

Patching The Binary Isn't Always Suitable...

So...it's possible that using a disassembler and a little bit of time, we can have the disassembler ingest a binary where we can then understand the flow of execution and modify the instructions to our advantage.

Patching the binary prior to installation can cause a variety of issues depending on the protection mechanisms the application has implemented. For example, at the OS level we could have issues with code-signing (see Useful Terms) and having the application actually install. At the application level, we could have issues such as...

  • Shared App Data Storage not functioning (if we sign the application with a different Bundle ID) (if this is relevant to the application). An example of this would be the Google applications, which use shared app storage to maintain your login status accross all installed Google applications on your device. A different Bundle ID (while re-signing) would place the application out of the scope of access to this shared data.
  • Encrypted Strings - it's possible that an application could decrypt the specific value you would like to modify, at runtime. Hardcoding a modification would require an understanding of how the values are encoded/decoded to produce a new valid patched value.
  • Complex Authentication - Some applications, such as banking applications, could implement 'checks' of sorts to ensure that the application has not been modified prior to a succesful launch. For example, a banking app could initialise a connection to the remote bank server, sending a data blob (see Useful Terms) comprising of various pieces of information about the running application. This could include the iOS Version, Display Resolution, hashes of Strings/resources inside the application, and other values which can be cross checked at the server-side to establish wether a device should be 'trusted' to access the service.

Patching At Runtime Isn't Perfect Either...

When we patch at runtime, we'll be manipulating values which are in memory (in comparison to modifying the binary on your storage device). This presents an immediate issue...

We need to be absolutely sure in the data we are modifying, whether that being an integer suggesting a license is activated, or a command string which will alter the return value (see Useful Terms) of a function, to ensure we are not modifying data outside of the data that's of interest to us.

In an ideal situation, we'll launch the binary with knowledge of either a specific memory location where the data is accessed, or where the data/value is initialised (using a static memory address (see Useful Terms) that we'd find using a disassembler).

Just taking note of the static address we find in a disassembler to over-write the value we are interested in isn't quite enough... There are a few other security features in our way. Address Space Layout Randomization ASLR is one we'll talk about today.

ASLR & Why Static Addresses Won't Work

While your device boots, it'll generate an ASLR 'slide'.

In the case of the iOS Kernel, a new slide is generated each time the device boots, and allows for an extra level of security against malicious actors.

In the case of an iOS Userland Process (the binaries you execute from the terminal are an example) a new slide is generated for each individual process.

This works in a way that rather than the device using the static address to reference a certain function/object/data, it'll use the formula 'Static Address + ASLR Slide' to calculate a new address which is used to reference said data.

iOS handles this in the background, and results in a different address being used to reference the same function/object/data as the formula we just discussed will result in a different address each time the device is booted.

To 'bypass' ASLR, we need to have the device 'leak' the ASLR slide. If you've observed how a modern jailbreak works, you'll probably see an area where it'll report 'SLIDE: 0xXXXXXXXX' at some point, where the X's will report the unique slide value.

LLDB…?

LLDB is the debugger (see Useful Terms) that we will be using today for a practical example of patching at runtime.

LLDB allows for us to gain an insight into how our application changes state over time... Let's talk a little about what we can actually do using a debugger!

  • We are able to set 'breakpoints' - Breakpoints allow us to pause our application's execution where a certain address/function is hit. This is really useful as we can modify register states at the point where the breakpoint is hit (to modify 'return values', 'arguments' passed to functions, and more).

Reading and writing to process memory - This is extremely valuable (and risky) as it allows us to bypass most validation mechanisms (depending on at what point we modify the value).

For example, if a value is hashed, and the hash verified during the execution - we could set a breakpoint after this validation had completed and modify the value to one of our choosing. Or just patch the return value of the validator function :-)

  • Disassemble instructions - As it says, it'll allow you to disassemble instructions at an address! Sort of like a disassembler within LLDB itself. We're also able to 'step' through instructions and analyse register values at the point of each instruction being hit.
  • Much, much more!

That was quite a mouthful (I confused myself a little on the way), but really what we need to understand is that LLDB is a debugger that allows us to interface with a running process. Let's check it out in action, and everything should make a little more sense.

The Exercise

I've carefully considered an exercise for this blog post, and although most users cannot reproduce this specific example with a physical device (as we'll be debugging the kernel, we'd need a prototype iPhone) - You'll be able to very easily transfer these same skills to other processes including user applications (on iOS and the Mac!) or follow this tutorial using the Corellium Platform!

Our task is as follows...

While connected to an iOS Device over SSH, we can print basic kernel information using the command 'uname -a' - the output will look just like this:

iPhone:~ root# uname -a
Darwin iPhone 19.5.0 Darwin Kernel Version 19.5.0: Tue Apr 28 22:25:04 PDT 2020; root:xnu-6153.122.1~1/RELEASE_ARM64_S8000 iPhone8,1 arm64 N71AP Darwin

Sort of bland...What if we could make it look like this:

iPhone:~ root# uname -a
Darwin iPhone 19.5.0 h4ck3d Kernel Version 19.5.0: Tue Apr 28 22:25:04 PDT 2020; root:xnu-6153.122.1~1/RELEASE_ARM64_S8000 iPhone8,1 arm64 N71AP Darwin

More interesting? I thought so too! A great application of debugging? not really... But a cool demo where we can transfer the skills elsewhere!

Almost all of the commands and steps I talk about here are applicable to both user application binaries and the kernel, although there's a minor difference in the way we initialise a connection to the Kernel compared to a user binary.

Here's the step for initialising lldb with a user binary:

So we'll begin by initialising LLDB with a binary of our choosing...

lldb binary-name-here

Keep in mind that if you execute this on your iOS device, you'll need LLDB installed on the iOS Device via Cydia.

I'll be debugging the iOS Kernel, and so the command we use to initialise is slightly different here and specific to the platform I am using... I'll omit it here to remove any confusion.

Your terminal window should look just like this...

We're in!

We can safely assume that the 'Darwin Kernel Version' text is probably hard-coded somewhere within the Kernel itself, right?

Though user applications don't have individual kernel versions, you'll probably be able to execute most user applications with '-h' or '-v' in the case of a command line application and it'll give basic usage information alongside a version number. We can use this same method to overwrite the version information instead! It’s exactly the same concept.

The first step i'll take is opening the binary using a disassembler on my Mac (Hopper Disassembler) so that I can find the address of this Kernel information text.

With a quick search of the string 'Darwin' using Hopper, we can identify very quickly where the long kernel information string is stored.

In the case of an iPhone 6S iOS 13.5 Kernel, the address of the kernel information is fffffff007032d57.

You'll remember I mentioned the ASLR security feature earlier on and how that would cause issues for us if referencing the static memory address. This applies here, and you'll need to identify the ASLR Slide!

Using the device emulation software platform that I am, ASLR is disabled by default. On a real iOS Device, you'll have to identify the slide and calculate the new address as per the formula I mentioned earlier on.

Here's an interesting Stack Overflow thread I read around the matter - How Can I Obtain Another Process ASLR Slide.

Now that we have the address of the text, let's use LLDB to read data at that memory address! For this, we use the command 'memory read' followed by the memory address we'd like to read, as follows:

(lldb) memory read fffffff007032d57

I like to shorten this to simply 'mem read'...

You should be presented with an output of the data at that memory address +32 bytes...

(lldb) mem read fffffff007032d57
0xfffffff007032d57: 44 61 72 77 69 6e 20 4b 65 72 6e 65 6c 20 56 65  Darwin Kernel Ve
0xfffffff007032d67: 72 73 69 6f 6e 20 31 39 2e 35 2e 30 3a 20 54 75  rsion 19.5.0: Tu

Now we can see that this default value of reading 32 bytes forward might be okay for very very small amounts of data. What if we want to explore a little further?

Luckily we can use the --count argument in LLDB - This allows us to specify a custom amount of bytes to read ahead of the address we specify!

(lldb) mem read fffffff007032d57 --count 100
    0xfffffff007032d57: 44 61 72 77 69 6e 20 4b 65 72 6e 65 6c 20 56 65  Darwin Kernel Ve
    0xfffffff007032d67: 72 73 69 6f 6e 20 31 39 2e 35 2e 30 3a 20 54 75  rsion 19.5.0: Tu
    0xfffffff007032d77: 65 20 41 70 72 20 32 38 20 32 32 3a 32 35 3a 30  e Apr 28 22:25:0
    0xfffffff007032d87: 34 20 50 44 54 20 32 30 32 30 3b 20 72 6f 6f 74  4 PDT 2020; root
    0xfffffff007032d97: 3a 78 6e 75 2d 36 31 35 33 2e 31 32 32 2e 31 7e  :xnu-6153.122.1~
    0xfffffff007032da7: 31 2f 52 45 4c 45 41 53 45 5f 41 52 4d 36 34 5f  1/RELEASE_ARM64_
    0xfffffff007032db7: 53 38 30 30                                      S800

Maybe you're thinking 'this makes sense! we'll just use mem write and type our replacement string' - and this is sort of true! There are a couple of considerations...

  • String = HEX - As we're modifying a value in memory, we must use the hex representation. Should you learn the hex representation of every ASCII character off the top of your head - absolutely not! Head to GCHQ CyberChef and convert your ASCII string to hex.
  • string1<string2 - For simplicity and stability, do not replace a value with a value longer than the previous. There's a risk we can begin writing into memory used by other functions! This would likely result in a crash if the data at that address is invalid.

For clarity - you'll recieve an error like this if you attempt to overwrite with ASCII characters.

(lldb) mem write fffffff007032d57 h4ck3d
error: 'h4ck3d' is not a valid hex string value.

Try this (use the To Hex recipe in CyberChef)...

(lldb) mem write fffffff007032d57 68 34 63 6b 33 64

And read the memory address again...

(lldb) mem read fffffff007032d57 --count 32
    0xfffffff007032d57: 68 34 63 6b 33 64 20 4b 65 72 6e 65 6c 20 56 65  h4ck3d Kernel Ve
    0xfffffff007032d67: 72 73 69 6f 6e 20 31 39 2e 35 2e 30 3a 20 54 75  rsion 19.5.0: Tu

Awesome!

Now we can resume the process, which in this case is the kernel, using the lldb command 'continue'

(lldb) continue
    Process 1 resuming

And, as if nothing ever happened...execute uname -a once more!

iPhone:~ root# uname -a
Darwin iPhone 19.5.0 h4ck3d Kernel Version 19.5.0: Tue Apr 28 22:25:04 PDT 2020; root:xnu-6153.122.1~1/RELEASE_ARM64_S8000 iPhone8,1 arm64 N71AP Darwin

And there we have it, our new patched kernel information!

Register Values!

Although we may have covered overwriting data at a memory address, we didn't cover breakpoints or register values.

Breakpoints allow us to stop the execution at a specific point. Maybe we want the process to pause where a certain string is accessed, or a certain function is called - or maybe just as it's about to return. With breakpoints, that all becomes a reality!

As this article is already quite extended, i'll save the majority of this content for a Part 2 - However, i'll leave you with some commands to experiment with in the mean time...

step

step allows us to, while a process is paused, step forward a line of code. (si allows us to step forward a single ‘instruction’) We'll then be able to observe some potentially manipulated register values using...

reg read


reg read allows us to dump the values in the 'registers' for the process. This includes memory addresses in use at the time of the register value dump, the address which is the next instruction to be executed (held in the pc register), the 'stack pointer' and the general purpose register.

We can also overwrite register values... take this example where we over-write the pc with hex value 41 (A):

reg write pc 0x41414141

let's read the register values now using reg read...

(lldb) reg read
General Purpose Registers:
        x0 = 0x0000000000000001
        x1 = 0x0000000000000000
        ...
       x30 = 0xfffffff0071eb484
        sp = 0xffffffe03e9a7fc0
        pc = 0x0000000041414141
      cpsr = 0x600003c4

and continue...

crashed!

In fact, the aim of a buffer overflow attack is to attack areas of a program which do not implement 'bounds-checking' which means we can purposely write so far into memory that we can overwrite function/return pointers and, in turn, manipulate the value of the pc.

For more practical exercises for you to work on, I recommend my friend Billy Ellis's Exploit Challenges.

-J