The Case for SMEP – Exploiting a Kernel Vulnerability

By Gal Badishi | September 20, 2013

Suppose you manage to exploit a vulnerability by directing a user to a malicious web site, or sending someone a specially crafted document. You can execute code at the permission-level of the user that was unfortunate enough to get owned, but he’s not an administrator, and you’re feeling greedy. Now what?

If you want to have administrator access while already running as an unprivileged user on the machine, you need a privilege elevation vulnerability. Today we are going to discuss a privilege-elevation kernel vulnerability that was presented by Gilad Bakas in Ruxcon 2011. The vulnerability was reported to be silently fixed by MS on February 2011. Our contributions will be as follows:

  1. We will highlight some differences between the presentation and the disassembled binaries that we work with.
  2. We will dive much much deeper into the code and see the exact steps needed for the attack to succeed.
  3. We will not use the exact same way to trigger the vulnerability as was presented in Ruxcon, but rather a variant thereof. We will also not need the page 0 allocation that is described in the presentation.

Unless noted otherwise, the pseudo-code and disassembled code relate to the 32-bit Windows XP’s user32.dll and win32k.sys binaries, v5.1.2600.5512, dated 14/4/2008. The vulnerability affects Windows 7 as well.

Background

Whenever a window is created by using some user32.dll API function, a kernel object that represents that window is created by win32k.sys. This means that “regular” windows, buttons, menus, tooltips, etc. all have a kernel object that represents them. Each window object is created from a specific class. You can create classes of your own, or use built-in classes of the OS. The kernel class object can be described as follows:

The following fields are of interest to us in the context of the exploit:

  1. atomClassName: An ATOM that uniquely identifies the class. Some ATOMs are predefined and reserved by the OS for internal classes.
  2. fnid: A number that is supposed to identify the type of window that will be created from this class. For user classes, this is supposed to be 0. For OS classes (e.g., button, menu, etc.) this should be a non-zero number identifying the window type.
  3. cbwndExtra: The number of bytes to allocate in the kernel beyond the basic WND structure that represents a basic window. This way, both the OS and the programmer can create window classes that are based on a “regular” window, but contain extra data that is needed for working with windows of that class.

The basic internal structure of a window is largely as follows:

The size of this structure is 0xa4 bytes. Whenever a new window object is created in the kernel, the OS allocates sizeof(WND) + cbwndExtra bytes for the window. If the corresponding window class was created by the OS, it’s the OS that knows what to do with the extra bytes, and is in charge of manipulating them as needed. If the window class was registered by the user, the OS has no knowledge of what the extra bytes mean, and the user is responsible for handling them. Usually, in user-defined windows, cbwndExtra is 0.

To create classes of your own, you may use the functions RegisterClass or RegisterClassEx. Let’s look at what RegisterClassEx does:

So basically, all the APIs for registering a class end up in a call to RegisterClassExWOW(A/W). The user controls only one parameter to that function, and that’s the first parameter. It turns out that the 3rd parameter is the fnid. Like so:

Here’s an example where RegisterClassExWOWW is called with non-zero fnids:

So we see that the control classes are being registered, and the fnid values are not 0. Let’s take a look at what the array for initializing the controls contains:

We can see that each control class has its own unique fnid, and unlike the user-registered classes, that fnid is not 0. Another interesting thing to note is that every window here has extra bytes for more information other than the regular WND structure.

The function RegisterClassExWOWW is not exported. However, we can easily get to it by dynamically disassembling the code of RegisterClassExW and finding the call to RegisterClassExWOWW. By Calling RegisterClassExWOWW directly, we are able to control its 3rd arugment – the fnid. But how does this help us?

When a user registers a class and asks for extra bytes, he needs a way to change them later. Since the real window object is in kernel memory, the user cannot access the extra bytes directly, and so must use an API function such as SetWindowLong. SetWindowLong can be used to change values in the WND structure itself (like style), or in the extra bytes that come after it, with one exception: you cannot change the extra bytes allocated by the system for internal classes (fnid != 0). These are considered private bytes and can only be changed by the system. The way the system determines whether we are trying to overwrite the private bytes is by consulting a private table (in a global SERVERINFO structure) that contains the cbWndExtra values for all system fnids. This mechanism allows support for both private bytes and user bytes in the same internal window class.

SetWindowLong in user32.dll eventually calls NtUserSetWindowLong in win32k.sys, which in turn calls the function that does the real work, xxxSetWindowLong. The pseudo-code for the relevant parts of the function goes something like this:

So, in order to successfully set a window’s long beyond the WND structure part, disregarding dialog windows, we must make sure that:

  1. 0 <= nIndex <= pwnd->cbWndExtra – sizeof(LONG).
  2. If pWnd->fnid != 0, then nIndex >= gpsi->cbFnidWndSize[fnid] – sizeof(WND) (or nIndex == 0 and the window is currently being created or destroyed).

Enough background for now. Let’s go smash some pointers, shall we?

The Vulnerability

The background part has already given us two interesting facts:

  1. We can use RegisterClassExWOWW to supply a non-zero fnid.
  2. xxxSetWindowLong uses values associated with the fnid to determine whether or not we’re trashing private data.

The second point is very important: there can be a disparity between the fnid table holding the assumed size of the object, and sizeof(WND) + pWnd->cbwndExtra, representing the actual size of the object. If we can create such a disparity, we can trick the checking code into thinking that we’re not overwriting private data, but rather just writing some “regular” extra bytes. In order to do so, we’ll need to find a way to modify the fnid table.

RegisterClassExWOWW in user32.dll calls NtUserRegisterClassExWOW in win32k.sys. Eventually, NtUserRegisterClassExWOW does the following:

Note that the structure used for registering the class is checked, and the function fails if either cbClsExtra or cbWndExtra are negative. This is in contrast to what is implied in the Ruxcon presentation, where a negative value is used for cbWndExtra when registering the malicious class. This is one place where we have to divert from the presentation. Other places will soon follow.

xxxRegisterClassEx calls InternalRegisterClassEx, where we find this piece of code (pcls is the pointer to the class object that we’re currently registering):

Let me put that in an easier-to-digest form:

Which means that if we register a class using a system fnid, the value in the table that saves the assumed size of the corresponding windows will be modified, but the cbWndExtra field in the corresponding system class is going to stay the same. Disparity achieved.

To abuse this disparity, we need to do the following:

  1. Find an interesting system class with private data (pclsSystem->cbWndExtra >= gpsi->cbFnidWndSize[fnidSystem] – sizeof(WND) > 0).
  2. Register a class using fnidSystem and pclsUser->cbWndExtra == 0. Now we get that pclsSystem->cbWndExtra > gpsi->cbFnidWndSize[fnidSystem] – sizeof(WND) (which is 0).
  3. Create a window using the system class. The Ruxcon presentation is not explicit about which window class to create (the malformed or the original), but we will later see why we must use the actual system class.
  4. Trash the window’s private data using SetWindowLong with an offset of 0 (or whatever is applicable).
  5. Make use of it.

Note that every once in a while (presumably whenever a GUI thread is created), the global gpsi->cbFnidWndSize[fnidSystem] will be overwritten to revert to its original state. Why would someone want that is beyond me, but in this case it doesn’t really matter to us, because the periods between such resets are very long compared to the time it takes to complete steps 2 to 4.

The Ruxcon presentation suggests overwriting the private data of a menu object. To see why, let’s see some properties of menus.

When win32k.sys is loaded into memory, its DriverEntry function calls Win32UserInitialize, which in turn calls SetupClassAtoms. In there we can find:

The values are taken from here. But what if we want to validate the docs? How do we know that 0×8000 is indeed the menu class atom? Let’s take a look at xxxTrackPopupMenuEx, which actually creates a menu window:

So that’s how we know that. Speaking of creating a menu window, in the menu’s window procedure, xxxMenuWindowProc, we can find something like this:

So the fnid for a menu is 0x29c. Recall from the background that gpsi + 2 * fnid – 0x48c held the size of the menu window object (the one we would like to overwrite). Since FNID_MENU is 0x29c, we get that gpsi + 0xac holds the size of the menu window object. The function InitFunctionTables (also called from Win32UserInitialize) shows us this:

So the size of a menu window object is 0xa8. This is sizeof(WND) + 4, which means that a menu window has 4 bytes of private data we can overwrite. Let’s see what this data is:

Where:

Upon creating the new menu window, an NC_CREATE message will be sent to the window and handled by xxxMenuWindowProc like so:

So the popup menu itself is allocated upon creation of the window, if it doesn’t yet exist. Upon destruction of the window, the WM_FINALDESTROY message is processed by xxxMenuWindowProc, which calls xxxMNDestroyHandler, passing it a parameter which is the pointer to the popup menu that was created upon receiveing WM_NCCREATE, and associated with the menu window.

Since our course of action would be to overwrite the pointer to the popup menu, we need to watch closely on what xxxMNDestroyHandler does exactly. On the one hand, we may find spots that will help us overwrite other values in memory. On the other hand, we must be careful and make sure the values we supply don’t cause the kernel to collapse.

Here is the important information from xxxMNDestroyHandler:

  • If ppopupmenu->spwndNextPopup != NULL it sends an MN_CLOSEHIERARCHY message to either ppopupmenu->spwndPopupMenu or ppopupmenu->spwndNextPopup.
  • If ppopupmenu->spmenu != NULL and an item is selected, it accesses and manipulates values referenced by ppopupmenu->spmenu.
  • If the flag 0×2000 is set in ppopupmenu->dwFlags, it calls _KillTimer(ppopupmenu->spwndPopupMenu, 0xFFFEu), which manipulates values related to ppopupmenu->spwndPopupMenu.
  • If the flag 0×4000 is set, almost the same thing happens, except the argument to _KillTimer is 0xFFFF.
  • If the flag 0×200000 is set and ppopupmenu->spwndNotify != NULL, it sends a WM_UNINITMENUPOPUP message to ppopupmenu->spwndNotify.
  • It sets the flag 0×8000, indicating that the popup menu is destroyed. This is a write operation into arbitrary memory that we can use to our advantage if we want to.
  • If the flag 0×800000 is not set, and ppopupmenu->spwndPopupMenu != NULL, it nulls ppopupmenu->spwndPopupMenu->ppopupmenu. This is a write operation through 2 dereferences. The destination can be controlled by us or not, depending on how we choose to exploit the target.
  • If the flag 0×10000 is not set, it calls MNFreePopup(ppopupmenu).
  • If the flag 0×10000 is set and ppopupmenu->ppopupmenuRoot != NULL, it manipulates values in ppopupmenu->ppopupmenuRoot.

So right now we have a mandatory write (OR operation) that is going to happen, plus a NULL write that might happen (through double-dereference), and a call to MNFreePopup that might also happen. Let us see what MNFreePopup does:

  • If ppopupmenu == ppopupmenu->ppopupmenuRoot, it calls MNFlushDestroyedPopups.
  • It unlocks all the windows and menus pointed at by ppopupmenu, if they are not NULL. This includes decrementing cLockObj in head by 1 for every window/menu, and then nulling the appropriate pointer.
  • If the flag 0×800000 is set in ppopupmenu->dwFlags, it nulls ppopupmenu->ppopupmenuRoot.
  • If the flag 0×800000 is not set, it performs another check that is going to fail and then lead you to HeavyFreePool(ppopupmenu). Assuming you have overwritten that pointer with your value, if you get here you are guaranteed a BSOD.

Armed with all this knowledge, we can now go and write our exploit.

Exploitation Details

The images in the Ruxcon presentation make it seem as if an arbitrary overwrite using just one dereference (ppopupmenu) is viable. Unfortunately, we have seen all the hurdles that await us if we try to do it that way. We might access memory regions that we’re not allowed to access, and change kernel memory in ways that we’ll regret afterwards. It is best if we stick to the double-dereference nulling of a pointer through ppopupmenu->spwndPopupMenu. We will direct ppopupmenu to a memory completely in our control, and then control the single value that will be nulled. Note that if we get to MNFreePopup, this will also mean that spwndPopupMenu will get “unlocked”, thereby decrementing a value through the spwndPopupMenu pointer. We will use this side-effect to our advantage later.

One other thing we want to do is to avoid calling MNFreePopup, the main reason being that in order to reach the double-dereference nulling, we need flag 0×800000 (desktop menu) to not be set. However, if we reach MNFreePopup with that flag not set, it calls HeavyFreePool and we get a BSOD. To avoid this we must set flag 0×10000 (delayed free). So, in principal, we want the entire POPUPMENU structure to be 0, except dwFlags, which should be 0×10000, and spwndPopupMenu, which should point to the area we want to overwrite (spwnPopupMenu->ppopupmenu is going to be nulled).

In this example, we’re going to use the well-known technique of overwriting HalDispatchTable’s entry for NtQueryIntervalProfile. That’s the second entry, meaning 4 bytes from the start of HalDispatchTable. We know that the nulling command is ppopupmenu->spwndPopupMenu->ppopupmenu = NULL. We also know that ppopupmenu in MENUWND comes right after the WND structure, meaning starting at byte 0xa4. Thus, we need spwndPopupMenu to point 0xa4 bytes before the address we want to overwrite. Here is the setup:

The code we will put in page 0 is a simple jump to our C function. The function itself can do a very straightforward procedure like changing the process’s token to have SYSTEM privileges. As our exploit is agnostic to what code we’re going to run, we can just use any privilege elevation code that runs from the kernel (perhaps wrapping it with a stack-preserving function):

And then:

The next thing we want to do is to create the disparity between the size of the fnid window saved in gpsi, and the size reported by the actual menu window class:

After performing these commands, the fnid table is going to show size 0xa4 (== sizeof(WND)) for menu windows, while the menu window class still (correctly) shows cbWndExtra == 4. This will allow us to change one DWORD of data using SetWindowLong at offset 0 of the extra bytes (the ppopupmenu pointer).

Now we need to create a menu window. Calling CreatePopupMenu from user mode doesn’t help much in creating something interesting, as the kernel simply allocates memory for a menu object (see InternalCreateMenu). We need to find a way to actually create a window for a menu. As mentioned earlier, we cannot simply use CreateWindow/CreateWindowEx. If we simply try it with the menu window class (0×8000) we are going to fail.

If we try creating a window out of our malformed class that has an fnid of 0x29c (FNID_MENU), we are not going to get the xxxMenuWindowProc to run at all. First of all, when we register our class we cannot give the address of xxxMenuWindowProc as our window procedure (even if we know the address, RegisterClassExWOWW will just fail). So we must use an address in user space. xxxCreateWindowEx calls MapClientNeuterToClientPfn to determine the window procedure to use. The first argument is a pointer to our class object, and the second should be a default window procedure to use. In our case, xxxCreateWindowEx passes NULL. The function itself is:

So we’re stuck with our own window procedure in user mode, that cannot do anything interesting, let alone manipulate a kernel object directly. We must find a way to create a window from a menu class, with xxxMenuWindowProc as its window procedure, and get a handle to it. Fortunately, as we’ve seen before, xxxTrackPopupMenuEx creates a menu window for us, and it can be called from user mode. Now we just need to get a handle to that window. In order to do that, we use the FindWindow function, supplying it with 0×8000 – the class atom of a menu window. The trick is to know when to do that. We do that exactly when we receive the WM_INITPOPUP message to our own window procedure.

And our window procedure is:

Where Exploit does this:

And that’s it – we’ve nulled a pointer in the HalDispatchTable. All that’s left is to call the function that will use that pointer to run our code (starting from the trampoline and moving on to the token-changing function). After we do that, we can simply spawn a shell:

We can see the result of running the exploit here (note the user for process 3500):

Why SMEP - Privilege Elevation Through Kernel Pointer Overwriting

Some More Creativeness

Microsoft patched the ability to allocate page 0 on April this year, but, for example, on Windows 7 this is enforced by default only on 64-bit systems. But even if you cannot allocate page 0, you don’t really have to use the NULL page to abuse this vulnerability. Here are some constructs you can use:

  1. You can set a pointer to NULL, as we’ve already seen.
  2. You can decrement a number by 1, as demonstrated with the cLockObj field.
  3. You can OR a number with 0×8000, when the popup menu is marked as “destroyed”.

Of course, this means you may be corrupting more pieces of kernel memory (with the obvious implications), but it is surely doable:


Connected to Windows XP 2600 x86 compatible target at (Fri Sep 20 10:45:34.361 2013 (UTC + 3:00)), ptr64 FALSE
Kernel Debugger connection established.
Symbol search path is: SRV*D:\Symbols*http://msdl.microsoft.com/download/symbols
Executable search path is:
Windows XP Kernel Version 2600 UP Free x86 compatible
Built by: 2600.xpsp_sp3_gdr.111025-1629
Machine Name:
Kernel base = 0x804d7000 PsLoadedModuleList = 0x805540c0
System Uptime: not available
Single step exception - code 80000004 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
001b:007268c2 8d55b8 lea edx,[ebp-48h]
kd> ba e 1 0x00ffffff
kd> g
Breakpoint 0 hit
00ffffff e9cc1740ff jmp 004017d0
kd> kv
ChildEBP RetAddr Args to Child
WARNING: Frame IP not in any known module. Following frames may be wrong.
b10b6cfc 8063d59d 00000001 0000000c b10b6d14 0xffffff
b10b6d20 8060eb13 00000002 b10b6d64 0012ff60 nt!KeQueryIntervalProfile+0x37 (FPO: [Non-Fpo])
b10b6d54 8053d6d8 00000002 0012ff74 0012ff7c nt!NtQueryIntervalProfile+0x61 (FPO: [Non-Fpo])
b10b6d54 7c90e514 00000002 0012ff74 0012ff7c nt!KiFastCallEntry+0xf8 (FPO: [0,0] TrapFrame @ b10b6d64)
0012ff4c 7c90d84a 0040177c 00000002 0012ff74 ntdll!KiFastSystemCallRet (FPO: [0,0,0])
0012ffd0 80544cfd 0012ffc8 81e019e8 ffffffff ntdll!NtQueryIntervalProfile+0xc (FPO: [2,0,0])
00130010 00000000 00000020 00000000 00000014 nt!ExFreePoolWithTag+0x417 (FPO: [Non-Fpo])

Disallowing allocations of page 0 will indeed thwart some attacks, but as we can see, in this case we can easily use higher addresses for our exploit. To effectively close this class of attacks and force attackers to be much more creative, a hardware-assisted solution like SMEP (Supervisor Mode Execution Protection) is necessary.

Cyvera TRAPS obstructs such exploitation techniques and provides advanced exploit-mitigation mechanisms (including hardware-assisted ones) even for operating systems that do not support those mitigations natively.