Shellcode: Resolving API addresses in memory


A basic but core function of all Position Independent Code (PIC) for windows is to resolve the address of API functions at runtime. It’s an important task with a number of options available. Here, we’ll examine 2 popular methods using the Import Address Table (IAT) and Export Address Table (EAT) which are by far the most stable. (for this kind of code)

Since the release of Windows Vista in 2007, Address space layout randomization (ASLR) is enabled for executables and dynamic link libraries specifically linked to be ASLR-enabled which mitigates exploitation of vulnerabilities.

But even long before ASLR arrived, virus writers over 20 years ago faced a similar problem with the unintentional “randomization” of the base address for kernel32.dll.

The first Windows virus called Bizatch was written by Quantum/VLAD on a beta copy of Windows 95. The virus used hardcoded API and as a result simply crashed on versions of windows that had a different base address for kernel32.dll.

Mr. Sandman, Jacky Qwerty and GriYo discussed “the kernel32 problem” and “the GetModuleHandle solution” in PE infection under Win32 and weren’t aware of the Process Environment Block (PEB) under NT at the time which was discussed later by Ratter in Gaining important datas from PEB under NT boxes..

Jacky Qwerty published a A GetProcAddress-alike utility which initially became a “standard” method of resolving API addressses in viruses.

At some point after this, authors started resolving the API by CRC32 checksum, presumably to hide strings of API in their code and also to reduce space.

LethalMind showed in 1999 a way to resolve API using his own checksum in Retrieving API Addresses. Then of course LSD group proposed in 2002 their own ARX based algorithm in WIN32 Assembly components (shellcodes) which was the basis for many win32 shellcodes that followed.

That’s just a brief (potentially inaccurate) historical context of where most of the basic ideas for resolving API came from. Today of course, there are many more advanced challenges to overcome when exploiting vulnerabilities but they are largely related to protection mechanisms and not what I’ll discuss here.

All the structures displayed here can be found in WinNT.h from the Microsoft SDK which should be included with MSVC if you have it installed.

You can find detailed description of PE/PE+ format in pecoff.docx

Image DOS Header

At the start of every PE file we find an MS-DOS executable or a “stub” that makes any PE file a valid MS-DOS executable.

The only field we need here is e_lfanew which when added to the current base address of module gives us a pointer to NT_IMAGE_HEADERS

// DOS .EXE header
typedef struct _IMAGE_DOS_HEADER {      
    WORD   e_magic;     // Magic number
    WORD   e_cblp;      // Bytes on last page of file
    WORD   e_cp;        // Pages in file
    WORD   e_crlc;      // Relocations
    WORD   e_cparhdr;   // Size of header in paragraphs
    WORD   e_minalloc;  // Minimum extra paragraphs needed
    WORD   e_maxalloc;  // Maximum extra paragraphs needed
    WORD   e_ss;        // Initial (relative) SS value
    WORD   e_sp;        // Initial SP value
    WORD   e_csum;      // Checksum
    WORD   e_ip;        // Initial IP value
    WORD   e_cs;        // Initial (relative) CS value
    WORD   e_lfarlc;    // File address of relocation table
    WORD   e_ovno;      // Overlay number
    WORD   e_res[4];    // Reserved words
    WORD   e_oemid;     // OEM identifier (for e_oeminfo)
    WORD   e_oeminfo;   // OEM information; e_oemid specific
    WORD   e_res2[10];  // Reserved words
    LONG   e_lfanew;    // File address of new exe header

Image NT Headers

Because the base address for mapped PE image in memory can be “random”, only the Relative Virtual Address (RVA) of important structures are saved in PE file.

To convert a RVA to Virtual Address (VA) we can use the following macro.

#define RVA2VA(type, base, rva) (type)((ULONG_PTR) base + rva)

Once we add e_lfanew to the base address, we then have a pointer to IMAGE_NT_HEADERS.

The following 2 structures are defined in WinNT.h but only one is used depending on architecture C code is compiled for.

We’re interested in the OptionalHeader field which contains among other things information about import and export directories.

typedef struct _IMAGE_NT_HEADERS64 {
    DWORD Signature;
    IMAGE_OPTIONAL_HEADER64 OptionalHeader;

typedef struct _IMAGE_NT_HEADERS {
    DWORD Signature;
    IMAGE_OPTIONAL_HEADER32 OptionalHeader;

Image Optional Header

At the end of Optional Header is an array of IMAGE_DATA_DIRECTORY structures.

// Directory Entries

#define IMAGE_DIRECTORY_ENTRY_EXPORT 0   // Export Directory
#define IMAGE_DIRECTORY_ENTRY_IMPORT 1   // Import Directory
// Optional header format.

typedef struct _IMAGE_OPTIONAL_HEADER {
  // Standard fields.

  WORD    Magic;
  BYTE    MajorLinkerVersion;
  BYTE    MinorLinkerVersion;
  DWORD   SizeOfCode;
  DWORD   SizeOfInitializedData;
  DWORD   SizeOfUninitializedData;
  DWORD   AddressOfEntryPoint;
  DWORD   BaseOfCode;
  DWORD   BaseOfData;

  // NT additional fields.

  DWORD   ImageBase;
  DWORD   SectionAlignment;
  DWORD   FileAlignment;
  WORD    MajorOperatingSystemVersion;
  WORD    MinorOperatingSystemVersion;
  WORD    MajorImageVersion;
  WORD    MinorImageVersion;
  WORD    MajorSubsystemVersion;
  WORD    MinorSubsystemVersion;
  DWORD   Win32VersionValue;
  DWORD   SizeOfImage;
  DWORD   SizeOfHeaders;
  DWORD   CheckSum;
  WORD    Subsystem;
  WORD    DllCharacteristics;
  DWORD   SizeOfStackReserve;
  DWORD   SizeOfStackCommit;
  DWORD   SizeOfHeapReserve;
  DWORD   SizeOfHeapCommit;
  DWORD   LoaderFlags;
  DWORD   NumberOfRvaAndSizes;

Image Data Directory

Each directory holds a VA and size of directory. To access the export or import directory, simply add the VirtualAddress to base using RVA2VA macro.

// Directory format.

typedef struct _IMAGE_DATA_DIRECTORY {
    DWORD   VirtualAddress;
    DWORD   Size;

  • VirtualAddress
  • RVA of the data structure. For example, if this structure is for import symbols, this field contains the RVA of the IMAGE_IMPORT_DESCRIPTOR array.

  • Size
  • Contains the size in bytes of the data structure referred to by VirtualAddress.

Image Export Directory

Since exports are first in the list of directories, let’s examine this method of retrieval.

// Export Format

typedef struct _IMAGE_EXPORT_DIRECTORY {
    DWORD   Characteristics;
    DWORD   TimeDateStamp;
    WORD    MajorVersion;
    WORD    MinorVersion;
    DWORD   Name;
    DWORD   Base;
    DWORD   NumberOfFunctions;
    DWORD   NumberOfNames;
    DWORD   AddressOfFunctions;     // RVA from base of image
    DWORD   AddressOfNames;         // RVA from base of image
    DWORD   AddressOfNameOrdinals;  // RVA from base of image

We’re interested in 5 fields.

  • Name
  • RVA of a string for DLL name.

  • NumberOfNames
  • The number of exported API by name.

  • AddressOfFunctions
  • RVA to array of RVAs. When each RVA is added to base address of module, they will give us the address of an exported API.

  • AddressOfNames
  • RVA to array of RVAs. When each RVA is added to base address of module, it will give us the address of a null terminated string representing an exported API.

  • AddressOfNameOrdinals
  • RVA to array of ordinals. Each ordinal represents an index in AddressOfFunctions array.

The following function will retrieve an API address from the export table using CRC-32C of DLL and API name.

base parameter is obviously base address of DLL and hash is derived from the addition of 2 CRC-32C hashes. crc32c(DLL string) + crc32c(API string).

LPVOID search_exp(LPVOID base, DWORD hash)
  PIMAGE_DOS_HEADER       dos;
  DWORD                   cnt, rva, dll_h;
  PDWORD                  adr;
  PDWORD                  sym;
  PWORD                   ord;
  PCHAR                   api, dll;
  LPVOID                  api_adr=NULL;
  dos = (PIMAGE_DOS_HEADER)base;
  nt  = RVA2VA(PIMAGE_NT_HEADERS, base, dos->e_lfanew);
  dir = (PIMAGE_DATA_DIRECTORY)nt->OptionalHeader.DataDirectory;
  rva = dir[IMAGE_DIRECTORY_ENTRY_EXPORT].VirtualAddress;
  // if no export table, return NULL
  if (rva==0) return NULL;
  cnt = exp->NumberOfNames;
  // if no api, return NULL
  if (cnt==0) return NULL;
  adr = RVA2VA(PDWORD,base, exp->AddressOfFunctions);
  sym = RVA2VA(PDWORD,base, exp->AddressOfNames);
  ord = RVA2VA(PWORD, base, exp->AddressOfNameOrdinals);
  dll = RVA2VA(PCHAR, base, exp->Name);
  // calculate hash of DLL string
  dll_h = crc32c(dll);
  do {
    // calculate hash of api string
    api = RVA2VA(PCHAR, base, sym[cnt-1]);
    // add to DLL hash and compare
    if (crc32c(api) + dll_h == hash) {
      // return address of function
      api_adr = RVA2VA(LPVOID, base, adr[ord[cnt-1]]);
      return api_adr;
  } while (--cnt && api_adr==0);
  return api_adr;

One important thing to mention is that this function does not resolve API by ordinal nor does it resolve forward references which can sometimes be a problem.

Here’s some assembly to perform the same thing.

; in:  ebx = base of module to search
;      ecx = hash to find
; out: eax = api address resolved in EAT
    ; eax = IMAGE_DOS_HEADER.e_lfanew
    mov    eax, [ebx+3ch]

    ; first directory is export
    ; ecx = IMAGE_DATA_DIRECTORY.VirtualAddress
    mov    ecx, [ebx+eax+78h]
    jecxz  exp_l2

    ; eax = crc32c(IMAGE_EXPORT_DIRECTORY.Name)
    mov    eax, [ebx+ecx+0ch]
    add    eax, ebx
    call   crc32c
    mov    [esp+_edx], eax

    ; esi = IMAGE_EXPORT_DIRECTORY.NumberOfNames
    lea    esi, [ebx+ecx+18h]
    push   4
    pop    ecx         ; load 4 RVA
    lodsd              ; load RVA
    add    eax, ebx    ; eax = RVA2VA(ebx, eax)
    push   eax         ; save VA
    loop   exp_l0

    pop    edi          ; edi = AddressOfNameOrdinals
    pop    edx          ; edx = AddressOfNames
    pop    esi          ; esi = AddressOfFunctions
    pop    ecx          ; ecx = NumberOfNames

    sub    ecx, ebx     ; ecx = VA2RVA(NumberOfNames, base)
    jz     exp_l2       ; exit if no api
    mov    eax, [edx+4*ecx-4] ; get VA of API string
    add    eax, ebx           ; eax = RVA2VA(eax, ebx)
    call   crc32c             ; generate crc32 of api string
    add    eax, [esp+_edx]    ; add crc32 of DLL string

    cmp    eax, [esp+_ecx]    ; found match?
    loopne exp_l3             ; --ecx && eax != hash
    jne    exp_l2             ; exit if not found

    xchg   eax, ebx
    xchg   eax, ecx

    movzx  eax, word [edi+2*eax] ; eax = AddressOfOrdinals[eax]
    add    ecx, [esi+4*eax] ; ecx = base + AddressOfFunctions[eax]
    mov    [esp+_eax], ecx

So that’s the basic method to search through exports. Now for the imports which is a little trickier.

Image Import Descriptor

The release of Enhanced Mitigation Experience Toolkit (EMET) by Microsoft in 2009 broke some existing shellcodes that searched the export directory for API.

EMET includes Export Address Table Access Filtering (EAF) and EAF+ since the release of 5.2, both of which serve to block read attempts of the export and import directories originating from modules commonly used to probe memory during the exploitation of vulnerabilities.

Typically, a shellcode using the IAT will resolve addresses for GetModuleHandle and GetProcAddress before resolving the rest by string.

If a PE file imports API from other modules, the import directory will contain an array of image import descriptors, each one representing a module.

  union {
    DWORD Characteristics; // 0 for terminating null import descriptor
    DWORD OriginalFirstThunk; // RVA to original unbound IAT (PIMAGE_THUNK_DATA)
  DWORD TimeDateStamp;        // 0 if not bound,
                              // -1 if bound, and real date\time stamp
                              //  in IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT (new BIND)
                              // O.W. date/time stamp of DLL bound to (Old BIND)

  DWORD ForwarderChain;       // -1 if no forwarders
  DWORD Name;
  DWORD FirstThunk;           // RVA to IAT (if bound this IAT has actual addresses)

The 3 fields we’re interested in are:

  • OriginalFirstThunk
  • Contains offsets to the names of the imported functions.

  • Name
  • Null terminated string of the module to import API from.

  • FirstThunk
  • Contains offsets to the actual addresses of the functions.

Image Thunk Data

Each descriptor contains RVA that points to array of Image Thunk Data structures. Each entry represents information about the imported API.

typedef struct _IMAGE_THUNK_DATA32 {
    union {
        DWORD ForwarderString;      // PBYTE 
        DWORD Function;             // PDWORD
        DWORD Ordinal;
        DWORD AddressOfData;        // PIMAGE_IMPORT_BY_NAME
    } u1;

In the code, I skip entries that are imported by ordinal.

The AddressOfData from OriginalFirstThunk is an RVA that points to an IMPORT_BY_NAME structure.

The Function field from FirstThunk points to actual address of API function we’re searching for.

Import By Name

Since we’re not importing by ordinal, we don’t care about the hint field, just the name which is null terminated API string.

typedef struct _IMAGE_IMPORT_BY_NAME {
    WORD    Hint;
    BYTE    Name[1];
  • Hint
  • Contains an index into the export table of the DLL the function resides in. This field is for use by the PE loader so it can look up the function in the DLL’s export table quickly.This value is not essential and some linkers may set the value in this field to 0.

  • Name
  • Contains the name of the import function. The name is an ASCIIZ string. Note that Name’s size is defined as byte but it’s really a variable-sized field. It’s just that there is no way to represent a variable-sized field in a structure. The structure is provided so that you can refer to the data structure with descriptive names.

The following code will search import address table for API address using CRC-32C hash of DLL and API strings.

LPVOID search_imp(LPVOID base, DWORD hash)
  DWORD                    dll_h, i, rva;
  PIMAGE_THUNK_DATA        oft, ft;
  PIMAGE_DOS_HEADER        dos;
  PIMAGE_NT_HEADERS        nt;
  PCHAR                    dll;
  LPVOID                   api_adr=NULL;
  dos = (PIMAGE_DOS_HEADER)base;
  nt  = RVA2VA(PIMAGE_NT_HEADERS, base, dos->e_lfanew);
  dir = (PIMAGE_DATA_DIRECTORY)nt->OptionalHeader.DataDirectory;
  rva = dir[IMAGE_DIRECTORY_ENTRY_IMPORT].VirtualAddress;
  // if no import table, return
  if (rva==0) return NULL;

  for (i=0; api_adr==NULL; i++) 
    if (imp[i].Name == 0) return NULL;
    dll   = RVA2VA(PCHAR, base, imp[i].Name);
    dll_h = crc32c(dll); 
    rva   = imp[i].OriginalFirstThunk;
    oft   = (PIMAGE_THUNK_DATA)RVA2VA(ULONG_PTR, base, rva);
    rva   = imp[i].FirstThunk;
    ft    = (PIMAGE_THUNK_DATA)RVA2VA(ULONG_PTR, base, rva);
    for (;; oft++, ft++) 
      if (oft->u1.Ordinal == 0) break;
      // skip import by ordinal
      if (IMAGE_SNAP_BY_ORDINAL(oft->u1.Ordinal)) continue;
      rva = oft->u1.AddressOfData;
      ibn = (PIMAGE_IMPORT_BY_NAME)RVA2VA(ULONG_PTR, base, rva);
      if ((crc32c(ibn->Name) + dll_h) == hash) {
        api_adr = (LPVOID)ft->u1.Function;
  return api_adr;

The assembly follows same alogorithm above but with some optimizations.

; in: ebx = base of module to search
;     ecx = hash to find
; out: eax = api address resolved in IAT
    xor    eax, eax    ; api_adr = NULL
    ; eax = IMAGE_DOS_HEADER.e_lfanew
    mov    eax, [ebx+3ch]
    add    eax, 8     ; add 8 for import directory

    ; eax = IMAGE_DATA_DIRECTORY.VirtualAddress
    mov    eax, [ebx+eax+78h]
    test   eax, eax
    jz     imp_l2

    lea    ebp, [eax+ebx]
    mov    esi, ebp      ; esi = current descriptor
    lodsd                ; OriginalFirstThunk +00h
    xchg   eax, edx      ; temporarily store in edx
    lodsd                ; TimeDateStamp      +04h
    lodsd                ; ForwarderChain     +08h
    lodsd                ; Name               +0Ch
    test   eax, eax
    jz     imp_l2        ; if (Name == 0) goto imp_l2;

    add    eax, ebx
    call   crc32c
    mov    [esp+_edx], eax

    lodsd                 ; FirstThunk
    mov    ebp, esi       ; ebp = next descriptor

    lea    esi, [edx+ebx] ; esi = OriginalFirstThunk + base
    lea    edi, [eax+ebx] ; edi = FirstThunk + base
    lodsd                 ; eax = oft->u1.Function, oft++;
    scasd                 ; ft++;
    test   eax, eax       ; if (oft->u1.Function == 0)
    jz     imp_l0         ; goto imp_l0
    js     imp_l1         ; oft->u1.Ordinal & IMAGE_ORDINAL_FLAG

    lea    eax, [eax+ebx+2] ; oft->Name_
    call   crc32c           ; get crc of API string

    add    eax, [esp+_edx]  ; eax = api_h + dll_h
    cmp    [esp+_ecx], eax  ; found match?
    jne    imp_l1

    mov    eax, [edi-4]     ; ft->u1.Function
    mov    [esp+_eax], eax

Process Environment Block

Perhaps this part should precede everything else?

Another “advancement” arrived with the publication of Gaining important datas from PEB under NT boxes by Ratter/29A in 2002. There was a better way to obtain base address of KERNEL32.DLL simply by reading it from the PEB.

Here I’m using structures from Matt Graeber’s PIC_Bindshell

LPVOID getapi (DWORD dwHash)
  PPEB                     peb;
  PMY_PEB_LDR_DATA         ldr;
  LPVOID                   api_adr=NULL;
#if defined(_WIN64)
  peb = (PPEB) __readgsqword(0x60);
  peb = (PPEB) __readfsdword(0x30);

  ldr = (PMY_PEB_LDR_DATA)peb->Ldr;
  // for each DLL loaded
  for (dte=(PMY_LDR_DATA_TABLE_ENTRY)ldr->InLoadOrderModuleList.Flink;
       dte->DllBase != NULL && api_adr == NULL; 
    api_adr=search_imp(dte->DllBase, dwHash);
  return api_adr;

The assembly is purely based on same algorithm but with some minor optimizations.

; LPVOID get_apix(DWORD hash);
    mov    ecx, [esp+32+4] ; ecx = hash
    push   30h
    pop    eax

    mov    eax, [fs:eax]  ; eax = (PPEB) __readfsdword(0x30);
    mov    eax, [eax+0ch] ; eax = (PMY_PEB_LDR_DATA)peb->Ldr
    mov    edi, [eax+0ch] ; edi = ldr->InLoadOrderModuleList.Flink
    jmp    gapi_l1
    call   search_expx
    test   eax, eax
    jnz    gapi_l2

    mov    edi, [edi]     ; edi = dte->InLoadOrderLinks.Flink
    mov    ebx, [edi+18h] ; ebx = dte->DllBase
    test   ebx, ebx
    jnz    gapi_l0
    xchg   eax, ebx
    mov    [esp+_eax], eax

Hash algorithm

For both examples, I use CRC-32C checksum. The C stands for Castagnoli polynomial. I’ve used it simply because there were no collisions for 80,000 API tested. Some existing hash algorithms provide “good enough” results but the advantage of using CRC-32C is that it is now supported by INTEL cpus since the release of SSE4.2

It should be clear however that the OR operation of bytes with 0x20 is not part of the CRC-32C specification. This is only here to convert strings to lowercase before hashing. Sometimes kernel32.dll can appear as uppercase so it should be converted to lowercase.

In the Metasploit code, the module is converted to uppercase instead.

uint32_t crc32c(const char *s)
  int i;
  uint32_t crc=0;
  do {
    crc ^= (uint8_t)(*s++ | 0x20);
    for (i=0; i<8; i++) {
      crc = (crc >> 1) ^ (0x82F63B78 * (crc & 1));
  } while (*(s - 1) != 0);
  return crc;

Here’s the code using built in instruction.

    xor    eax, eax
    or     al, 0x20
    crc32  edx, al
    cmp    al, 0x20
    jne    crc_l0

Here’s code for CPUs without the support for SSE4.2

; in: eax = s
; out: crc-32c(s)
    xchg   eax, esi          ; esi = s
    xor    eax, eax          ; eax = 0
    cdq                      ; edx = 0
    lodsb                    ; al = *s++ | 0x20
    or     al, 0x20
    xor    dl, al            ; crc ^= c
    push   8
    pop    ecx    
    shr    edx, 1            ; crc >>= 1
    jnc    crc_l2
    xor    edx, 0x82F63B78
    loop   crc_l1
    sub    al, 0x20          ; until al==0
    jnz    crc_l0    
    mov    [esp+_eax], edx

Of course, CRC-32C is not collision resistant. In some cases, you might need to consider using a cryptographic hash algorithm. The smallest I can think of would be CubeHash by Daniel Bernstein.

Although, you could also use a tiny block or stream cipher to encrypt the strings and truncate the ciphertext to 32 or 64-bits. Not sure how collision resistant that would be but it’s worth exploring.


Parsing the import and export tables isn’t a really difficult task. With all the sources and documentation available, there’s really no excuse to avoid using either in a PIC. Using hardcoded API or looking up by ordinal are recipe for a disaster.

By writing your code in C first and generating assembly output with /FAs switch of MSVC, this should make parsing in assembly much easier to understand.

getapi.c contains code in C to locate API by CRC-32C hash. x86.asm and x64.asm contain the code in assembly to locate API by CRC-32C hash.

This entry was posted in assembly, programming, shellcode, windows and tagged , , , , , , , , . Bookmark the permalink.

2 Responses to Shellcode: Resolving API addresses in memory

  1. Pingback: Shellcode: Multimode PIC for x86 (Reverse and Bind Shells for Windows) | modexp

  2. Pingback: Shellcode: Fido and how it resolves GetProcAddress and LoadLibraryA | modexp

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s