I recently was brutally nerdsniped into developing a strange Rust library that turns prints into uwu-speak. I briefly considered writing a proc macro but that was far too memory safe. It's doing-bad-things-to-binaries time! (followed shortly by uwu time~!!)

I am going to use Linux because it's the platform I'm most comfortable doing terrible things to.

I thought of a few strategies including inserting a breakpoint on the write(2) routine in libc, but I figured I'd have to get the symbol anyway, so messing with dynamic linking is probably the best strategy.

The way that dynamically linked symbols are handled on my machine for my Rust executables is primarily through the .rela.dyn section. What this table actually stores is the offsets from the base of the process image for function pointers that are called indirectly when actually calling the function:

(gdb) si
0x000055555558126e      9           unsafe { libc::write(1, s.as_ptr() as *const c_void, s.to_bytes().len()) };
   0x0000555555581267 <_ZN7pwintln4main17hef045d1a4d1daed3E+23>:        48 8d 35 ea 7d 0e 00    lea    rsi,[rip+0xe7dea]        # 0x5
=> 0x000055555558126e <_ZN7pwintln4main17hef045d1a4d1daed3E+30>:        ba 14 00 00 00  mov    edx,0x14
   0x0000555555581273 <_ZN7pwintln4main17hef045d1a4d1daed3E+35>:        bf 01 00 00 00  mov    edi,0x1
   0x0000555555581278 <_ZN7pwintln4main17hef045d1a4d1daed3E+40>:        ff 15 72 5c 17 00       call   QWORD PTR [rip+0x175c72]
  # 0x5555556f6ef0

This form of the call instruction, for those who are unfamiliar, dereferences the pointer [rip + 0x175c72] then calls the resulting address. So, if we want to redirect execution, we can replace the address in memory at 0x5555556f6ef0 with a pointer to our own function!

The way the dynamic linker knows where to put this pointer is by looking it up in the relocations table, which you can see with readelf -r. In particular, we find that 0x0x5555556f6ef0 = PROG_BASE + 0x1a2ef0.

dev/pwintln » readelf -r target/release/pwintln
Relocation section '.rela.dyn' at offset 0x11a0 contains 7458 entries:
   Offset          Info           Type           Sym. Value    Sym. Name + Addend
 00000017e9c0  000000000008 R_X86_64_RELATIVE                    f94e0
 00000017e9c8  000000000008 R_X86_64_RELATIVE                    2d160
 00000017e9d0  000000000008 R_X86_64_RELATIVE                    2d110
                       < ... ... ... ... ... ... >
0000001a2ea8  004400000006 R_X86_64_GLOB_DAT 0000000000000000 pthread_mutexattr_init@GLIBC_2.2.5 + 0
0000001a2ec0  004500000006 R_X86_64_GLOB_DAT 0000000000000000 pthread_key_create@GLIBC_2.2.5 + 0
0000001a2ee8  004600000006 R_X86_64_GLOB_DAT 0000000000000000 pthread_mutex_destroy@GLIBC_2.2.5 + 0
0000001a2ef0  004700000006 R_X86_64_GLOB_DAT 0000000000000000 write@GLIBC_2.2.5 + 0
0000001a2f28  004900000006 R_X86_64_GLOB_DAT 0000000000000000 sigaltstack@GLIBC_2.2.5 + 0
0000001a2f40  004a00000006 R_X86_64_GLOB_DAT 0000000000000000 pthread_mutex_unlock@GLIBC_2.2.5 + 0
0000001a2f48  004b00000006 R_X86_64_GLOB_DAT 0000000000000000 memcpy@GLIBC_2.14 + 0
0000001a2f68  004c00000006 R_X86_64_GLOB_DAT 0000000000000000 open@GLIBC_2.2.5 + 0
0000001a2f88  004d00000006 R_X86_64_GLOB_DAT 0000000000000000 mmap@GLIBC_2.2.5 + 0
0000001a2f98  004e00000006 R_X86_64_GLOB_DAT 0000000000000000 _Unwind_SetIP@GCC_3.0 + 0

 Relocation section '.rela.plt' at offset 0x2ccd0 contains 4 entries:
   Offset          Info           Type           Sym. Value    Sym. Name + Addend
 0000001a11d8  001000000007 R_X86_64_JUMP_SLO 0000000000000000 __register_atfork@GLIBC_2.3.2 + 0
 0000001a11e0  001900000007 R_X86_64_JUMP_SLO 0000000000000000 __fxstat64@GLIBC_2.2.5 + 0
 0000001a11e8  002300000007 R_X86_64_JUMP_SLO 0000000000000000 __tls_get_addr@GLIBC_2.3 + 0
 0000001a11f0  004800000007 R_X86_64_JUMP_SLO 0000000000000000 _Unwind_Resume@GCC_3.0 + 0

Here, we show the process of poking at the symbol in a different way: first, we get the program base with info proc mappings (the first line). Then, we look at the memory at PROG_BASE + 0x1a1ef0, then interpret the quad-word we find there as a pointer, dereferencing it and looking at the disassembly at its target. We find libc code for write(2) here!

dev/pwintln » gdb target/release/pwintln
(gdb) info proc map
process 26676
Mapped address spaces:

          Start Addr           End Addr       Size     Offset objfile
      0x555555554000     0x555555581000    0x2d000        0x0 /home/jade/dev/pwintln/target/release/pwintln
      0x555555581000     0x555555668000    0xe7000    0x2d000 /home/jade/dev/pwintln/target/release/pwintln
      0x555555668000     0x5555556d1000    0x69000   0x114000 /home/jade/dev/pwintln/target/release/pwintln
         < ... ... ... >
(gdb) x/gx 0x555555554000 + 0x1a1ef0
0x5555556f5ef0: 0x00007ffff7ec4f50
(gdb) x/10i 0x00007ffff7ec4f50
   0x7ffff7ec4f50 <write>:      endbr64
   0x7ffff7ec4f54 <write+4>:    mov    eax,DWORD PTR fs:0x18
   0x7ffff7ec4f5c <write+12>:   test   eax,eax
   0x7ffff7ec4f5e <write+14>:   jne    0x7ffff7ec4f70 <write+32>
   0x7ffff7ec4f60 <write+16>:   mov    eax,0x1
   0x7ffff7ec4f65 <write+21>:   syscall
   0x7ffff7ec4f67 <write+23>:   cmp    rax,0xfffffffffffff000
   0x7ffff7ec4f6d <write+29>:   ja     0x7ffff7ec4fc0 <write+112>
   0x7ffff7ec4f6f <write+31>:   ret
   0x7ffff7ec4f70 <write+32>:   sub    rsp,0x28

So, we know what we want to hack and how we want to hack it, but how do we find these pointers exactly? Well, we could consult StackOverflow but the answer is some fairly ugly C.

Rust will mostly save us from much of the uglier pointer code, and the goblin crate makes a lot of the ELF code much more pleasant.

Linux provides us the libc function getauxval(3), which will retrieve various bits of information that the kernel's ELF loader thinks were good. The most relevant one to figuring out where our program is loaded is getauxval(AT_PHDR), which gives us the address of our Elf64_Phdr structure, which will in turn have its own virtual address offset from the base. We can subtract that offset to get the base of where our executable was loaded.

Side note: if you want to read preprocessor-infested C code and headers as their concrete representation, you can do something like this:

~ » cpp /usr/include/link.h | grep -B10 Elf64_Phdr
typedef struct
  Elf64_Word p_type;
  Elf64_Word p_flags;
  Elf64_Off p_offset;
  Elf64_Addr p_vaddr;
  Elf64_Addr p_paddr;
  Elf64_Xword p_filesz;
  Elf64_Xword p_memsz;
  Elf64_Xword p_align;
} Elf64_Phdr;

Once we have that header, we can calculate memory addresses to other structures in the ELF. I use dyn64::from_phdrs(base: usize, headers: &[ProgramHeader]) which looks for a program header with p_type == PT_DYNAMIC, then uses that address and length to make a slice of Dyn (Elf64_Dyn in C) structures.

This is a pile of tagged pointers to various bits related to dynamic linking:

dev/pwintln » readelf -d target/debug/pwintln

Dynamic section at offset 0x4bb6c8 contains 32 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libgcc_s.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [ld-linux-x86-64.so.2]
 0x0000000000000001 (NEEDED)             Shared library: [libm.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libpthread.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libdl.so.2]
 0x000000000000000c (INIT)               0x68000
 0x000000000000000d (FINI)               0x3a9424
 0x0000000000000019 (INIT_ARRAY)         0x490f00
 0x000000000000001b (INIT_ARRAYSZ)       16 (bytes)
 0x000000000000001a (FINI_ARRAY)         0x490f10
 0x000000000000001c (FINI_ARRAYSZ)       8 (bytes)
 0x000000006ffffef5 (GNU_HASH)           0x340
 0x0000000000000005 (STRTAB)             0xad0
 0x0000000000000006 (SYMTAB)             0x368
 0x000000000000000a (STRSZ)              1331 (bytes)
 0x000000000000000b (SYMENT)             24 (bytes)
 0x0000000000000015 (DEBUG)              0x0
 0x0000000000000003 (PLTGOT)             0x4bc908
 0x0000000000000002 (PLTRELSZ)           96 (bytes)
 0x0000000000000014 (PLTREL)             RELA

For whatever reason, when they are actually loaded into memory, they are resolved to actual pointers rather than the offsets we see here. In any case, this is how you find the various bits you need next:

We can walk through the Rela table (storing tuples of (offset, info, addend)), an array of Elf64_Rela in C, to find the symbol we're looking for. To find things in it, such as our write we want to hack, we have to resolve the names of the symbols, so let's get started on that.

The info field is a packed 64 bit integer with the index into the symbol table in the upper 32 bits. The rest of the structure we can just ignore.

Once we index into the symbol table (which mysteriously doesn't seem to have any terribly accessible way to get a length for?? This project was intended as a joke so I used more unsafe ✨✨), we can get a symbol record.

These symbol records (Elf64_Sym) have st_name and a bunch of other fields we don't really care about. But, since this is ELF, there's more indirection! The st_name is an offset into the strings table, which is a big packed blob of null-terminated C strings. So, we either use some C string functions or let goblin's Strtab abstraction deal with it for us, to get the actual string.

Now that we have the string, we can reject all the symbols we aren't looking for.

We have reached the home stretch of getting the pointer we were looking for, the offset of which is in the Rela from earlier, which we can add to the program base to get our pointer.


Replacing the function with our own

This part is much easier. We need to write an extern "C" function in Rust that has the same signature as write(2) in libc (note that at this machine level, there is no type system; so if we mess up, it might crash horribly or do other UB. Fun!)

Once we have this function, we can replace it by putting a pointer to it at the address we found earlier. This might require redoing the memory protection on the page with mprotect(2) to allow reading and writing to it, because "security" or some other similar good idea.

Store off the address to the real write(2) into some static (bonus points for atomics), and replace the existing pointer.

We can then implement the wrapper using the function pointer we squirreled away, to do whatever nefarious things we want.

uwu ✨

The crate I am writing this post on is on GitHub here. Have fun in your future binary spelunking adventures!