Debugging AWS Lambda Functions with a Reverse Shell

AWS Lambda has become a cornerstone service of the platform, and its introduction nine years ago led way to the widespread adoption of serverless architectures.

AWS doesn't provide any facilities for directly accessing Lambdas at runtime: you can't connect into their underlying infrastructure, you can't live-debug them, and you can't directly run commands in the containers that they run from other than from within the Lambdas themselves. Their isolated nature is often an obstacle to debugging.

Reverse shells are a nifty tool for these types of scenarios - scenarios wherein you might want to run commands on a server that doesn't allow incoming network connections, but from which you can run code that can open outgoing network connections.

Reverse shells are so-named because as opposed to e.g. a standard SSH connection where you initiate a session with a server and run commands there, in a reverse shell the server will initiate the session itself - and allow you to run commands on it.

A Reverse Shell in Lambda

The idea to run a reverse shell from a Lambda came to me when I was dealing with IAM issues in a deployed Lambda - when deployed the Lambda would fail with permissions issues. The debugging was a frustrating experience, filled with redeployments and logging until I was able to identify the problem. I found myself frustrated that I couldn't just "connect" to my Lambda and run some commands for pinpointing the permissions issue, which would have been a much faster debugging experience.

I knew that you can't connect into a Lambda, and I realized that what I was really wishing for was a reverse shell from the Lambda that'd allow me to run code. The idea was simple enough that I jumped straight into a POC:

The POC

The standard formula for a Python-based reverse shell is for the server to initiate an outgoing connection and redirect stdin/stdout/stderr through the socket before launching an interactive shell, like this:

s = socket.socket()
s.connect((HOST, PORT))
 
for fd in (0, 1, 2):
    os.dup2(s.fileno(), fd)
 
subprocess.call(["/bin/bash", "-i"])

Note

Another common pattern is pty.spawn("/bin/bash") - but this doesn't work in a Lambda context, because the Lambda environment doesn't allow for allocation of pseudoterminals.

A handy resource for reverse shells can be found in the Payloads All The Things repository.

For a quick POC, I set up a t3.micro with a public IP that the Lambda could connect to. I gave the EC2 instance a security group that enabled incoming traffic to port 2222, and set up a netcat-based listener:

> nc -nlvp 2222
Listening on 0.0.0.0 2222

And I deployed the following Lambda:

import os
import socket
import subprocess
 
 
HOST = <my EC2 public IP>
PORT = 2222
 
 
def lambda_handler(_event, _context):
    s = socket.socket()
    s.connect((HOST, PORT))
 
    for fd in (0, 1, 2):
        os.dup2(s.fileno(), fd)
 
    subprocess.call(["/bin/bash", "-i"])

When I invoked the Lambda, my server received an incoming connection!! And I had some fun poking around the container, here's an excerpt:

~> nc -nlvp 2222
Listening on 0.0.0.0 2222
Connection received on 44.203.236.23 38120
bash: no job control in this shell
 
bash-4.2$ ls
ls
lambda_function.py
 
bash-4.2$ ls /
ls /
bin
boot
dev
etc
home
lambda-entrypoint.sh
lib
lib64
media
mnt
opt
proc
root
run
sbin
srv
sys
THIRD-PARTY-LICENSES.txt
tmp
usr
var
 
bash-4.2$ touch test.txt
touch test.txt
touch: cannot touch ‘test.txt’: Read-only file system
 
bash-4.2$ touch /tmp/test.txt # /tmp has write permissions
touch /tmp/test.txt
 
bash-4.2$ echo hello > /tmp/test.txt
echo hello > /tmp/test.txt
 
bash-4.2$ cat /tmp/test.txt
cat /tmp/test.txt
hello

Note

Notice how every command I input is also displayed back to me as part of the command's output? This is an annoyance in reverse shells that stems from interactive bash assuming that it's operating inside a tty - i.e. an interactive terminal - and so it echoes the commands you type back into stdout to display them to you. However, stdout redirects back to the socket, and so we see the commands we sent into the socket come back to us.

With /bin/bash -s - which doesn't assume it's running from inside a terminal, but still reads commands from stdin - we wouldn't have this annoyance, but then the experience would be a little less ergonomic since we wouldn't get interactive behavior from bash (e.g. if we pressed Up we wouldn't be able to run the last command entered).

Introducing `pdb` Into the Mix

Now let's demonstrate a cool use case for this reverse shell functionality - live debugging of a deployed Lambda function. We've redirected stdin/stdout/stderr to our server, and in the same way we can use this redirection to run commands in bash, we can use this redirection to run commands in any interactive debugger.

We'll use Python's built-in breakpoint to demonstrate this:

def debugging_poc():
    flag = 0
    breakpoint()
 
    if flag:
        print("Success")
    else:
        print("Failure")
 
 
def lambda_handler(_event, _context):
    # Same code as before for setting up the reverse shell, but without launching bash
 
    debugging_poc()

The idea is for us to interactively change the state of the running code in the Lambda - we'll want to place a non-zero value in flag to get the "Success" message printed.

And in our server:

~> nc -nlvp 2222
Listening on 0.0.0.0 2222
Connection received on 54.158.53.111 54386
> /var/task/lambda_function.py(13)debugging_poc()
-> if flag:
(Pdb) flag
0
(Pdb) flag = 1
(Pdb) continue
Success

Cool!!!

The Problem with Redirections

It's worth noting that redirecting stdin/stdout/stderr is a heavy hammer - it's perfectly possible, even in a non-interactive Lambda context, for those file descriptors to be in real usage by the application.

I worked on a sub-POC of patching ipdb and IPython to directly read input and send output to a socket rather than working with stdin/stdout at all. For instance, in IPython's interactiveshell.py (which is used by ipdb), I set up a global variable:

client_socket = socket.socket()
client_socket.connect((<HOST IP>, 2222))

And I saw that while by default IPython has a non-trivial asynchronous command loop, it supports an IPY_TEST_SIMPLE_PROMPT environment variable which has a much simpler prompt function, which I patched to directly work with my socket:

def init_prompt_toolkit_cli(self):
    if self.simple_prompt:
        # Fall back to plain non-interactive output for tests.
        # This is very limited.
        def prompt():
            prompt_text = "".join(x[1] for x in self.prompts.in_prompt_tokens())
            # <Daniel> This reads input from the socket
            lines = [client_socket.recv(4096).decode("utf-8")]
            # <Daniel> The following line was commented out by me:
            # lines = [input(prompt_text)]
            prompt_continuation = "".join(x[1] for x in self.prompts.continuation_prompt_tokens())
            while self.check_complete('\n'.join(lines))[0] == 'incomplete':
                lines.append( input(prompt_continuation) )
            return '\n'.join(lines)
        self.prompt_for_code = prompt
        return
 
    ...

Notice that I replaced the call to input - which reads from stdin - with a call to client_socket.recv.

This was enough for IPython's input - but not for ipdb, which has a separate prompt. ipdb calls this function in IPython's interactiveshell.py:

@property
def debugger_cls(self):
    return Pdb if self.simple_prompt else TerminalPdb

The Pdb class is not good for POC purposes because it is a Python built-in, and so is harder to patch in Lambda than a third-party package like IPython. Therefore I patched this to always return TerminalPdb, and made sure TerminalPdb had access to the socket:

@property
def debugger_cls(self):
    TerminalPdb.client_socket = client
    return TerminalPdb
    # return Pdb if self.simple_prompt else TerminalPdb

And then, in IPython's debugger.py, I patched this line of TerminalPdb's cmdloop:

line = input("ipdb> ")

to instead be:

line = self.client_socket.recv(4096).decode("utf-8")

Now, all of this worked beautifully for allowing me to send interactive commands from my server to my Lambda function - but the output would still be printed to the Lambda function's stdout and leave me blind back in my server.

Patching output to directly work with a socket rather than with stdout is harder than with input - there are many more locations that print than read, and most of the relevant ones are found in built-in Python modules that are harder to patch.

This took the scope of the work from a "fun Saturday afternoon POC" to a real project, so I decided to leave this for now - for many applications, the stdin/stdout/stderr redirection is good enough. It's also possible to "temporarily" redirect those file descriptors to a socket for a debugging session, and then restore them back to their original state.

Conclusions

Directly connecting into a Lambda is awesome, but it comes at a cost - quite literally, since you pay for a Lambda's runtime, the longer you stay connected to it and keep it running, the more you'll pay. Moreover, your sessions will be limited to Lambda's 15 minutes runtime limit.

Note

In case you're curious like I was - as of writing, in us-east-1 the cheapest Lambda (128MB) runs for $0.0000000021 per ms, and the most expensive (10240MB) runs for $0.0000001667 per ms. This means that a single Lambda invocation that runs for a full 15 minutes will cost you between $0.00189 and $0.15003.

For actually using this for debugging scenarios, you'd also need to pay for a server to be up and ready for accepting reverse-shell connections from deployed Lambdas.

Lambdas are often very frustrating to debug, and the ability to reverse-shell and interactively debug them, or to interactively learn about their runtime, is a nifty tool. Thanks for reading and hope you enjoyed!

A Reverse Shell in Lambda

The POC

Note

Note

Introducing pdb Into the Mix

The Problem with Redirections

Conclusions

Note

Introducing `pdb` Into the Mix