AWS Lambda has become a cornerstone service of the platform, and its introduction nine years ago led way to the widespread adoption of serverless architectures.
AWS doesn't provide any facilities for directly accessing Lambdas at runtime: you can't connect into their underlying infrastructure, you can't live-debug them, and you can't directly run commands in the containers that they run from other than from within the Lambdas themselves. Their isolated nature is often an obstacle to debugging.
Reverse shells are a nifty tool for these types of scenarios - scenarios wherein you might want to run commands on a server that doesn't allow incoming network connections, but from which you can run code that can open outgoing network connections.
Reverse shells are so-named because as opposed to e.g. a standard SSH connection where you initiate a session with a server and run commands there, in a reverse shell the server will initiate the session itself - and allow you to run commands on it.
A Reverse Shell in Lambda
The idea to run a reverse shell from a Lambda came to me when I was dealing with IAM issues in a deployed Lambda - when deployed the Lambda would fail with permissions issues. The debugging was a frustrating experience, filled with redeployments and logging until I was able to identify the problem. I found myself frustrated that I couldn't just "connect" to my Lambda and run some commands for pinpointing the permissions issue, which would have been a much faster debugging experience.
I knew that you can't connect into a Lambda, and I realized that what I was really wishing for was a reverse shell from the Lambda that'd allow me to run code. The idea was simple enough that I jumped straight into a POC:
The POC
The standard formula for a Python-based reverse shell is for the server to initiate an outgoing
connection and redirect stdin
/stdout
/stderr
through the socket before launching an
interactive shell, like this:
s = socket.socket()
s.connect((HOST, PORT))
for fd in (0, 1, 2):
os.dup2(s.fileno(), fd)
subprocess.call(["/bin/bash", "-i"])
Note
Another common pattern is pty.spawn("/bin/bash")
- but this doesn't work in a Lambda context,
because the Lambda environment doesn't allow for allocation of pseudoterminals.
A handy resource for reverse shells can be found in the Payloads All The Things repository.
For a quick POC, I set up a t3.micro
with a public IP that the Lambda could connect to. I gave
the EC2 instance a security group that enabled incoming traffic to port 2222, and set up a
netcat
-based listener:
> nc -nlvp 2222
Listening on 0.0.0.0 2222
And I deployed the following Lambda:
import os
import socket
import subprocess
HOST = <my EC2 public IP>
PORT = 2222
def lambda_handler(_event, _context):
s = socket.socket()
s.connect((HOST, PORT))
for fd in (0, 1, 2):
os.dup2(s.fileno(), fd)
subprocess.call(["/bin/bash", "-i"])
When I invoked the Lambda, my server received an incoming connection!! And I had some fun poking around the container, here's an excerpt:
~> nc -nlvp 2222
Listening on 0.0.0.0 2222
Connection received on 44.203.236.23 38120
bash: no job control in this shell
bash-4.2$ ls
ls
lambda_function.py
bash-4.2$ ls /
ls /
bin
boot
dev
etc
home
lambda-entrypoint.sh
lib
lib64
media
mnt
opt
proc
root
run
sbin
srv
sys
THIRD-PARTY-LICENSES.txt
tmp
usr
var
bash-4.2$ touch test.txt
touch test.txt
touch: cannot touch ‘test.txt’: Read-only file system
bash-4.2$ touch /tmp/test.txt # /tmp has write permissions
touch /tmp/test.txt
bash-4.2$ echo hello > /tmp/test.txt
echo hello > /tmp/test.txt
bash-4.2$ cat /tmp/test.txt
cat /tmp/test.txt
hello
Note
Notice how every command I input is also displayed back to me as part of the command's output?
This is an annoyance in reverse shells that stems from interactive bash assuming that it's
operating inside a tty - i.e. an interactive terminal - and so it echoes the commands you type
back into stdout
to display them to you. However, stdout
redirects back to the socket, and so
we see the commands we sent into the socket come back to us.
With /bin/bash -s
- which doesn't assume it's running from inside a terminal, but still reads
commands from stdin
- we wouldn't have this annoyance, but then the experience would be a little
less ergonomic since we wouldn't get interactive behavior from bash (e.g. if we pressed Up we
wouldn't be able to run the last command entered).
Introducing pdb
Into the Mix
Now let's demonstrate a cool use case for this reverse shell functionality - live debugging of a
deployed Lambda function. We've redirected stdin
/stdout
/stderr
to our server, and in the
same way we can use this redirection to run commands in bash
, we can use this redirection to
run commands in any interactive debugger.
We'll use Python's built-in breakpoint
to demonstrate this:
def debugging_poc():
flag = 0
breakpoint()
if flag:
print("Success")
else:
print("Failure")
def lambda_handler(_event, _context):
# Same code as before for setting up the reverse shell, but without launching bash
debugging_poc()
The idea is for us to interactively change the state of the running code in the Lambda - we'll
want to place a non-zero value in flag
to get the "Success" message printed.
And in our server:
~> nc -nlvp 2222
Listening on 0.0.0.0 2222
Connection received on 54.158.53.111 54386
> /var/task/lambda_function.py(13)debugging_poc()
-> if flag:
(Pdb) flag
0
(Pdb) flag = 1
(Pdb) continue
Success
Cool!!!
The Problem with Redirections
It's worth noting that redirecting stdin
/stdout
/stderr
is a heavy hammer - it's perfectly
possible, even in a non-interactive Lambda context, for those file descriptors to be in real usage
by the application.
I worked on a sub-POC of patching ipdb
and IPython to directly read input and send output to a
socket rather than working with stdin
/stdout
at all. For instance, in IPython's
interactiveshell.py
(which is used by ipdb
), I set up a global variable:
client_socket = socket.socket()
client_socket.connect((<HOST IP>, 2222))
And I saw that while by default IPython has a non-trivial asynchronous command loop, it supports
an IPY_TEST_SIMPLE_PROMPT
environment variable which has a much simpler prompt function, which
I patched to directly work with my socket:
def init_prompt_toolkit_cli(self):
if self.simple_prompt:
# Fall back to plain non-interactive output for tests.
# This is very limited.
def prompt():
prompt_text = "".join(x[1] for x in self.prompts.in_prompt_tokens())
# <Daniel> This reads input from the socket
lines = [client_socket.recv(4096).decode("utf-8")]
# <Daniel> The following line was commented out by me:
# lines = [input(prompt_text)]
prompt_continuation = "".join(x[1] for x in self.prompts.continuation_prompt_tokens())
while self.check_complete('\n'.join(lines))[0] == 'incomplete':
lines.append( input(prompt_continuation) )
return '\n'.join(lines)
self.prompt_for_code = prompt
return
...
Notice that I replaced the call to input
- which reads from stdin
- with a call to
client_socket.recv
.
This was enough for IPython's input - but not for ipdb
, which has a separate prompt. ipdb
calls this function in IPython's interactiveshell.py
:
@property
def debugger_cls(self):
return Pdb if self.simple_prompt else TerminalPdb
The Pdb
class is not good for POC purposes because it is a Python built-in, and so is harder to
patch in Lambda than a third-party package like IPython. Therefore I patched this to always return
TerminalPdb
, and made sure TerminalPdb
had access to the socket:
@property
def debugger_cls(self):
TerminalPdb.client_socket = client
return TerminalPdb
# return Pdb if self.simple_prompt else TerminalPdb
And then, in IPython's debugger.py
, I patched this line of TerminalPdb
's cmdloop
:
line = input("ipdb> ")
to instead be:
line = self.client_socket.recv(4096).decode("utf-8")
Now, all of this worked beautifully for allowing me to send interactive commands from my server to
my Lambda function - but the output would still be printed to the Lambda function's stdout
and
leave me blind back in my server.
Patching output to directly work with a socket rather than with stdout
is harder than with
input - there are many more locations that print than read, and most of the relevant ones are found
in built-in Python modules that are harder to patch.
This took the scope of the work from a "fun Saturday afternoon POC" to a real project, so I
decided to leave this for now - for many applications, the stdin
/stdout
/stderr
redirection
is good enough. It's also possible to "temporarily" redirect those file descriptors to a socket
for a debugging session, and then restore them back to their original state.
Conclusions
Directly connecting into a Lambda is awesome, but it comes at a cost - quite literally, since you pay for a Lambda's runtime, the longer you stay connected to it and keep it running, the more you'll pay. Moreover, your sessions will be limited to Lambda's 15 minutes runtime limit.
Note
In case you're curious like I was - as of writing, in us-east-1
the cheapest Lambda (128MB)
runs for $0.0000000021 per ms, and the most expensive (10240MB) runs for $0.0000001667 per ms.
This means that a single Lambda invocation that runs for a full 15 minutes will cost you between
$0.00189 and $0.15003.
For actually using this for debugging scenarios, you'd also need to pay for a server to be up and ready for accepting reverse-shell connections from deployed Lambdas.
Lambdas are often very frustrating to debug, and the ability to reverse-shell and interactively debug them, or to interactively learn about their runtime, is a nifty tool. Thanks for reading and hope you enjoyed!