Skip to content

Instantly share code, notes, and snippets.

@simonw
Created May 10, 2023 17:35
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save simonw/b9a1f080714785b7ee16c7d04db12210 to your computer and use it in GitHub Desktop.
Save simonw/b9a1f080714785b7ee16c7d04db12210 to your computer and use it in GitHub Desktop.

What I want from a run-Python-in-WASM solution

I want to be able to use WebAssembly as a sandbox to safely run Python code from untrusted sources, within my existing Python applications.

I'd like to be able to:

  • Pass in an untrusted string of Python code and have that evaluated
  • Maybe also pass in initial variables to be used in the code - though hard-coding them would work OK too
  • Have the code run in a sandboxed environment, with a timeout and memory limit
  • The sandbox disallows network access and disk access - it can only access the variables passed in
  • The result is returned back to my program

Effectively I want to be able to do something like this:

pip install python-wasm-sandbox

Then:

from python_wasm_sandbox import run_python

# This code could come from an untrusted source
code = """
result = input_string.upper()
"""

result = run_python(code, variables={
    "input_string": "Hello, World!"
}, timeout=1.0, memory_limit_in_bytes=4096)

Use-cases for this:

  • I want to build features where users can enter Python expressions which will be used to transform their data (for my Datasette project)
  • I want to let people edit code that will be run on a schedule (by a cron-like mechanism) by typing it into a textarea
  • I want my users to be able to copy-and-paste in code snippets from elsewhere - and limit the damage that can be caused if someone malicious convinces them to copy-and-paste in something harmful

Stretch goal

The ability to prepare some untrusted code once and then call it multiple times with different inputs would be neat too - something like this:

from python_wasm_sandbox import PythonSandbox

code = """
def convert(input):
    return input.upper()
"""

sandbox = PythonSandbox(code, timeout=1.0, memory_limit_in_bytes=4096)

for input in ["hello", "world"]:
    print(sandbox.execute("convert", input))

In this case, every callable defined in code becomes a thing that can be executed by the sandbox, with the result returned back to the caller.

I'd be fine with a restriction that says only basic Python types - strings, floats, integers, bytes - can be passed in and out of the sandbox. I can roll my own serialization/deserialization on top of that if I need to.

@simonw
Copy link
Author

simonw commented May 10, 2023

This is almost the solution I want: https://til.simonwillison.net/webassembly/python-in-a-wasm-sandbox

I don't like how it relies on using a temporary filesystem though - I'd rather be able to pass code directly to the Python sandbox, execute it there and have the results returned back to me.

@simonw
Copy link
Author

simonw commented May 10, 2023

@dicej
Copy link

dicej commented May 12, 2023

I put together a demo which meets some of your requirements, I believe: https://github.com/dicej/component-sandbox-demo. The stretch goal also works; see https://github.com/dicej/component-sandbox-demo#examples. I don't know offhand if wasmtime-py allows you to specify time and memory limits. You can certainly do that via the wasmtime Rust API, so worst case it's just a matter of plumbing it through to the Python API (by way of the C API).

Note that you can't pip install componentize-py yet, but I plan to make that possible next week.

@dicej
Copy link

dicej commented May 22, 2023

Update: you can pip install componentize-py now, and I've updated the instructions in the demo README.md.

@dicej
Copy link

dicej commented May 23, 2023

Update 2: I've added a timeout to the demo. Unfortunately, I don't think there's currently a way to limit guest memory usage in wasmtime-py since ResourceLimiter is not exposed as part of the Wasmtime C or Python APIs. @alexcrichton is that correct?

@alexcrichton
Copy link

The full power of that trait isn't exposed but a limited version is available for limiting memory. (just added in 9.0.0)

@dicej
Copy link

dicej commented May 23, 2023

Thanks, @alexcrichton. I've updated the demo to specify a memory limit. @simonw I believe the demo now satisfies all the requirements you listed, including the stretch goal. Just needs to be wrapped in a friendly API at this point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment