Skip to content

Instantly share code, notes, and snippets.

@rain-1
Last active September 11, 2023 11:12
Show Gist options
  • Star 8 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rain-1/a1ed1116c6da4d2b195e562c8d3f9482 to your computer and use it in GitHub Desktop.
Save rain-1/a1ed1116c6da4d2b195e562c8d3f9482 to your computer and use it in GitHub Desktop.
Prompt Injection and AutoGPT

Does prompt injection matter to AutoGPT?

Executive summary: If you use AutoGPT, you need to be aware of prompt injection. This is a serious problem that can cause your AutoGPT agent to perform unexpected and unwanted tasks. Unfortunately, there isn't a perfect solution to this problem available yet.

Prompt injection can derail agents

If you set up an AutoGPT agent to perform task A, a prompt injection could 'derail' it into performing task B instead. Task B could be anything. Even something unwanted like deleting your personal files or sending all your bitcoins to some crooks wallet.

Docker helps limit the file system access that agents have. Measures like this are extremely useful. It's important to note that the agent can still be derailed.

How it works

In band signalling. This can happen if the agent uses web search to find things, and it hits upon a document that includes a prompt injection. GPT has to decide by itself whether text is instructions or not, and sometimes it will misinterpret untrusted user input from the internet as part of its instructions.

Proof of Concept

I'm not providing an example PoC. It may be useful for one to be privately shared with the developers to help them experiment with and address this problem but for now there is none.

Defenses

It isn't yet known how to protect against prompt injection.

Escaping approach

SQL injection can be solved by escaping the untrusted user input. Unfortunately there is no equivalent to escaping untrusted user input.

The LLM approach

Why not add a second LLM that tries to detect prompt injection before passing the text to the LLM?

As an analogy: If you think of prompt injection like a knife and LLMs like a paper bag, the knife can stab through a paper bag. It can also stab through a second paper bag. To protect against a knife you need something different that's tough against a knife like armor.

We can't solve a vulnerability by adding more layers of vulnerable technology.

Arms race approach

Another idea is to keep blocking every prompt injection that is found. This doesn't address the root cause of the problem. And it means that people will be occasionally victim to attacks.

ChatGPT helped me redraft the "Executive summary" part.

@simonw
Copy link

simonw commented May 7, 2023

I've been writing about this problem a bunch recently. Unfortunately there still isn't a good way of addressing this, despite a lot of motivated people trying to find a good solution.

@rain-1
Copy link
Author

rain-1 commented May 12, 2023

@simonw Thank you!

@ato2
Copy link

ato2 commented May 15, 2023

Any clue ?
cc -I. -O3 -std=c11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -pthread -march=native -mtune=native -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -I/targets/x86_64-linux/include -c ggml.c -o ggml.o
ggml.c: In function ‘bytes_from_nibbles_32’:
ggml.c:534:27: warning: implicit declaration of function ‘_mm256_set_m128i’; did you mean ‘_mm256_set_epi8’? [-Wimplicit-function-declaration]
const __m256i bytes = _mm256_set_m128i(_mm_srli_epi16(tmp, 4), tmp);
^~~~~~~~~~~~~~~~
_mm256_set_epi8
ggml.c:534:27: error: incompatible types when initializing type ‘__m256i {aka const __vector(4) long long int}’ using type ‘int’

There is no definition (gcc12):
grep _mm256_set_m128i /usr/lib64/gcc/x86_64-suse-linux/*/include/immintrin.h

The _mm256_set_m128i function is an AVX2 intrinsic that is part of the Intel Intrinsics Guide.
However, it was not originally part of GCC's implementation of AVX2 intrinsics.
As of my knowledge cutoff in September 2021, it seems that _mm256_set_m128i was not directly included in GCC. Instead, the equivalent functionality could be achieved by using other AVX2 intrinsics, like _mm256_insertf128_si256 and _mm256_castsi128_si256.

@rain-1
Copy link
Author

rain-1 commented May 17, 2023

@ato2 i am not sure, sorry! try asking on the llama.cpp or ggml issue tracker

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment