LLuffman: LLM-based steganography

2023-04-28

It is a tale
Told by an idiot, full of sound and fury
Signifying nothing (MacBeth I.v. 58-59).
And here is the thing: if you are looking at these blog entries for their significance - you won't get any!

— ./main -m ./models/7B/ggml-model-q4_0.bin -f macbeth.txt -e secret

Text generated by large language models (LLMs) has recently become much more copious and is extremely ignorable. This makes LLM text an attractive target for steganography. Here I describe my steganographic modification of llama.cpp, a C++ implementation of Meta’s LLaMA language model, that encodes an external bitstream in its generated text in a way that can be decoded later.

Tokenization

LLaMA and other LLMs operate not on individual letters or words, but on tokens. The model defines how to break up text into smaller strings, represented as integers. For example, Told by an idiot, tokenizes in this way:

 29911 -> 'T'
  1025 -> 'old'
   491 -> ' by'
   385 -> ' an'
  1178 -> ' id'
 24414 -> 'iot'
 29892 -> ','

That is, the letter T by itself corresponds to the number 29911, and the string ‘old’, with no space before it, corresponds to 1025. The model “sees” input and output text as a list of integers.

Encoding

The model normally operates by continually generating a predicted probability distribution of the next token, choosing one, and then updating so that it then predicts the token after that. After a prompt of “It is a tale”, LLaMA’s distribution is:

 prob.    cumulative
 0.77315  0.77315:    310 -> ' of'
 0.15700  0.93015:    408 -> ' as'        
 0.03834  0.96850:   5429 -> ' told'
 0.03150  1.00000:    393 -> ' that'

Ordinarily what an LLM does is sample from this distribution by the given probabilities: 77% of the time it will choose ‘of’ to be its next token, 16% of the time it will choose ‘as’, then 3.8% for ‘told’ and 3.1% for ‘that’.

If we have an external source of bits that we would like to encode in the generated text, we can construct a Huffman code for LLaMA’s distribution and grab bits from this source instead of using a random number generator. After “It is a tale”, with the above distribution, the code looks like:

0   -  310 -> ' of'
10  -  408 -> ' as'
110 - 5429 -> ' told'
111 -  393 -> ' that'

After using up all these bits from the source to choose a path down the Huffman code that we construct for each next token, we end up with text that is probabalistically similar to random LLM text, but can be decoded to recover the source bits.

So, the encoder starts with an initial prompt and then repeats these steps until the source bitstream is exhausted:

Use LLaMA to produce a probability distribution for the next token.
Convert this to a code.
Take data from the source to look up what the next token should be.
Output this token and add it into LLaMA.

Decoding

A decoder can be constructed by regenerating the same Huffman code for each next token, reading what the next token is, and then turning that back into a bitstream. For example, if we know that the encoder started with a prompt of “It is a tale”, and we see “It is a tale that”, we know that the encoder saw the bits 111, and we can write those back to an output stream. At that point we would look at the next token that the encoder sent us and do the same thing. The steps for each token to decode are:

Use LLaMA to produce a probability distribution for the next token.
Convert this to a code.
Look up what the actual next token was, and output these bits.
Add the next token into LLaMA and repeat.

If all goes right, steps 1 and 2 at the encoder and decoder should produce the same thing, and when they do, the decoder recovers the same data. Both the encoder and decoder need to have the same prompt and other settings, including temperature, top k, top p, etc., and of course the same model.

Usage

The patch for llama.cpp that will compile to an encoder and decoder is here.

This patch is based on this specific revision of llama.cpp, and I can offer no reassurance that it would work on other revisions, or be easily rebase-able. But it works for me.

It adds two (currently undocumented) switches to the main executable:

Adding -e [filename] will encode the contents of the file given. llama.cpp has a lot of output, but most is to stderr. Redirecting the output to a file will give you clean text that can go into the decoder.
Adding -d [filename] will decode the text in the file given. Same thing applies here, lots of output but stdout is clean.

The sender of a secret message should, for example, run

./main -e secret-message.txt -p "Larry" > encoded-message.txt

and deliver encoded-message.txt to the recipient. The recipient should run

./main -d encoded-message.txt -p "Larry" > secret-message.txt

to recover the message.

Caveats

The encoder won’t use all of the bits and adds no padding if it runs out in the middle of consuming them for the next token, so the last byte may end up cut off. Try adding a padding byte at the end of your message.

The encoder reads the entire source file before encoding, so you cannot, for example, use -e /dev/random to produce random text.

The implied distribution of the Huffman codes produced is not the same as the actual LLaMA distribution, so a sufficiently motivated adversary would be able to distinguish whether the output text is from unmodified LLaMA vs. LLuffman.