llama.cpp: Explicitly send cache_prompt = True

This commit is contained in:
oobabooga 2025-04-30 15:24:07 -07:00
parent 195a45c6e1
commit a6c3ec2299

View file

@ -135,6 +135,7 @@ class LlamaServer:
"prompt": token_ids,
"n_predict": max_new_tokens,
"stream": True,
"cache_prompt": True
})
if shared.args.verbose: