mirror of
https://github.com/oobabooga/text-generation-webui.git
synced 2026-01-07 17:20:19 +01:00
UI: Set max_updates_second to 12 by default
When the tokens/second at at ~50 and the model is a thinking model, the markdown rendering for the streaming message becomes a CPU bottleneck.
This commit is contained in:
parent
a4bf339724
commit
b46ca01340
|
|
@ -47,7 +47,7 @@ settings = {
|
|||
'max_new_tokens_max': 4096,
|
||||
'prompt_lookup_num_tokens': 0,
|
||||
'max_tokens_second': 0,
|
||||
'max_updates_second': 0,
|
||||
'max_updates_second': 12,
|
||||
'auto_max_new_tokens': True,
|
||||
'ban_eos_token': False,
|
||||
'add_bos_token': True,
|
||||
|
|
|
|||
Loading…
Reference in a new issue