oobabooga
|
b42192c2b7
|
Implement settings autosaving
|
2025-12-01 10:43:42 -08:00 |
|
oobabooga
|
41618cf799
|
Merge branch 'dev' into image_generation
|
2025-12-01 09:35:22 -08:00 |
|
oobabooga
|
5327bc9397
|
Update modules/shared.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
2025-11-28 22:48:05 -03:00 |
|
oobabooga
|
148a5d1e44
|
Keep things more modular
|
2025-11-27 15:32:01 -08:00 |
|
oobabooga
|
a873692234
|
Image generation now functional
|
2025-11-27 14:24:35 -08:00 |
|
oobabooga
|
aa63c612de
|
Progress on model loading
|
2025-11-27 13:46:54 -08:00 |
|
oobabooga
|
164c6fcdbf
|
Add the UI structure
|
2025-11-27 13:44:07 -08:00 |
|
GodEmperor785
|
400bb0694b
|
Add slider for --ubatch-size for llama.cpp loader, change defaults for better MoE performance (#7316)
|
2025-11-21 16:56:02 -03:00 |
|
oobabooga
|
0d4eff284c
|
Add a --cpu-moe model for llama.cpp
|
2025-11-19 05:23:43 -08:00 |
|
oobabooga
|
b5a6904c4a
|
Make --trust-remote-code immutable from the UI/API
|
2025-10-14 20:47:01 -07:00 |
|
oobabooga
|
78ff21d512
|
Organize the --help message
|
2025-10-10 15:21:08 -07:00 |
|
oobabooga
|
13876a1ee8
|
llama.cpp: Remove the --flash-attn flag (it's always on now)
|
2025-08-30 20:28:26 -07:00 |
|
oobabooga
|
0b4518e61c
|
"Text generation web UI" -> "Text Generation Web UI"
|
2025-08-27 05:53:09 -07:00 |
|
oobabooga
|
02ca96fa44
|
Multiple fixes
|
2025-08-25 22:17:22 -07:00 |
|
oobabooga
|
6c165d2e55
|
Fix the chat template
|
2025-08-25 18:28:43 -07:00 |
|
oobabooga
|
dbabe67e77
|
ExLlamaV3: Enable the --enable-tp option, add a --tp-backend option
|
2025-08-17 13:19:11 -07:00 |
|
altoiddealer
|
57f6e9af5a
|
Set multimodal status during Model Loading (#7199)
|
2025-08-13 16:47:27 -03:00 |
|
oobabooga
|
d86b0ec010
|
Add multimodal support (llama.cpp) (#7027)
|
2025-08-10 01:27:25 -03:00 |
|
Katehuuh
|
88127f46c1
|
Add multimodal support (ExLlamaV3) (#7174)
|
2025-08-08 23:31:16 -03:00 |
|
oobabooga
|
498778b8ac
|
Add a new 'Reasoning effort' UI element
|
2025-08-05 15:19:11 -07:00 |
|
oobabooga
|
1d1b20bd77
|
Remove the --torch-compile option (it doesn't do anything currently)
|
2025-07-11 10:51:23 -07:00 |
|
oobabooga
|
273888f218
|
Revert "Use eager attention by default instead of sdpa"
This reverts commit bd4881c4dc.
|
2025-07-10 18:56:46 -07:00 |
|
oobabooga
|
e015355e4a
|
Update README
|
2025-07-09 20:03:53 -07:00 |
|
oobabooga
|
bd4881c4dc
|
Use eager attention by default instead of sdpa
|
2025-07-09 19:57:37 -07:00 |
|
oobabooga
|
6c2bdda0f0
|
Transformers loader: replace use_flash_attention_2/use_eager_attention with a unified attn_implementation
Closes #7107
|
2025-07-09 18:39:37 -07:00 |
|
oobabooga
|
faae4dc1b0
|
Autosave generated text in the Notebook tab (#7079)
|
2025-06-16 17:36:05 -03:00 |
|
oobabooga
|
de24b3bb31
|
Merge the Default and Notebook tabs into a single Notebook tab (#7078)
|
2025-06-16 13:19:29 -03:00 |
|
oobabooga
|
2dee3a66ff
|
Add an option to include/exclude attachments from previous messages in the chat prompt
|
2025-06-12 21:37:18 -07:00 |
|
oobabooga
|
004fd8316c
|
Minor changes
|
2025-06-11 07:49:51 -07:00 |
|
oobabooga
|
27140f3563
|
Revert "Don't save active extensions through the UI"
This reverts commit df98f4b331.
|
2025-06-11 07:25:27 -07:00 |
|
oobabooga
|
3f9eb3aad1
|
Fix the preset dropdown when the default preset file is not present
|
2025-06-10 14:22:37 -07:00 |
|
oobabooga
|
df98f4b331
|
Don't save active extensions through the UI
Prevents command-line activated extensions from becoming permanently active due to autosave.
|
2025-06-09 20:28:16 -07:00 |
|
oobabooga
|
84f66484c5
|
Make it optional to paste long pasted content to an attachment
|
2025-06-08 09:31:38 -07:00 |
|
oobabooga
|
1bdf11b511
|
Use the Qwen3 - Thinking preset by default
|
2025-06-07 22:23:09 -07:00 |
|
oobabooga
|
caf9fca5f3
|
Avoid some code repetition
|
2025-06-07 22:11:35 -07:00 |
|
oobabooga
|
6436bf1920
|
More UI persistence: presets and characters (#7051)
|
2025-06-08 01:58:02 -03:00 |
|
oobabooga
|
35ed55d18f
|
UI persistence (#7050)
|
2025-06-07 22:46:52 -03:00 |
|
oobabooga
|
bb409c926e
|
Update only the last message during streaming + add back dynamic UI update speed (#7038)
|
2025-06-02 09:50:17 -03:00 |
|
oobabooga
|
9ec46b8c44
|
Remove the HQQ loader (HQQ models can be loaded through Transformers)
|
2025-05-19 09:23:24 -07:00 |
|
oobabooga
|
126b3a768f
|
Revert "Dynamic Chat Message UI Update Speed (#6952)" (for now)
This reverts commit 8137eb8ef4.
|
2025-05-18 12:38:36 -07:00 |
|
oobabooga
|
47d4758509
|
Fix #6970
|
2025-05-10 17:46:00 -07:00 |
|
oobabooga
|
b28fa86db6
|
Default --gpu-layers to 256
|
2025-05-06 17:51:55 -07:00 |
|
Downtown-Case
|
5ef564a22e
|
Fix model config loading in shared.py for Python 3.13 (#6961)
|
2025-05-06 17:03:33 -03:00 |
|
mamei16
|
8137eb8ef4
|
Dynamic Chat Message UI Update Speed (#6952)
|
2025-05-05 18:05:23 -03:00 |
|
oobabooga
|
df7bb0db1f
|
Rename --n-gpu-layers to --gpu-layers
|
2025-05-04 20:03:55 -07:00 |
|
oobabooga
|
4cea720da8
|
UI: Remove the "Autoload the model" feature
|
2025-05-02 16:38:28 -07:00 |
|
oobabooga
|
905afced1c
|
Add a --portable flag to hide things in portable mode
|
2025-05-02 16:34:29 -07:00 |
|
oobabooga
|
b46ca01340
|
UI: Set max_updates_second to 12 by default
When the tokens/second at at ~50 and the model is a thinking model,
the markdown rendering for the streaming message becomes a CPU
bottleneck.
|
2025-04-30 14:53:15 -07:00 |
|
oobabooga
|
d10bded7f8
|
UI: Add an enable_thinking option to enable/disable Qwen3 thinking
|
2025-04-28 22:37:01 -07:00 |
|
oobabooga
|
7b80acd524
|
Fix parsing --extra-flags
|
2025-04-26 18:40:03 -07:00 |
|