text-generation-webui

mirror of https://github.com/oobabooga/text-generation-webui.git synced 2025-12-06 07:12:10 +01:00

Author	SHA1	Message	Date
oobabooga	b42192c2b7	Implement settings autosaving	2025-12-01 10:43:42 -08:00
oobabooga	41618cf799	Merge branch 'dev' into image_generation	2025-12-01 09:35:22 -08:00
oobabooga	5327bc9397	Update modules/shared.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-11-28 22:48:05 -03:00
oobabooga	148a5d1e44	Keep things more modular	2025-11-27 15:32:01 -08:00
oobabooga	a873692234	Image generation now functional	2025-11-27 14:24:35 -08:00
oobabooga	aa63c612de	Progress on model loading	2025-11-27 13:46:54 -08:00
oobabooga	164c6fcdbf	Add the UI structure	2025-11-27 13:44:07 -08:00
GodEmperor785	400bb0694b	Add slider for --ubatch-size for llama.cpp loader, change defaults for better MoE performance (#7316 )	2025-11-21 16:56:02 -03:00
oobabooga	0d4eff284c	Add a --cpu-moe model for llama.cpp	2025-11-19 05:23:43 -08:00
oobabooga	b5a6904c4a	Make --trust-remote-code immutable from the UI/API	2025-10-14 20:47:01 -07:00
oobabooga	78ff21d512	Organize the --help message	2025-10-10 15:21:08 -07:00
oobabooga	13876a1ee8	llama.cpp: Remove the --flash-attn flag (it's always on now)	2025-08-30 20:28:26 -07:00
oobabooga	0b4518e61c	"Text generation web UI" -> "Text Generation Web UI"	2025-08-27 05:53:09 -07:00
oobabooga	02ca96fa44	Multiple fixes	2025-08-25 22:17:22 -07:00
oobabooga	6c165d2e55	Fix the chat template	2025-08-25 18:28:43 -07:00
oobabooga	dbabe67e77	ExLlamaV3: Enable the --enable-tp option, add a --tp-backend option	2025-08-17 13:19:11 -07:00
altoiddealer	57f6e9af5a	Set multimodal status during Model Loading (#7199 )	2025-08-13 16:47:27 -03:00
oobabooga	d86b0ec010	Add multimodal support (llama.cpp) (#7027 )	2025-08-10 01:27:25 -03:00
Katehuuh	88127f46c1	Add multimodal support (ExLlamaV3) (#7174 )	2025-08-08 23:31:16 -03:00
oobabooga	498778b8ac	Add a new 'Reasoning effort' UI element	2025-08-05 15:19:11 -07:00
oobabooga	1d1b20bd77	Remove the --torch-compile option (it doesn't do anything currently)	2025-07-11 10:51:23 -07:00
oobabooga	273888f218	Revert "Use eager attention by default instead of sdpa" This reverts commit `bd4881c4dc`.	2025-07-10 18:56:46 -07:00
oobabooga	e015355e4a	Update README	2025-07-09 20:03:53 -07:00
oobabooga	bd4881c4dc	Use eager attention by default instead of sdpa	2025-07-09 19:57:37 -07:00
oobabooga	6c2bdda0f0	Transformers loader: replace `use_flash_attention_2`/`use_eager_attention` with a unified `attn_implementation` Closes #7107	2025-07-09 18:39:37 -07:00
oobabooga	faae4dc1b0	Autosave generated text in the Notebook tab (#7079 )	2025-06-16 17:36:05 -03:00
oobabooga	de24b3bb31	Merge the Default and Notebook tabs into a single Notebook tab (#7078 )	2025-06-16 13:19:29 -03:00
oobabooga	2dee3a66ff	Add an option to include/exclude attachments from previous messages in the chat prompt	2025-06-12 21:37:18 -07:00
oobabooga	004fd8316c	Minor changes	2025-06-11 07:49:51 -07:00
oobabooga	27140f3563	Revert "Don't save active extensions through the UI" This reverts commit `df98f4b331`.	2025-06-11 07:25:27 -07:00
oobabooga	3f9eb3aad1	Fix the preset dropdown when the default preset file is not present	2025-06-10 14:22:37 -07:00
oobabooga	df98f4b331	Don't save active extensions through the UI Prevents command-line activated extensions from becoming permanently active due to autosave.	2025-06-09 20:28:16 -07:00
oobabooga	84f66484c5	Make it optional to paste long pasted content to an attachment	2025-06-08 09:31:38 -07:00
oobabooga	1bdf11b511	Use the Qwen3 - Thinking preset by default	2025-06-07 22:23:09 -07:00
oobabooga	caf9fca5f3	Avoid some code repetition	2025-06-07 22:11:35 -07:00
oobabooga	6436bf1920	More UI persistence: presets and characters (#7051 )	2025-06-08 01:58:02 -03:00
oobabooga	35ed55d18f	UI persistence (#7050 )	2025-06-07 22:46:52 -03:00
oobabooga	bb409c926e	Update only the last message during streaming + add back dynamic UI update speed (#7038 )	2025-06-02 09:50:17 -03:00
oobabooga	9ec46b8c44	Remove the HQQ loader (HQQ models can be loaded through Transformers)	2025-05-19 09:23:24 -07:00
oobabooga	126b3a768f	Revert "Dynamic Chat Message UI Update Speed (#6952 )" (for now) This reverts commit `8137eb8ef4`.	2025-05-18 12:38:36 -07:00
oobabooga	47d4758509	Fix #6970	2025-05-10 17:46:00 -07:00
oobabooga	b28fa86db6	Default --gpu-layers to 256	2025-05-06 17:51:55 -07:00
Downtown-Case	5ef564a22e	Fix model config loading in shared.py for Python 3.13 (#6961 )	2025-05-06 17:03:33 -03:00
mamei16	8137eb8ef4	Dynamic Chat Message UI Update Speed (#6952 )	2025-05-05 18:05:23 -03:00
oobabooga	df7bb0db1f	Rename --n-gpu-layers to --gpu-layers	2025-05-04 20:03:55 -07:00
oobabooga	4cea720da8	UI: Remove the "Autoload the model" feature	2025-05-02 16:38:28 -07:00
oobabooga	905afced1c	Add a --portable flag to hide things in portable mode	2025-05-02 16:34:29 -07:00
oobabooga	b46ca01340	UI: Set max_updates_second to 12 by default When the tokens/second at at ~50 and the model is a thinking model, the markdown rendering for the streaming message becomes a CPU bottleneck.	2025-04-30 14:53:15 -07:00
oobabooga	d10bded7f8	UI: Add an `enable_thinking` option to enable/disable Qwen3 thinking	2025-04-28 22:37:01 -07:00
oobabooga	7b80acd524	Fix parsing --extra-flags	2025-04-26 18:40:03 -07:00

1 2 3 4 5 ...

390 commits