text-generation-webui

mirror of https://github.com/oobabooga/text-generation-webui.git synced 2026-03-20 12:24:38 +01:00

Author	SHA1	Message	Date
oobabooga	8a3d866401	Fix temperature_last having no effect in llama.cpp server sampler order	2026-03-04 06:10:51 -08:00
oobabooga	b3fd0d16e0	Use a new gr.Headless component for efficient chat streaming	2026-03-03 18:12:03 -08:00
oobabooga	2260e530c9	Remove gradio monkey-patches (moved to gradio fork)	2026-03-03 17:17:36 -08:00
oobabooga	c54e8a2b3d	Try to spawn llama.cpp on port 5001 instead of random port	2026-01-28 08:23:55 -08:00
oobabooga	dc2bbf1861	Refactor thinking block detection and add Solar Open support	2026-01-28 08:21:34 -08:00
q5sys (JT)	7493fe7841	feat: Add a dropdown to save/load user personas (#7367 )	2026-01-14 20:35:08 -03:00
Sergey 'Jin' Bostandzhyan	6e2c4e9c23	Fix loading models which have their eos token disabled (#7363 )	2026-01-06 11:31:10 -03:00
oobabooga	e7c8b51fec	Revert "Use flash_attention_2 by default for Transformers models" This reverts commit `85f2df92e9`.	2025-12-07 18:48:41 -08:00
oobabooga	b758059e95	Revert "Clear the torch cache between sequential image generations" This reverts commit `1ec9f708e5`.	2025-12-07 12:23:19 -08:00
oobabooga	1ec9f708e5	Clear the torch cache between sequential image generations	2025-12-07 11:49:22 -08:00
oobabooga	85f2df92e9	Use flash_attention_2 by default for Transformers models	2025-12-07 06:56:58 -08:00
oobabooga	1762312fb4	Use random instead of np.random for image seeds (makes it work on Windows)	2025-12-06 20:10:32 -08:00
oobabooga	02518a96a9	Lint	2025-12-06 06:55:06 -08:00
oobabooga	455dc06db0	Serve the original PNG images in the UI instead of webp	2025-12-06 05:43:00 -08:00
oobabooga	6ca99910ba	Image: Quantize the text encoder for lower VRAM	2025-12-05 13:08:46 -08:00
oobabooga	11937de517	Use flash attention for image generation by default	2025-12-05 12:13:24 -08:00
oobabooga	c11c14590a	Image: Better LLM variation default prompt	2025-12-05 08:08:11 -08:00
oobabooga	0dd468245c	Image: Add back the gallery cache (for performance)	2025-12-05 07:11:38 -08:00
oobabooga	b63d57158d	Image: Add TGW as a prefix to output images	2025-12-05 05:59:54 -08:00
oobabooga	afa29b9554	Image: Several fixes	2025-12-05 05:58:57 -08:00
oobabooga	8eac99599a	Image: Better LLM variation default prompt	2025-12-04 19:58:06 -08:00
oobabooga	b4f06a50b0	fix: Pass bos_token and eos_token from metadata to jinja2 Fixes loading Seed-Instruct-36B	2025-12-04 19:11:31 -08:00
oobabooga	56f2a9512f	Revert "Image: Add the LLM-generated prompt to the API result" This reverts commit `c7ad28a4cd`.	2025-12-04 17:34:27 -08:00
oobabooga	c7ad28a4cd	Image: Add the LLM-generated prompt to the API result	2025-12-04 17:22:08 -08:00
oobabooga	b451bac082	Image: Improve a log message	2025-12-04 16:33:46 -08:00
oobabooga	47a0fcd614	Image: PNG metadata improvements	2025-12-04 16:25:48 -08:00
oobabooga	ac31a7c008	Image: Organize the UI	2025-12-04 15:45:04 -08:00
oobabooga	a90739f498	Image: Better LLM variation default prompt	2025-12-04 10:50:40 -08:00
oobabooga	ffef3c7b1d	Image: Make the LLM Variations prompt configurable	2025-12-04 10:44:35 -08:00
oobabooga	5763947c37	Image: Simplify the API code, add the llm_variations option	2025-12-04 10:23:00 -08:00
oobabooga	2793153717	Image: Add LLM-generated prompt variations	2025-12-04 08:10:24 -08:00
oobabooga	7fb9f19bd8	Progress bar style improvements	2025-12-04 06:20:45 -08:00
oobabooga	a838223d18	Image: Add a progress bar during generation	2025-12-04 05:49:57 -08:00
oobabooga	14dbc3488e	Image: Clear the torch cache after generation, not before	2025-12-04 05:32:58 -08:00
oobabooga	c357eed4c7	Image: Remove the flash_attention_3 option (no idea how to get it working)	2025-12-03 18:40:34 -08:00
oobabooga	fbca54957e	Image generation: Yield partial results for batch count > 1	2025-12-03 16:13:07 -08:00
oobabooga	49c60882bf	Image generation: Safer image uploading	2025-12-03 16:07:51 -08:00
oobabooga	59285d501d	Image generation: Small UI improvements	2025-12-03 16:03:31 -08:00
oobabooga	373baa5c9c	UI: Minor image gallery improvements	2025-12-03 14:45:02 -08:00
oobabooga	9448bf1caa	Image generation: add torchao quantization (supports torch.compile)	2025-12-02 14:22:51 -08:00
oobabooga	97281ff831	UI: Fix an index error in the new image gallery	2025-12-02 11:20:52 -08:00
oobabooga	9d07d3a229	Make portable builds functional again after `b3666e140d`	2025-12-02 10:06:57 -08:00
oobabooga	6291e72129	Remove quanto for now (requires messy compilation)	2025-12-02 09:57:18 -08:00
oobabooga	b3666e140d	Add image generation support (#7328 )	2025-12-02 14:55:38 -03:00
oobabooga	5327bc9397	Update modules/shared.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-11-28 22:48:05 -03:00
GodEmperor785	400bb0694b	Add slider for --ubatch-size for llama.cpp loader, change defaults for better MoE performance (#7316 )	2025-11-21 16:56:02 -03:00
oobabooga	8f0048663d	More modular HTML generator	2025-11-21 07:09:16 -08:00
oobabooga	0d4eff284c	Add a --cpu-moe model for llama.cpp	2025-11-19 05:23:43 -08:00
Trenten Miller	6871484398	fix: Rename 'evaluation_strategy' to 'eval_strategy' in training	2025-10-28 16:48:04 -03:00
oobabooga	a156ebbf76	Lint	2025-10-15 13:15:01 -07:00
oobabooga	c871d9cdbd	Revert "Same as `7f06aec3a1` but for exllamav3_hf" This reverts commit `deb37b821b`.	2025-10-15 13:05:41 -07:00
oobabooga	b5a6904c4a	Make --trust-remote-code immutable from the UI/API	2025-10-14 20:47:01 -07:00
mamei16	308e726e11	log error when llama-server request exceeds context size (#7263 )	2025-10-12 23:00:11 -03:00
oobabooga	655c3e86e3	Fix "continue" missing an initial space in chat-instruct/chat modes	2025-10-11 17:00:25 -07:00
oobabooga	c7dd920dc8	Fix metadata leaking into branched chats	2025-10-11 14:12:05 -07:00
oobabooga	78ff21d512	Organize the --help message	2025-10-10 15:21:08 -07:00
oobabooga	0d03813e98	Update modules/chat.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-10-09 21:01:13 -03:00
oobabooga	deb37b821b	Same as `7f06aec3a1` but for exllamav3_hf	2025-10-09 13:02:38 -07:00
oobabooga	7f06aec3a1	exllamav3: Implement the logits function for /v1/internal/logits	2025-10-09 11:24:25 -07:00
oobabooga	218dc01b51	Add fallbacks after `93aa7b3ed3`	2025-10-09 10:59:34 -07:00
oobabooga	282aa19189	Safer profile picture uploading	2025-10-09 09:26:35 -07:00
oobabooga	93aa7b3ed3	Better handle multigpu setups with transformers + bitsandbytes	2025-10-09 08:49:44 -07:00
Remowylliams	38a7fd685d	chat.py fixes Instruct mode History	2025-10-05 11:34:47 -03:00
oobabooga	1e863a7113	Fix exllamav3 ignoring the stop button	2025-09-19 16:12:50 -07:00
stevenxdavis	dd6d2223a5	Changing transformers_loader.py to Match User Expectations for --bf16 and Flash Attention 2 (#7217 )	2025-09-17 16:39:04 -03:00
oobabooga	9e9ab39892	Make exllamav3_hf and exllamav2_hf functional again	2025-09-17 12:29:22 -07:00
oobabooga	f3829b268a	llama.cpp: Always pass --flash-attn on	2025-09-02 12:12:17 -07:00
oobabooga	c6ea67bbdb	Lint	2025-09-02 10:22:03 -07:00
oobabooga	00ed878b05	Slightly more robust model loading	2025-09-02 10:16:26 -07:00
oobabooga	387e249dec	Change an info message	2025-08-31 16:27:10 -07:00
oobabooga	8028d88541	Lint	2025-08-30 21:29:20 -07:00
oobabooga	13876a1ee8	llama.cpp: Remove the --flash-attn flag (it's always on now)	2025-08-30 20:28:26 -07:00
oobabooga	3a3e247f3c	Even better way to handle continue for thinking blocks	2025-08-30 12:36:35 -07:00
oobabooga	cf1aad2a68	Fix "continue" for Byte-OSS for partial thinking blocks	2025-08-30 12:16:45 -07:00
oobabooga	96136ea760	Fix LaTeX rendering for equations with asterisks	2025-08-30 10:13:32 -07:00
oobabooga	a3eb67e466	Fix the UI failing to launch if the Notebook prompt is too long	2025-08-30 08:42:26 -07:00
oobabooga	a2b37adb26	UI: Preload the correct fonts for chat mode	2025-08-29 09:25:44 -07:00
oobabooga	cb8780a4ce	Safer check for is_multimodal when loading models Avoids unrelated multimodal error when a model fails to load due to lack of memory.	2025-08-28 11:13:19 -07:00
oobabooga	cfc83745ec	UI: Improve right sidebar borders in light mode	2025-08-28 08:34:48 -07:00
oobabooga	ba6041251d	UI: Minor change	2025-08-28 06:20:00 -07:00
oobabooga	a92758a144	llama.cpp: Fix obtaining the maximum sequence length for GPT-OSS	2025-08-27 16:15:40 -07:00
oobabooga	030ba7bfeb	UI: Mention that Seed-OSS uses enable_thinking	2025-08-27 07:44:35 -07:00
oobabooga	0b4518e61c	"Text generation web UI" -> "Text Generation Web UI"	2025-08-27 05:53:09 -07:00
oobabooga	02ca96fa44	Multiple fixes	2025-08-25 22:17:22 -07:00
oobabooga	6a7166fffa	Add support for the Seed-OSS template	2025-08-25 19:46:48 -07:00
oobabooga	8fcb4b3102	Make bot_prefix extensions functional again	2025-08-25 19:10:46 -07:00
oobabooga	8f660aefe3	Fix chat-instruct replies leaking the bot name sometimes	2025-08-25 18:50:16 -07:00
oobabooga	a531328f7e	Fix the GPT-OSS stopping string	2025-08-25 18:41:58 -07:00
oobabooga	6c165d2e55	Fix the chat template	2025-08-25 18:28:43 -07:00
oobabooga	b657be7381	Obtain stopping strings in chat mode	2025-08-25 18:22:08 -07:00
oobabooga	ded6c41cf8	Fix impersonate for chat-instruct	2025-08-25 18:16:17 -07:00
oobabooga	c1aa4590ea	Code simplifications, fix impersonate	2025-08-25 18:05:40 -07:00
oobabooga	b330ec3517	Simplifications	2025-08-25 17:54:15 -07:00
oobabooga	3ad5970374	Make the llama.cpp --verbose output less verbose	2025-08-25 17:43:21 -07:00
oobabooga	adeca8a658	Remove changes to the jinja2 templates	2025-08-25 17:36:01 -07:00
oobabooga	aad0104c1b	Remove a function	2025-08-25 17:33:13 -07:00
oobabooga	f919cdf881	chat.py code simplifications	2025-08-25 17:20:51 -07:00
oobabooga	d08800c359	chat.py improvements	2025-08-25 17:03:37 -07:00
oobabooga	3bc48014a5	chat.py code simplifications	2025-08-25 16:48:21 -07:00
oobabooga	2478294c06	UI: Preload the instruct and chat fonts	2025-08-24 12:37:41 -07:00

1 2 3 4 5 ...

2043 commits