text-generation-webui

mirror of https://github.com/oobabooga/text-generation-webui.git synced 2026-01-14 20:50:08 +01:00

Author	SHA1	Message	Date
oobabooga	0d03813e98	Update modules/chat.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-10-09 21:01:13 -03:00
oobabooga	deb37b821b	Same as `7f06aec3a1` but for exllamav3_hf	2025-10-09 13:02:38 -07:00
oobabooga	7f06aec3a1	exllamav3: Implement the logits function for /v1/internal/logits	2025-10-09 11:24:25 -07:00
oobabooga	218dc01b51	Add fallbacks after `93aa7b3ed3`	2025-10-09 10:59:34 -07:00
oobabooga	282aa19189	Safer profile picture uploading	2025-10-09 09:26:35 -07:00
oobabooga	93aa7b3ed3	Better handle multigpu setups with transformers + bitsandbytes	2025-10-09 08:49:44 -07:00
Remowylliams	38a7fd685d	chat.py fixes Instruct mode History	2025-10-05 11:34:47 -03:00
oobabooga	1e863a7113	Fix exllamav3 ignoring the stop button	2025-09-19 16:12:50 -07:00
stevenxdavis	dd6d2223a5	Changing transformers_loader.py to Match User Expectations for --bf16 and Flash Attention 2 (#7217 )	2025-09-17 16:39:04 -03:00
oobabooga	9e9ab39892	Make exllamav3_hf and exllamav2_hf functional again	2025-09-17 12:29:22 -07:00
oobabooga	f3829b268a	llama.cpp: Always pass --flash-attn on	2025-09-02 12:12:17 -07:00
oobabooga	c6ea67bbdb	Lint	2025-09-02 10:22:03 -07:00
oobabooga	00ed878b05	Slightly more robust model loading	2025-09-02 10:16:26 -07:00
oobabooga	387e249dec	Change an info message	2025-08-31 16:27:10 -07:00
oobabooga	8028d88541	Lint	2025-08-30 21:29:20 -07:00
oobabooga	13876a1ee8	llama.cpp: Remove the --flash-attn flag (it's always on now)	2025-08-30 20:28:26 -07:00
oobabooga	3a3e247f3c	Even better way to handle continue for thinking blocks	2025-08-30 12:36:35 -07:00
oobabooga	cf1aad2a68	Fix "continue" for Byte-OSS for partial thinking blocks	2025-08-30 12:16:45 -07:00
oobabooga	96136ea760	Fix LaTeX rendering for equations with asterisks	2025-08-30 10:13:32 -07:00
oobabooga	a3eb67e466	Fix the UI failing to launch if the Notebook prompt is too long	2025-08-30 08:42:26 -07:00
oobabooga	a2b37adb26	UI: Preload the correct fonts for chat mode	2025-08-29 09:25:44 -07:00
oobabooga	cb8780a4ce	Safer check for is_multimodal when loading models Avoids unrelated multimodal error when a model fails to load due to lack of memory.	2025-08-28 11:13:19 -07:00
oobabooga	cfc83745ec	UI: Improve right sidebar borders in light mode	2025-08-28 08:34:48 -07:00
oobabooga	ba6041251d	UI: Minor change	2025-08-28 06:20:00 -07:00
oobabooga	a92758a144	llama.cpp: Fix obtaining the maximum sequence length for GPT-OSS	2025-08-27 16:15:40 -07:00
oobabooga	030ba7bfeb	UI: Mention that Seed-OSS uses enable_thinking	2025-08-27 07:44:35 -07:00
oobabooga	0b4518e61c	"Text generation web UI" -> "Text Generation Web UI"	2025-08-27 05:53:09 -07:00
oobabooga	02ca96fa44	Multiple fixes	2025-08-25 22:17:22 -07:00
oobabooga	6a7166fffa	Add support for the Seed-OSS template	2025-08-25 19:46:48 -07:00
oobabooga	8fcb4b3102	Make bot_prefix extensions functional again	2025-08-25 19:10:46 -07:00
oobabooga	8f660aefe3	Fix chat-instruct replies leaking the bot name sometimes	2025-08-25 18:50:16 -07:00
oobabooga	a531328f7e	Fix the GPT-OSS stopping string	2025-08-25 18:41:58 -07:00
oobabooga	6c165d2e55	Fix the chat template	2025-08-25 18:28:43 -07:00
oobabooga	b657be7381	Obtain stopping strings in chat mode	2025-08-25 18:22:08 -07:00
oobabooga	ded6c41cf8	Fix impersonate for chat-instruct	2025-08-25 18:16:17 -07:00
oobabooga	c1aa4590ea	Code simplifications, fix impersonate	2025-08-25 18:05:40 -07:00
oobabooga	b330ec3517	Simplifications	2025-08-25 17:54:15 -07:00
oobabooga	3ad5970374	Make the llama.cpp --verbose output less verbose	2025-08-25 17:43:21 -07:00
oobabooga	adeca8a658	Remove changes to the jinja2 templates	2025-08-25 17:36:01 -07:00
oobabooga	aad0104c1b	Remove a function	2025-08-25 17:33:13 -07:00
oobabooga	f919cdf881	chat.py code simplifications	2025-08-25 17:20:51 -07:00
oobabooga	d08800c359	chat.py improvements	2025-08-25 17:03:37 -07:00
oobabooga	3bc48014a5	chat.py code simplifications	2025-08-25 16:48:21 -07:00
oobabooga	2478294c06	UI: Preload the instruct and chat fonts	2025-08-24 12:37:41 -07:00
oobabooga	8be798e15f	llama.cpp: Fix stderr deadlock while loading some multimodal models	2025-08-24 12:20:05 -07:00
oobabooga	7fe8da8944	Minor simplification after `f247c2ae62`	2025-08-22 14:42:56 -07:00
oobabooga	f247c2ae62	Make --model work with absolute paths, eg --model /tmp/gemma-3-270m-it-IQ4_NL.gguf	2025-08-22 11:47:33 -07:00
oobabooga	9e7b326e34	Lint	2025-08-19 06:50:40 -07:00
oobabooga	1972479610	Add the TP option to exllamav3_HF	2025-08-19 06:48:22 -07:00
oobabooga	e0f5905a97	Code formatting	2025-08-19 06:34:05 -07:00
oobabooga	5b06284a8a	UI: Keep ExLlamav3_HF selected if already selected for EXL3 models	2025-08-19 06:23:21 -07:00
oobabooga	cbba58bef9	UI: Fix code blocks having an extra empty line	2025-08-18 15:50:09 -07:00
oobabooga	7d23a55901	Fix model unloading when switching loaders (closes #7203 )	2025-08-18 09:05:47 -07:00
oobabooga	64eba9576c	mtmd: Fix a bug when "include past attachments" is unchecked	2025-08-17 14:08:40 -07:00
oobabooga	dbabe67e77	ExLlamaV3: Enable the --enable-tp option, add a --tp-backend option	2025-08-17 13:19:11 -07:00
oobabooga	d771ca4a13	Fix web search (attempt)	2025-08-14 12:05:14 -07:00
altoiddealer	57f6e9af5a	Set multimodal status during Model Loading (#7199 )	2025-08-13 16:47:27 -03:00
oobabooga	41b95e9ec3	Lint	2025-08-12 13:37:37 -07:00
oobabooga	7301452b41	UI: Minor info message change	2025-08-12 13:23:24 -07:00
oobabooga	8d7b88106a	Revert "mtmd: Fail early if images are provided but the model doesn't support them (llama.cpp)" This reverts commit `d8fcc71616`.	2025-08-12 13:20:16 -07:00
oobabooga	2238302b49	ExLlamaV3: Add speculative decoding	2025-08-12 08:50:45 -07:00
oobabooga	d8fcc71616	mtmd: Fail early if images are provided but the model doesn't support them (llama.cpp)	2025-08-11 18:02:33 -07:00
oobabooga	e6447cd24a	mtmd: Update the llama-server request	2025-08-11 17:42:35 -07:00
oobabooga	0e3def449a	llama.cpp: --swa-full to llama-server when streaming-llm is checked	2025-08-11 15:17:25 -07:00
oobabooga	0e88a621fd	UI: Better organize the right sidebar	2025-08-11 15:16:03 -07:00
oobabooga	a78ca6ffcd	Remove a comment	2025-08-11 12:33:38 -07:00
oobabooga	999471256c	Lint	2025-08-11 12:32:17 -07:00
oobabooga	b62c8845f3	mtmd: Fix /chat/completions for llama.cpp	2025-08-11 12:01:59 -07:00
oobabooga	38c0b4a1ad	Default ctx-size to 8192 when not found in the metadata	2025-08-11 07:39:53 -07:00
oobabooga	52d1cbbbe9	Fix an import	2025-08-11 07:38:39 -07:00
oobabooga	4809ddfeb8	Exllamav3: small sampler fixes	2025-08-11 07:35:22 -07:00
oobabooga	4d8dbbab64	API: Fix sampler_priority usage for ExLlamaV3	2025-08-11 07:26:11 -07:00
oobabooga	0ea62d88f6	mtmd: Fix "continue" when an image is present	2025-08-09 21:47:02 -07:00
oobabooga	2f90ac9880	Move the new image_utils.py file to modules/	2025-08-09 21:41:38 -07:00
oobabooga	c6b4d1e87f	Fix the exllamav2 loader ignoring add_bos	2025-08-09 21:34:35 -07:00
oobabooga	d86b0ec010	Add multimodal support (llama.cpp) (#7027 )	2025-08-10 01:27:25 -03:00
oobabooga	a289a92b94	Fix exllamav3 token count	2025-08-09 17:10:58 -07:00
oobabooga	d489eb589a	Attempt at fixing new exllamav3 loader undefined behavior when switching conversations	2025-08-09 14:11:31 -07:00
oobabooga	a6d6bee88c	Change a comment	2025-08-09 07:51:03 -07:00
oobabooga	2fe79a93cc	mtmd: Handle another case after `3f5ec9644f`	2025-08-09 07:50:24 -07:00
oobabooga	59c6138e98	Remove a log message	2025-08-09 07:32:15 -07:00
oobabooga	f396b82a4f	mtmd: Better way to detect if an EXL3 model is multimodal	2025-08-09 07:31:36 -07:00
oobabooga	fa9be444fa	Use ExLlamav3 instead of ExLlamav3_HF by default for EXL3 models	2025-08-09 07:26:59 -07:00
oobabooga	3f5ec9644f	mtmd: Place the image <__media__> at the top of the prompt	2025-08-09 07:06:07 -07:00
oobabooga	1168004067	Minor change	2025-08-09 07:01:55 -07:00
oobabooga	9e260332cc	Remove some unnecessary code	2025-08-08 21:22:47 -07:00
oobabooga	544c3a7c9f	Polish the new exllamav3 loader	2025-08-08 21:15:53 -07:00
oobabooga	8fcadff8d3	mtmd: Use the base64 attachment for the UI preview instead of the file	2025-08-08 20:13:54 -07:00
oobabooga	6e9de75727	Support loading chat templates from chat_template.json files	2025-08-08 19:35:09 -07:00
Katehuuh	88127f46c1	Add multimodal support (ExLlamaV3) (#7174 )	2025-08-08 23:31:16 -03:00
oobabooga	b391ac8eb1	Fix getting the ctx-size for EXL3/EXL2/Transformers models	2025-08-08 18:11:45 -07:00
oobabooga	3e24f455c8	Fix continue for GPT-OSS (hopefully the final fix)	2025-08-06 10:18:42 -07:00
oobabooga	0c1403f2c7	Handle GPT-OSS as a special case when continuing	2025-08-06 08:05:37 -07:00
oobabooga	6ce4b353c4	Fix the GPT-OSS template	2025-08-06 07:12:39 -07:00
oobabooga	7c82d65a9d	Handle GPT-OSS as a special template case	2025-08-05 18:05:09 -07:00
oobabooga	fbea21a1f1	Only use enable_thinking if the template supports it	2025-08-05 17:33:27 -07:00
oobabooga	bfbbfc2361	Ignore add_generation_prompt in GPT-OSS	2025-08-05 17:33:01 -07:00
oobabooga	20adc3c967	Start over new template handling (to avoid overcomplicating)	2025-08-05 16:58:45 -07:00
oobabooga	80f6abb07e	Begin fixing 'Continue' with GPT-OSS	2025-08-05 16:01:19 -07:00
oobabooga	e5b8d4d072	Fix a typo	2025-08-05 15:52:56 -07:00

1 2 3 4 5 ...

1987 commits