text-generation-webui

mirror of https://github.com/oobabooga/text-generation-webui.git synced 2026-04-20 22:13:43 +00:00

Author	SHA1	Message	Date
oobabooga	085c4ef5d7	Revert "Update Miniforge from 25.3.0 to 26.1.0" This reverts commit `9576c5a5f4`.	2026-03-07 05:09:49 -08:00
oobabooga	aa634c77c0	Update llama.cpp	2026-03-06 21:00:36 -08:00
oobabooga	abc699db9b	Minor UI change	2026-03-06 19:03:38 -08:00
oobabooga	f2fe001cc4	Fix message copy buttons not working over HTTP	2026-03-06 19:01:38 -08:00
oobabooga	7ea5513263	Handle Qwen 3.5 thinking blocks	2026-03-06 19:01:28 -08:00
oobabooga	5fa709a3f4	llama.cpp server: use port+5 offset and suppress No parser definition detected logs	2026-03-06 18:52:34 -08:00
oobabooga	e8e0d02406	Remove outdated ROCm environment variable overrides from one_click.py	2026-03-06 18:15:05 -08:00
oobabooga	1eead661c3	Portable mode: always use ../user_data if it exists	2026-03-06 18:04:48 -08:00
oobabooga	d48b53422f	Training: Optimize _peek_json_keys to avoid loading entire file into memory	2026-03-06 15:39:08 -08:00
oobabooga	2beaa4b971	Update llama.cpp	2026-03-06 14:39:35 -08:00
oobabooga	5f6754c267	Fix stop button being ignored when token throttling is off	2026-03-06 17:12:34 -03:00
oobabooga	b8b4471ab5	Security: restrict file writes to user_data_dir, block extra_flags from API	2026-03-06 16:58:11 -03:00
oobabooga	d03923924a	Several small fixes - Stop llama-server subprocess on model unload instead of relying on GC - Fix tool_calls[].index being string instead of int in API responses - Omit tool_calls key from API response when empty per OpenAI spec - Prevent division by zero when micro_batch_size > batch_size in training - Copy sampler_priority list before mutating in ExLlamaV3 - Normalize presence/frequency_penalty names for ExLlamaV3 sampler sorting - Restore original chat_template after training instead of leaving it mutated	2026-03-06 16:52:13 -03:00
oobabooga	044566d42d	API: Add tool call parsing for DeepSeek, GLM, MiniMax, and Kimi models	2026-03-06 15:06:56 -03:00
oobabooga	f5acf55207	Add --chat-template-file flag to override the default instruction template for API requests Matches llama.cpp's flag name. Supports .jinja, .jinja2, and .yaml files. Priority: per-request params > --chat-template-file > model's built-in template.	2026-03-06 14:04:16 -03:00
oobabooga	3531069824	API: Support Llama 4 tool calling and fix tool calling edge cases	2026-03-06 13:12:14 -03:00
oobabooga	160f7ad6b4	Handle SIGTERM to stop llama-server on pkill	2026-03-06 12:56:33 -03:00
oobabooga	8e24a20873	Installer: Fix libstdcxx-ng version pin causing conda solver to hang on Python 3.13	2026-03-06 07:39:50 -08:00
oobabooga	3bab7fbfd4	Update Colab notebook: new default model, direct GGUF URL support	2026-03-06 06:52:49 -08:00
oobabooga	e7e0df0101	Fix hover menu shifting down when chat input grows	2026-03-06 11:52:16 -03:00
oobabooga	3323dedd08	Update llama.cpp	2026-03-06 06:30:01 -08:00
oobabooga	36dbc4ccce	Remove unused colorama and psutil requirements	2026-03-06 06:28:35 -08:00
oobabooga	86d59b4404	Installer: Fix edge case in wheel re-download caching	2026-03-06 06:16:57 -08:00
oobabooga	0e0e3ceb97	Update the custom gradio wheels	2026-03-06 05:46:08 -08:00
oobabooga	6d7018069c	Installer: Use absolute Python path in Windows batch scripts	2026-03-05 21:56:01 -08:00
oobabooga	f9ed8820de	API: Make tool function description and parameters optional	2026-03-05 21:43:33 -08:00
oobabooga	3880c1a406	API: Accept content:null and complex tool definitions in tool calling requests	2026-03-06 02:41:38 -03:00
oobabooga	93ebfa2b7e	Fix llama-server output filter for new log format	2026-03-06 02:38:13 -03:00
oobabooga	d0ac58ad31	API: Fix tool_calls placement and other response compatibility issues	2026-03-05 21:25:03 -08:00
oobabooga	f06583b2b9	API: Use \n instead of \r\n as the SSE separator to match OpenAI	2026-03-05 21:16:37 -08:00
oobabooga	8be444a559	Update the custom gradio wheels	2026-03-05 21:05:15 -08:00
oobabooga	1729fb07b9	Update llama.cpp	2026-03-05 21:04:24 -08:00
oobabooga	eba262d47a	Security: prevent path traversal in character/user/file save and delete	2026-03-06 02:00:10 -03:00
oobabooga	521ddbb722	Security: restrict API model loading args to UI-exposed parameters The /v1/internal/model/load endpoint previously allowed setting any shared.args attribute, including security-sensitive flags like trust_remote_code. Now only keys from list_model_elements() are accepted.	2026-03-06 01:57:02 -03:00
oobabooga	66fb79fe15	llama.cpp: Add --fit-target param	2026-03-06 01:55:48 -03:00
oobabooga	e81a47f708	Improve the API generation defaults --help message	2026-03-05 20:41:45 -08:00
oobabooga	27bcc45c18	API: Add command-line flags to override default generation parameters	2026-03-06 01:36:45 -03:00
oobabooga	8a9afcbec6	Allow extensions to skip output post-processing	2026-03-06 01:19:46 -03:00
oobabooga	2e7e966ef2	Docs: Better Tool/Function calling examples	2026-03-05 20:06:34 -08:00
oobabooga	ddcad3cc51	Follow-up to `e2548f69`: add missing paths module, fix gallery extension	2026-03-06 00:58:03 -03:00
oobabooga	8d43123f73	API: Fix function calling for Qwen, Mistral, GPT-OSS, and other models The tool call response parser only handled JSON-based formats, causing tool_calls to always be empty for models that use non-JSON formats. Add parsers for three additional tool call formats: - Qwen3.5: <tool_call><function=name><parameter=key>value</parameter> - Mistral/Devstral: functionName{"arg": "value"} - GPT-OSS: <\|channel\|>commentary to=functions.name<\|message\|>{...} Also fix multi-turn tool conversations crashing with Jinja2 UndefinedError on tool_call_id by preserving tool_calls and tool_call_id metadata through the chat history conversion.	2026-03-06 00:55:33 -03:00
oobabooga	e2548f69a9	Make user_data configurable: add --user-data-dir flag, auto-detect ../user_data If --user-data-dir is not set, auto-detect: use ../user_data when ./user_data doesn't exist, making it easy to share user data across portable builds by placing it one folder up.	2026-03-05 19:31:10 -08:00
oobabooga	4c406e024f	API: Speed up chat completions by ~85ms per request	2026-03-05 18:36:07 -08:00
oobabooga	249bd6eea2	UI: Update the parallel info message	2026-03-05 18:11:55 -08:00
oobabooga	f52d9336e5	TensorRT-LLM: Migrate from ModelRunner to LLM API, add concurrent API request support	2026-03-05 18:09:45 -08:00
oobabooga	9824c82cb6	API: Add parallel request support for llama.cpp and ExLlamaV3	2026-03-05 16:49:58 -08:00
oobabooga	2f08dce7b0	Remove ExLlamaV2 backend - archived upstream: `7dc12af3a8` - replaced by ExLlamaV3, which has much better quantization accuracy	2026-03-05 14:02:13 -08:00
oobabooga	134ac8fc29	Update README	2026-03-05 12:30:28 -08:00
oobabooga	409db3df1e	Training: Docs improvements	2026-03-05 11:30:57 -08:00
oobabooga	86d8291e58	Training: UI cleanup and better defaults	2026-03-05 11:20:55 -08:00

1 2 3 4 5 ...

5282 commits