text-generation-webui

mirror of https://github.com/oobabooga/text-generation-webui.git synced 2026-03-18 03:14:39 +01:00

Author	SHA1	Message	Date
oobabooga	fb1b3b6ddf	API: Rewrite logprobs for OpenAI spec compliance across all backends - Rewrite logprobs output format to match the OpenAI specification for both chat completions and completions endpoints - Fix top_logprobs count being ignored for llama.cpp and ExLlamav3 backends in chat completions (always returned 1 instead of requested N) - Fix non-streaming responses only returning logprobs for the last token instead of all generated tokens (affects all HF-based loaders) - Fix logprobs returning null for non-streaming chat requests on HF loaders - Fix off-by-one returning one extra top alternative on HF loaders	2026-03-12 14:17:32 -03:00
oobabooga	5a017aa338	API: Several OpenAI spec compliance fixes - Return proper OpenAI error format ({"error": {...}}) instead of HTTP 500 for validation errors - Send data: [DONE] at the end of SSE streams - Fix finish_reason so "tool_calls" takes priority over "length" - Stop including usage in streaming chunks when include_usage is not set - Handle "developer" role in messages (treated same as "system") - Add logprobs and top_logprobs parameters for chat completions - Fix chat completions logprobs not working with llama.cpp and ExLlamav3 backends - Add max_completion_tokens as an alias for max_tokens in chat completions	2026-03-12 13:30:38 -03:00
oobabooga	4b6c9db1c9	UI: Fix stale tool_sequence after edit and chat-instruct tool rendering	2026-03-12 13:12:18 -03:00
oobabooga	09723c9988	API: Include /v1 in the printed API URL for easier integration	2026-03-12 12:43:15 -03:00
oobabooga	2549f7c33b	API: Add tool_choice support and fix tool_calls spec compliance	2026-03-12 10:29:23 -03:00
oobabooga	b5cac2e3b2	Fix swipes and edit for tool calling in the UI	2026-03-12 01:53:37 -03:00
oobabooga	0d62038710	Add tools refresh button and _tool_turn comment	2026-03-12 01:36:07 -03:00
oobabooga	cf9ad8eafe	Initial tool-calling support in the UI	2026-03-12 01:16:19 -03:00
oobabooga	980a9d1657	UI: Minor defensive changes to autosave	2026-03-11 15:50:16 -07:00
oobabooga	bb00d96dc3	Use a new gr.DragDrop element for Sampler priority + update gradio	2026-03-11 19:35:12 -03:00
oobabooga	66c976e995	Update README with ROCm 7.2 torch install URL	2026-03-11 19:35:12 -03:00
oobabooga	24977846fb	Update AMD ROCm from 6.4 to 7.2	2026-03-11 13:14:26 -07:00
oobabooga	7a63a56043	Update llama.cpp	2026-03-11 12:53:19 -07:00
oobabooga	f1cfeae372	API: Improve OpenAI spec compliance in streaming and non-streaming responses	2026-03-10 20:55:49 -07:00
oobabooga	3304b57bdf	Add native logit_bias and logprobs support for ExLlamav3	2026-03-10 11:03:25 -03:00
oobabooga	8aeaa76365	Forward logit_bias, logprobs, and n to llama.cpp backend - Forward logit_bias and logprobs natively to llama.cpp - Support n>1 completions with seed increment for diversity - Fix logprobs returning empty dict when not requested	2026-03-10 10:41:45 -03:00
oobabooga	6ec4ca8b10	Add missing custom_token_bans to llama.cpp and reasoning_effort to ExLlamav3	2026-03-10 09:58:00 -03:00
oobabooga	307c085d1b	Minor warning change	2026-03-09 21:44:53 -07:00
oobabooga	c604ca66de	Update the --multi-user warning	2026-03-09 21:36:04 -07:00
oobabooga	15792c3cb8	Update ExLlamaV3 to 0.0.24	2026-03-09 20:31:05 -07:00
oobabooga	3b71932658	Update README	2026-03-09 20:18:09 -07:00
oobabooga	83b7e47d77	Update README	2026-03-09 20:12:54 -07:00
oobabooga	7f485274eb	Fix ExLlamaV3 EOS handling, load order, and perplexity evaluation - Use config.eos_token_id_list for all EOS tokens as stop conditions (fixes models like Llama-3 that define multiple EOS token IDs) - Load vision/draft models before main model so autosplit accounts for their VRAM usage - Fix loss computation in ExLlamav3_HF: use cache across chunks so sequences longer than 2048 tokens get correct perplexity values	2026-03-09 23:56:38 -03:00
oobabooga	39e6c997cc	Refactor to not import gradio in `--nowebui` mode	2026-03-09 19:29:24 -07:00
oobabooga	970055ca00	Update Intel GPU support to use native PyTorch XPU wheels PyTorch 2.9+ includes native XPU support, making intel-extension-for-pytorch and the separate oneAPI conda install unnecessary. Closes #7308	2026-03-09 17:08:59 -03:00
oobabooga	d6643bb4bc	One-click installer: Optimize wheel downloads to only re-download changed wheels	2026-03-09 12:30:43 -07:00
oobabooga	9753b2342b	Fix crash on non-UTF-8 Windows locales (e.g. Chinese GBK) Closes #7416	2026-03-09 16:22:37 -03:00
oobabooga	eb4a20137a	Update README	2026-03-08 20:38:50 -07:00
oobabooga	634609acca	Fix pip installing to system Miniconda on Windows, revert `0132966d`	2026-03-08 20:35:41 -07:00
oobabooga	40f1837b42	README: Minor updates	2026-03-08 08:38:29 -07:00
oobabooga	f6ffecfff2	Add guard against training with llama.cpp loader	2026-03-08 10:47:59 -03:00
oobabooga	5a91b8462f	Remove ctx_size_draft from ExLlamav3 loader	2026-03-08 09:53:48 -03:00
oobabooga	7a8ca9f2b0	Fix passing adaptive-p to llama-server	2026-03-08 04:09:40 -07:00
oobabooga	0132966d09	Add PyPI fallback for PyTorch install commands	2026-03-07 23:06:15 -03:00
oobabooga	baf4e13ff1	ExLlamav3: fix draft cache size to match main cache	2026-03-07 22:34:48 -03:00
oobabooga	6ff111d18e	ExLlamav3: handle exceptions in ConcurrentGenerator iterate loop	2026-03-07 22:05:31 -03:00
oobabooga	0cecc0a041	Use tar.gz for Linux/macOS portable builds to preserve symlinks	2026-03-07 06:59:48 -08:00
oobabooga	e1bf0b866f	Update the macos workflow	2026-03-07 06:46:46 -08:00
oobabooga	b686193fe2	Reapply "Update Miniforge from 25.3.0 to 26.1.0" This reverts commit `085c4ef5d7`.	2026-03-07 06:10:05 -08:00
oobabooga	328215b0c7	API: Stop generation on client disconnect for non-streaming requests	2026-03-07 06:06:13 -08:00
oobabooga	304510eb3d	ExLlamav3: route all generation through ConcurrentGenerator	2026-03-07 05:54:14 -08:00
oobabooga	085c4ef5d7	Revert "Update Miniforge from 25.3.0 to 26.1.0" This reverts commit `9576c5a5f4`.	2026-03-07 05:09:49 -08:00
oobabooga	aa634c77c0	Update llama.cpp	2026-03-06 21:00:36 -08:00
oobabooga	abc699db9b	Minor UI change	2026-03-06 19:03:38 -08:00
oobabooga	f2fe001cc4	Fix message copy buttons not working over HTTP	2026-03-06 19:01:38 -08:00
oobabooga	7ea5513263	Handle Qwen 3.5 thinking blocks	2026-03-06 19:01:28 -08:00
oobabooga	5fa709a3f4	llama.cpp server: use port+5 offset and suppress No parser definition detected logs	2026-03-06 18:52:34 -08:00
oobabooga	e8e0d02406	Remove outdated ROCm environment variable overrides from one_click.py	2026-03-06 18:15:05 -08:00
oobabooga	1eead661c3	Portable mode: always use ../user_data if it exists	2026-03-06 18:04:48 -08:00
oobabooga	d48b53422f	Training: Optimize _peek_json_keys to avoid loading entire file into memory	2026-03-06 15:39:08 -08:00

1 2 3 4 5 ...

5323 commits