text-generation-webui

mirror of https://github.com/oobabooga/text-generation-webui.git synced 2026-03-18 11:24:39 +01:00

Author	SHA1	Message	Date
oobabooga	a09f21b9de	UI: Fix tool calling for GPT-OSS and Continue	2026-03-12 22:17:20 -03:00
oobabooga	9a7428b627	UI: Add collapsible accordions for tool calling steps	2026-03-12 14:16:04 -07:00
oobabooga	2d0cc7726e	API: Add reasoning_content field to non-streaming chat completions Extract thinking/reasoning blocks (e.g. <think>...</think>) into a separate reasoning_content field on the assistant message, matching the convention used by DeepSeek, llama.cpp, and SGLang.	2026-03-12 16:30:46 -03:00
oobabooga	d45c9b3c59	API: Minor logprobs fixes	2026-03-12 16:09:49 -03:00
oobabooga	a916fb0e5c	API: Preserve mid-conversation system message positions	2026-03-12 14:27:24 -03:00
oobabooga	fb1b3b6ddf	API: Rewrite logprobs for OpenAI spec compliance across all backends - Rewrite logprobs output format to match the OpenAI specification for both chat completions and completions endpoints - Fix top_logprobs count being ignored for llama.cpp and ExLlamav3 backends in chat completions (always returned 1 instead of requested N) - Fix non-streaming responses only returning logprobs for the last token instead of all generated tokens (affects all HF-based loaders) - Fix logprobs returning null for non-streaming chat requests on HF loaders - Fix off-by-one returning one extra top alternative on HF loaders	2026-03-12 14:17:32 -03:00
oobabooga	5a017aa338	API: Several OpenAI spec compliance fixes - Return proper OpenAI error format ({"error": {...}}) instead of HTTP 500 for validation errors - Send data: [DONE] at the end of SSE streams - Fix finish_reason so "tool_calls" takes priority over "length" - Stop including usage in streaming chunks when include_usage is not set - Handle "developer" role in messages (treated same as "system") - Add logprobs and top_logprobs parameters for chat completions - Fix chat completions logprobs not working with llama.cpp and ExLlamav3 backends - Add max_completion_tokens as an alias for max_tokens in chat completions	2026-03-12 13:30:38 -03:00
oobabooga	09723c9988	API: Include /v1 in the printed API URL for easier integration	2026-03-12 12:43:15 -03:00
oobabooga	2549f7c33b	API: Add tool_choice support and fix tool_calls spec compliance	2026-03-12 10:29:23 -03:00
oobabooga	f1cfeae372	API: Improve OpenAI spec compliance in streaming and non-streaming responses	2026-03-10 20:55:49 -07:00
oobabooga	3304b57bdf	Add native logit_bias and logprobs support for ExLlamav3	2026-03-10 11:03:25 -03:00
oobabooga	8aeaa76365	Forward logit_bias, logprobs, and n to llama.cpp backend - Forward logit_bias and logprobs natively to llama.cpp - Support n>1 completions with seed increment for diversity - Fix logprobs returning empty dict when not requested	2026-03-10 10:41:45 -03:00
oobabooga	39e6c997cc	Refactor to not import gradio in `--nowebui` mode	2026-03-09 19:29:24 -07:00
oobabooga	328215b0c7	API: Stop generation on client disconnect for non-streaming requests	2026-03-07 06:06:13 -08:00
oobabooga	b8b4471ab5	Security: restrict file writes to user_data_dir, block extra_flags from API	2026-03-06 16:58:11 -03:00
oobabooga	d03923924a	Several small fixes - Stop llama-server subprocess on model unload instead of relying on GC - Fix tool_calls[].index being string instead of int in API responses - Omit tool_calls key from API response when empty per OpenAI spec - Prevent division by zero when micro_batch_size > batch_size in training - Copy sampler_priority list before mutating in ExLlamaV3 - Normalize presence/frequency_penalty names for ExLlamaV3 sampler sorting - Restore original chat_template after training instead of leaving it mutated	2026-03-06 16:52:13 -03:00
oobabooga	044566d42d	API: Add tool call parsing for DeepSeek, GLM, MiniMax, and Kimi models	2026-03-06 15:06:56 -03:00
oobabooga	f5acf55207	Add --chat-template-file flag to override the default instruction template for API requests Matches llama.cpp's flag name. Supports .jinja, .jinja2, and .yaml files. Priority: per-request params > --chat-template-file > model's built-in template.	2026-03-06 14:04:16 -03:00
oobabooga	3531069824	API: Support Llama 4 tool calling and fix tool calling edge cases	2026-03-06 13:12:14 -03:00
oobabooga	f9ed8820de	API: Make tool function description and parameters optional	2026-03-05 21:43:33 -08:00
oobabooga	3880c1a406	API: Accept content:null and complex tool definitions in tool calling requests	2026-03-06 02:41:38 -03:00
oobabooga	d0ac58ad31	API: Fix tool_calls placement and other response compatibility issues	2026-03-05 21:25:03 -08:00
oobabooga	f06583b2b9	API: Use \n instead of \r\n as the SSE separator to match OpenAI	2026-03-05 21:16:37 -08:00
oobabooga	521ddbb722	Security: restrict API model loading args to UI-exposed parameters The /v1/internal/model/load endpoint previously allowed setting any shared.args attribute, including security-sensitive flags like trust_remote_code. Now only keys from list_model_elements() are accepted.	2026-03-06 01:57:02 -03:00
oobabooga	27bcc45c18	API: Add command-line flags to override default generation parameters	2026-03-06 01:36:45 -03:00
oobabooga	8d43123f73	API: Fix function calling for Qwen, Mistral, GPT-OSS, and other models The tool call response parser only handled JSON-based formats, causing tool_calls to always be empty for models that use non-JSON formats. Add parsers for three additional tool call formats: - Qwen3.5: <tool_call><function=name><parameter=key>value</parameter> - Mistral/Devstral: functionName{"arg": "value"} - GPT-OSS: <\|channel\|>commentary to=functions.name<\|message\|>{...} Also fix multi-turn tool conversations crashing with Jinja2 UndefinedError on tool_call_id by preserving tool_calls and tool_call_id metadata through the chat history conversion.	2026-03-06 00:55:33 -03:00
oobabooga	4c406e024f	API: Speed up chat completions by ~85ms per request	2026-03-05 18:36:07 -08:00
oobabooga	9824c82cb6	API: Add parallel request support for llama.cpp and ExLlamaV3	2026-03-05 16:49:58 -08:00
Sense_wang	7bf15ad933	fix: replace bare except clauses with except Exception (#7400 )	2026-03-04 18:06:17 -03:00
oobabooga	65de4c30c8	Add adaptive-p sampler and n-gram speculative decoding support	2026-03-04 09:41:29 -08:00
oobabooga	c026dbaf64	Fix API requests always returning the same 'created' time	2025-12-06 08:23:21 -08:00
oobabooga	afa29b9554	Image: Several fixes	2025-12-05 05:58:57 -08:00
oobabooga	15c6e43597	Image: Add a revised_prompt field to API results for OpenAI compatibility	2025-12-04 17:41:09 -08:00
oobabooga	56f2a9512f	Revert "Image: Add the LLM-generated prompt to the API result" This reverts commit `c7ad28a4cd`.	2025-12-04 17:34:27 -08:00
oobabooga	3ef428efaa	Image: Remove llm_variations from the API	2025-12-04 17:34:17 -08:00
oobabooga	c7ad28a4cd	Image: Add the LLM-generated prompt to the API result	2025-12-04 17:22:08 -08:00
oobabooga	ffef3c7b1d	Image: Make the LLM Variations prompt configurable	2025-12-04 10:44:35 -08:00
oobabooga	5763947c37	Image: Simplify the API code, add the llm_variations option	2025-12-04 10:23:00 -08:00
oobabooga	4468c49439	Add semaphore to image generation API endpoint	2025-12-03 12:02:47 -08:00
oobabooga	5433ef3333	Add an API endpoint for generating images	2025-12-03 11:50:56 -08:00
oobabooga	765af1ba17	API: Improve a validation	2025-08-11 12:39:48 -07:00
oobabooga	b62c8845f3	mtmd: Fix /chat/completions for llama.cpp	2025-08-11 12:01:59 -07:00
oobabooga	6fbf162d71	Default max_tokens to 512 in the API instead of 16	2025-08-10 07:21:55 -07:00
oobabooga	1fb5807859	mtmd: Fix API text completion when no images are sent	2025-08-10 06:54:44 -07:00
oobabooga	2f90ac9880	Move the new image_utils.py file to modules/	2025-08-09 21:41:38 -07:00
oobabooga	d86b0ec010	Add multimodal support (llama.cpp) (#7027 )	2025-08-10 01:27:25 -03:00
oobabooga	d9db8f63a7	mtmd: Simplifications	2025-08-09 07:25:42 -07:00
Katehuuh	88127f46c1	Add multimodal support (ExLlamaV3) (#7174 )	2025-08-08 23:31:16 -03:00
oobabooga	498778b8ac	Add a new 'Reasoning effort' UI element	2025-08-05 15:19:11 -07:00
oobabooga	84617abdeb	Properly fix the /v1/models endpoint	2025-06-19 10:25:55 -07:00

1 2 3 4 5

222 commits