Commit graph

5377 commits

Author SHA1 Message Date
oobabooga 24e7e77b55 Clean up 2026-03-13 12:37:10 -07:00
oobabooga cabb95f0d6 UI: Increase the instruct width to 768px 2026-03-13 12:24:48 -07:00
oobabooga 5362bbb413 Make web_search not download the page contents, use fetch_webpage instead 2026-03-13 12:09:08 -07:00
oobabooga d4c22ced83 UI: Optimize syntax highlighting and autoscroll by moving from MutationObserver to morphdom updates 2026-03-13 15:47:14 -03:00
oobabooga aab2596d29 UI: Fix multiple thinking blocks rendering as raw text in HTML generator 2026-03-13 15:47:11 -03:00
oobabooga e0a38da9f3 Improve tool call parsing for Devstral/GPT-OSS and preserve thinking across tool turns 2026-03-13 11:04:06 -03:00
oobabooga e50b823eee Update llama.cpp 2026-03-13 06:22:28 -07:00
oobabooga b7670cc762 Add a tool calling tutorial 2026-03-13 04:35:42 -07:00
oobabooga d0b72c73c0 Update diffusers to 0.37 2026-03-13 03:43:02 -07:00
oobabooga c39c187f47 UI: Improve the style of table scrollbars 2026-03-13 03:21:47 -07:00
oobabooga 4628825651 Better solution to fef95b9e56 2026-03-13 03:17:36 -07:00
oobabooga fef95b9e56 UI: Fix an autoscroll race condition during chat streaming 2026-03-13 03:05:09 -07:00
oobabooga 5833d94d7f UI: Prevent word breaks in tables 2026-03-13 02:56:49 -07:00
oobabooga a4bef860b6 UI: Optimize chat streaming by batching morphdom to one update per animation frame
The monitor physically cannot paint faster than its refresh rate, so
intermediate morphdom calls between frames do redundant parsing, diffing,
and patching work that is never displayed.
2026-03-13 06:45:47 -03:00
oobabooga 5ddc1002d2 Update ExLlamaV3 to 0.0.25 2026-03-13 02:40:17 -07:00
oobabooga c094bc943c UI: Skip output extensions on intermediate tool-calling turns 2026-03-12 21:45:38 -07:00
oobabooga 85ec85e569 UI: Fix Continue while in a tool-calling loop, remove the upper limit on number of tool calls 2026-03-12 20:22:35 -07:00
oobabooga 04213dff14 Address copilot feedback 2026-03-12 19:55:20 -07:00
oobabooga 24fdcc52b3 Merge branch 'main' into dev 2026-03-12 19:33:03 -07:00
oobabooga 58f26a4cc7 UI: Skip redundant work in chat loop when no tools are selected 2026-03-12 19:18:55 -07:00
oobabooga 0e35421593 API: Always extract reasoning_content, even with tool calls 2026-03-12 18:52:41 -07:00
oobabooga 1ed56aee85 Add a calculate tool 2026-03-12 18:45:19 -07:00
oobabooga 286ae475f6 UI: Clean up tool calling code 2026-03-12 22:39:38 -03:00
oobabooga 4c7a56c18d Add num_pages and max_tokens kwargs to web search tools 2026-03-12 22:17:23 -03:00
oobabooga a09f21b9de UI: Fix tool calling for GPT-OSS and Continue 2026-03-12 22:17:20 -03:00
oobabooga 1b7e6c5705 Add the fetch_webpage tool source 2026-03-12 17:11:05 -07:00
oobabooga f8936ec47c Truncate web_search and fetch_webpage tools to 8192 tokens 2026-03-12 17:10:41 -07:00
oobabooga 5c02b7f603 Allow the fetch_webpage tool to return links 2026-03-12 17:08:30 -07:00
oobabooga 09d5e049d6 UI: Improve the Tools checkbox list style 2026-03-12 16:53:49 -07:00
oobabooga fdd8e5b1fd Make repeated Ctrl+C force a shutdown 2026-03-12 15:48:50 -07:00
oobabooga 4f82b71ef3 UI: Bump the ctx-size max from 131072 to 262144 (256K) 2026-03-12 14:56:35 -07:00
oobabooga bbd43d9463 UI: Correctly propagate truncation_length when ctx_size is auto 2026-03-12 14:54:05 -07:00
oobabooga 3e6bd1a310 UI: Prepend thinking tag when template appends it to prompt
Makes Qwen models have a thinking block straight away during streaming.
2026-03-12 14:30:51 -07:00
oobabooga 9a7428b627 UI: Add collapsible accordions for tool calling steps 2026-03-12 14:16:04 -07:00
oobabooga 2d0cc7726e API: Add reasoning_content field to non-streaming chat completions
Extract thinking/reasoning blocks (e.g. <think>...</think>) into a
separate reasoning_content field on the assistant message, matching
the convention used by DeepSeek, llama.cpp, and SGLang.
2026-03-12 16:30:46 -03:00
oobabooga d45c9b3c59 API: Minor logprobs fixes 2026-03-12 16:09:49 -03:00
oobabooga 2466305f76 Add tool examples 2026-03-12 16:03:57 -03:00
oobabooga a916fb0e5c API: Preserve mid-conversation system message positions 2026-03-12 14:27:24 -03:00
oobabooga fb1b3b6ddf API: Rewrite logprobs for OpenAI spec compliance across all backends
- Rewrite logprobs output format to match the OpenAI specification for
  both chat completions and completions endpoints
- Fix top_logprobs count being ignored for llama.cpp and ExLlamav3
  backends in chat completions (always returned 1 instead of requested N)
- Fix non-streaming responses only returning logprobs for the last token
  instead of all generated tokens (affects all HF-based loaders)
- Fix logprobs returning null for non-streaming chat requests on HF loaders
- Fix off-by-one returning one extra top alternative on HF loaders
2026-03-12 14:17:32 -03:00
oobabooga 5a017aa338 API: Several OpenAI spec compliance fixes
- Return proper OpenAI error format ({"error": {...}}) instead of HTTP 500 for validation errors
- Send data: [DONE] at the end of SSE streams
- Fix finish_reason so "tool_calls" takes priority over "length"
- Stop including usage in streaming chunks when include_usage is not set
- Handle "developer" role in messages (treated same as "system")
- Add logprobs and top_logprobs parameters for chat completions
- Fix chat completions logprobs not working with llama.cpp and ExLlamav3 backends
- Add max_completion_tokens as an alias for max_tokens in chat completions
2026-03-12 13:30:38 -03:00
oobabooga 4b6c9db1c9 UI: Fix stale tool_sequence after edit and chat-instruct tool rendering 2026-03-12 13:12:18 -03:00
oobabooga 09723c9988 API: Include /v1 in the printed API URL for easier integration 2026-03-12 12:43:15 -03:00
oobabooga 2549f7c33b API: Add tool_choice support and fix tool_calls spec compliance 2026-03-12 10:29:23 -03:00
oobabooga b5cac2e3b2 Fix swipes and edit for tool calling in the UI 2026-03-12 01:53:37 -03:00
oobabooga 0d62038710 Add tools refresh button and _tool_turn comment 2026-03-12 01:36:07 -03:00
oobabooga cf9ad8eafe Initial tool-calling support in the UI 2026-03-12 01:16:19 -03:00
oobabooga 980a9d1657 UI: Minor defensive changes to autosave 2026-03-11 15:50:16 -07:00
oobabooga bb00d96dc3 Use a new gr.DragDrop element for Sampler priority + update gradio 2026-03-11 19:35:12 -03:00
oobabooga 66c976e995 Update README with ROCm 7.2 torch install URL 2026-03-11 19:35:12 -03:00
oobabooga 24977846fb Update AMD ROCm from 6.4 to 7.2 2026-03-11 13:14:26 -07:00