oobabooga
cabb95f0d6
UI: Increase the instruct width to 768px
2026-03-13 12:24:48 -07:00
oobabooga
5362bbb413
Make web_search not download the page contents, use fetch_webpage instead
2026-03-13 12:09:08 -07:00
oobabooga
d4c22ced83
UI: Optimize syntax highlighting and autoscroll by moving from MutationObserver to morphdom updates
2026-03-13 15:47:14 -03:00
oobabooga
aab2596d29
UI: Fix multiple thinking blocks rendering as raw text in HTML generator
2026-03-13 15:47:11 -03:00
oobabooga
e0a38da9f3
Improve tool call parsing for Devstral/GPT-OSS and preserve thinking across tool turns
2026-03-13 11:04:06 -03:00
oobabooga
e50b823eee
Update llama.cpp
2026-03-13 06:22:28 -07:00
oobabooga
b7670cc762
Add a tool calling tutorial
2026-03-13 04:35:42 -07:00
oobabooga
d0b72c73c0
Update diffusers to 0.37
2026-03-13 03:43:02 -07:00
oobabooga
c39c187f47
UI: Improve the style of table scrollbars
2026-03-13 03:21:47 -07:00
oobabooga
4628825651
Better solution to fef95b9e56
2026-03-13 03:17:36 -07:00
oobabooga
fef95b9e56
UI: Fix an autoscroll race condition during chat streaming
2026-03-13 03:05:09 -07:00
oobabooga
5833d94d7f
UI: Prevent word breaks in tables
2026-03-13 02:56:49 -07:00
oobabooga
a4bef860b6
UI: Optimize chat streaming by batching morphdom to one update per animation frame
...
The monitor physically cannot paint faster than its refresh rate, so
intermediate morphdom calls between frames do redundant parsing, diffing,
and patching work that is never displayed.
2026-03-13 06:45:47 -03:00
oobabooga
5ddc1002d2
Update ExLlamaV3 to 0.0.25
2026-03-13 02:40:17 -07:00
oobabooga
c094bc943c
UI: Skip output extensions on intermediate tool-calling turns
2026-03-12 21:45:38 -07:00
oobabooga
85ec85e569
UI: Fix Continue while in a tool-calling loop, remove the upper limit on number of tool calls
2026-03-12 20:22:35 -07:00
oobabooga
04213dff14
Address copilot feedback
2026-03-12 19:55:20 -07:00
oobabooga
24fdcc52b3
Merge branch 'main' into dev
2026-03-12 19:33:03 -07:00
oobabooga
58f26a4cc7
UI: Skip redundant work in chat loop when no tools are selected
2026-03-12 19:18:55 -07:00
oobabooga
0e35421593
API: Always extract reasoning_content, even with tool calls
2026-03-12 18:52:41 -07:00
oobabooga
1ed56aee85
Add a calculate tool
2026-03-12 18:45:19 -07:00
oobabooga
286ae475f6
UI: Clean up tool calling code
2026-03-12 22:39:38 -03:00
oobabooga
4c7a56c18d
Add num_pages and max_tokens kwargs to web search tools
2026-03-12 22:17:23 -03:00
oobabooga
a09f21b9de
UI: Fix tool calling for GPT-OSS and Continue
2026-03-12 22:17:20 -03:00
oobabooga
1b7e6c5705
Add the fetch_webpage tool source
2026-03-12 17:11:05 -07:00
oobabooga
f8936ec47c
Truncate web_search and fetch_webpage tools to 8192 tokens
2026-03-12 17:10:41 -07:00
oobabooga
5c02b7f603
Allow the fetch_webpage tool to return links
2026-03-12 17:08:30 -07:00
oobabooga
09d5e049d6
UI: Improve the Tools checkbox list style
2026-03-12 16:53:49 -07:00
oobabooga
fdd8e5b1fd
Make repeated Ctrl+C force a shutdown
2026-03-12 15:48:50 -07:00
oobabooga
4f82b71ef3
UI: Bump the ctx-size max from 131072 to 262144 (256K)
2026-03-12 14:56:35 -07:00
oobabooga
bbd43d9463
UI: Correctly propagate truncation_length when ctx_size is auto
2026-03-12 14:54:05 -07:00
oobabooga
3e6bd1a310
UI: Prepend thinking tag when template appends it to prompt
...
Makes Qwen models have a thinking block straight away during streaming.
2026-03-12 14:30:51 -07:00
oobabooga
9a7428b627
UI: Add collapsible accordions for tool calling steps
2026-03-12 14:16:04 -07:00
oobabooga
2d0cc7726e
API: Add reasoning_content field to non-streaming chat completions
...
Extract thinking/reasoning blocks (e.g. <think>...</think>) into a
separate reasoning_content field on the assistant message, matching
the convention used by DeepSeek, llama.cpp, and SGLang.
2026-03-12 16:30:46 -03:00
oobabooga
d45c9b3c59
API: Minor logprobs fixes
2026-03-12 16:09:49 -03:00
oobabooga
2466305f76
Add tool examples
2026-03-12 16:03:57 -03:00
oobabooga
a916fb0e5c
API: Preserve mid-conversation system message positions
2026-03-12 14:27:24 -03:00
oobabooga
fb1b3b6ddf
API: Rewrite logprobs for OpenAI spec compliance across all backends
...
- Rewrite logprobs output format to match the OpenAI specification for
both chat completions and completions endpoints
- Fix top_logprobs count being ignored for llama.cpp and ExLlamav3
backends in chat completions (always returned 1 instead of requested N)
- Fix non-streaming responses only returning logprobs for the last token
instead of all generated tokens (affects all HF-based loaders)
- Fix logprobs returning null for non-streaming chat requests on HF loaders
- Fix off-by-one returning one extra top alternative on HF loaders
2026-03-12 14:17:32 -03:00
oobabooga
5a017aa338
API: Several OpenAI spec compliance fixes
...
- Return proper OpenAI error format ({"error": {...}}) instead of HTTP 500 for validation errors
- Send data: [DONE] at the end of SSE streams
- Fix finish_reason so "tool_calls" takes priority over "length"
- Stop including usage in streaming chunks when include_usage is not set
- Handle "developer" role in messages (treated same as "system")
- Add logprobs and top_logprobs parameters for chat completions
- Fix chat completions logprobs not working with llama.cpp and ExLlamav3 backends
- Add max_completion_tokens as an alias for max_tokens in chat completions
2026-03-12 13:30:38 -03:00
oobabooga
4b6c9db1c9
UI: Fix stale tool_sequence after edit and chat-instruct tool rendering
2026-03-12 13:12:18 -03:00
oobabooga
09723c9988
API: Include /v1 in the printed API URL for easier integration
2026-03-12 12:43:15 -03:00
oobabooga
2549f7c33b
API: Add tool_choice support and fix tool_calls spec compliance
2026-03-12 10:29:23 -03:00
oobabooga
b5cac2e3b2
Fix swipes and edit for tool calling in the UI
2026-03-12 01:53:37 -03:00
oobabooga
0d62038710
Add tools refresh button and _tool_turn comment
2026-03-12 01:36:07 -03:00
oobabooga
cf9ad8eafe
Initial tool-calling support in the UI
2026-03-12 01:16:19 -03:00
oobabooga
980a9d1657
UI: Minor defensive changes to autosave
2026-03-11 15:50:16 -07:00
oobabooga
bb00d96dc3
Use a new gr.DragDrop element for Sampler priority + update gradio
2026-03-11 19:35:12 -03:00
oobabooga
66c976e995
Update README with ROCm 7.2 torch install URL
2026-03-11 19:35:12 -03:00
oobabooga
24977846fb
Update AMD ROCm from 6.4 to 7.2
2026-03-11 13:14:26 -07:00
oobabooga
7a63a56043
Update llama.cpp
2026-03-11 12:53:19 -07:00