Commit graph

5400 commits

Author SHA1 Message Date
oobabooga
d1aba08561 UI: Set chat widths to 724px 2026-03-14 18:35:44 -07:00
oobabooga
c126530061 UI: Minor color change 2026-03-14 18:22:41 -07:00
oobabooga
b9bdbd638e Fix after 4ae2bd86e2 2026-03-14 18:18:33 -07:00
oobabooga
9eacd4a207 UI: Minor morphdom optimizations 2026-03-14 16:07:16 -07:00
oobabooga
e11425d5f8 Fix relative redirect handling in web page fetcher 2026-03-14 15:46:21 -07:00
oobabooga
4ae2bd86e2 Change the default ctx-size to 0 (auto) for llama.cpp 2026-03-14 15:30:01 -07:00
oobabooga
9f657d3976 UI: Fix a minor glitch 2026-03-14 14:19:12 -07:00
oobabooga
c09a367c64 UI: Fix dark theme using light theme syntax highlighting 2026-03-14 14:08:03 -07:00
oobabooga
beab346f48 UI: Fix a minor glitch 2026-03-14 12:45:37 -07:00
oobabooga
573617157a Optimize tool call detection
Avoids templates that don't contain a given necessary keyword
2026-03-14 12:09:41 -07:00
oobabooga
d0a4993cf4 UI: Increase ctx-size slider maximum to 1M and step to 1024 2026-03-14 09:53:12 -07:00
oobabooga
c7953fb923 Add ROCm version to portable package filenames 2026-03-14 09:44:37 -07:00
oobabooga
c908ac00d7 Replace html2text with trafilatura for better web content extraction
After this change a lot of boilerplate is removed from web pages, saving tokens on agentic loops.
2026-03-14 09:29:17 -07:00
oobabooga
8bff331893 UI: Fix tool call markup flashing before accordion appears during streaming 2026-03-14 09:26:20 -07:00
oobabooga
cb08ba63dc Fix GPT-OSS channel markup leaking into UI when model skips analysis block 2026-03-14 09:08:05 -07:00
oobabooga
09a6549816 API: Stream reasoning_content separately from content in OpenAI-compatible responses 2026-03-14 06:52:40 -07:00
oobabooga
accb2ef661 UI/API: Prevent tool call markup from leaking into streamed UI output (closes #7427) 2026-03-14 06:26:47 -07:00
oobabooga
998b9bfb2a UI: Make all chat styles better match instruct style 2026-03-13 21:07:40 -07:00
oobabooga
5f1707af35 UI: Increase the width of non-instruct chat styles 2026-03-13 20:38:40 -07:00
oobabooga
16636c04b8 UI: Minor fix/optimization 2026-03-13 19:06:04 -07:00
oobabooga
e8d1c66303 Clean up tool calling code 2026-03-13 18:27:01 -07:00
oobabooga
cb88066d15 Update llama.cpp 2026-03-13 13:17:41 -07:00
oobabooga
0cd245bcbb UI: Make autoscroll more robust after the optimizations 2026-03-13 12:58:56 -07:00
oobabooga
24e7e77b55 Clean up 2026-03-13 12:37:10 -07:00
oobabooga
cabb95f0d6 UI: Increase the instruct width to 768px 2026-03-13 12:24:48 -07:00
oobabooga
5362bbb413 Make web_search not download the page contents, use fetch_webpage instead 2026-03-13 12:09:08 -07:00
oobabooga
d4c22ced83 UI: Optimize syntax highlighting and autoscroll by moving from MutationObserver to morphdom updates 2026-03-13 15:47:14 -03:00
oobabooga
aab2596d29 UI: Fix multiple thinking blocks rendering as raw text in HTML generator 2026-03-13 15:47:11 -03:00
oobabooga
e0a38da9f3 Improve tool call parsing for Devstral/GPT-OSS and preserve thinking across tool turns 2026-03-13 11:04:06 -03:00
oobabooga
e50b823eee Update llama.cpp 2026-03-13 06:22:28 -07:00
oobabooga
b7670cc762 Add a tool calling tutorial 2026-03-13 04:35:42 -07:00
oobabooga
d0b72c73c0 Update diffusers to 0.37 2026-03-13 03:43:02 -07:00
oobabooga
c39c187f47 UI: Improve the style of table scrollbars 2026-03-13 03:21:47 -07:00
oobabooga
4628825651 Better solution to fef95b9e56 2026-03-13 03:17:36 -07:00
oobabooga
fef95b9e56 UI: Fix an autoscroll race condition during chat streaming 2026-03-13 03:05:09 -07:00
oobabooga
5833d94d7f UI: Prevent word breaks in tables 2026-03-13 02:56:49 -07:00
oobabooga
a4bef860b6 UI: Optimize chat streaming by batching morphdom to one update per animation frame
The monitor physically cannot paint faster than its refresh rate, so
intermediate morphdom calls between frames do redundant parsing, diffing,
and patching work that is never displayed.
2026-03-13 06:45:47 -03:00
oobabooga
5ddc1002d2 Update ExLlamaV3 to 0.0.25 2026-03-13 02:40:17 -07:00
oobabooga
c094bc943c UI: Skip output extensions on intermediate tool-calling turns 2026-03-12 21:45:38 -07:00
oobabooga
85ec85e569 UI: Fix Continue while in a tool-calling loop, remove the upper limit on number of tool calls 2026-03-12 20:22:35 -07:00
oobabooga
04213dff14 Address copilot feedback 2026-03-12 19:55:20 -07:00
oobabooga
24fdcc52b3 Merge branch 'main' into dev 2026-03-12 19:33:03 -07:00
oobabooga
58f26a4cc7 UI: Skip redundant work in chat loop when no tools are selected 2026-03-12 19:18:55 -07:00
oobabooga
0e35421593 API: Always extract reasoning_content, even with tool calls 2026-03-12 18:52:41 -07:00
oobabooga
1ed56aee85 Add a calculate tool 2026-03-12 18:45:19 -07:00
oobabooga
286ae475f6 UI: Clean up tool calling code 2026-03-12 22:39:38 -03:00
oobabooga
4c7a56c18d Add num_pages and max_tokens kwargs to web search tools 2026-03-12 22:17:23 -03:00
oobabooga
a09f21b9de UI: Fix tool calling for GPT-OSS and Continue 2026-03-12 22:17:20 -03:00
oobabooga
1b7e6c5705 Add the fetch_webpage tool source 2026-03-12 17:11:05 -07:00
oobabooga
f8936ec47c Truncate web_search and fetch_webpage tools to 8192 tokens 2026-03-12 17:10:41 -07:00