oobabooga
|
9d02d3a13b
|
docs: Minor change to tool calling tutorial
|
2026-03-16 16:10:17 -07:00 |
|
oobabooga
|
238cbd5656
|
training: Remove arbitrary higher_rank_limit parameter
|
2026-03-16 16:05:43 -07:00 |
|
oobabooga
|
22ff5044b0
|
training: Organize the UI
|
2026-03-16 16:01:40 -07:00 |
|
oobabooga
|
1c89376370
|
training: Add gradient_checkpointing for lower VRAM by default
|
2026-03-16 15:23:24 -07:00 |
|
oobabooga
|
44810751de
|
Update llama.cpp
|
2026-03-16 06:21:14 -07:00 |
|
oobabooga
|
6c05a964a7
|
docs: Mention supported tool-calling models
|
2026-03-16 06:00:16 -07:00 |
|
oobabooga
|
737ded6959
|
Web search: Fix SSRF validation to block all non-global IPs
|
2026-03-16 05:37:46 -07:00 |
|
oobabooga
|
50685c93f2
|
Update README
|
2026-03-16 05:29:27 -07:00 |
|
oobabooga
|
9d9f5d9860
|
Update README
|
2026-03-15 20:27:44 -07:00 |
|
oobabooga
|
5cfe9fe295
|
Update README
|
2026-03-15 20:12:22 -07:00 |
|
oobabooga
|
b76a289e04
|
API: Respect --listen-host for the OpenAI API server
Closes #7429
|
2026-03-15 18:04:34 -07:00 |
|
oobabooga
|
c0de1d176c
|
UI: Add an incognito chat option
|
2026-03-15 17:57:31 -07:00 |
|
oobabooga
|
4f80b20859
|
UI: Follow-up to beab346f (fix scroll deadlock on chat-parent)
|
2026-03-15 16:38:54 -07:00 |
|
oobabooga
|
f8ff7cf99e
|
Update the custom gradio wheels
|
2026-03-15 14:12:59 -07:00 |
|
oobabooga
|
92d376e420
|
web_search: Return all results and improve URL extraction
|
2026-03-15 13:14:53 -07:00 |
|
oobabooga
|
f6a749a151
|
API: Fix /v1/models to only list the currently loaded model
|
2026-03-15 10:17:31 -07:00 |
|
oobabooga
|
1a2b840938
|
UI: Fix scroll jump when toggling thinking blocks during streaming
|
2026-03-15 09:52:31 -07:00 |
|
oobabooga
|
bfea49b197
|
Move top_p and top_k higher up in the UI and CLI help
|
2026-03-15 09:34:17 -07:00 |
|
oobabooga
|
80d0c03bab
|
llama.cpp: Change the default --fit-target from 1024 to 512
|
2026-03-15 09:29:25 -07:00 |
|
oobabooga
|
9119ce0680
|
llama.cpp: Use --fit-ctx 8192 when --fit on is used
This sets the minimum acceptable context length, which by default is 4096.
|
2026-03-15 09:24:14 -07:00 |
|
oobabooga
|
5763cab3c4
|
Fix a crash loading the MiniMax-M2.5 jinja template
|
2026-03-15 07:13:26 -07:00 |
|
oobabooga
|
f0c16813ef
|
Remove the rope scaling parameters
Now models have 131k+ context length. The parameters can still be
passed to llama.cpp through --extra-flags.
|
2026-03-14 19:43:25 -07:00 |
|
oobabooga
|
2d3a3794c9
|
Add a Top-P preset, make it the new default, clean up the built-in presets
|
2026-03-14 19:22:12 -07:00 |
|
oobabooga
|
9955e54a1f
|
UI: Fix autoscroll not engaging when regenerating short chats
|
2026-03-14 18:51:12 -07:00 |
|
oobabooga
|
d1aba08561
|
UI: Set chat widths to 724px
|
2026-03-14 18:35:44 -07:00 |
|
oobabooga
|
c126530061
|
UI: Minor color change
|
2026-03-14 18:22:41 -07:00 |
|
oobabooga
|
b9bdbd638e
|
Fix after 4ae2bd86e2
|
2026-03-14 18:18:33 -07:00 |
|
oobabooga
|
9eacd4a207
|
UI: Minor morphdom optimizations
|
2026-03-14 16:07:16 -07:00 |
|
oobabooga
|
e11425d5f8
|
Fix relative redirect handling in web page fetcher
|
2026-03-14 15:46:21 -07:00 |
|
oobabooga
|
4ae2bd86e2
|
Change the default ctx-size to 0 (auto) for llama.cpp
|
2026-03-14 15:30:01 -07:00 |
|
oobabooga
|
9f657d3976
|
UI: Fix a minor glitch
|
2026-03-14 14:19:12 -07:00 |
|
oobabooga
|
c09a367c64
|
UI: Fix dark theme using light theme syntax highlighting
|
2026-03-14 14:08:03 -07:00 |
|
oobabooga
|
beab346f48
|
UI: Fix a minor glitch
|
2026-03-14 12:45:37 -07:00 |
|
oobabooga
|
573617157a
|
Optimize tool call detection
Avoids templates that don't contain a given necessary keyword
|
2026-03-14 12:09:41 -07:00 |
|
oobabooga
|
d0a4993cf4
|
UI: Increase ctx-size slider maximum to 1M and step to 1024
|
2026-03-14 09:53:12 -07:00 |
|
oobabooga
|
c7953fb923
|
Add ROCm version to portable package filenames
|
2026-03-14 09:44:37 -07:00 |
|
oobabooga
|
c908ac00d7
|
Replace html2text with trafilatura for better web content extraction
After this change a lot of boilerplate is removed from web pages, saving tokens on agentic loops.
|
2026-03-14 09:29:17 -07:00 |
|
oobabooga
|
8bff331893
|
UI: Fix tool call markup flashing before accordion appears during streaming
|
2026-03-14 09:26:20 -07:00 |
|
oobabooga
|
cb08ba63dc
|
Fix GPT-OSS channel markup leaking into UI when model skips analysis block
|
2026-03-14 09:08:05 -07:00 |
|
oobabooga
|
09a6549816
|
API: Stream reasoning_content separately from content in OpenAI-compatible responses
|
2026-03-14 06:52:40 -07:00 |
|
oobabooga
|
accb2ef661
|
UI/API: Prevent tool call markup from leaking into streamed UI output (closes #7427)
|
2026-03-14 06:26:47 -07:00 |
|
oobabooga
|
998b9bfb2a
|
UI: Make all chat styles better match instruct style
|
2026-03-13 21:07:40 -07:00 |
|
oobabooga
|
5f1707af35
|
UI: Increase the width of non-instruct chat styles
|
2026-03-13 20:38:40 -07:00 |
|
oobabooga
|
16636c04b8
|
UI: Minor fix/optimization
|
2026-03-13 19:06:04 -07:00 |
|
oobabooga
|
e8d1c66303
|
Clean up tool calling code
|
2026-03-13 18:27:01 -07:00 |
|
oobabooga
|
cb88066d15
|
Update llama.cpp
|
2026-03-13 13:17:41 -07:00 |
|
oobabooga
|
0cd245bcbb
|
UI: Make autoscroll more robust after the optimizations
|
2026-03-13 12:58:56 -07:00 |
|
oobabooga
|
24e7e77b55
|
Clean up
|
2026-03-13 12:37:10 -07:00 |
|
oobabooga
|
cabb95f0d6
|
UI: Increase the instruct width to 768px
|
2026-03-13 12:24:48 -07:00 |
|
oobabooga
|
5362bbb413
|
Make web_search not download the page contents, use fetch_webpage instead
|
2026-03-13 12:09:08 -07:00 |
|