oobabooga
d6643bb4bc
One-click installer: Optimize wheel downloads to only re-download changed wheels
2026-03-09 12:30:43 -07:00
oobabooga
9753b2342b
Fix crash on non-UTF-8 Windows locales (e.g. Chinese GBK)
...
Closes #7416
2026-03-09 16:22:37 -03:00
oobabooga
eb4a20137a
Update README
2026-03-08 20:38:50 -07:00
oobabooga
634609acca
Fix pip installing to system Miniconda on Windows, revert 0132966d
2026-03-08 20:35:41 -07:00
oobabooga
40f1837b42
README: Minor updates
2026-03-08 08:38:29 -07:00
oobabooga
f6ffecfff2
Add guard against training with llama.cpp loader
2026-03-08 10:47:59 -03:00
oobabooga
5a91b8462f
Remove ctx_size_draft from ExLlamav3 loader
2026-03-08 09:53:48 -03:00
oobabooga
7a8ca9f2b0
Fix passing adaptive-p to llama-server
2026-03-08 04:09:40 -07:00
oobabooga
0132966d09
Add PyPI fallback for PyTorch install commands
2026-03-07 23:06:15 -03:00
oobabooga
baf4e13ff1
ExLlamav3: fix draft cache size to match main cache
2026-03-07 22:34:48 -03:00
oobabooga
6ff111d18e
ExLlamav3: handle exceptions in ConcurrentGenerator iterate loop
2026-03-07 22:05:31 -03:00
oobabooga
0cecc0a041
Use tar.gz for Linux/macOS portable builds to preserve symlinks
2026-03-07 06:59:48 -08:00
oobabooga
e1bf0b866f
Update the macos workflow
2026-03-07 06:46:46 -08:00
oobabooga
b686193fe2
Reapply "Update Miniforge from 25.3.0 to 26.1.0"
...
This reverts commit 085c4ef5d7 .
2026-03-07 06:10:05 -08:00
oobabooga
328215b0c7
API: Stop generation on client disconnect for non-streaming requests
2026-03-07 06:06:13 -08:00
oobabooga
304510eb3d
ExLlamav3: route all generation through ConcurrentGenerator
2026-03-07 05:54:14 -08:00
oobabooga
085c4ef5d7
Revert "Update Miniforge from 25.3.0 to 26.1.0"
...
This reverts commit 9576c5a5f4 .
2026-03-07 05:09:49 -08:00
oobabooga
aa634c77c0
Update llama.cpp
2026-03-06 21:00:36 -08:00
oobabooga
abc699db9b
Minor UI change
2026-03-06 19:03:38 -08:00
oobabooga
f2fe001cc4
Fix message copy buttons not working over HTTP
2026-03-06 19:01:38 -08:00
oobabooga
7ea5513263
Handle Qwen 3.5 thinking blocks
2026-03-06 19:01:28 -08:00
oobabooga
5fa709a3f4
llama.cpp server: use port+5 offset and suppress No parser definition detected logs
2026-03-06 18:52:34 -08:00
oobabooga
e8e0d02406
Remove outdated ROCm environment variable overrides from one_click.py
2026-03-06 18:15:05 -08:00
oobabooga
1eead661c3
Portable mode: always use ../user_data if it exists
2026-03-06 18:04:48 -08:00
oobabooga
d48b53422f
Training: Optimize _peek_json_keys to avoid loading entire file into memory
2026-03-06 15:39:08 -08:00
oobabooga
2beaa4b971
Update llama.cpp
2026-03-06 14:39:35 -08:00
oobabooga
5f6754c267
Fix stop button being ignored when token throttling is off
2026-03-06 17:12:34 -03:00
oobabooga
b8b4471ab5
Security: restrict file writes to user_data_dir, block extra_flags from API
2026-03-06 16:58:11 -03:00
oobabooga
d03923924a
Several small fixes
...
- Stop llama-server subprocess on model unload instead of relying on GC
- Fix tool_calls[].index being string instead of int in API responses
- Omit tool_calls key from API response when empty per OpenAI spec
- Prevent division by zero when micro_batch_size > batch_size in training
- Copy sampler_priority list before mutating in ExLlamaV3
- Normalize presence/frequency_penalty names for ExLlamaV3 sampler sorting
- Restore original chat_template after training instead of leaving it mutated
2026-03-06 16:52:13 -03:00
oobabooga
044566d42d
API: Add tool call parsing for DeepSeek, GLM, MiniMax, and Kimi models
2026-03-06 15:06:56 -03:00
oobabooga
f5acf55207
Add --chat-template-file flag to override the default instruction template for API requests
...
Matches llama.cpp's flag name. Supports .jinja, .jinja2, and .yaml files.
Priority: per-request params > --chat-template-file > model's built-in template.
2026-03-06 14:04:16 -03:00
oobabooga
3531069824
API: Support Llama 4 tool calling and fix tool calling edge cases
2026-03-06 13:12:14 -03:00
oobabooga
160f7ad6b4
Handle SIGTERM to stop llama-server on pkill
2026-03-06 12:56:33 -03:00
oobabooga
8e24a20873
Installer: Fix libstdcxx-ng version pin causing conda solver to hang on Python 3.13
2026-03-06 07:39:50 -08:00
oobabooga
3bab7fbfd4
Update Colab notebook: new default model, direct GGUF URL support
2026-03-06 06:52:49 -08:00
oobabooga
e7e0df0101
Fix hover menu shifting down when chat input grows
2026-03-06 11:52:16 -03:00
oobabooga
3323dedd08
Update llama.cpp
2026-03-06 06:30:01 -08:00
oobabooga
36dbc4ccce
Remove unused colorama and psutil requirements
2026-03-06 06:28:35 -08:00
oobabooga
86d59b4404
Installer: Fix edge case in wheel re-download caching
2026-03-06 06:16:57 -08:00
oobabooga
0e0e3ceb97
Update the custom gradio wheels
2026-03-06 05:46:08 -08:00
oobabooga
6d7018069c
Installer: Use absolute Python path in Windows batch scripts
2026-03-05 21:56:01 -08:00
oobabooga
f9ed8820de
API: Make tool function description and parameters optional
2026-03-05 21:43:33 -08:00
oobabooga
3880c1a406
API: Accept content:null and complex tool definitions in tool calling requests
2026-03-06 02:41:38 -03:00
oobabooga
93ebfa2b7e
Fix llama-server output filter for new log format
2026-03-06 02:38:13 -03:00
oobabooga
d0ac58ad31
API: Fix tool_calls placement and other response compatibility issues
2026-03-05 21:25:03 -08:00
oobabooga
f06583b2b9
API: Use \n instead of \r\n as the SSE separator to match OpenAI
2026-03-05 21:16:37 -08:00
oobabooga
8be444a559
Update the custom gradio wheels
2026-03-05 21:05:15 -08:00
oobabooga
1729fb07b9
Update llama.cpp
2026-03-05 21:04:24 -08:00
oobabooga
eba262d47a
Security: prevent path traversal in character/user/file save and delete
2026-03-06 02:00:10 -03:00
oobabooga
521ddbb722
Security: restrict API model loading args to UI-exposed parameters
...
The /v1/internal/model/load endpoint previously allowed setting any
shared.args attribute, including security-sensitive flags like
trust_remote_code. Now only keys from list_model_elements() are accepted.
2026-03-06 01:57:02 -03:00