oobabooga
b8b4471ab5
Security: restrict file writes to user_data_dir, block extra_flags from API
2026-03-06 16:58:11 -03:00
oobabooga
d03923924a
Several small fixes
...
- Stop llama-server subprocess on model unload instead of relying on GC
- Fix tool_calls[].index being string instead of int in API responses
- Omit tool_calls key from API response when empty per OpenAI spec
- Prevent division by zero when micro_batch_size > batch_size in training
- Copy sampler_priority list before mutating in ExLlamaV3
- Normalize presence/frequency_penalty names for ExLlamaV3 sampler sorting
- Restore original chat_template after training instead of leaving it mutated
2026-03-06 16:52:13 -03:00
oobabooga
044566d42d
API: Add tool call parsing for DeepSeek, GLM, MiniMax, and Kimi models
2026-03-06 15:06:56 -03:00
oobabooga
f5acf55207
Add --chat-template-file flag to override the default instruction template for API requests
...
Matches llama.cpp's flag name. Supports .jinja, .jinja2, and .yaml files.
Priority: per-request params > --chat-template-file > model's built-in template.
2026-03-06 14:04:16 -03:00
oobabooga
3531069824
API: Support Llama 4 tool calling and fix tool calling edge cases
2026-03-06 13:12:14 -03:00
oobabooga
f9ed8820de
API: Make tool function description and parameters optional
2026-03-05 21:43:33 -08:00
oobabooga
3880c1a406
API: Accept content:null and complex tool definitions in tool calling requests
2026-03-06 02:41:38 -03:00
oobabooga
d0ac58ad31
API: Fix tool_calls placement and other response compatibility issues
2026-03-05 21:25:03 -08:00
oobabooga
f06583b2b9
API: Use \n instead of \r\n as the SSE separator to match OpenAI
2026-03-05 21:16:37 -08:00
oobabooga
521ddbb722
Security: restrict API model loading args to UI-exposed parameters
...
The /v1/internal/model/load endpoint previously allowed setting any
shared.args attribute, including security-sensitive flags like
trust_remote_code. Now only keys from list_model_elements() are accepted.
2026-03-06 01:57:02 -03:00
oobabooga
27bcc45c18
API: Add command-line flags to override default generation parameters
2026-03-06 01:36:45 -03:00
oobabooga
ddcad3cc51
Follow-up to e2548f69: add missing paths module, fix gallery extension
2026-03-06 00:58:03 -03:00
oobabooga
8d43123f73
API: Fix function calling for Qwen, Mistral, GPT-OSS, and other models
...
The tool call response parser only handled JSON-based formats, causing
tool_calls to always be empty for models that use non-JSON formats.
Add parsers for three additional tool call formats:
- Qwen3.5: <tool_call><function=name><parameter=key>value</parameter>
- Mistral/Devstral: functionName{"arg": "value"}
- GPT-OSS: <|channel|>commentary to=functions.name<|message|>{...}
Also fix multi-turn tool conversations crashing with Jinja2
UndefinedError on tool_call_id by preserving tool_calls and
tool_call_id metadata through the chat history conversion.
2026-03-06 00:55:33 -03:00
oobabooga
4c406e024f
API: Speed up chat completions by ~85ms per request
2026-03-05 18:36:07 -08:00
oobabooga
9824c82cb6
API: Add parallel request support for llama.cpp and ExLlamaV3
2026-03-05 16:49:58 -08:00
oobabooga
5be68cc073
Remove Training_PRO extension
...
The built-in training tab now covers its essential functionality
with a more modern and correct implementation (apply_chat_template,
dynamic padding, JSONL datasets, stride overlap).
2026-03-05 12:55:07 -03:00
thecaptain789
2ac4eb33c8
fix: correct typo 'occured' to 'occurred' ( #7389 )
2026-03-04 18:09:28 -03:00
Sense_wang
7bf15ad933
fix: replace bare except clauses with except Exception ( #7400 )
2026-03-04 18:06:17 -03:00
weiguang li
952e2c404a
Bump sentence-transformers from 2.2.2 to 3.3.1 in superbooga ( #7406 )
2026-03-04 17:08:08 -03:00
oobabooga
65de4c30c8
Add adaptive-p sampler and n-gram speculative decoding support
2026-03-04 09:41:29 -08:00
oobabooga
c026dbaf64
Fix API requests always returning the same 'created' time
2025-12-06 08:23:21 -08:00
oobabooga
afa29b9554
Image: Several fixes
2025-12-05 05:58:57 -08:00
oobabooga
15c6e43597
Image: Add a revised_prompt field to API results for OpenAI compatibility
2025-12-04 17:41:09 -08:00
oobabooga
56f2a9512f
Revert "Image: Add the LLM-generated prompt to the API result"
...
This reverts commit c7ad28a4cd .
2025-12-04 17:34:27 -08:00
oobabooga
3ef428efaa
Image: Remove llm_variations from the API
2025-12-04 17:34:17 -08:00
oobabooga
c7ad28a4cd
Image: Add the LLM-generated prompt to the API result
2025-12-04 17:22:08 -08:00
oobabooga
ffef3c7b1d
Image: Make the LLM Variations prompt configurable
2025-12-04 10:44:35 -08:00
oobabooga
5763947c37
Image: Simplify the API code, add the llm_variations option
2025-12-04 10:23:00 -08:00
oobabooga
4468c49439
Add semaphore to image generation API endpoint
2025-12-03 12:02:47 -08:00
oobabooga
5433ef3333
Add an API endpoint for generating images
2025-12-03 11:50:56 -08:00
aidevtime
661e42d2b7
fix(deps): upgrade coqui-tts to >=0.27.0 for transformers 4.55 compatibility ( #7329 )
2025-11-28 22:59:36 -03:00
oobabooga
338ae36f73
Add weights_only=True to torch.load in Training_PRO
2025-10-28 12:43:16 -07:00
oobabooga
765af1ba17
API: Improve a validation
2025-08-11 12:39:48 -07:00
oobabooga
b62c8845f3
mtmd: Fix /chat/completions for llama.cpp
2025-08-11 12:01:59 -07:00
oobabooga
6fbf162d71
Default max_tokens to 512 in the API instead of 16
2025-08-10 07:21:55 -07:00
oobabooga
1fb5807859
mtmd: Fix API text completion when no images are sent
2025-08-10 06:54:44 -07:00
oobabooga
2f90ac9880
Move the new image_utils.py file to modules/
2025-08-09 21:41:38 -07:00
oobabooga
d86b0ec010
Add multimodal support (llama.cpp) ( #7027 )
2025-08-10 01:27:25 -03:00
oobabooga
d9db8f63a7
mtmd: Simplifications
2025-08-09 07:25:42 -07:00
Katehuuh
88127f46c1
Add multimodal support (ExLlamaV3) ( #7174 )
2025-08-08 23:31:16 -03:00
oobabooga
498778b8ac
Add a new 'Reasoning effort' UI element
2025-08-05 15:19:11 -07:00
oobabooga
84617abdeb
Properly fix the /v1/models endpoint
2025-06-19 10:25:55 -07:00
oobabooga
dcdc42fa06
Fix the /v1/models output format ( closes #7089 )
2025-06-19 07:57:17 -07:00
oobabooga
6af3598cfa
API: Remove obsolete list_dummy_models function
2025-06-18 16:15:42 -07:00
NoxWorld2660
0b26650f47
Expose real model list via /v1/models endpoint ( #7088 )
2025-06-18 20:14:24 -03:00
oobabooga
87ae09ecd6
Improve the basic API examples
2025-06-17 07:46:58 -07:00
oobabooga
aa44e542cb
Revert "Safer usage of mkdir across the project"
...
This reverts commit 0d1597616f .
2025-06-17 07:11:59 -07:00
oobabooga
0d1597616f
Safer usage of mkdir across the project
2025-06-17 07:09:33 -07:00
djholtby
73bfc936a0
Close response generator when stopping API generation ( #7014 )
2025-05-26 22:39:03 -03:00
oobabooga
83bfd5c64b
Fix API issues
2025-05-18 12:45:01 -07:00