Commit graph

552 commits

Author SHA1 Message Date
oobabooga 328215b0c7 API: Stop generation on client disconnect for non-streaming requests 2026-03-07 06:06:13 -08:00
oobabooga b8b4471ab5 Security: restrict file writes to user_data_dir, block extra_flags from API 2026-03-06 16:58:11 -03:00
oobabooga d03923924a Several small fixes
- Stop llama-server subprocess on model unload instead of relying on GC
- Fix tool_calls[].index being string instead of int in API responses
- Omit tool_calls key from API response when empty per OpenAI spec
- Prevent division by zero when micro_batch_size > batch_size in training
- Copy sampler_priority list before mutating in ExLlamaV3
- Normalize presence/frequency_penalty names for ExLlamaV3 sampler sorting
- Restore original chat_template after training instead of leaving it mutated
2026-03-06 16:52:13 -03:00
oobabooga 044566d42d API: Add tool call parsing for DeepSeek, GLM, MiniMax, and Kimi models 2026-03-06 15:06:56 -03:00
oobabooga f5acf55207 Add --chat-template-file flag to override the default instruction template for API requests
Matches llama.cpp's flag name. Supports .jinja, .jinja2, and .yaml files.
Priority: per-request params > --chat-template-file > model's built-in template.
2026-03-06 14:04:16 -03:00
oobabooga 3531069824 API: Support Llama 4 tool calling and fix tool calling edge cases 2026-03-06 13:12:14 -03:00
oobabooga f9ed8820de API: Make tool function description and parameters optional 2026-03-05 21:43:33 -08:00
oobabooga 3880c1a406 API: Accept content:null and complex tool definitions in tool calling requests 2026-03-06 02:41:38 -03:00
oobabooga d0ac58ad31 API: Fix tool_calls placement and other response compatibility issues 2026-03-05 21:25:03 -08:00
oobabooga f06583b2b9 API: Use \n instead of \r\n as the SSE separator to match OpenAI 2026-03-05 21:16:37 -08:00
oobabooga 521ddbb722 Security: restrict API model loading args to UI-exposed parameters
The /v1/internal/model/load endpoint previously allowed setting any
shared.args attribute, including security-sensitive flags like
trust_remote_code. Now only keys from list_model_elements() are accepted.
2026-03-06 01:57:02 -03:00
oobabooga 27bcc45c18 API: Add command-line flags to override default generation parameters 2026-03-06 01:36:45 -03:00
oobabooga ddcad3cc51 Follow-up to e2548f69: add missing paths module, fix gallery extension 2026-03-06 00:58:03 -03:00
oobabooga 8d43123f73 API: Fix function calling for Qwen, Mistral, GPT-OSS, and other models
The tool call response parser only handled JSON-based formats, causing
tool_calls to always be empty for models that use non-JSON formats.

Add parsers for three additional tool call formats:
- Qwen3.5: <tool_call><function=name><parameter=key>value</parameter>
- Mistral/Devstral: functionName{"arg": "value"}
- GPT-OSS: <|channel|>commentary to=functions.name<|message|>{...}

Also fix multi-turn tool conversations crashing with Jinja2
UndefinedError on tool_call_id by preserving tool_calls and
tool_call_id metadata through the chat history conversion.
2026-03-06 00:55:33 -03:00
oobabooga 4c406e024f API: Speed up chat completions by ~85ms per request 2026-03-05 18:36:07 -08:00
oobabooga 9824c82cb6 API: Add parallel request support for llama.cpp and ExLlamaV3 2026-03-05 16:49:58 -08:00
oobabooga 5be68cc073 Remove Training_PRO extension
The built-in training tab now covers its essential functionality
with a more modern and correct implementation (apply_chat_template,
dynamic padding, JSONL datasets, stride overlap).
2026-03-05 12:55:07 -03:00
thecaptain789 2ac4eb33c8
fix: correct typo 'occured' to 'occurred' (#7389) 2026-03-04 18:09:28 -03:00
Sense_wang 7bf15ad933
fix: replace bare except clauses with except Exception (#7400) 2026-03-04 18:06:17 -03:00
weiguang li 952e2c404a
Bump sentence-transformers from 2.2.2 to 3.3.1 in superbooga (#7406) 2026-03-04 17:08:08 -03:00
oobabooga 65de4c30c8 Add adaptive-p sampler and n-gram speculative decoding support 2026-03-04 09:41:29 -08:00
oobabooga c026dbaf64 Fix API requests always returning the same 'created' time 2025-12-06 08:23:21 -08:00
oobabooga afa29b9554 Image: Several fixes 2025-12-05 05:58:57 -08:00
oobabooga 15c6e43597 Image: Add a revised_prompt field to API results for OpenAI compatibility 2025-12-04 17:41:09 -08:00
oobabooga 56f2a9512f Revert "Image: Add the LLM-generated prompt to the API result"
This reverts commit c7ad28a4cd.
2025-12-04 17:34:27 -08:00
oobabooga 3ef428efaa Image: Remove llm_variations from the API 2025-12-04 17:34:17 -08:00
oobabooga c7ad28a4cd Image: Add the LLM-generated prompt to the API result 2025-12-04 17:22:08 -08:00
oobabooga ffef3c7b1d Image: Make the LLM Variations prompt configurable 2025-12-04 10:44:35 -08:00
oobabooga 5763947c37 Image: Simplify the API code, add the llm_variations option 2025-12-04 10:23:00 -08:00
oobabooga 4468c49439 Add semaphore to image generation API endpoint 2025-12-03 12:02:47 -08:00
oobabooga 5433ef3333 Add an API endpoint for generating images 2025-12-03 11:50:56 -08:00
aidevtime 661e42d2b7
fix(deps): upgrade coqui-tts to >=0.27.0 for transformers 4.55 compatibility (#7329) 2025-11-28 22:59:36 -03:00
oobabooga 338ae36f73 Add weights_only=True to torch.load in Training_PRO 2025-10-28 12:43:16 -07:00
oobabooga 765af1ba17 API: Improve a validation 2025-08-11 12:39:48 -07:00
oobabooga b62c8845f3 mtmd: Fix /chat/completions for llama.cpp 2025-08-11 12:01:59 -07:00
oobabooga 6fbf162d71 Default max_tokens to 512 in the API instead of 16 2025-08-10 07:21:55 -07:00
oobabooga 1fb5807859 mtmd: Fix API text completion when no images are sent 2025-08-10 06:54:44 -07:00
oobabooga 2f90ac9880 Move the new image_utils.py file to modules/ 2025-08-09 21:41:38 -07:00
oobabooga d86b0ec010
Add multimodal support (llama.cpp) (#7027) 2025-08-10 01:27:25 -03:00
oobabooga d9db8f63a7 mtmd: Simplifications 2025-08-09 07:25:42 -07:00
Katehuuh 88127f46c1
Add multimodal support (ExLlamaV3) (#7174) 2025-08-08 23:31:16 -03:00
oobabooga 498778b8ac Add a new 'Reasoning effort' UI element 2025-08-05 15:19:11 -07:00
oobabooga 84617abdeb Properly fix the /v1/models endpoint 2025-06-19 10:25:55 -07:00
oobabooga dcdc42fa06 Fix the /v1/models output format (closes #7089) 2025-06-19 07:57:17 -07:00
oobabooga 6af3598cfa API: Remove obsolete list_dummy_models function 2025-06-18 16:15:42 -07:00
NoxWorld2660 0b26650f47
Expose real model list via /v1/models endpoint (#7088) 2025-06-18 20:14:24 -03:00
oobabooga 87ae09ecd6 Improve the basic API examples 2025-06-17 07:46:58 -07:00
oobabooga aa44e542cb Revert "Safer usage of mkdir across the project"
This reverts commit 0d1597616f.
2025-06-17 07:11:59 -07:00
oobabooga 0d1597616f Safer usage of mkdir across the project 2025-06-17 07:09:33 -07:00
djholtby 73bfc936a0
Close response generator when stopping API generation (#7014) 2025-05-26 22:39:03 -03:00