Commit graph

192 commits

Author SHA1 Message Date
oobabooga c026dbaf64 Fix API requests always returning the same 'created' time 2025-12-06 08:23:21 -08:00
oobabooga afa29b9554 Image: Several fixes 2025-12-05 05:58:57 -08:00
oobabooga 15c6e43597 Image: Add a revised_prompt field to API results for OpenAI compatibility 2025-12-04 17:41:09 -08:00
oobabooga 56f2a9512f Revert "Image: Add the LLM-generated prompt to the API result"
This reverts commit c7ad28a4cd.
2025-12-04 17:34:27 -08:00
oobabooga 3ef428efaa Image: Remove llm_variations from the API 2025-12-04 17:34:17 -08:00
oobabooga c7ad28a4cd Image: Add the LLM-generated prompt to the API result 2025-12-04 17:22:08 -08:00
oobabooga ffef3c7b1d Image: Make the LLM Variations prompt configurable 2025-12-04 10:44:35 -08:00
oobabooga 5763947c37 Image: Simplify the API code, add the llm_variations option 2025-12-04 10:23:00 -08:00
oobabooga 4468c49439 Add semaphore to image generation API endpoint 2025-12-03 12:02:47 -08:00
oobabooga 5433ef3333 Add an API endpoint for generating images 2025-12-03 11:50:56 -08:00
oobabooga 765af1ba17 API: Improve a validation 2025-08-11 12:39:48 -07:00
oobabooga b62c8845f3 mtmd: Fix /chat/completions for llama.cpp 2025-08-11 12:01:59 -07:00
oobabooga 6fbf162d71 Default max_tokens to 512 in the API instead of 16 2025-08-10 07:21:55 -07:00
oobabooga 1fb5807859 mtmd: Fix API text completion when no images are sent 2025-08-10 06:54:44 -07:00
oobabooga 2f90ac9880 Move the new image_utils.py file to modules/ 2025-08-09 21:41:38 -07:00
oobabooga d86b0ec010
Add multimodal support (llama.cpp) (#7027) 2025-08-10 01:27:25 -03:00
oobabooga d9db8f63a7 mtmd: Simplifications 2025-08-09 07:25:42 -07:00
Katehuuh 88127f46c1
Add multimodal support (ExLlamaV3) (#7174) 2025-08-08 23:31:16 -03:00
oobabooga 498778b8ac Add a new 'Reasoning effort' UI element 2025-08-05 15:19:11 -07:00
oobabooga 84617abdeb Properly fix the /v1/models endpoint 2025-06-19 10:25:55 -07:00
oobabooga dcdc42fa06 Fix the /v1/models output format (closes #7089) 2025-06-19 07:57:17 -07:00
oobabooga 6af3598cfa API: Remove obsolete list_dummy_models function 2025-06-18 16:15:42 -07:00
NoxWorld2660 0b26650f47
Expose real model list via /v1/models endpoint (#7088) 2025-06-18 20:14:24 -03:00
oobabooga 87ae09ecd6 Improve the basic API examples 2025-06-17 07:46:58 -07:00
djholtby 73bfc936a0
Close response generator when stopping API generation (#7014) 2025-05-26 22:39:03 -03:00
oobabooga 83bfd5c64b Fix API issues 2025-05-18 12:45:01 -07:00
oobabooga 076aa67963 Fix API issues 2025-05-17 22:22:18 -07:00
oobabooga 470c822f44 API: Hide the uvicorn access logs from the terminal 2025-05-16 12:54:39 -07:00
oobabooga fd61297933 Lint 2025-05-15 21:19:19 -07:00
oobabooga c375b69413 API: Fix llama.cpp generating after disconnect, improve disconnect detection, fix deadlock on simultaneous requests 2025-05-13 11:23:33 -07:00
oobabooga 0c5fa3728e Revert "Fix API failing to cancel streams (attempt), closes #6966"
This reverts commit 006a866079.
2025-05-10 19:12:40 -07:00
oobabooga 006a866079 Fix API failing to cancel streams (attempt), closes #6966 2025-05-10 17:55:48 -07:00
Jonas fa960496d5
Tools support for OpenAI compatible API (#6827) 2025-05-08 12:30:27 -03:00
oobabooga f82667f0b4 Remove more multimodal extension references 2025-05-05 14:17:00 -07:00
oobabooga 85bf2e15b9 API: Remove obsolete multimodal extension handling
Multimodal support will be added back once it's implemented in llama-server.
2025-05-05 14:14:48 -07:00
oobabooga d10bded7f8 UI: Add an enable_thinking option to enable/disable Qwen3 thinking 2025-04-28 22:37:01 -07:00
oobabooga bbcaec75b4 API: Find a new port if the default one is taken (closes #6918) 2025-04-27 21:13:16 -07:00
oobabooga 35717a088c API: Add an /v1/internal/health endpoint 2025-04-26 15:42:27 -07:00
oobabooga bc55feaf3e Improve host header validation in local mode 2025-04-26 15:42:17 -07:00
oobabooga d9de14d1f7
Restructure the repository (#6904) 2025-04-26 08:56:54 -03:00
oobabooga d5e1bccef9 Remove the SpeechRecognition requirement 2025-04-20 11:47:28 -07:00
oobabooga ae02ffc605
Refactor the transformers loader (#6859) 2025-04-20 13:33:47 -03:00
oobabooga ae54d8faaa
New llama.cpp loader (#6846) 2025-04-18 09:59:37 -03:00
oobabooga 5bcd2d7ad0
Add the top N-sigma sampler (#6796) 2025-03-14 16:45:11 -03:00
oobabooga edbe0af647 Minor fixes after 0360f54ae8 2025-02-02 17:04:56 -08:00
oobabooga 0360f54ae8 UI: add a "Show after" parameter (to use with DeepSeek </think>) 2025-02-02 15:30:09 -08:00
oobabooga f01cc079b9 Lint 2025-01-29 14:00:59 -08:00
FP HAM 4bd260c60d
Give SillyTavern a bit of leaway the way the do OpenAI (#6685) 2025-01-22 12:01:44 -03:00
oobabooga 83c426e96b
Organize internals (#6646) 2025-01-10 18:04:32 -03:00
BPplays 619265b32c
add ipv6 support to the API (#6559) 2025-01-09 10:23:44 -03:00