oobabooga
5a017aa338
API: Several OpenAI spec compliance fixes
...
- Return proper OpenAI error format ({"error": {...}}) instead of HTTP 500 for validation errors
- Send data: [DONE] at the end of SSE streams
- Fix finish_reason so "tool_calls" takes priority over "length"
- Stop including usage in streaming chunks when include_usage is not set
- Handle "developer" role in messages (treated same as "system")
- Add logprobs and top_logprobs parameters for chat completions
- Fix chat completions logprobs not working with llama.cpp and ExLlamav3 backends
- Add max_completion_tokens as an alias for max_tokens in chat completions
2026-03-12 13:30:38 -03:00
oobabooga
2549f7c33b
API: Add tool_choice support and fix tool_calls spec compliance
2026-03-12 10:29:23 -03:00
oobabooga
f1cfeae372
API: Improve OpenAI spec compliance in streaming and non-streaming responses
2026-03-10 20:55:49 -07:00
oobabooga
8aeaa76365
Forward logit_bias, logprobs, and n to llama.cpp backend
...
- Forward logit_bias and logprobs natively to llama.cpp
- Support n>1 completions with seed increment for diversity
- Fix logprobs returning empty dict when not requested
2026-03-10 10:41:45 -03:00
oobabooga
f9ed8820de
API: Make tool function description and parameters optional
2026-03-05 21:43:33 -08:00
oobabooga
3880c1a406
API: Accept content:null and complex tool definitions in tool calling requests
2026-03-06 02:41:38 -03:00
oobabooga
27bcc45c18
API: Add command-line flags to override default generation parameters
2026-03-06 01:36:45 -03:00
oobabooga
65de4c30c8
Add adaptive-p sampler and n-gram speculative decoding support
2026-03-04 09:41:29 -08:00
oobabooga
c026dbaf64
Fix API requests always returning the same 'created' time
2025-12-06 08:23:21 -08:00
oobabooga
3ef428efaa
Image: Remove llm_variations from the API
2025-12-04 17:34:17 -08:00
oobabooga
ffef3c7b1d
Image: Make the LLM Variations prompt configurable
2025-12-04 10:44:35 -08:00
oobabooga
5763947c37
Image: Simplify the API code, add the llm_variations option
2025-12-04 10:23:00 -08:00
oobabooga
5433ef3333
Add an API endpoint for generating images
2025-12-03 11:50:56 -08:00
oobabooga
765af1ba17
API: Improve a validation
2025-08-11 12:39:48 -07:00
oobabooga
6fbf162d71
Default max_tokens to 512 in the API instead of 16
2025-08-10 07:21:55 -07:00
Katehuuh
88127f46c1
Add multimodal support (ExLlamaV3) ( #7174 )
2025-08-08 23:31:16 -03:00
oobabooga
498778b8ac
Add a new 'Reasoning effort' UI element
2025-08-05 15:19:11 -07:00
oobabooga
87ae09ecd6
Improve the basic API examples
2025-06-17 07:46:58 -07:00
Jonas
fa960496d5
Tools support for OpenAI compatible API ( #6827 )
2025-05-08 12:30:27 -03:00
oobabooga
d10bded7f8
UI: Add an enable_thinking option to enable/disable Qwen3 thinking
2025-04-28 22:37:01 -07:00
oobabooga
d9de14d1f7
Restructure the repository ( #6904 )
2025-04-26 08:56:54 -03:00
oobabooga
5bcd2d7ad0
Add the top N-sigma sampler ( #6796 )
2025-03-14 16:45:11 -03:00
oobabooga
edbe0af647
Minor fixes after 0360f54ae8
2025-02-02 17:04:56 -08:00
oobabooga
0360f54ae8
UI: add a "Show after" parameter (to use with DeepSeek </think>)
2025-02-02 15:30:09 -08:00
oobabooga
83c426e96b
Organize internals ( #6646 )
2025-01-10 18:04:32 -03:00
oobabooga
11af199aff
Add a "Static KV cache" option for transformers
2025-01-04 17:52:57 -08:00
Philipp Emanuel Weidmann
301375834e
Exclude Top Choices (XTC): A sampler that boosts creativity, breaks writing clichés, and inhibits non-verbatim repetition ( #6335 )
2024-09-27 22:50:12 -03:00
Philipp Emanuel Weidmann
852c943769
DRY: A modern repetition penalty that reliably prevents looping ( #5677 )
2024-05-19 23:53:47 -03:00
oobabooga
f27e1ba302
Add a /v1/internal/chat-prompt endpoint ( #5879 )
2024-04-19 00:24:46 -03:00
oobabooga
c37f792afa
Better way to handle user_bio default in the API (alternative to bdcf31035f)
2024-03-29 10:54:01 -07:00
oobabooga
35da6b989d
Organize the parameters tab ( #5767 )
2024-03-28 16:45:03 -03:00
Yiximail
bdcf31035f
Set a default empty string for user_bio to fix #5717 issue ( #5722 )
2024-03-26 16:34:03 -03:00
oobabooga
28076928ac
UI: Add a new "User description" field for user personality/biography ( #5691 )
2024-03-11 23:41:57 -03:00
kalomaze
cfb25c9b3f
Cubic sampling w/ curve param ( #5551 )
...
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-03-03 13:22:21 -03:00
oobabooga
8c35fefb3b
Add custom sampler order support ( #5443 )
2024-02-06 11:20:10 -03:00
kalomaze
b6077b02e4
Quadratic sampling ( #5403 )
...
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-02-04 00:20:02 -03:00
oobabooga
e055967974
Add prompt_lookup_num_tokens parameter ( #5296 )
2024-01-17 17:09:36 -03:00
Samuel Weinhardt
952a05a7c8
Correct field alias types for OpenAI extension ( #5257 )
2024-01-14 13:30:36 -03:00
oobabooga
4332e24740
API: Make user_name/bot_name the official and name1/name2 the alias
2024-01-09 19:06:11 -08:00
oobabooga
a4c51b5a05
API: add "user_name" and "bot_name" aliases for name1 and name2
2024-01-09 19:02:45 -08:00
oobabooga
29c2693ea0
dynatemp_low, dynatemp_high, dynatemp_exponent parameters ( #5209 )
2024-01-08 23:28:35 -03:00
oobabooga
0d07b3a6a1
Add dynamic_temperature_low parameter ( #5198 )
2024-01-07 17:03:47 -03:00
kalomaze
48327cc5c4
Dynamic Temperature HF loader support ( #5174 )
...
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-01-07 10:36:26 -03:00
Felipe Ferreira
11f082e417
[OpenAI Extension] Add more types to Embeddings Endpoint ( #4895 )
2023-12-15 00:26:16 -03:00
Kim Jaewon
e53f99faa0
[OpenAI Extension] Add 'max_logits' parameter in logits endpoint ( #4916 )
2023-12-15 00:22:43 -03:00
oobabooga
39d2fe1ed9
Jinja templates for Instruct and Chat ( #4874 )
2023-12-12 17:23:14 -03:00
oobabooga
2c5a1e67f9
Parameters: change max_new_tokens & repetition_penalty_range defaults ( #4842 )
2023-12-07 20:04:52 -03:00
oobabooga
771e62e476
Add /v1/internal/lora endpoints ( #4652 )
2023-11-19 00:35:22 -03:00
oobabooga
0fa1af296c
Add /v1/internal/logits endpoint ( #4650 )
2023-11-18 23:19:31 -03:00
oobabooga
a475aa7816
Improve API documentation
2023-11-15 18:39:08 -08:00