text-generation-webui/extensions/openai
oobabooga fb1b3b6ddf API: Rewrite logprobs for OpenAI spec compliance across all backends
- Rewrite logprobs output format to match the OpenAI specification for
  both chat completions and completions endpoints
- Fix top_logprobs count being ignored for llama.cpp and ExLlamav3
  backends in chat completions (always returned 1 instead of requested N)
- Fix non-streaming responses only returning logprobs for the last token
  instead of all generated tokens (affects all HF-based loaders)
- Fix logprobs returning null for non-streaming chat requests on HF loaders
- Fix off-by-one returning one extra top alternative on HF loaders
2026-03-12 14:17:32 -03:00
..
cache_embedding_model.py Make /v1/embeddings functional, add request/response types 2023-11-10 07:34:27 -08:00
completions.py API: Rewrite logprobs for OpenAI spec compliance across all backends 2026-03-12 14:17:32 -03:00
embeddings.py Openai embedding fix to support jina-embeddings-v2 (#4642) 2023-11-18 20:24:29 -03:00
errors.py extensions/openai: Fixes for: embeddings, tokens, better errors. +Docs update, +Images, +logit_bias/logprobs, +more. (#3122) 2023-07-24 11:28:12 -03:00
images.py Image: Several fixes 2025-12-05 05:58:57 -08:00
logits.py New llama.cpp loader (#6846) 2025-04-18 09:59:37 -03:00
models.py Refactor to not import gradio in --nowebui mode 2026-03-09 19:29:24 -07:00
moderations.py Lint 2023-11-16 18:03:06 -08:00
script.py API: Several OpenAI spec compliance fixes 2026-03-12 13:30:38 -03:00
tokens.py Add types to the encode/decode/token-count endpoints 2023-11-07 19:32:14 -08:00
typing.py API: Several OpenAI spec compliance fixes 2026-03-12 13:30:38 -03:00
utils.py API: Add tool call parsing for DeepSeek, GLM, MiniMax, and Kimi models 2026-03-06 15:06:56 -03:00