Commit graph

2163 commits

Author SHA1 Message Date
oobabooga
b108c55353 Fix portable builds not starting due to missing ik element 2026-04-02 19:14:50 -07:00
oobabooga
7aab2fdf9a API: Improve cache clearing in logprobs 2026-04-02 17:50:42 -07:00
oobabooga
091037ec20 Fix top_logprobs_ids missing for llama.cpp loader 2026-04-02 16:13:45 -03:00
oobabooga
ea1f8c71f2 API: Optimize prompt logprobs and refactor ExLlamav3 forward pass 2026-04-02 14:31:11 -03:00
oobabooga
c10c6e87ae API: Add token ids to logprobs output 2026-04-02 07:17:27 -07:00
oobabooga
a32ce254f2 Don't pass torch_dtype to transformers, autodetect from model config 2026-04-02 00:44:14 -03:00
oobabooga
4073164be0 Fix ExLlamav3 OOM on prompt logprobs and qwen3_5_moe HF compat 2026-04-01 19:44:55 -07:00
oobabooga
71c1a52afe API: Implement echo + logprobs for /v1/completions endpoint 2026-03-31 07:43:11 -07:00
oobabooga
6382fbef83 Several small code simplifications 2026-03-30 19:36:03 -07:00
oobabooga
0466b6e271 ik_llama.cpp: Auto-enable Hadamard KV cache rotation with quantized cache 2026-03-29 15:52:36 -07:00
oobabooga
4979e87e48 Add ik_llama.cpp support via ik_llama_cpp_binaries package 2026-03-28 12:09:00 -03:00
oobabooga
9dd04b86ce Suppress EOS token at logit level for ExLlamav3 when ban_eos_token is set 2026-03-28 06:17:57 -07:00
oobabooga
bda95172bd Fix stopping string detection for chromadb/context-1 2026-03-28 06:09:53 -07:00
oobabooga
4cbea02ed4 Add ik_llama.cpp support via --ik flag 2026-03-26 06:54:47 -07:00
oobabooga
e154140021 Rename "truncation length" to "context length" in logs 2026-03-25 07:21:02 -07:00
oobabooga
368f37335f Fix --idle-timeout issues with encode/decode and parallel generation 2026-03-25 06:37:45 -07:00
oobabooga
d6f1485dd1 UI: Update the enable_thinking info message 2026-03-24 21:45:11 -07:00
oobabooga
807be11832 Remove obsolete models/config.yaml and related code 2026-03-24 18:48:50 -07:00
oobabooga
750502695c Fix GPT-OSS tool-calling after 9ec20d97 2026-03-24 11:39:24 -07:00
oobabooga
a7ef430b38 Revert "llama.cpp: Don't suppress llama-server logs"
This reverts commit 9488df3e48.
2026-03-23 20:22:51 -07:00
oobabooga
286bbb685d Revert "Follow-up to previous commit"
This reverts commit 1dda5e4711.
2026-03-23 20:22:46 -07:00
oobabooga
02f18a1d65 API: Add thinking block signature field, fix error codes, clean up logging 2026-03-23 07:06:38 -07:00
oobabooga
307d0c92be UI polish 2026-03-23 06:35:14 -07:00
oobabooga
9ec20d9730 Strip thinking blocks before tool-call parsing 2026-03-22 19:19:14 -07:00
Phrosty1
bde496ea5d
Fix prompt corruption when continuing with context truncation (#7439) 2026-03-22 21:48:56 -03:00
oobabooga
1dda5e4711 Follow-up to previous commit 2026-03-21 20:58:45 -07:00
oobabooga
9488df3e48 llama.cpp: Don't suppress llama-server logs 2026-03-21 20:47:26 -07:00
oobabooga
2c4f364339 Update API docs to mention Anthropic support 2026-03-21 18:38:11 -07:00
oobabooga
f2c909725e API: Use top_p=0.95 by default 2026-03-21 11:11:09 -07:00
oobabooga
0216893475 API: Add Anthropic-compatible /v1/messages endpoint 2026-03-20 20:38:55 -07:00
oobabooga
f0e3997f37 Add missing __init__.py to modules/grammar 2026-03-20 16:04:57 -03:00
oobabooga
7c79143a14 API: Fix _start_cloudflared raising after first attempt instead of exhausting retries 2026-03-20 15:03:49 -03:00
oobabooga
1a910574c3 API: Fix debug_msg truthy check for OPENEDAI_DEBUG=0 2026-03-20 14:57:01 -03:00
oobabooga
bf6fbc019d API: Move OpenAI-compatible API from extensions/openai to modules/api 2026-03-20 14:46:00 -03:00
oobabooga
2e4232e02b Minor cleanup 2026-03-20 07:20:26 -07:00
oobabooga
e0e20ab9e7 Minor cleanup across multiple modules 2026-03-19 08:02:23 -07:00
oobabooga
dde1764763 Cleanup modules/chat.py 2026-03-18 21:12:14 -07:00
oobabooga
779e7611ff Use logger.exception() instead of traceback.print_exc() for error messages 2026-03-18 20:42:20 -07:00
oobabooga
ca36bd6eb6 API: Remove leading spaces from post-reasoning content 2026-03-18 07:36:11 -07:00
oobabooga
fef2bd8630 UI: Fix the instruction template delete dialog not appearing 2026-03-17 22:52:32 -07:00
oobabooga
c8bb2129ba Security: server-side file save roots, image URL SSRF protection, extension allowlist 2026-03-17 22:29:35 -07:00
oobabooga
08ff3f0f90 Merge remote-tracking branch 'refs/remotes/origin/dev' into dev 2026-03-17 19:52:24 -07:00
oobabooga
7e54e7b7ae llama.cpp: Support literal flags in --extra-flags (e.g. --rpc, --jinja)
The old format is still accepted for backwards compatibility.
2026-03-17 19:47:55 -07:00
oobabooga
2a6b1fdcba Fix --extra-flags breaking short long-form-only flags like --rpc
Closes #7357
2026-03-17 18:29:15 -07:00
Alvin Tang
73a094a657
Fix file handle leaks and redundant re-read in get_model_metadata (#7422) 2026-03-17 22:06:05 -03:00
RoomWithOutRoof
f0014ab01c
fix: mutable default argument in LogitsBiasProcessor (#7426) 2026-03-17 22:03:48 -03:00
oobabooga
27a6cdeec1 Fix multi-turn thinking block corruption for Kimi models 2026-03-17 11:31:55 -07:00
oobabooga
2d141b54c5 Fix several typos 2026-03-17 11:11:12 -07:00
oobabooga
249861b65d web search: Update the user agents 2026-03-17 05:41:05 -07:00
oobabooga
dff8903b03 UI: Modernize the Gradio theme 2026-03-16 19:33:54 -07:00