Hermann Hans Klie
d77a42a776
Merge eb6c5a171e into bd9f2de73a
2025-11-30 14:07:46 +08:00
oobabooga
5327bc9397
Update modules/shared.py
...
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-28 22:48:05 -03:00
GodEmperor785
400bb0694b
Add slider for --ubatch-size for llama.cpp loader, change defaults for better MoE performance ( #7316 )
2025-11-21 16:56:02 -03:00
oobabooga
8f0048663d
More modular HTML generator
2025-11-21 07:09:16 -08:00
oobabooga
0d4eff284c
Add a --cpu-moe model for llama.cpp
2025-11-19 05:23:43 -08:00
Trenten Miller
6871484398
fix: Rename 'evaluation_strategy' to 'eval_strategy' in training
2025-10-28 16:48:04 -03:00
Hermann Hans Klie
eb6c5a171e
Update loaders.py
2025-10-24 09:06:29 +03:00
Hermann Hans Klie
779795266f
Update models.py
...
the def load_model(model_name, loader=None) we fill in ktransformers .
before the def unload_model(keep_model_name=False) fill def ktransformers_loader
2025-10-24 08:53:23 +03:00
oobabooga
a156ebbf76
Lint
2025-10-15 13:15:01 -07:00
oobabooga
c871d9cdbd
Revert "Same as 7f06aec3a1 but for exllamav3_hf"
...
This reverts commit deb37b821b .
2025-10-15 13:05:41 -07:00
oobabooga
b5a6904c4a
Make --trust-remote-code immutable from the UI/API
2025-10-14 20:47:01 -07:00
mamei16
308e726e11
log error when llama-server request exceeds context size ( #7263 )
2025-10-12 23:00:11 -03:00
oobabooga
655c3e86e3
Fix "continue" missing an initial space in chat-instruct/chat modes
2025-10-11 17:00:25 -07:00
oobabooga
c7dd920dc8
Fix metadata leaking into branched chats
2025-10-11 14:12:05 -07:00
oobabooga
78ff21d512
Organize the --help message
2025-10-10 15:21:08 -07:00
oobabooga
0d03813e98
Update modules/chat.py
...
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-10-09 21:01:13 -03:00
oobabooga
deb37b821b
Same as 7f06aec3a1 but for exllamav3_hf
2025-10-09 13:02:38 -07:00
oobabooga
7f06aec3a1
exllamav3: Implement the logits function for /v1/internal/logits
2025-10-09 11:24:25 -07:00
oobabooga
218dc01b51
Add fallbacks after 93aa7b3ed3
2025-10-09 10:59:34 -07:00
oobabooga
282aa19189
Safer profile picture uploading
2025-10-09 09:26:35 -07:00
oobabooga
93aa7b3ed3
Better handle multigpu setups with transformers + bitsandbytes
2025-10-09 08:49:44 -07:00
Remowylliams
38a7fd685d
chat.py fixes Instruct mode History
2025-10-05 11:34:47 -03:00
oobabooga
1e863a7113
Fix exllamav3 ignoring the stop button
2025-09-19 16:12:50 -07:00
stevenxdavis
dd6d2223a5
Changing transformers_loader.py to Match User Expectations for --bf16 and Flash Attention 2 ( #7217 )
2025-09-17 16:39:04 -03:00
oobabooga
9e9ab39892
Make exllamav3_hf and exllamav2_hf functional again
2025-09-17 12:29:22 -07:00
oobabooga
f3829b268a
llama.cpp: Always pass --flash-attn on
2025-09-02 12:12:17 -07:00
oobabooga
c6ea67bbdb
Lint
2025-09-02 10:22:03 -07:00
oobabooga
00ed878b05
Slightly more robust model loading
2025-09-02 10:16:26 -07:00
oobabooga
387e249dec
Change an info message
2025-08-31 16:27:10 -07:00
oobabooga
8028d88541
Lint
2025-08-30 21:29:20 -07:00
oobabooga
13876a1ee8
llama.cpp: Remove the --flash-attn flag (it's always on now)
2025-08-30 20:28:26 -07:00
oobabooga
3a3e247f3c
Even better way to handle continue for thinking blocks
2025-08-30 12:36:35 -07:00
oobabooga
cf1aad2a68
Fix "continue" for Byte-OSS for partial thinking blocks
2025-08-30 12:16:45 -07:00
oobabooga
96136ea760
Fix LaTeX rendering for equations with asterisks
2025-08-30 10:13:32 -07:00
oobabooga
a3eb67e466
Fix the UI failing to launch if the Notebook prompt is too long
2025-08-30 08:42:26 -07:00
oobabooga
a2b37adb26
UI: Preload the correct fonts for chat mode
2025-08-29 09:25:44 -07:00
oobabooga
cb8780a4ce
Safer check for is_multimodal when loading models
...
Avoids unrelated multimodal error when a model fails to load due
to lack of memory.
2025-08-28 11:13:19 -07:00
oobabooga
cfc83745ec
UI: Improve right sidebar borders in light mode
2025-08-28 08:34:48 -07:00
oobabooga
ba6041251d
UI: Minor change
2025-08-28 06:20:00 -07:00
oobabooga
a92758a144
llama.cpp: Fix obtaining the maximum sequence length for GPT-OSS
2025-08-27 16:15:40 -07:00
oobabooga
030ba7bfeb
UI: Mention that Seed-OSS uses enable_thinking
2025-08-27 07:44:35 -07:00
oobabooga
0b4518e61c
"Text generation web UI" -> "Text Generation Web UI"
2025-08-27 05:53:09 -07:00
oobabooga
02ca96fa44
Multiple fixes
2025-08-25 22:17:22 -07:00
oobabooga
6a7166fffa
Add support for the Seed-OSS template
2025-08-25 19:46:48 -07:00
oobabooga
8fcb4b3102
Make bot_prefix extensions functional again
2025-08-25 19:10:46 -07:00
oobabooga
8f660aefe3
Fix chat-instruct replies leaking the bot name sometimes
2025-08-25 18:50:16 -07:00
oobabooga
a531328f7e
Fix the GPT-OSS stopping string
2025-08-25 18:41:58 -07:00
oobabooga
6c165d2e55
Fix the chat template
2025-08-25 18:28:43 -07:00
oobabooga
b657be7381
Obtain stopping strings in chat mode
2025-08-25 18:22:08 -07:00
oobabooga
ded6c41cf8
Fix impersonate for chat-instruct
2025-08-25 18:16:17 -07:00