Hermann Hans Klie
eb6c5a171e
Update loaders.py
2025-10-24 09:06:29 +03:00
Hermann Hans Klie
779795266f
Update models.py
...
the def load_model(model_name, loader=None) we fill in ktransformers .
before the def unload_model(keep_model_name=False) fill def ktransformers_loader
2025-10-24 08:53:23 +03:00
Hermann Hans Klie
02c7049227
Merge pull request #1 from hermannklie/ktransformers_in_textgenwebui
...
Add KTransformers loader integration
2025-10-21 20:47:20 +03:00
Hermann Hans Klie
8fdb1b1e5f
Add KTransformers loader integration
...
This PR adds native support for the KTransformers backend as a selectable loader in Text-Generation-WebUI.
It provides a reproducible installation and integration process compatible with the one-click installer (Conda environment).
The integration is not limited to small models — it has meant to be used with Qwen3-Next-80B-A3B-Instruct-FP8 and other larger architectures using FP8 like DeepSeeK FP8 model and FlashAttention-2.
Smaller models (e.g., Qwen3-4B-Instruct) now run efficiently, confirming broad coverage from laptop to workstation setups.
2025-10-21 20:44:25 +03:00
oobabooga
771130532c
Merge pull request #7267 from oobabooga/dev
...
Merge dev branch
2025-10-15 17:15:28 -03:00
oobabooga
a156ebbf76
Lint
2025-10-15 13:15:01 -07:00
oobabooga
c871d9cdbd
Revert "Same as 7f06aec3a1 but for exllamav3_hf"
...
This reverts commit deb37b821b .
2025-10-15 13:05:41 -07:00
oobabooga
163d863443
Update llama.cpp
2025-10-15 11:23:10 -07:00
oobabooga
c93d567f97
Update exllamav3 to 0.0.10
2025-10-15 06:41:09 -07:00
oobabooga
b5a6904c4a
Make --trust-remote-code immutable from the UI/API
2025-10-14 20:47:01 -07:00
oobabooga
efaf2aef3d
Update exllamav3 to 0.0.9
2025-10-13 15:32:25 -07:00
oobabooga
047855c591
Update llama.cpp
2025-10-13 15:32:03 -07:00
mamei16
308e726e11
log error when llama-server request exceeds context size ( #7263 )
2025-10-12 23:00:11 -03:00
oobabooga
611399e089
Update README
2025-10-11 17:22:48 -07:00
oobabooga
968c79db06
Minor README fix ( closes #7251 )
2025-10-11 17:20:49 -07:00
oobabooga
655c3e86e3
Fix "continue" missing an initial space in chat-instruct/chat modes
2025-10-11 17:00:25 -07:00
oobabooga
c7dd920dc8
Fix metadata leaking into branched chats
2025-10-11 14:12:05 -07:00
oobabooga
1831b3fb51
Use my custom gradio_client build (small changes to work with pydantic 2.11)
2025-10-10 18:01:21 -07:00
oobabooga
dd0b003493
Bump pydantic to 2.11.0
2025-10-10 17:52:16 -07:00
oobabooga
a74596374d
Reapply "Update exllamav3 to 0.0.8"
...
This reverts commit 748007f6ee .
2025-10-10 17:51:31 -07:00
oobabooga
78ff21d512
Organize the --help message
2025-10-10 15:21:08 -07:00
oobabooga
5d734cc7ca
Remove unused CSS
2025-10-10 12:54:54 -07:00
oobabooga
25360387ec
Downloader: Fix resuming downloads after HF moved to Xet
2025-10-10 08:27:40 -07:00
oobabooga
7833650aa1
Merge pull request #7260 from oobabooga/dev
...
Merge dev branch
2025-10-10 10:46:34 -03:00
oobabooga
bf5d85c922
Revert "Downloader: Gracefully handle '416 Range Not Satisfiable' when continuing downloads"
...
This reverts commit 1aa2b924d2 .
2025-10-09 17:22:41 -07:00
oobabooga
0d03813e98
Update modules/chat.py
...
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-10-09 21:01:13 -03:00
oobabooga
748007f6ee
Revert "Update exllamav3 to 0.0.8"
...
This reverts commit 977ffbaa04 .
2025-10-09 16:50:00 -07:00
dependabot[bot]
af3c70651c
Update bitsandbytes requirement in /requirements/full ( #7255 )
2025-10-09 19:53:34 -03:00
oobabooga
977ffbaa04
Update exllamav3 to 0.0.8
2025-10-09 15:53:14 -07:00
oobabooga
e0f0fae59d
Exllamav3: Add fla to requirements for qwen3-next
2025-10-09 13:03:48 -07:00
oobabooga
deb37b821b
Same as 7f06aec3a1 but for exllamav3_hf
2025-10-09 13:02:38 -07:00
oobabooga
7f06aec3a1
exllamav3: Implement the logits function for /v1/internal/logits
2025-10-09 11:24:25 -07:00
oobabooga
218dc01b51
Add fallbacks after 93aa7b3ed3
2025-10-09 10:59:34 -07:00
oobabooga
1aa2b924d2
Downloader: Gracefully handle '416 Range Not Satisfiable' when continuing downloads
2025-10-09 10:52:31 -07:00
oobabooga
0f3793d608
Update llama.cpp
2025-10-09 09:38:22 -07:00
oobabooga
282aa19189
Safer profile picture uploading
2025-10-09 09:26:35 -07:00
oobabooga
93aa7b3ed3
Better handle multigpu setups with transformers + bitsandbytes
2025-10-09 08:49:44 -07:00
Ionoclast Laboratories
d229dfe991
Fix portable apple intel requirement for llama binaries (issue #7238 ) ( #7239 )
2025-10-08 12:40:53 -03:00
oobabooga
292c91abbb
Update llama.cpp
2025-10-08 08:31:34 -07:00
oobabooga
f660e0836b
Merge branch 'main' into dev
2025-10-08 05:38:33 -07:00
oobabooga
898a3ed2fe
Add sponsor (Warp) to README <3
2025-10-07 18:33:28 -03:00
oobabooga
22997c134e
Merge remote-tracking branch 'refs/remotes/origin/dev' into dev
2025-10-05 20:34:49 -07:00
Remowylliams
38a7fd685d
chat.py fixes Instruct mode History
2025-10-05 11:34:47 -03:00
oobabooga
64829071e0
Update llama.cpp
2025-10-05 07:32:41 -07:00
oobabooga
0eb8543d74
Update transformers
2025-10-05 07:30:33 -07:00
oobabooga
b7effb22e0
Update exllamav3
2025-10-05 07:29:57 -07:00
oobabooga
042b828c73
Merge pull request #7231 from oobabooga/dev
...
Merge dev branch
2025-09-21 01:18:56 -03:00
oobabooga
8c9df34696
Update llama.cpp
2025-09-20 20:57:15 -07:00
oobabooga
1e863a7113
Fix exllamav3 ignoring the stop button
2025-09-19 16:12:50 -07:00
oobabooga
005fcf3f98
Formatting
2025-09-17 21:58:37 -07:00