oobabooga
|
8a3d866401
|
Fix temperature_last having no effect in llama.cpp server sampler order
|
2026-03-04 06:10:51 -08:00 |
|
oobabooga
|
b3fd0d16e0
|
Use a new gr.Headless component for efficient chat streaming
|
2026-03-03 18:12:03 -08:00 |
|
oobabooga
|
2260e530c9
|
Remove gradio monkey-patches (moved to gradio fork)
|
2026-03-03 17:17:36 -08:00 |
|
oobabooga
|
c54e8a2b3d
|
Try to spawn llama.cpp on port 5001 instead of random port
|
2026-01-28 08:23:55 -08:00 |
|
oobabooga
|
dc2bbf1861
|
Refactor thinking block detection and add Solar Open support
|
2026-01-28 08:21:34 -08:00 |
|
q5sys (JT)
|
7493fe7841
|
feat: Add a dropdown to save/load user personas (#7367)
|
2026-01-14 20:35:08 -03:00 |
|
Sergey 'Jin' Bostandzhyan
|
6e2c4e9c23
|
Fix loading models which have their eos token disabled (#7363)
|
2026-01-06 11:31:10 -03:00 |
|
oobabooga
|
e7c8b51fec
|
Revert "Use flash_attention_2 by default for Transformers models"
This reverts commit 85f2df92e9.
|
2025-12-07 18:48:41 -08:00 |
|
oobabooga
|
b758059e95
|
Revert "Clear the torch cache between sequential image generations"
This reverts commit 1ec9f708e5.
|
2025-12-07 12:23:19 -08:00 |
|
oobabooga
|
1ec9f708e5
|
Clear the torch cache between sequential image generations
|
2025-12-07 11:49:22 -08:00 |
|
oobabooga
|
85f2df92e9
|
Use flash_attention_2 by default for Transformers models
|
2025-12-07 06:56:58 -08:00 |
|
oobabooga
|
1762312fb4
|
Use random instead of np.random for image seeds (makes it work on Windows)
|
2025-12-06 20:10:32 -08:00 |
|
oobabooga
|
02518a96a9
|
Lint
|
2025-12-06 06:55:06 -08:00 |
|
oobabooga
|
455dc06db0
|
Serve the original PNG images in the UI instead of webp
|
2025-12-06 05:43:00 -08:00 |
|
oobabooga
|
6ca99910ba
|
Image: Quantize the text encoder for lower VRAM
|
2025-12-05 13:08:46 -08:00 |
|
oobabooga
|
11937de517
|
Use flash attention for image generation by default
|
2025-12-05 12:13:24 -08:00 |
|
oobabooga
|
c11c14590a
|
Image: Better LLM variation default prompt
|
2025-12-05 08:08:11 -08:00 |
|
oobabooga
|
0dd468245c
|
Image: Add back the gallery cache (for performance)
|
2025-12-05 07:11:38 -08:00 |
|
oobabooga
|
b63d57158d
|
Image: Add TGW as a prefix to output images
|
2025-12-05 05:59:54 -08:00 |
|
oobabooga
|
afa29b9554
|
Image: Several fixes
|
2025-12-05 05:58:57 -08:00 |
|
oobabooga
|
8eac99599a
|
Image: Better LLM variation default prompt
|
2025-12-04 19:58:06 -08:00 |
|
oobabooga
|
b4f06a50b0
|
fix: Pass bos_token and eos_token from metadata to jinja2
Fixes loading Seed-Instruct-36B
|
2025-12-04 19:11:31 -08:00 |
|
oobabooga
|
56f2a9512f
|
Revert "Image: Add the LLM-generated prompt to the API result"
This reverts commit c7ad28a4cd.
|
2025-12-04 17:34:27 -08:00 |
|
oobabooga
|
c7ad28a4cd
|
Image: Add the LLM-generated prompt to the API result
|
2025-12-04 17:22:08 -08:00 |
|
oobabooga
|
b451bac082
|
Image: Improve a log message
|
2025-12-04 16:33:46 -08:00 |
|
oobabooga
|
47a0fcd614
|
Image: PNG metadata improvements
|
2025-12-04 16:25:48 -08:00 |
|
oobabooga
|
ac31a7c008
|
Image: Organize the UI
|
2025-12-04 15:45:04 -08:00 |
|
oobabooga
|
a90739f498
|
Image: Better LLM variation default prompt
|
2025-12-04 10:50:40 -08:00 |
|
oobabooga
|
ffef3c7b1d
|
Image: Make the LLM Variations prompt configurable
|
2025-12-04 10:44:35 -08:00 |
|
oobabooga
|
5763947c37
|
Image: Simplify the API code, add the llm_variations option
|
2025-12-04 10:23:00 -08:00 |
|
oobabooga
|
2793153717
|
Image: Add LLM-generated prompt variations
|
2025-12-04 08:10:24 -08:00 |
|
oobabooga
|
7fb9f19bd8
|
Progress bar style improvements
|
2025-12-04 06:20:45 -08:00 |
|
oobabooga
|
a838223d18
|
Image: Add a progress bar during generation
|
2025-12-04 05:49:57 -08:00 |
|
oobabooga
|
14dbc3488e
|
Image: Clear the torch cache after generation, not before
|
2025-12-04 05:32:58 -08:00 |
|
oobabooga
|
c357eed4c7
|
Image: Remove the flash_attention_3 option (no idea how to get it working)
|
2025-12-03 18:40:34 -08:00 |
|
oobabooga
|
fbca54957e
|
Image generation: Yield partial results for batch count > 1
|
2025-12-03 16:13:07 -08:00 |
|
oobabooga
|
49c60882bf
|
Image generation: Safer image uploading
|
2025-12-03 16:07:51 -08:00 |
|
oobabooga
|
59285d501d
|
Image generation: Small UI improvements
|
2025-12-03 16:03:31 -08:00 |
|
oobabooga
|
373baa5c9c
|
UI: Minor image gallery improvements
|
2025-12-03 14:45:02 -08:00 |
|
oobabooga
|
9448bf1caa
|
Image generation: add torchao quantization (supports torch.compile)
|
2025-12-02 14:22:51 -08:00 |
|
oobabooga
|
97281ff831
|
UI: Fix an index error in the new image gallery
|
2025-12-02 11:20:52 -08:00 |
|
oobabooga
|
9d07d3a229
|
Make portable builds functional again after b3666e140d
|
2025-12-02 10:06:57 -08:00 |
|
oobabooga
|
6291e72129
|
Remove quanto for now (requires messy compilation)
|
2025-12-02 09:57:18 -08:00 |
|
oobabooga
|
b3666e140d
|
Add image generation support (#7328)
|
2025-12-02 14:55:38 -03:00 |
|
oobabooga
|
5327bc9397
|
Update modules/shared.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
2025-11-28 22:48:05 -03:00 |
|
GodEmperor785
|
400bb0694b
|
Add slider for --ubatch-size for llama.cpp loader, change defaults for better MoE performance (#7316)
|
2025-11-21 16:56:02 -03:00 |
|
oobabooga
|
8f0048663d
|
More modular HTML generator
|
2025-11-21 07:09:16 -08:00 |
|
oobabooga
|
0d4eff284c
|
Add a --cpu-moe model for llama.cpp
|
2025-11-19 05:23:43 -08:00 |
|
Trenten Miller
|
6871484398
|
fix: Rename 'evaluation_strategy' to 'eval_strategy' in training
|
2025-10-28 16:48:04 -03:00 |
|
oobabooga
|
a156ebbf76
|
Lint
|
2025-10-15 13:15:01 -07:00 |
|
oobabooga
|
c871d9cdbd
|
Revert "Same as 7f06aec3a1 but for exllamav3_hf"
This reverts commit deb37b821b.
|
2025-10-15 13:05:41 -07:00 |
|
oobabooga
|
b5a6904c4a
|
Make --trust-remote-code immutable from the UI/API
|
2025-10-14 20:47:01 -07:00 |
|
mamei16
|
308e726e11
|
log error when llama-server request exceeds context size (#7263)
|
2025-10-12 23:00:11 -03:00 |
|
oobabooga
|
655c3e86e3
|
Fix "continue" missing an initial space in chat-instruct/chat modes
|
2025-10-11 17:00:25 -07:00 |
|
oobabooga
|
c7dd920dc8
|
Fix metadata leaking into branched chats
|
2025-10-11 14:12:05 -07:00 |
|
oobabooga
|
78ff21d512
|
Organize the --help message
|
2025-10-10 15:21:08 -07:00 |
|
oobabooga
|
0d03813e98
|
Update modules/chat.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
2025-10-09 21:01:13 -03:00 |
|
oobabooga
|
deb37b821b
|
Same as 7f06aec3a1 but for exllamav3_hf
|
2025-10-09 13:02:38 -07:00 |
|
oobabooga
|
7f06aec3a1
|
exllamav3: Implement the logits function for /v1/internal/logits
|
2025-10-09 11:24:25 -07:00 |
|
oobabooga
|
218dc01b51
|
Add fallbacks after 93aa7b3ed3
|
2025-10-09 10:59:34 -07:00 |
|
oobabooga
|
282aa19189
|
Safer profile picture uploading
|
2025-10-09 09:26:35 -07:00 |
|
oobabooga
|
93aa7b3ed3
|
Better handle multigpu setups with transformers + bitsandbytes
|
2025-10-09 08:49:44 -07:00 |
|
Remowylliams
|
38a7fd685d
|
chat.py fixes Instruct mode History
|
2025-10-05 11:34:47 -03:00 |
|
oobabooga
|
1e863a7113
|
Fix exllamav3 ignoring the stop button
|
2025-09-19 16:12:50 -07:00 |
|
stevenxdavis
|
dd6d2223a5
|
Changing transformers_loader.py to Match User Expectations for --bf16 and Flash Attention 2 (#7217)
|
2025-09-17 16:39:04 -03:00 |
|
oobabooga
|
9e9ab39892
|
Make exllamav3_hf and exllamav2_hf functional again
|
2025-09-17 12:29:22 -07:00 |
|
oobabooga
|
f3829b268a
|
llama.cpp: Always pass --flash-attn on
|
2025-09-02 12:12:17 -07:00 |
|
oobabooga
|
c6ea67bbdb
|
Lint
|
2025-09-02 10:22:03 -07:00 |
|
oobabooga
|
00ed878b05
|
Slightly more robust model loading
|
2025-09-02 10:16:26 -07:00 |
|
oobabooga
|
387e249dec
|
Change an info message
|
2025-08-31 16:27:10 -07:00 |
|
oobabooga
|
8028d88541
|
Lint
|
2025-08-30 21:29:20 -07:00 |
|
oobabooga
|
13876a1ee8
|
llama.cpp: Remove the --flash-attn flag (it's always on now)
|
2025-08-30 20:28:26 -07:00 |
|
oobabooga
|
3a3e247f3c
|
Even better way to handle continue for thinking blocks
|
2025-08-30 12:36:35 -07:00 |
|
oobabooga
|
cf1aad2a68
|
Fix "continue" for Byte-OSS for partial thinking blocks
|
2025-08-30 12:16:45 -07:00 |
|
oobabooga
|
96136ea760
|
Fix LaTeX rendering for equations with asterisks
|
2025-08-30 10:13:32 -07:00 |
|
oobabooga
|
a3eb67e466
|
Fix the UI failing to launch if the Notebook prompt is too long
|
2025-08-30 08:42:26 -07:00 |
|
oobabooga
|
a2b37adb26
|
UI: Preload the correct fonts for chat mode
|
2025-08-29 09:25:44 -07:00 |
|
oobabooga
|
cb8780a4ce
|
Safer check for is_multimodal when loading models
Avoids unrelated multimodal error when a model fails to load due
to lack of memory.
|
2025-08-28 11:13:19 -07:00 |
|
oobabooga
|
cfc83745ec
|
UI: Improve right sidebar borders in light mode
|
2025-08-28 08:34:48 -07:00 |
|
oobabooga
|
ba6041251d
|
UI: Minor change
|
2025-08-28 06:20:00 -07:00 |
|
oobabooga
|
a92758a144
|
llama.cpp: Fix obtaining the maximum sequence length for GPT-OSS
|
2025-08-27 16:15:40 -07:00 |
|
oobabooga
|
030ba7bfeb
|
UI: Mention that Seed-OSS uses enable_thinking
|
2025-08-27 07:44:35 -07:00 |
|
oobabooga
|
0b4518e61c
|
"Text generation web UI" -> "Text Generation Web UI"
|
2025-08-27 05:53:09 -07:00 |
|
oobabooga
|
02ca96fa44
|
Multiple fixes
|
2025-08-25 22:17:22 -07:00 |
|
oobabooga
|
6a7166fffa
|
Add support for the Seed-OSS template
|
2025-08-25 19:46:48 -07:00 |
|
oobabooga
|
8fcb4b3102
|
Make bot_prefix extensions functional again
|
2025-08-25 19:10:46 -07:00 |
|
oobabooga
|
8f660aefe3
|
Fix chat-instruct replies leaking the bot name sometimes
|
2025-08-25 18:50:16 -07:00 |
|
oobabooga
|
a531328f7e
|
Fix the GPT-OSS stopping string
|
2025-08-25 18:41:58 -07:00 |
|
oobabooga
|
6c165d2e55
|
Fix the chat template
|
2025-08-25 18:28:43 -07:00 |
|
oobabooga
|
b657be7381
|
Obtain stopping strings in chat mode
|
2025-08-25 18:22:08 -07:00 |
|
oobabooga
|
ded6c41cf8
|
Fix impersonate for chat-instruct
|
2025-08-25 18:16:17 -07:00 |
|
oobabooga
|
c1aa4590ea
|
Code simplifications, fix impersonate
|
2025-08-25 18:05:40 -07:00 |
|
oobabooga
|
b330ec3517
|
Simplifications
|
2025-08-25 17:54:15 -07:00 |
|
oobabooga
|
3ad5970374
|
Make the llama.cpp --verbose output less verbose
|
2025-08-25 17:43:21 -07:00 |
|
oobabooga
|
adeca8a658
|
Remove changes to the jinja2 templates
|
2025-08-25 17:36:01 -07:00 |
|
oobabooga
|
aad0104c1b
|
Remove a function
|
2025-08-25 17:33:13 -07:00 |
|
oobabooga
|
f919cdf881
|
chat.py code simplifications
|
2025-08-25 17:20:51 -07:00 |
|
oobabooga
|
d08800c359
|
chat.py improvements
|
2025-08-25 17:03:37 -07:00 |
|
oobabooga
|
3bc48014a5
|
chat.py code simplifications
|
2025-08-25 16:48:21 -07:00 |
|
oobabooga
|
2478294c06
|
UI: Preload the instruct and chat fonts
|
2025-08-24 12:37:41 -07:00 |
|