text-generation-webui

mirror of https://github.com/oobabooga/text-generation-webui.git synced 2026-02-19 22:24:40 +01:00

History

oobabooga d4017fbb6d ExLlamaV3: Add kv cache quantization (#6903 )		2025-04-25 21:32:00 -03:00
..
grammar	Let grammar escape backslashes (#5865 )	2024-05-19 20:26:09 -03:00
block_requests.py	Fix the Google Colab notebook	2025-01-16 05:21:18 -08:00
callbacks.py	Refactor the transformers loader (#6859 )	2025-04-20 13:33:47 -03:00
chat.py	UI: Add a collapsible thinking block to messages with `<think>` steps (#6902 )	2025-04-25 18:02:02 -03:00
deepspeed_parameters.py	Fix typo in deepspeed_parameters.py (#3222 )	2023-07-24 11:17:28 -03:00
evaluate.py	Fix an import	2025-04-20 17:51:28 -07:00
exllamav2.py	Use `--ctx-size` to specify the context size for all loaders	2025-04-25 16:59:03 -07:00
exllamav2_hf.py	Use `--ctx-size` to specify the context size for all loaders	2025-04-25 16:59:03 -07:00
exllamav3_hf.py	ExLlamaV3: Add kv cache quantization (#6903 )	2025-04-25 21:32:00 -03:00
extensions.py	Move update_wizard_windows.sh to update_wizard_windows.bat (oops)	2024-03-04 19:26:24 -08:00
github.py	Fix several typos in the codebase (#6151 )	2024-06-22 21:40:25 -03:00
gradio_hijack.py	Bump gradio to 4.23 (#5758 )	2024-03-26 16:32:20 -03:00
html_generator.py	UI: Add a collapsible thinking block to messages with `<think>` steps (#6902 )	2025-04-25 18:02:02 -03:00
llama_cpp_server.py	Use `--ctx-size` to specify the context size for all loaders	2025-04-25 16:59:03 -07:00
loaders.py	ExLlamaV3: Add kv cache quantization (#6903 )	2025-04-25 21:32:00 -03:00
logging_colors.py	Lint	2023-12-19 21:36:57 -08:00
logits.py	Refactor the transformers loader (#6859 )	2025-04-20 13:33:47 -03:00
LoRA.py	Refactor the transformers loader (#6859 )	2025-04-20 13:33:47 -03:00
metadata_gguf.py	llama.cpp: read instruction template from GGUF metadata (#4975 )	2023-12-18 01:51:58 -03:00
models.py	Use `--ctx-size` to specify the context size for all loaders	2025-04-25 16:59:03 -07:00
models_settings.py	Use `--ctx-size` to specify the context size for all loaders	2025-04-25 16:59:03 -07:00
one_click_installer_check.py	Lint	2023-11-16 18:03:06 -08:00
presets.py	Add the top N-sigma sampler (#6796 )	2025-03-14 16:45:11 -03:00
prompts.py	Fix "send instruction template to..." buttons (closes #4625 )	2023-11-16 18:16:42 -08:00
relative_imports.py	Add ExLlama+LoRA support (#2756 )	2023-06-19 12:31:24 -03:00
sampler_hijack.py	Fix the exllamav2_HF and exllamav3_HF loaders	2025-04-21 18:32:23 -07:00
sane_markdown_lists.py	Sane handling of markdown lists (#6626 )	2025-01-04 15:41:31 -03:00
shared.py	ExLlamaV3: Add kv cache quantization (#6903 )	2025-04-25 21:32:00 -03:00
tensorrt_llm.py	Use `--ctx-size` to specify the context size for all loaders	2025-04-25 16:59:03 -07:00
text_generation.py	EXL2: add another torch.cuda.synchronize() call to prevent errors	2025-04-24 09:03:49 -07:00
torch_utils.py	Refactor the transformers loader (#6859 )	2025-04-20 13:33:47 -03:00
training.py	llama.cpp: Add speculative decoding (#6891 )	2025-04-23 20:10:16 -03:00
transformers_loader.py	Fix ExLlamaV2_HF and ExLlamaV3_HF after `ae02ffc605`	2025-04-20 11:32:48 -07:00
ui.py	Use `--ctx-size` to specify the context size for all loaders	2025-04-25 16:59:03 -07:00
ui_chat.py	Make 'instruct' the default chat mode	2025-04-24 07:08:49 -07:00
ui_default.py	Lint	2024-12-17 20:13:32 -08:00
ui_file_saving.py	Fix the "save preset" event	2024-10-01 11:20:48 -07:00
ui_model_menu.py	ExLlamaV3: Add kv cache quantization (#6903 )	2025-04-25 21:32:00 -03:00
ui_notebook.py	Lint	2024-12-17 20:13:32 -08:00
ui_parameters.py	Use `--ctx-size` to specify the context size for all loaders	2025-04-25 16:59:03 -07:00
ui_session.py	Fix a bug after `c6901aba9f`	2025-04-18 06:51:28 -07:00
utils.py	UI: show only part 00001 of multipart GGUF models in the model menu	2025-04-22 19:56:42 -07:00