text-generation-webui

mirror of https://github.com/oobabooga/text-generation-webui.git synced 2026-03-08 06:33:51 +01:00

Author	SHA1	Message	Date
oobabooga	66fb79fe15	llama.cpp: Add --fit-target param	2026-03-06 01:55:48 -03:00
oobabooga	e81a47f708	Improve the API generation defaults --help message	2026-03-05 20:41:45 -08:00
oobabooga	27bcc45c18	API: Add command-line flags to override default generation parameters	2026-03-06 01:36:45 -03:00
oobabooga	8a9afcbec6	Allow extensions to skip output post-processing	2026-03-06 01:19:46 -03:00
oobabooga	2e7e966ef2	Docs: Better Tool/Function calling examples	2026-03-05 20:06:34 -08:00
oobabooga	ddcad3cc51	Follow-up to `e2548f69`: add missing paths module, fix gallery extension	2026-03-06 00:58:03 -03:00
oobabooga	8d43123f73	API: Fix function calling for Qwen, Mistral, GPT-OSS, and other models The tool call response parser only handled JSON-based formats, causing tool_calls to always be empty for models that use non-JSON formats. Add parsers for three additional tool call formats: - Qwen3.5: <tool_call><function=name><parameter=key>value</parameter> - Mistral/Devstral: functionName{"arg": "value"} - GPT-OSS: <\|channel\|>commentary to=functions.name<\|message\|>{...} Also fix multi-turn tool conversations crashing with Jinja2 UndefinedError on tool_call_id by preserving tool_calls and tool_call_id metadata through the chat history conversion.	2026-03-06 00:55:33 -03:00
oobabooga	e2548f69a9	Make user_data configurable: add --user-data-dir flag, auto-detect ../user_data If --user-data-dir is not set, auto-detect: use ../user_data when ./user_data doesn't exist, making it easy to share user data across portable builds by placing it one folder up.	2026-03-05 19:31:10 -08:00
oobabooga	4c406e024f	API: Speed up chat completions by ~85ms per request	2026-03-05 18:36:07 -08:00
oobabooga	249bd6eea2	UI: Update the parallel info message	2026-03-05 18:11:55 -08:00
oobabooga	f52d9336e5	TensorRT-LLM: Migrate from ModelRunner to LLM API, add concurrent API request support	2026-03-05 18:09:45 -08:00
oobabooga	9824c82cb6	API: Add parallel request support for llama.cpp and ExLlamaV3	2026-03-05 16:49:58 -08:00
oobabooga	2f08dce7b0	Remove ExLlamaV2 backend - archived upstream: `7dc12af3a8` - replaced by ExLlamaV3, which has much better quantization accuracy	2026-03-05 14:02:13 -08:00
oobabooga	134ac8fc29	Update README	2026-03-05 12:30:28 -08:00
oobabooga	409db3df1e	Training: Docs improvements	2026-03-05 11:30:57 -08:00
oobabooga	86d8291e58	Training: UI cleanup and better defaults	2026-03-05 11:20:55 -08:00
oobabooga	33ff3773a0	Clean up LoRA loading parameter handling	2026-03-05 16:00:13 -03:00
oobabooga	7a1fa8c9ea	Training: fix checkpoint resume and surface training errors to UI	2026-03-05 15:50:39 -03:00
oobabooga	275810c843	Training: wire up HF Trainer checkpoint resumption for full state recovery	2026-03-05 15:32:49 -03:00
oobabooga	438e59498e	Update ExLlamaV3 to v0.0.23	2026-03-05 10:24:31 -08:00
oobabooga	63f28cb4a2	Training: align defaults with peft/axolotl (rank 8, alpha 16, dropout 0, cutoff 512, eos on)	2026-03-05 15:12:32 -03:00
oobabooga	33a38d7ece	Training: drop conversations exceeding cutoff length instead of truncating	2026-03-05 14:56:27 -03:00
oobabooga	c2e494963f	Training: fix silent error on model reload failure, minor cleanups	2026-03-05 14:41:44 -03:00
oobabooga	5b18be8582	Training: unify instruction training through apply_chat_template() Instead of two separate paths (format files vs Chat Template), all instruction training now uses apply_chat_template() with assistant-only label masking. Users pick a Jinja2 template from the dropdown or use the model's built-in chat template — both work identically.	2026-03-05 14:39:37 -03:00
oobabooga	d337ba0390	Training: fix apply_chat_template returning BatchEncoding instead of list	2026-03-05 13:45:28 -03:00
oobabooga	5be68cc073	Remove Training_PRO extension The built-in training tab now covers its essential functionality with a more modern and correct implementation (apply_chat_template, dynamic padding, JSONL datasets, stride overlap).	2026-03-05 12:55:07 -03:00
oobabooga	1ffe540c97	Full documentation update to match current codebase	2026-03-05 12:46:54 -03:00
oobabooga	1c2548fd89	Training: use dynamic padding (pad to batch max instead of cutoff_len) - Remove pre-padding from tokenize() and tokenize_conversation() - Collate function now right-pads each batch to the longest sequence - Set tokenizer padding_side to "right" (standard for training) - Remove dead natural_keys import - Reduces wasted compute on batches with short sequences - Aligns with axolotl/unsloth approach	2026-03-05 12:45:32 -03:00
oobabooga	da2d4f1a6a	Training: replace raw text file with JSONL text dataset, re-add stride overlap - Replace "Raw text file" tab with "Text Dataset" tab using JSONL format with "text" key per row - Re-add stride overlap for chunking (configurable Stride Length slider, 0-2048 tokens) - Pad remainder chunks instead of dropping them - Remove hard_cut_string, min_chars, raw_text_file parameters - Remove .txt file and directory loading support	2026-03-05 12:33:12 -03:00
oobabooga	d278bb46a2	Add apply_chat_template() support for LoRA training - Support multi-turn conversations (OpenAI messages + ShareGPT formats) - Automatic assistant-only label masking via incremental tokenization - Use tokenizer.apply_chat_template() for proper special token handling - Add "Chat Template" option to the Data Format dropdown - Also accept instruction/output datasets (auto-converted to messages) - Validate chat template availability and dataset format upfront - Fix after_tokens[-1] IndexError when train_only_after is at end of prompt - Update docs	2026-03-05 11:47:25 -03:00
oobabooga	b16a1a874a	Update TensorRT-LLM Dockerfile for v1.1.0	2026-03-05 06:23:56 -08:00
oobabooga	45188eccef	Overhaul LoRA training tab - Use peft's "all-linear" for target modules instead of the old model_to_lora_modules mapping (only knew ~39 model types) - Add "Target all linear layers" checkbox, on by default - Fix labels in tokenize() — were [1]s instead of actual token IDs - Replace DataCollatorForLanguageModeling with custom collate_fn - Raw text: concatenate-and-split instead of overlapping chunks - Adapter backup/loading: check safetensors before bin - Fix report_to=None crash on transformers 5.x - Fix no_cuda deprecation for transformers 5.x (use use_cpu) - Move torch.compile before Trainer init - Add remove_unused_columns=False (torch.compile breaks column detection) - Guard against no target modules selected - Set tracked.did_save so we don't always save twice - pad_token_id: fall back to eos_token_id instead of hardcoding 0 - Drop MODEL_CLASSES, split_chunks, cut_chunk_for_newline - Update docs	2026-03-05 10:52:59 -03:00
oobabooga	268cc3f100	Update TensorRT-LLM to v1.1.0	2026-03-05 09:32:28 -03:00
oobabooga	69fa4dd0b1	llama.cpp: allow ctx_size=0 for auto context via --fit	2026-03-04 19:33:20 -08:00
oobabooga	fbfcd59fe0	llama.cpp: Use -1 instead of 0 for auto gpu_layers	2026-03-04 19:21:45 -08:00
oobabooga	d45aa6606a	Fix blank prompt dropdown in Notebook/Default tabs on first startup	2026-03-04 19:07:55 -08:00
oobabooga	0804296f4d	Revert "UI: Remove unnecessary server round-trips from button click chains" This reverts commit `ff48956cb0`.	2026-03-04 18:41:30 -08:00
oobabooga	6a08e79fa5	Update the custom gradio wheels	2026-03-04 18:22:50 -08:00
oobabooga	ff48956cb0	UI: Remove unnecessary server round-trips from button click chains	2026-03-04 18:19:56 -08:00
oobabooga	5a22970ba8	Docker: fix and clean up configs, update docs	2026-03-04 23:13:47 -03:00
oobabooga	387cf9d8df	Remove obsolete DeepSpeed inference code (2023 relic)	2026-03-04 17:20:34 -08:00
oobabooga	942ff8fcb4	Remove obsolete stuff after custom gradio updates	2026-03-04 16:43:32 -08:00
oobabooga	da3010c3ed	tiny improvements to llama_cpp_server.py	2026-03-04 15:54:37 -08:00
oobabooga	83cc207ef7	Update the custom gradio wheels	2026-03-04 14:31:18 -08:00
thecaptain789	2ac4eb33c8	fix: correct typo 'occured' to 'occurred' (#7389 )	2026-03-04 18:09:28 -03:00
Sense_wang	7bf15ad933	fix: replace bare except clauses with except Exception (#7400 )	2026-03-04 18:06:17 -03:00
mamei16	1d1f4dfc88	Disable uncommonly used indented codeblocks (#7401 )	2026-03-04 17:51:00 -03:00
mamei16	abb7cc02e9	Re-introduce inline LaTeX rendering with more robust exception handling (#7402 )	2026-03-04 17:44:19 -03:00
mamei16	68109bc5da	Improve `process_markdown_content` (#7403 )	2026-03-04 17:26:13 -03:00
weiguang li	952e2c404a	Bump sentence-transformers from 2.2.2 to 3.3.1 in superbooga (#7406 )	2026-03-04 17:08:08 -03:00

1 2 3 4 5 ...

5248 commits