text-generation-webui

mirror of https://github.com/oobabooga/text-generation-webui.git synced 2026-03-07 06:03:51 +01:00

Author	SHA1	Message	Date
oobabooga	86d8291e58	Training: UI cleanup and better defaults	2026-03-05 11:20:55 -08:00
oobabooga	7a1fa8c9ea	Training: fix checkpoint resume and surface training errors to UI	2026-03-05 15:50:39 -03:00
oobabooga	275810c843	Training: wire up HF Trainer checkpoint resumption for full state recovery	2026-03-05 15:32:49 -03:00
oobabooga	63f28cb4a2	Training: align defaults with peft/axolotl (rank 8, alpha 16, dropout 0, cutoff 512, eos on)	2026-03-05 15:12:32 -03:00
oobabooga	33a38d7ece	Training: drop conversations exceeding cutoff length instead of truncating	2026-03-05 14:56:27 -03:00
oobabooga	c2e494963f	Training: fix silent error on model reload failure, minor cleanups	2026-03-05 14:41:44 -03:00
oobabooga	5b18be8582	Training: unify instruction training through apply_chat_template() Instead of two separate paths (format files vs Chat Template), all instruction training now uses apply_chat_template() with assistant-only label masking. Users pick a Jinja2 template from the dropdown or use the model's built-in chat template — both work identically.	2026-03-05 14:39:37 -03:00
oobabooga	d337ba0390	Training: fix apply_chat_template returning BatchEncoding instead of list	2026-03-05 13:45:28 -03:00
oobabooga	1c2548fd89	Training: use dynamic padding (pad to batch max instead of cutoff_len) - Remove pre-padding from tokenize() and tokenize_conversation() - Collate function now right-pads each batch to the longest sequence - Set tokenizer padding_side to "right" (standard for training) - Remove dead natural_keys import - Reduces wasted compute on batches with short sequences - Aligns with axolotl/unsloth approach	2026-03-05 12:45:32 -03:00
oobabooga	da2d4f1a6a	Training: replace raw text file with JSONL text dataset, re-add stride overlap - Replace "Raw text file" tab with "Text Dataset" tab using JSONL format with "text" key per row - Re-add stride overlap for chunking (configurable Stride Length slider, 0-2048 tokens) - Pad remainder chunks instead of dropping them - Remove hard_cut_string, min_chars, raw_text_file parameters - Remove .txt file and directory loading support	2026-03-05 12:33:12 -03:00
oobabooga	d278bb46a2	Add apply_chat_template() support for LoRA training - Support multi-turn conversations (OpenAI messages + ShareGPT formats) - Automatic assistant-only label masking via incremental tokenization - Use tokenizer.apply_chat_template() for proper special token handling - Add "Chat Template" option to the Data Format dropdown - Also accept instruction/output datasets (auto-converted to messages) - Validate chat template availability and dataset format upfront - Fix after_tokens[-1] IndexError when train_only_after is at end of prompt - Update docs	2026-03-05 11:47:25 -03:00
oobabooga	45188eccef	Overhaul LoRA training tab - Use peft's "all-linear" for target modules instead of the old model_to_lora_modules mapping (only knew ~39 model types) - Add "Target all linear layers" checkbox, on by default - Fix labels in tokenize() — were [1]s instead of actual token IDs - Replace DataCollatorForLanguageModeling with custom collate_fn - Raw text: concatenate-and-split instead of overlapping chunks - Adapter backup/loading: check safetensors before bin - Fix report_to=None crash on transformers 5.x - Fix no_cuda deprecation for transformers 5.x (use use_cpu) - Move torch.compile before Trainer init - Add remove_unused_columns=False (torch.compile breaks column detection) - Guard against no target modules selected - Set tracked.did_save so we don't always save twice - pad_token_id: fall back to eos_token_id instead of hardcoding 0 - Drop MODEL_CLASSES, split_chunks, cut_chunk_for_newline - Update docs	2026-03-05 10:52:59 -03:00
Sense_wang	7bf15ad933	fix: replace bare except clauses with except Exception (#7400 )	2026-03-04 18:06:17 -03:00
Trenten Miller	6871484398	fix: Rename 'evaluation_strategy' to 'eval_strategy' in training	2025-10-28 16:48:04 -03:00
oobabooga	aa44e542cb	Revert "Safer usage of mkdir across the project" This reverts commit `0d1597616f`.	2025-06-17 07:11:59 -07:00
oobabooga	0d1597616f	Safer usage of mkdir across the project	2025-06-17 07:09:33 -07:00
oobabooga	d9de14d1f7	Restructure the repository (#6904 )	2025-04-26 08:56:54 -03:00
oobabooga	e99c20bcb0	llama.cpp: Add speculative decoding (#6891 )	2025-04-23 20:10:16 -03:00
oobabooga	ae02ffc605	Refactor the transformers loader (#6859 )	2025-04-20 13:33:47 -03:00
oobabooga	bba5b36d33	Don't import PEFT unless necessary	2024-09-03 19:40:53 -07:00
oobabooga	5223c009fe	Minor change after previous commit	2024-07-27 23:13:34 -07:00
oobabooga	7050bb880e	UI: make n_ctx/max_seq_len/truncation_length numbers rather than sliders	2024-07-27 23:11:53 -07:00
oobabooga	bd7cc4234d	Backend cleanup (#6025 )	2024-05-21 13:32:02 -03:00
oobabooga	faf3bf2503	Perplexity evaluation: make UI events more robust (attempt)	2024-02-22 07:13:22 -08:00
ilya sheprut	4d14eb8b82	LoRA: Fix error "Attempting to unscale FP16 gradients" when training (#5268 )	2024-01-17 17:11:49 -03:00
AstrisCantCode	b80e6365d0	Fix various bugs for LoRA training (#5161 )	2024-01-03 20:42:20 -03:00
oobabooga	9992f7d8c0	Improve several log messages	2023-12-19 20:54:32 -08:00
oobabooga	131a5212ce	UI: update context upper limit to 200000	2023-12-04 15:48:34 -08:00
Julien Chaumond	a56ef2a942	make torch.load a bit safer (#4448 )	2023-11-02 14:07:08 -03:00
oobabooga	262f8ae5bb	Use default gr.Dataframe for evaluation table	2023-10-27 06:49:14 -07:00
Abhilash Majumder	778a010df8	Intel Gpu support initialization (#4340 )	2023-10-26 23:39:51 -03:00
adrianfiedler	4bc411332f	Fix broken links (#4367 ) --------- Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>	2023-10-23 14:09:57 -03:00
omo	4405513ca5	Option to select/target additional linear modules/layers in LORA training (#4178 )	2023-10-22 15:57:19 -03:00
oobabooga	f17f7a6913	Increase the evaluation table height	2023-10-16 12:55:35 -07:00
oobabooga	188d20e9e5	Reduce the evaluation table height	2023-10-16 10:53:42 -07:00
oobabooga	71cac7a1b2	Increase the height of the evaluation table	2023-10-15 21:56:40 -07:00
oobabooga	fae8062d39	Bump to latest gradio (3.47) (#4258 )	2023-10-10 22:20:49 -03:00
oobabooga	abe99cddeb	Extend evaluation slider bounds	2023-09-29 13:06:26 -07:00
oobabooga	1ca54faaf0	Improve --multi-user mode	2023-09-26 06:42:33 -07:00
John Smith	cc7b7ba153	fix lora training with alpaca_lora_4bit (#3853 )	2023-09-11 01:22:20 -03:00
oobabooga	8545052c9d	Add the option to use samplers in the logit viewer	2023-08-22 20:18:16 -07:00
oobabooga	25e5eaa6a6	Remove outdated training warning	2023-08-22 13:16:44 -07:00
oobabooga	335c49cc7e	Bump peft and transformers	2023-08-22 13:14:59 -07:00
oobabooga	b96fd22a81	Refactor the training tab (#3619 )	2023-08-18 16:58:38 -03:00
oobabooga	65aa11890f	Refactor everything (#3481 )	2023-08-06 21:49:27 -03:00
oobabooga	3e70bce576	Properly format exceptions in the UI	2023-08-03 06:57:21 -07:00
Foxtr0t1337	85b3a26e25	Ignore values which are not string in training.py (#3287 )	2023-07-25 19:00:25 -03:00
FartyPants	9b55d3a9f9	More robust and error prone training (#3058 )	2023-07-12 15:29:43 -03:00
oobabooga	30f37530d5	Add back .replace('\r', '')	2023-07-12 09:52:20 -07:00
Fernando Tarin Morales	987d0fe023	Fix: Fixed the tokenization process of a raw dataset and improved its efficiency (#3035 )	2023-07-12 12:05:37 -03:00

1 2 3

110 commits