Commit graph

5222 commits

Author SHA1 Message Date
oobabooga 1ffe540c97 Full documentation update to match current codebase 2026-03-05 12:46:54 -03:00
oobabooga 1c2548fd89 Training: use dynamic padding (pad to batch max instead of cutoff_len)
- Remove pre-padding from tokenize() and tokenize_conversation()
- Collate function now right-pads each batch to the longest sequence
- Set tokenizer padding_side to "right" (standard for training)
- Remove dead natural_keys import
- Reduces wasted compute on batches with short sequences
- Aligns with axolotl/unsloth approach
2026-03-05 12:45:32 -03:00
oobabooga da2d4f1a6a Training: replace raw text file with JSONL text dataset, re-add stride overlap
- Replace "Raw text file" tab with "Text Dataset" tab using JSONL format with "text" key per row
- Re-add stride overlap for chunking (configurable Stride Length slider, 0-2048 tokens)
- Pad remainder chunks instead of dropping them
- Remove hard_cut_string, min_chars, raw_text_file parameters
- Remove .txt file and directory loading support
2026-03-05 12:33:12 -03:00
oobabooga d278bb46a2 Add apply_chat_template() support for LoRA training
- Support multi-turn conversations (OpenAI messages + ShareGPT formats)
- Automatic assistant-only label masking via incremental tokenization
- Use tokenizer.apply_chat_template() for proper special token handling
- Add "Chat Template" option to the Data Format dropdown
- Also accept instruction/output datasets (auto-converted to messages)
- Validate chat template availability and dataset format upfront
- Fix after_tokens[-1] IndexError when train_only_after is at end of prompt
- Update docs
2026-03-05 11:47:25 -03:00
oobabooga b16a1a874a Update TensorRT-LLM Dockerfile for v1.1.0 2026-03-05 06:23:56 -08:00
oobabooga 45188eccef Overhaul LoRA training tab
- Use peft's "all-linear" for target modules instead of the old
  model_to_lora_modules mapping (only knew ~39 model types)
- Add "Target all linear layers" checkbox, on by default
- Fix labels in tokenize() — were [1]s instead of actual token IDs
- Replace DataCollatorForLanguageModeling with custom collate_fn
- Raw text: concatenate-and-split instead of overlapping chunks
- Adapter backup/loading: check safetensors before bin
- Fix report_to=None crash on transformers 5.x
- Fix no_cuda deprecation for transformers 5.x (use use_cpu)
- Move torch.compile before Trainer init
- Add remove_unused_columns=False (torch.compile breaks column detection)
- Guard against no target modules selected
- Set tracked.did_save so we don't always save twice
- pad_token_id: fall back to eos_token_id instead of hardcoding 0
- Drop MODEL_CLASSES, split_chunks, cut_chunk_for_newline
- Update docs
2026-03-05 10:52:59 -03:00
oobabooga 268cc3f100 Update TensorRT-LLM to v1.1.0 2026-03-05 09:32:28 -03:00
oobabooga 69fa4dd0b1 llama.cpp: allow ctx_size=0 for auto context via --fit 2026-03-04 19:33:20 -08:00
oobabooga fbfcd59fe0 llama.cpp: Use -1 instead of 0 for auto gpu_layers 2026-03-04 19:21:45 -08:00
oobabooga d45aa6606a Fix blank prompt dropdown in Notebook/Default tabs on first startup 2026-03-04 19:07:55 -08:00
oobabooga 0804296f4d Revert "UI: Remove unnecessary server round-trips from button click chains"
This reverts commit ff48956cb0.
2026-03-04 18:41:30 -08:00
oobabooga 6a08e79fa5 Update the custom gradio wheels 2026-03-04 18:22:50 -08:00
oobabooga ff48956cb0 UI: Remove unnecessary server round-trips from button click chains 2026-03-04 18:19:56 -08:00
oobabooga 5a22970ba8 Docker: fix and clean up configs, update docs 2026-03-04 23:13:47 -03:00
oobabooga 387cf9d8df Remove obsolete DeepSpeed inference code (2023 relic) 2026-03-04 17:20:34 -08:00
oobabooga 942ff8fcb4 Remove obsolete stuff after custom gradio updates 2026-03-04 16:43:32 -08:00
oobabooga da3010c3ed tiny improvements to llama_cpp_server.py 2026-03-04 15:54:37 -08:00
oobabooga 83cc207ef7 Update the custom gradio wheels 2026-03-04 14:31:18 -08:00
thecaptain789 2ac4eb33c8
fix: correct typo 'occured' to 'occurred' (#7389) 2026-03-04 18:09:28 -03:00
Sense_wang 7bf15ad933
fix: replace bare except clauses with except Exception (#7400) 2026-03-04 18:06:17 -03:00
mamei16 1d1f4dfc88
Disable uncommonly used indented codeblocks (#7401) 2026-03-04 17:51:00 -03:00
mamei16 abb7cc02e9
Re-introduce inline LaTeX rendering with more robust exception handling (#7402) 2026-03-04 17:44:19 -03:00
mamei16 68109bc5da
Improve process_markdown_content (#7403) 2026-03-04 17:26:13 -03:00
weiguang li 952e2c404a
Bump sentence-transformers from 2.2.2 to 3.3.1 in superbooga (#7406) 2026-03-04 17:08:08 -03:00
oobabooga cdf0e392e6 llama.cpp: Reorganize speculative decoding UI and use recommended ngram-mod defaults 2026-03-04 12:05:08 -08:00
oobabooga eb90daf098 ExLlamaV2: Don't expose unused seed parameter 2026-03-04 11:14:50 -08:00
oobabooga 0ffb75de7c Update Transformers to 5.3.0 2026-03-04 11:12:54 -08:00
oobabooga d8af0505a8 ExLlamav3_HF: Optimize prefill and fix CFG cache initialization 2026-03-04 11:09:58 -08:00
oobabooga 9b916f02cd ExLlamaV3: Attach AdaptiveP, fix speculative decoding parameter, add seed 2026-03-04 10:51:15 -08:00
oobabooga 5d93f4e800 Fix requires_grad warning in logits API 2026-03-04 10:43:23 -08:00
oobabooga 64eb77e782 Fix the logits API endpoint with transformers 2026-03-04 10:41:47 -08:00
oobabooga 22141679e3 Update the custom gradio wheels 2026-03-04 10:01:31 -08:00
oobabooga 65de4c30c8 Add adaptive-p sampler and n-gram speculative decoding support 2026-03-04 09:41:29 -08:00
oobabooga f010aa1612 Replace PyPDF2 with pymupdf for PDF text extraction
pymupdf produces cleaner text (e.g. no concatenated words in headers),
handles encrypted and malformed PDFs that PyPDF2 failed on, and
supports non-Latin scripts.
2026-03-04 06:43:37 -08:00
oobabooga f4d787ab8d Delegate GPU layer allocation to llama.cpp's --fit 2026-03-04 06:37:50 -08:00
oobabooga 8a3d866401 Fix temperature_last having no effect in llama.cpp server sampler order 2026-03-04 06:10:51 -08:00
oobabooga 11dc6fdfce Update the custom gradio wheels 2026-03-04 06:04:33 -08:00
oobabooga 7d42b6900e Update the custom gradio wheels 2026-03-04 05:47:59 -08:00
oobabooga 8cbb7661a8 Remove no longer needed dark theme localstorage code 2026-03-03 18:51:24 -08:00
oobabooga 866c48e55b Simplify dark theme handling using gradio fork's new dark_theme parameter 2026-03-03 18:41:47 -08:00
oobabooga b3fd0d16e0 Use a new gr.Headless component for efficient chat streaming 2026-03-03 18:12:03 -08:00
oobabooga d584ede72e Avoid a circular import 2026-03-03 17:59:47 -08:00
oobabooga c0bff831e3 Update custom gradio wheels 2026-03-03 17:21:18 -08:00
oobabooga 2260e530c9 Remove gradio monkey-patches (moved to gradio fork) 2026-03-03 17:17:36 -08:00
oobabooga e9f22813e4 Replace gradio with my gradio 4.37.2 fork 2026-03-03 16:51:27 -08:00
dependabot[bot] 3519890c8e
Bump flask-cloudflared from 0.0.14 to 0.0.15 in /requirements/full (#7380) 2026-03-03 21:41:51 -03:00
dependabot[bot] 9c604628a0
Bump flask-cloudflared from 0.0.14 to 0.0.15 in /requirements/portable (#7382) 2026-03-03 21:41:46 -03:00
oobabooga fbd2acfa19 Remove triton-windows from non-CUDA requirements 2026-03-03 16:16:55 -08:00
oobabooga 5fd79b23d1 Add CUDA 13.1 portable builds 2026-03-03 15:36:41 -08:00
oobabooga b8fcc8ea32 Update llama.cpp, remove noavx2 builds, add ROCm Windows portable builds 2026-03-03 15:27:19 -08:00