Commit graph

5302 commits

Author SHA1 Message Date
oobabooga 83b7e47d77 Update README 2026-03-09 20:12:54 -07:00
oobabooga 7f485274eb Fix ExLlamaV3 EOS handling, load order, and perplexity evaluation
- Use config.eos_token_id_list for all EOS tokens as stop conditions
  (fixes models like Llama-3 that define multiple EOS token IDs)
- Load vision/draft models before main model so autosplit accounts
  for their VRAM usage
- Fix loss computation in ExLlamav3_HF: use cache across chunks so
  sequences longer than 2048 tokens get correct perplexity values
2026-03-09 23:56:38 -03:00
oobabooga 39e6c997cc Refactor to not import gradio in --nowebui mode 2026-03-09 19:29:24 -07:00
oobabooga 970055ca00 Update Intel GPU support to use native PyTorch XPU wheels
PyTorch 2.9+ includes native XPU support, making
intel-extension-for-pytorch and the separate oneAPI conda
install unnecessary.

Closes #7308
2026-03-09 17:08:59 -03:00
oobabooga d6643bb4bc One-click installer: Optimize wheel downloads to only re-download changed wheels 2026-03-09 12:30:43 -07:00
oobabooga 9753b2342b Fix crash on non-UTF-8 Windows locales (e.g. Chinese GBK)
Closes #7416
2026-03-09 16:22:37 -03:00
oobabooga eb4a20137a Update README 2026-03-08 20:38:50 -07:00
oobabooga 634609acca Fix pip installing to system Miniconda on Windows, revert 0132966d 2026-03-08 20:35:41 -07:00
oobabooga 40f1837b42 README: Minor updates 2026-03-08 08:38:29 -07:00
oobabooga f6ffecfff2 Add guard against training with llama.cpp loader 2026-03-08 10:47:59 -03:00
oobabooga 5a91b8462f Remove ctx_size_draft from ExLlamav3 loader 2026-03-08 09:53:48 -03:00
oobabooga 7a8ca9f2b0 Fix passing adaptive-p to llama-server 2026-03-08 04:09:40 -07:00
oobabooga 0132966d09 Add PyPI fallback for PyTorch install commands 2026-03-07 23:06:15 -03:00
oobabooga baf4e13ff1 ExLlamav3: fix draft cache size to match main cache 2026-03-07 22:34:48 -03:00
oobabooga 6ff111d18e ExLlamav3: handle exceptions in ConcurrentGenerator iterate loop 2026-03-07 22:05:31 -03:00
oobabooga 0cecc0a041 Use tar.gz for Linux/macOS portable builds to preserve symlinks 2026-03-07 06:59:48 -08:00
oobabooga e1bf0b866f Update the macos workflow 2026-03-07 06:46:46 -08:00
oobabooga b686193fe2 Reapply "Update Miniforge from 25.3.0 to 26.1.0"
This reverts commit 085c4ef5d7.
2026-03-07 06:10:05 -08:00
oobabooga 328215b0c7 API: Stop generation on client disconnect for non-streaming requests 2026-03-07 06:06:13 -08:00
oobabooga 304510eb3d ExLlamav3: route all generation through ConcurrentGenerator 2026-03-07 05:54:14 -08:00
oobabooga 085c4ef5d7 Revert "Update Miniforge from 25.3.0 to 26.1.0"
This reverts commit 9576c5a5f4.
2026-03-07 05:09:49 -08:00
oobabooga aa634c77c0 Update llama.cpp 2026-03-06 21:00:36 -08:00
oobabooga abc699db9b Minor UI change 2026-03-06 19:03:38 -08:00
oobabooga f2fe001cc4 Fix message copy buttons not working over HTTP 2026-03-06 19:01:38 -08:00
oobabooga 7ea5513263 Handle Qwen 3.5 thinking blocks 2026-03-06 19:01:28 -08:00
oobabooga 5fa709a3f4 llama.cpp server: use port+5 offset and suppress No parser definition detected logs 2026-03-06 18:52:34 -08:00
oobabooga e8e0d02406 Remove outdated ROCm environment variable overrides from one_click.py 2026-03-06 18:15:05 -08:00
oobabooga 1eead661c3 Portable mode: always use ../user_data if it exists 2026-03-06 18:04:48 -08:00
oobabooga d48b53422f Training: Optimize _peek_json_keys to avoid loading entire file into memory 2026-03-06 15:39:08 -08:00
oobabooga 2beaa4b971 Update llama.cpp 2026-03-06 14:39:35 -08:00
oobabooga 5f6754c267 Fix stop button being ignored when token throttling is off 2026-03-06 17:12:34 -03:00
oobabooga b8b4471ab5 Security: restrict file writes to user_data_dir, block extra_flags from API 2026-03-06 16:58:11 -03:00
oobabooga d03923924a Several small fixes
- Stop llama-server subprocess on model unload instead of relying on GC
- Fix tool_calls[].index being string instead of int in API responses
- Omit tool_calls key from API response when empty per OpenAI spec
- Prevent division by zero when micro_batch_size > batch_size in training
- Copy sampler_priority list before mutating in ExLlamaV3
- Normalize presence/frequency_penalty names for ExLlamaV3 sampler sorting
- Restore original chat_template after training instead of leaving it mutated
2026-03-06 16:52:13 -03:00
oobabooga 044566d42d API: Add tool call parsing for DeepSeek, GLM, MiniMax, and Kimi models 2026-03-06 15:06:56 -03:00
oobabooga f5acf55207 Add --chat-template-file flag to override the default instruction template for API requests
Matches llama.cpp's flag name. Supports .jinja, .jinja2, and .yaml files.
Priority: per-request params > --chat-template-file > model's built-in template.
2026-03-06 14:04:16 -03:00
oobabooga 3531069824 API: Support Llama 4 tool calling and fix tool calling edge cases 2026-03-06 13:12:14 -03:00
oobabooga 160f7ad6b4 Handle SIGTERM to stop llama-server on pkill 2026-03-06 12:56:33 -03:00
oobabooga 8e24a20873 Installer: Fix libstdcxx-ng version pin causing conda solver to hang on Python 3.13 2026-03-06 07:39:50 -08:00
oobabooga 3bab7fbfd4 Update Colab notebook: new default model, direct GGUF URL support 2026-03-06 06:52:49 -08:00
oobabooga e7e0df0101 Fix hover menu shifting down when chat input grows 2026-03-06 11:52:16 -03:00
oobabooga 3323dedd08 Update llama.cpp 2026-03-06 06:30:01 -08:00
oobabooga 36dbc4ccce Remove unused colorama and psutil requirements 2026-03-06 06:28:35 -08:00
oobabooga 86d59b4404 Installer: Fix edge case in wheel re-download caching 2026-03-06 06:16:57 -08:00
oobabooga 0e0e3ceb97 Update the custom gradio wheels 2026-03-06 05:46:08 -08:00
oobabooga 6d7018069c Installer: Use absolute Python path in Windows batch scripts 2026-03-05 21:56:01 -08:00
oobabooga f9ed8820de API: Make tool function description and parameters optional 2026-03-05 21:43:33 -08:00
oobabooga 3880c1a406 API: Accept content:null and complex tool definitions in tool calling requests 2026-03-06 02:41:38 -03:00
oobabooga 93ebfa2b7e Fix llama-server output filter for new log format 2026-03-06 02:38:13 -03:00
oobabooga d0ac58ad31 API: Fix tool_calls placement and other response compatibility issues 2026-03-05 21:25:03 -08:00
oobabooga f06583b2b9 API: Use \n instead of \r\n as the SSE separator to match OpenAI 2026-03-05 21:16:37 -08:00