Commit graph

4495 commits

Author SHA1 Message Date
oobabooga 2db36da979 UI: Make scrollbars more discrete in dark mode 2025-05-27 21:00:11 -07:00
Underscore 5028480eba
UI: Add footer buttons for editing messages (#7019)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2025-05-28 00:55:27 -03:00
Underscore 355b5f6c8b
UI: Add message version navigation (#6947)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2025-05-27 22:54:18 -03:00
dependabot[bot] cc9b7253c1
Update transformers requirement in /requirements/full (#7017) 2025-05-26 23:13:10 -03:00
Underscore 8531100109
Fix textbox text usage in methods (#7009) 2025-05-26 22:40:09 -03:00
djholtby 73bfc936a0
Close response generator when stopping API generation (#7014) 2025-05-26 22:39:03 -03:00
oobabooga bae1aa34aa Fix loading Llama-3_3-Nemotron-Super-49B-v1 and similar models (closes #7012) 2025-05-25 17:19:26 -07:00
oobabooga 7f6579ab20 Minor style change 2025-05-20 21:49:44 -07:00
oobabooga 0d3f854778 Improve the style of thinking blocks 2025-05-20 21:40:42 -07:00
oobabooga 8620d6ffe7 Make it possible to upload multiple text files/pdfs at once 2025-05-20 21:34:07 -07:00
oobabooga cc8a4fdcb1 Minor improvement to attachments prompt format 2025-05-20 21:31:18 -07:00
oobabooga 409a48d6bd
Add attachments support (text files, PDF documents) (#7005) 2025-05-21 00:36:20 -03:00
oobabooga 5d00574a56 Minor UI fixes 2025-05-20 16:20:49 -07:00
oobabooga 51c50b265d Update llama.cpp to b7a17463ec 2025-05-20 11:16:12 -07:00
oobabooga 616ea6966d
Store previous reply versions on regenerate (#7004) 2025-05-20 12:51:28 -03:00
Daniel Dengler c25a381540
Add a "Branch here" footer button to chat messages (#6967) 2025-05-20 11:07:40 -03:00
oobabooga 8e10f9894a
Add a metadata field to the chat history & add date/time to chat messages (#7003) 2025-05-20 10:48:46 -03:00
oobabooga 9ec46b8c44 Remove the HQQ loader (HQQ models can be loaded through Transformers) 2025-05-19 09:23:24 -07:00
oobabooga 0c7237e4b7 Update README 2025-05-18 20:01:29 -07:00
oobabooga bad1da99db Merge remote-tracking branch 'refs/remotes/origin/dev' into dev 2025-05-18 14:09:08 -07:00
oobabooga 0c1bc6d1d0 Bump llama.cpp 2025-05-18 14:08:54 -07:00
Tiago Silva 9cd6ea6c0b
Fix Dockerfile in AMD and Intel (#6995) 2025-05-18 18:07:16 -03:00
oobabooga 83bfd5c64b Fix API issues 2025-05-18 12:45:01 -07:00
oobabooga 126b3a768f Revert "Dynamic Chat Message UI Update Speed (#6952)" (for now)
This reverts commit 8137eb8ef4.
2025-05-18 12:38:36 -07:00
oobabooga 9d7a36356d Remove unnecessary js that was causing scrolling issues 2025-05-18 10:56:16 -07:00
oobabooga 2faaf18f1f Add back the "Common values" to the ctx-size slider 2025-05-18 09:06:20 -07:00
oobabooga f1ec6c8662 Minor label changes 2025-05-18 09:04:51 -07:00
oobabooga bd13a8f255 UI: Light theme improvement 2025-05-17 22:31:55 -07:00
oobabooga 076aa67963 Fix API issues 2025-05-17 22:22:18 -07:00
oobabooga 366de4b561 UI: Fix the chat area height when "Show controls" is unchecked 2025-05-17 17:11:38 -07:00
oobabooga 61276f6a37 Merge remote-tracking branch 'refs/remotes/origin/dev' into dev 2025-05-17 07:22:51 -07:00
oobabooga 4800d1d522 More robust VRAM calculation 2025-05-17 07:20:38 -07:00
mamei16 052c82b664
Fix KeyError: 'gpu_layers' when loading existing model settings (#6991) 2025-05-17 11:19:13 -03:00
oobabooga 0f77ff9670 UI: Use total VRAM (not free) for layers calculation when a model is loaded 2025-05-16 19:19:22 -07:00
oobabooga 4bf763e1d9 Multiple small CSS fixes 2025-05-16 18:22:43 -07:00
oobabooga c0e295dd1d Remove the 'None' option from the model menu 2025-05-16 17:53:20 -07:00
oobabooga e3bba510d4 UI: Only add a blank space to streaming messages in instruct mode 2025-05-16 17:49:17 -07:00
oobabooga 71fa046c17 Minor changes after 1c549d176b 2025-05-16 17:38:08 -07:00
oobabooga d99fb0a22a Add backward compatibility with saved n_gpu_layers values 2025-05-16 17:29:18 -07:00
oobabooga 1c549d176b Fix GPU layers slider: honor saved settings and show true maximum 2025-05-16 17:26:13 -07:00
oobabooga e4d3f4449d API: Fix a regression 2025-05-16 13:02:27 -07:00
oobabooga 470c822f44 API: Hide the uvicorn access logs from the terminal 2025-05-16 12:54:39 -07:00
oobabooga adb975a380 Prevent fractional gpu-layers in the UI 2025-05-16 12:52:43 -07:00
oobabooga fc483650b5 Set the maximum gpu_layers value automatically when the model is loaded with --model 2025-05-16 11:58:17 -07:00
oobabooga 38c50087fe Prevent a crash on systems without an NVIDIA GPU 2025-05-16 11:55:30 -07:00
oobabooga 253e85a519 Only compute VRAM/GPU layers for llama.cpp models 2025-05-16 10:02:30 -07:00
oobabooga 9ec9b1bf83 Auto-adjust GPU layers after model unload to utilize freed VRAM 2025-05-16 09:56:23 -07:00
oobabooga ee7b3028ac Always cache GGUF metadata calls 2025-05-16 09:12:36 -07:00
oobabooga 4925c307cf Auto-adjust GPU layers on context size and cache type changes + many fixes 2025-05-16 09:07:38 -07:00
oobabooga 93e1850a2c Only show the VRAM info for llama.cpp 2025-05-15 21:42:15 -07:00