Commit graph

1675 commits

Author SHA1 Message Date
oobabooga
0ef1b8f8b4 Use ExLlamaV2 (instead of the HF one) for EXL2 models for now
It doesn't seem to have the "OverflowError" bug
2025-04-17 05:47:40 -07:00
oobabooga
682c78ea42 Add back detection of GPTQ models (closes #6841) 2025-04-11 21:00:42 -07:00
oobabooga
4ed0da74a8 Remove the obsolete 'multimodal' extension 2025-04-09 20:09:48 -07:00
oobabooga
598568b1ed Revert "UI: remove the streaming cursor"
This reverts commit 6ea0206207.
2025-04-09 16:03:14 -07:00
oobabooga
297a406e05 UI: smoother chat streaming
This removes the throttling associated to gr.Textbox that made words appears in chunks rather than one at a time
2025-04-09 16:02:37 -07:00
oobabooga
6ea0206207 UI: remove the streaming cursor 2025-04-09 14:59:34 -07:00
oobabooga
8b8d39ec4e
Add ExLlamaV3 support (#6832) 2025-04-09 00:07:08 -03:00
oobabooga
bf48ec8c44 Remove an unnecessary UI message 2025-04-07 17:43:41 -07:00
oobabooga
a5855c345c
Set context lengths to at most 8192 by default (to prevent out of memory errors) (#6835) 2025-04-07 21:42:33 -03:00
oobabooga
109de34e3b Remove the old --model-menu flag 2025-03-31 09:24:03 -07:00
oobabooga
758c3f15a5 Lint 2025-03-14 20:04:43 -07:00
oobabooga
5bcd2d7ad0
Add the top N-sigma sampler (#6796) 2025-03-14 16:45:11 -03:00
oobabooga
26317a4c7e Fix jinja2 error while loading c4ai-command-a-03-2025 2025-03-14 10:59:05 -07:00
Kelvie Wong
16fa9215c4
Fix OpenAI API with new param (show_after), closes #6747 (#6749)
---------

Co-authored-by: oobabooga <oobabooga4@gmail.com>
2025-02-18 12:01:30 -03:00
oobabooga
dba17c40fc Make transformers 4.49 functional 2025-02-17 17:31:11 -08:00
SamAcctX
f28f39792d
update deprecated deepspeed import for transformers 4.46+ (#6725) 2025-02-02 20:41:36 -03:00
oobabooga
c6f2c2fd7e UI: style improvements 2025-02-02 15:34:03 -08:00
oobabooga
0360f54ae8 UI: add a "Show after" parameter (to use with DeepSeek </think>) 2025-02-02 15:30:09 -08:00
oobabooga
f01cc079b9 Lint 2025-01-29 14:00:59 -08:00
oobabooga
75ff3f3815 UI: Mention common context length values 2025-01-25 08:22:23 -08:00
FP HAM
71a551a622
Add strftime_now to JINJA to sattisfy LLAMA 3.1 and 3.2 (and granite) (#6692) 2025-01-24 11:37:20 -03:00
oobabooga
0485ff20e8 Workaround for convert_to_markdown bug 2025-01-23 06:21:40 -08:00
oobabooga
39799adc47 Add a helpful error message when llama.cpp fails to load the model 2025-01-21 12:49:12 -08:00
oobabooga
5e99dded4e UI: add "Continue" and "Remove" buttons below the last chat message 2025-01-21 09:05:44 -08:00
oobabooga
0258a6f877 Fix the Google Colab notebook 2025-01-16 05:21:18 -08:00
oobabooga
1ef748fb20 Lint 2025-01-14 16:44:15 -08:00
oobabooga
f843cb475b UI: update a help message 2025-01-14 08:12:51 -08:00
oobabooga
c832953ff7 UI: Activate auto_max_new_tokens by default 2025-01-14 05:59:55 -08:00
Underscore
53b838d6c5
HTML: Fix quote pair RegEx matching for all quote types (#6661) 2025-01-13 18:01:50 -03:00
oobabooga
c85e5e58d0 UI: move the new morphdom code to a .js file 2025-01-13 06:20:42 -08:00
oobabooga
facb4155d4 Fix morphdom leaving ghost elements behind 2025-01-11 20:57:28 -08:00
oobabooga
a0492ce325
Optimize syntax highlighting during chat streaming (#6655) 2025-01-11 21:14:10 -03:00
mamei16
f1797f4323
Unescape backslashes in html_output (#6648) 2025-01-11 18:39:44 -03:00
oobabooga
1b9121e5b8 Add a "refresh" button below the last message, add a missing file 2025-01-11 12:42:25 -08:00
oobabooga
a5d64b586d
Add a "copy" button below each message (#6654) 2025-01-11 16:59:21 -03:00
oobabooga
3a722a36c8
Use morphdom to make chat streaming 1902381098231% faster (#6653) 2025-01-11 12:55:19 -03:00
oobabooga
d2f6c0f65f Update README 2025-01-10 13:25:40 -08:00
oobabooga
c393f7650d Update settings-template.yaml, organize modules/shared.py 2025-01-10 13:22:18 -08:00
oobabooga
83c426e96b
Organize internals (#6646) 2025-01-10 18:04:32 -03:00
oobabooga
7fe46764fb Improve the --help message about --tensorcores as well 2025-01-10 07:07:41 -08:00
oobabooga
da6d868f58 Remove old deprecated flags (~6 months or more) 2025-01-09 16:11:46 -08:00
oobabooga
f3c0f964a2 Lint 2025-01-09 13:18:23 -08:00
oobabooga
3020f2e5ec UI: improve the info message about --tensorcores 2025-01-09 12:44:03 -08:00
oobabooga
c08d87b78d Make the huggingface loader more readable 2025-01-09 12:23:38 -08:00
BPplays
619265b32c
add ipv6 support to the API (#6559) 2025-01-09 10:23:44 -03:00
oobabooga
5c89068168 UI: add an info message for the new Static KV cache option 2025-01-08 17:36:30 -08:00
nclok1405
b9e2ded6d4
Added UnicodeDecodeError workaround for modules/llamacpp_model.py (#6040)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2025-01-08 21:17:31 -03:00
oobabooga
91a8a87887 Remove obsolete code 2025-01-08 15:07:21 -08:00
oobabooga
7157257c3f
Remove the AutoGPTQ loader (#6641) 2025-01-08 19:28:56 -03:00
oobabooga
c0f600c887 Add a --torch-compile flag for transformers 2025-01-05 05:47:00 -08:00