oobabooga
c393f7650d
Update settings-template.yaml, organize modules/shared.py
2025-01-10 13:22:18 -08:00
oobabooga
83c426e96b
Organize internals ( #6646 )
2025-01-10 18:04:32 -03:00
oobabooga
7fe46764fb
Improve the --help message about --tensorcores as well
2025-01-10 07:07:41 -08:00
oobabooga
da6d868f58
Remove old deprecated flags (~6 months or more)
2025-01-09 16:11:46 -08:00
oobabooga
f3c0f964a2
Lint
2025-01-09 13:18:23 -08:00
oobabooga
3020f2e5ec
UI: improve the info message about --tensorcores
2025-01-09 12:44:03 -08:00
oobabooga
c08d87b78d
Make the huggingface loader more readable
2025-01-09 12:23:38 -08:00
BPplays
619265b32c
add ipv6 support to the API ( #6559 )
2025-01-09 10:23:44 -03:00
oobabooga
5c89068168
UI: add an info message for the new Static KV cache option
2025-01-08 17:36:30 -08:00
nclok1405
b9e2ded6d4
Added UnicodeDecodeError workaround for modules/llamacpp_model.py ( #6040 )
...
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2025-01-08 21:17:31 -03:00
oobabooga
91a8a87887
Remove obsolete code
2025-01-08 15:07:21 -08:00
oobabooga
7157257c3f
Remove the AutoGPTQ loader ( #6641 )
2025-01-08 19:28:56 -03:00
oobabooga
c0f600c887
Add a --torch-compile flag for transformers
2025-01-05 05:47:00 -08:00
oobabooga
11af199aff
Add a "Static KV cache" option for transformers
2025-01-04 17:52:57 -08:00
oobabooga
3967520e71
Connect XTC, DRY, smoothing_factor, and dynatemp to ExLlamaV2 loader (non-HF)
2025-01-04 16:25:06 -08:00
oobabooga
049297fa66
UI: reduce the size of CSS sent to the UI during streaming
2025-01-04 14:09:36 -08:00
oobabooga
0e673a7a42
UI: reduce the size of HTML sent to the UI during streaming
2025-01-04 11:40:24 -08:00
mamei16
9f24885bd2
Sane handling of markdown lists ( #6626 )
2025-01-04 15:41:31 -03:00
oobabooga
4b3e1b3757
UI: add a "Search chats" input field
2025-01-02 18:46:40 -08:00
oobabooga
b8fc9010fa
UI: fix orjson.JSONDecodeError error on page reload
2025-01-02 16:57:04 -08:00
oobabooga
75f1b5ccde
UI: add a "Branch chat" button
2025-01-02 16:24:18 -08:00
Petr Korolev
13c033c745
Fix CUDA error on MPS backend during API request ( #6572 )
...
---------
Co-authored-by: oobabooga <oobabooga4@gmail.com>
2025-01-02 00:06:11 -03:00
oobabooga
725639118a
UI: Use a tab length of 2 for lists (rather than 4)
2025-01-01 13:53:50 -08:00
oobabooga
7b88724711
Make responses start faster by removing unnecessary cleanup calls ( #6625 )
2025-01-01 18:33:38 -03:00
oobabooga
64853f8509
Reapply a necessary change that I removed from #6599 (thanks @mamei16!)
2024-12-31 14:43:22 -08:00
mamei16
e953af85cd
Fix newlines in the markdown renderer ( #6599 )
...
---------
Co-authored-by: oobabooga <oobabooga4@gmail.com>
2024-12-31 01:04:02 -03:00
oobabooga
39a5c9a49c
UI organization ( #6618 )
2024-12-29 11:16:17 -03:00
oobabooga
0490ee620a
UI: increase the threshold for a <li> to be considered long (some more)
2024-12-19 16:51:34 -08:00
oobabooga
89888bef56
UI: increase the threshold for a <li> to be considered long
2024-12-19 14:38:36 -08:00
oobabooga
2acec386fc
UI: improve the streaming cursor
2024-12-19 14:08:56 -08:00
oobabooga
e2fb86e5df
UI: further improve the style of lists and headings
2024-12-19 13:59:24 -08:00
oobabooga
c48e4622e8
UI: update a link
2024-12-18 06:28:14 -08:00
oobabooga
b27f6f8915
Lint
2024-12-17 20:13:32 -08:00
oobabooga
b051e2c161
UI: improve a margin for readability
2024-12-17 19:58:21 -08:00
oobabooga
60c93e0c66
UI: Set cache_type to fp16 by default
2024-12-17 19:44:20 -08:00
oobabooga
ddccc0d657
UI: minor change to log messages
2024-12-17 19:39:00 -08:00
oobabooga
3030c79e8c
UI: show progress while loading a model
2024-12-17 19:37:43 -08:00
Diner Burger
addad3c63e
Allow more granular KV cache settings ( #6561 )
2024-12-17 17:43:48 -03:00
oobabooga
c43ee5db11
UI: very minor color change
2024-12-17 07:59:55 -08:00
oobabooga
d769618591
Improved UI ( #6575 )
2024-12-17 00:47:41 -03:00
oobabooga
350758f81c
UI: Fix the history upload event
2024-11-19 20:34:53 -08:00
oobabooga
d01293861b
Merge remote-tracking branch 'refs/remotes/origin/dev' into dev
2024-11-18 10:15:36 -08:00
oobabooga
3d19746a5d
UI: improve HTML rendering for lists with sub-lists
2024-11-18 10:14:09 -08:00
mefich
1c937dad72
Filter whitespaces in downloader fields in model tab ( #6518 )
2024-11-18 12:01:40 -03:00
PIRI
e1061ba7e3
Make token bans work again on HF loaders ( #6488 )
2024-10-24 15:24:02 -03:00
oobabooga
2468cfd8bb
Merge remote-tracking branch 'refs/remotes/origin/dev' into dev
2024-10-14 13:25:27 -07:00
oobabooga
bb62e796eb
Fix locally compiled llama-cpp-python failing to import
2024-10-14 13:24:13 -07:00
oobabooga
c9a9f63d1b
Fix llama.cpp loader not being random (thanks @reydeljuego12345)
2024-10-14 13:07:07 -07:00
PIRI
03a2e70054
Fix temperature_last when temperature not in sampler priority ( #6439 )
2024-10-09 11:25:14 -03:00
oobabooga
49dfa0adaf
Fix the "save preset" event
2024-10-01 11:20:48 -07:00
oobabooga
93c250b9b6
Add a UI element for enable_tp
2024-10-01 11:16:15 -07:00
oobabooga
cca9d6e22d
Lint
2024-10-01 10:21:06 -07:00
oobabooga
4d9ce586d3
Update llama_cpp_python_hijack.py, fix llamacpp_hf
2024-09-30 14:49:21 -07:00
oobabooga
bbdeed3cf4
Make sampler priority high if unspecified
2024-09-29 20:45:27 -07:00
Manuel Schmid
0f90a1b50f
Do not set value for histories in chat when --multi-user is used ( #6317 )
2024-09-29 01:08:55 -03:00
oobabooga
c61b29b9ce
Simplify the warning when flash-attn fails to import
2024-09-28 20:33:17 -07:00
oobabooga
b92d7fd43e
Add warnings for when AutoGPTQ, TensorRT-LLM, or HQQ are missing
2024-09-28 20:30:24 -07:00
oobabooga
7276dca933
Fix a typo
2024-09-27 20:28:17 -07:00
RandoInternetPreson
46996f6519
ExllamaV2 tensor parallelism to increase multi gpu inference speeds ( #6356 )
2024-09-28 00:26:03 -03:00
Philipp Emanuel Weidmann
301375834e
Exclude Top Choices (XTC): A sampler that boosts creativity, breaks writing clichés, and inhibits non-verbatim repetition ( #6335 )
2024-09-27 22:50:12 -03:00
oobabooga
5c918c5b2d
Make it possible to sort DRY
2024-09-27 15:40:48 -07:00
oobabooga
7424f789bf
Fix the sampling monkey patch (and add more options to sampler_priority) ( #6411 )
2024-09-27 19:03:25 -03:00
oobabooga
bba5b36d33
Don't import PEFT unless necessary
2024-09-03 19:40:53 -07:00
oobabooga
c5b40eb555
llama.cpp: prevent prompt evaluation progress bar with just 1 step
2024-09-03 17:37:06 -07:00
GralchemOz
4c74c7a116
Fix UnicodeDecodeError for BPE-based Models (especially GLM-4) ( #6357 )
2024-09-02 23:00:59 -03:00
oobabooga
fd9cb26619
UI: update the DRY parameters descriptions/order
2024-08-19 19:40:17 -07:00
oobabooga
e926c03b3d
Add a --tokenizer-dir command-line flag for llamacpp_HF
2024-08-06 19:41:18 -07:00
oobabooga
30b4d8c8b2
Fix Llama 3.1 template including lengthy "tools" headers
2024-07-29 11:52:17 -07:00
oobabooga
9dcff21da9
Remove unnecessary shared.previous_model_name variable
2024-07-28 18:35:11 -07:00
oobabooga
514fb2e451
Fix UI error caused by --idle-timeout
2024-07-28 18:30:06 -07:00
oobabooga
5223c009fe
Minor change after previous commit
2024-07-27 23:13:34 -07:00
oobabooga
7050bb880e
UI: make n_ctx/max_seq_len/truncation_length numbers rather than sliders
2024-07-27 23:11:53 -07:00
Harry
078e8c8969
Make compress_pos_emb float ( #6276 )
2024-07-28 03:03:19 -03:00
oobabooga
ffc713f72b
UI: fix multiline LaTeX equations
2024-07-27 15:36:10 -07:00
oobabooga
493f8c3242
UI: remove animation after clicking on "Stop" in the Chat tab
2024-07-27 15:22:34 -07:00
oobabooga
e4d411b841
UI: fix rendering LaTeX enclosed between \[ and \]
2024-07-27 15:21:44 -07:00
oobabooga
f32d26240d
UI: Fix the chat "stop" event
2024-07-26 23:03:05 -07:00
oobabooga
b80d5906c2
UI: fix saving characters
2024-07-25 15:09:31 -07:00
oobabooga
42e80108f5
UI: clear the markdown LRU cache when using the default/notebook tabs
2024-07-25 08:01:42 -07:00
oobabooga
7e2851e505
UI: fix "Command for chat-instruct mode" not appearing by default
2024-07-24 15:04:12 -07:00
oobabooga
947016d010
UI: make the markdown LRU cache infinite (for really long conversations)
2024-07-24 11:54:26 -07:00
oobabooga
e637b702ff
UI: make text between quotes colored in chat mode
2024-07-23 21:30:32 -07:00
oobabooga
1815877061
UI: fix the default character not loading correctly on startup
2024-07-23 18:48:10 -07:00
oobabooga
e6181e834a
Remove AutoAWQ as a standalone loader
...
(it works better through transformers)
2024-07-23 15:31:17 -07:00
oobabooga
f18c947a86
Update the tensorcores description
2024-07-22 18:06:41 -07:00
oobabooga
aa809e420e
Bump llama-cpp-python to 0.2.83, add back tensorcore wheels
...
Also add back the progress bar patch
2024-07-22 18:05:11 -07:00
oobabooga
11bbf71aa5
Bump back llama-cpp-python ( #6257 )
2024-07-22 16:19:41 -03:00
oobabooga
0f53a736c1
Revert the llama-cpp-python update
2024-07-22 12:02:25 -07:00
oobabooga
a687f950ba
Remove the tensorcores llama.cpp wheels
...
They are not faster than the default wheels anymore and they use a lot of space.
2024-07-22 11:54:35 -07:00
oobabooga
017d2332ea
Remove no longer necessary llama-cpp-python patch
2024-07-22 11:50:36 -07:00
oobabooga
f2d802e707
UI: make Default/Notebook contents persist on page reload
2024-07-22 11:07:10 -07:00
oobabooga
8768b69a2d
Lint
2024-07-21 22:08:14 -07:00
oobabooga
79e8dbe45f
UI: minor optimization
2024-07-21 22:06:49 -07:00
oobabooga
7ef2414357
UI: Make the file saving dialogs more robust
2024-07-21 15:38:20 -07:00
oobabooga
423372d6e7
Organize ui_file_saving.py
2024-07-21 13:23:18 -07:00
oobabooga
17df2d7bdf
UI: don't export the instruction template on "Save UI defaults to settings.yaml"
2024-07-21 10:45:01 -07:00
oobabooga
d05846eae5
UI: refresh the pfp cache on handle_your_picture_change
2024-07-21 10:17:22 -07:00
oobabooga
e9d4bff7d0
Update the --tensor_split description
2024-07-20 22:04:48 -07:00
oobabooga
916d1d8283
UI: improve the style of code blocks in light theme
2024-07-20 20:32:57 -07:00
oobabooga
564d8c8c0d
Make alpha_value a float number
2024-07-20 20:02:54 -07:00
oobabooga
79c4d3da3d
Optimize the UI ( #6251 )
2024-07-21 00:01:42 -03:00
Alberto Cano
a14c510afb
Customize the subpath for gradio, use with reverse proxy ( #5106 )
2024-07-20 19:10:39 -03:00
Vhallo
a9a6d72d8c
Use gr.Number for RoPE scaling parameters ( #6233 )
...
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-07-20 18:57:09 -03:00
oobabooga
aa7c14a463
Use chat-instruct mode by default
2024-07-19 21:43:52 -07:00
InvectorGator
4148a9201f
Fix for MacOS users encountering model load errors ( #6227 )
...
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
Co-authored-by: Invectorgator <Kudzu12gaming@outlook.com>
2024-07-13 00:04:19 -03:00
oobabooga
e436d69e2b
Add --no_xformers and --no_sdpa flags for ExllamaV2
2024-07-11 15:47:37 -07:00
oobabooga
512b311137
Improve the llama-cpp-python exception messages
2024-07-11 13:00:29 -07:00
oobabooga
f957b17d18
UI: update an obsolete message
2024-07-10 06:01:36 -07:00
oobabooga
c176244327
UI: Move cache_8bit/cache_4bit further up
2024-07-05 12:16:21 -07:00
oobabooga
aa653e3b5a
Prevent llama.cpp from being monkey patched more than once ( closes #6201 )
2024-07-05 03:34:15 -07:00
oobabooga
a210e61df1
UI: Fix broken chat histories not showing ( closes #6196 )
2024-07-04 20:31:25 -07:00
oobabooga
e79e7b90dc
UI: Move the cache_8bit and cache_4bit elements up
2024-07-04 20:21:28 -07:00
oobabooga
8b44d7b12a
Lint
2024-07-04 20:16:44 -07:00
oobabooga
a47de06088
Force only 1 llama-cpp-python version at a time for now
2024-07-04 19:43:34 -07:00
oobabooga
f243b4ca9c
Make llama-cpp-python not crash immediately
2024-07-04 19:16:00 -07:00
oobabooga
907137a13d
Automatically set bf16 & use_eager_attention for Gemma-2
2024-07-01 21:46:35 -07:00
GralchemOz
8a39f579d8
transformers: Add eager attention option to make Gemma-2 work properly ( #6188 )
2024-07-01 12:08:08 -03:00
oobabooga
ed01322763
Obtain the EOT token from the jinja template (attempt)
...
To use as a stopping string.
2024-06-30 15:09:22 -07:00
oobabooga
4ea260098f
llama.cpp: add 4-bit/8-bit kv cache options
2024-06-29 09:10:33 -07:00
oobabooga
220c1797fc
UI: do not show the "save character" button in the Chat tab
2024-06-28 22:11:31 -07:00
oobabooga
8803ae1845
UI: decrease the number of lines for "Command for chat-instruct mode"
2024-06-28 21:41:30 -07:00
oobabooga
5c6b9c610d
UI: allow the character dropdown to coexist in the Chat tab and the Parameters tab ( #6177 )
2024-06-29 01:20:27 -03:00
oobabooga
de69a62004
Revert "UI: move "Character" dropdown to the main Chat tab"
...
This reverts commit 83534798b2 .
2024-06-28 15:38:11 -07:00
oobabooga
38d58764db
UI: remove unused gr.State variable from the Default tab
2024-06-28 15:17:44 -07:00
oobabooga
da196707cf
UI: improve the light theme a bit
2024-06-27 21:05:38 -07:00
oobabooga
9dbcb1aeea
Small fix to make transformers 4.42 functional
2024-06-27 17:05:29 -07:00
oobabooga
8ec8bc0b85
UI: handle another edge case while streaming lists
2024-06-26 18:40:43 -07:00
oobabooga
0e138e4be1
Merge remote-tracking branch 'refs/remotes/origin/dev' into dev
2024-06-26 18:30:08 -07:00
mefich
a85749dcbe
Update models_settings.py: add default alpha_value, add proper compress_pos_emb for newer GGUFs ( #6111 )
2024-06-26 22:17:56 -03:00
oobabooga
5fe532a5ce
UI: remove DRY info text
...
It was visible for loaders without DRY.
2024-06-26 15:33:11 -07:00
oobabooga
b1187fc9a5
UI: prevent flickering while streaming lists / bullet points
2024-06-25 19:19:45 -07:00
oobabooga
3691451d00
Add back the "Rename chat" feature ( #6161 )
2024-06-25 22:28:58 -03:00
oobabooga
ac3f92d36a
UI: store chat history in the browser
2024-06-25 14:18:07 -07:00
oobabooga
46ca15cb79
Minor bug fixes after e7e1f5901e
2024-06-25 11:49:33 -07:00
oobabooga
83534798b2
UI: move "Character" dropdown to the main Chat tab
2024-06-25 11:25:57 -07:00
oobabooga
279cba607f
UI: don't show an animation when updating the "past chats" menu
2024-06-25 11:10:17 -07:00
oobabooga
3290edfad9
Bug fix: force chat history to be loaded on launch
2024-06-25 11:06:05 -07:00
oobabooga
e7e1f5901e
Prompts in the "past chats" menu ( #6160 )
2024-06-25 15:01:43 -03:00
oobabooga
a43c210617
Improved past chats menu ( #6158 )
2024-06-25 00:07:22 -03:00
oobabooga
96ba53d916
Handle another fix after 57119c1b30
2024-06-24 15:51:12 -07:00
oobabooga
577a8cd3ee
Add TensorRT-LLM support ( #5715 )
2024-06-24 02:30:03 -03:00
oobabooga
536f8d58d4
Do not expose alpha_value to llama.cpp & rope_freq_base to transformers
...
To avoid confusion
2024-06-23 22:09:24 -07:00
oobabooga
b48ab482f8
Remove obsolete "gptq_for_llama_info" message
2024-06-23 22:05:19 -07:00
oobabooga
5e8dc56f8a
Fix after previous commit
2024-06-23 21:58:28 -07:00
Louis Del Valle
57119c1b30
Update block_requests.py to resolve unexpected type error (500 error) ( #5976 )
2024-06-24 01:56:51 -03:00
CharlesCNorton
5993904acf
Fix several typos in the codebase ( #6151 )
2024-06-22 21:40:25 -03:00
GodEmperor785
2c5a9eb597
Change limits of RoPE scaling sliders in UI ( #6142 )
2024-06-19 21:42:17 -03:00
Guanghua Lu
229d89ccfb
Make logs more readable, no more \u7f16\u7801 ( #6127 )
2024-06-15 23:00:13 -03:00
Forkoz
1576227f16
Fix GGUFs with no BOS token present, mainly qwen2 models. ( #6119 )
...
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-06-14 13:51:01 -03:00
oobabooga
10601850d9
Fix after previous commit
2024-06-13 19:54:12 -07:00
oobabooga
0f3a423de1
Alternative solution to "get next logits" deadlock ( #6106 )
2024-06-13 19:34:16 -07:00
oobabooga
386500aa37
Avoid unnecessary calls UI -> backend, to make it faster
2024-06-12 20:52:42 -07:00
Forkoz
1d79aa67cf
Fix flash-attn UI parameter to actually store true. ( #6076 )
2024-06-13 00:34:54 -03:00
Belladore
3abafee696
DRY sampler improvements ( #6053 )
2024-06-12 23:39:11 -03:00
oobabooga
a36fa73071
Lint
2024-06-12 19:00:21 -07:00
oobabooga
2d196ed2fe
Remove obsolete pre_layer parameter
2024-06-12 18:56:44 -07:00
Belladore
46174a2d33
Fix error when bos_token_id is None. ( #6061 )
2024-06-12 22:52:27 -03:00
Belladore
a363cdfca1
Fix missing bos token for some models (including Llama-3) ( #6050 )
2024-05-27 09:21:30 -03:00
oobabooga
8df68b05e9
Remove MinPLogitsWarper (it's now a transformers built-in)
2024-05-27 05:03:30 -07:00
oobabooga
4f1e96b9e3
Downloader: Add --model-dir argument, respect --model-dir in the UI
2024-05-23 20:42:46 -07:00
oobabooga
ad54d524f7
Revert "Fix stopping strings for llama-3 and phi ( #6043 )"
...
This reverts commit 5499bc9bc8 .
2024-05-22 17:18:08 -07:00
oobabooga
5499bc9bc8
Fix stopping strings for llama-3 and phi ( #6043 )
2024-05-22 13:53:59 -03:00
oobabooga
9e189947d1
Minor fix after bd7cc4234d (thanks @belladoreai)
2024-05-21 10:37:30 -07:00
oobabooga
ae86292159
Fix getting Phi-3-small-128k-instruct logits
2024-05-21 10:35:00 -07:00
oobabooga
bd7cc4234d
Backend cleanup ( #6025 )
2024-05-21 13:32:02 -03:00
Philipp Emanuel Weidmann
852c943769
DRY: A modern repetition penalty that reliably prevents looping ( #5677 )
2024-05-19 23:53:47 -03:00
oobabooga
9f77ed1b98
--idle-timeout flag to unload the model if unused for N minutes ( #6026 )
2024-05-19 23:29:39 -03:00
altoiddealer
818b4e0354
Let grammar escape backslashes ( #5865 )
2024-05-19 20:26:09 -03:00
Tisjwlf
907702c204
Fix gguf multipart file loading ( #5857 )
2024-05-19 20:22:09 -03:00
A0nameless0man
5cb59707f3
fix: grammar not support utf-8 ( #5900 )
2024-05-19 20:10:39 -03:00
Samuel Wein
b63dc4e325
UI: Warn user if they are trying to load a model from no path ( #6006 )
2024-05-19 20:05:17 -03:00
chr
6b546a2c8b
llama.cpp: increase the max threads from 32 to 256 ( #5889 )
2024-05-19 20:02:19 -03:00
oobabooga
a38a37b3b3
llama.cpp: default n_gpu_layers to the maximum value for the model automatically
2024-05-19 10:57:42 -07:00
oobabooga
a4611232b7
Make --verbose output less spammy
2024-05-18 09:57:00 -07:00
oobabooga
e9c9483171
Improve the logging messages while loading models
2024-05-03 08:10:44 -07:00
oobabooga
e61055253c
Bump llama-cpp-python to 0.2.69, add --flash-attn option
2024-05-03 04:31:22 -07:00
oobabooga
51fb766bea
Add back my llama-cpp-python wheels, bump to 0.2.65 ( #5964 )
2024-04-30 09:11:31 -03:00
oobabooga
dfdb6fee22
Set llm_int8_enable_fp32_cpu_offload=True for --load-in-4bit
...
To allow for 32-bit CPU offloading (it's very slow).
2024-04-26 09:39:27 -07:00
oobabooga
70845c76fb
Add back the max_updates_second parameter ( #5937 )
2024-04-26 10:14:51 -03:00
oobabooga
6761b5e7c6
Improved instruct style (with syntax highlighting & LaTeX rendering) ( #5936 )
2024-04-26 10:13:11 -03:00
oobabooga
4094813f8d
Lint
2024-04-24 09:53:41 -07:00
oobabooga
64e2a9a0a7
Fix the Phi-3 template when used in the UI
2024-04-24 01:34:11 -07:00
oobabooga
f0538efb99
Remove obsolete --tensorcores references
2024-04-24 00:31:28 -07:00
Colin
f3c9103e04
Revert walrus operator for params['max_memory'] ( #5878 )
2024-04-24 01:09:14 -03:00
oobabooga
9b623b8a78
Bump llama-cpp-python to 0.2.64, use official wheels ( #5921 )
2024-04-23 23:17:05 -03:00
oobabooga
f27e1ba302
Add a /v1/internal/chat-prompt endpoint ( #5879 )
2024-04-19 00:24:46 -03:00
oobabooga
e158299fb4
Fix loading sharted GGUF models through llamacpp_HF
2024-04-11 14:50:05 -07:00
wangshuai09
fd4e46bce2
Add Ascend NPU support (basic) ( #5541 )
2024-04-11 18:42:20 -03:00
Ashley Kleynhans
70c637bf90
Fix saving of UI defaults to settings.yaml - Fixes #5592 ( #5794 )
2024-04-11 18:19:16 -03:00
oobabooga
3e3a7c4250
Bump llama-cpp-python to 0.2.61 & fix the crash
2024-04-11 14:15:34 -07:00
Victorivus
c423d51a83
Fix issue #5783 for character images with transparency ( #5827 )
2024-04-11 02:23:43 -03:00
Alex O'Connell
b94cd6754e
UI: Respect model and lora directory settings when downloading files ( #5842 )
2024-04-11 01:55:02 -03:00
oobabooga
17c4319e2d
Fix loading command-r context length metadata
2024-04-10 21:39:59 -07:00
oobabooga
cbd65ba767
Add a simple min_p preset, make it the default ( #5836 )
2024-04-09 12:50:16 -03:00
oobabooga
d02744282b
Minor logging change
2024-04-06 18:56:58 -07:00
oobabooga
dd6e4ac55f
Prevent double <BOS_TOKEN> with Command R+
2024-04-06 13:14:32 -07:00
oobabooga
1bdceea2d4
UI: Focus on the chat input after starting a new chat
2024-04-06 12:57:57 -07:00
oobabooga
168a0f4f67
UI: do not load the "gallery" extension by default
2024-04-06 12:43:21 -07:00
oobabooga
64a76856bd
Metadata: Fix loading Command R+ template with multiple options
2024-04-06 07:32:17 -07:00
oobabooga
1b87844928
Minor fix
2024-04-05 18:43:43 -07:00
oobabooga
6b7f7555fc
Logging message to make transformers loader a bit more transparent
2024-04-05 18:40:02 -07:00
oobabooga
0f536dd97d
UI: Fix the "Show controls" action
2024-04-05 12:18:33 -07:00
oobabooga
308452b783
Bitsandbytes: load preconverted 4bit models without additional flags
2024-04-04 18:10:24 -07:00
oobabooga
d423021a48
Remove CTransformers support ( #5807 )
2024-04-04 20:23:58 -03:00
oobabooga
13fe38eb27
Remove specialized code for gpt-4chan
2024-04-04 16:11:47 -07:00
oobabooga
9ab7365b56
Read rope_theta for DBRX model (thanks turboderp)
2024-04-01 20:25:31 -07:00
oobabooga
db5f6cd1d8
Fix ExLlamaV2 loaders using unnecessary "bits" metadata
2024-03-30 21:51:39 -07:00
oobabooga
624faa1438
Fix ExLlamaV2 context length setting ( closes #5750 )
2024-03-30 21:33:16 -07:00
oobabooga
9653a9176c
Minor improvements to Parameters tab
2024-03-29 10:41:24 -07:00
oobabooga
35da6b989d
Organize the parameters tab ( #5767 )
2024-03-28 16:45:03 -03:00
Yiximail
8c9aca239a
Fix prompt incorrectly set to empty when suffix is empty string ( #5757 )
2024-03-26 16:33:09 -03:00
oobabooga
2a92a842ce
Bump gradio to 4.23 ( #5758 )
2024-03-26 16:32:20 -03:00
oobabooga
49b111e2dd
Lint
2024-03-17 08:33:23 -07:00
oobabooga
d890c99b53
Fix StreamingLLM when content is removed from the beginning of the prompt
2024-03-14 09:18:54 -07:00
oobabooga
d828844a6f
Small fix: don't save truncation_length to settings.yaml
...
It should derive from model metadata or from a command-line flag.
2024-03-14 08:56:28 -07:00
oobabooga
2ef5490a36
UI: make light theme less blinding
2024-03-13 08:23:16 -07:00
oobabooga
40a60e0297
Convert attention_sink_size to int ( closes #5696 )
2024-03-13 08:15:49 -07:00
oobabooga
edec3bf3b0
UI: avoid caching convert_to_markdown calls during streaming
2024-03-13 08:14:34 -07:00
oobabooga
8152152dd6
Small fix after 28076928ac
2024-03-11 19:56:35 -07:00
oobabooga
28076928ac
UI: Add a new "User description" field for user personality/biography ( #5691 )
2024-03-11 23:41:57 -03:00
oobabooga
63701f59cf
UI: mention that n_gpu_layers > 0 is necessary for the GPU to be used
2024-03-11 18:54:15 -07:00
oobabooga
46031407b5
Increase the cache size of convert_to_markdown to 4096
2024-03-11 18:43:04 -07:00
oobabooga
9eca197409
Minor logging change
2024-03-11 16:31:13 -07:00
oobabooga
afadc787d7
Optimize the UI by caching convert_to_markdown calls
2024-03-10 20:10:07 -07:00
oobabooga
056717923f
Document StreamingLLM
2024-03-10 19:15:23 -07:00
oobabooga
15d90d9bd5
Minor logging change
2024-03-10 18:20:50 -07:00
oobabooga
cf0697936a
Optimize StreamingLLM by over 10x
2024-03-08 21:48:28 -08:00
oobabooga
afb51bd5d6
Add StreamingLLM for llamacpp & llamacpp_HF (2nd attempt) ( #5669 )
2024-03-09 00:25:33 -03:00
oobabooga
549bb88975
Increase height of "Custom stopping strings" UI field
2024-03-08 12:54:30 -08:00
oobabooga
238f69accc
Move "Command for chat-instruct mode" to the main chat tab ( closes #5634 )
2024-03-08 12:52:52 -08:00
oobabooga
bae14c8f13
Right-truncate long chat completion prompts instead of left-truncating
...
Instructions are usually at the beginning of the prompt.
2024-03-07 08:50:24 -08:00
Bartowski
104573f7d4
Update cache_4bit documentation ( #5649 )
...
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-03-07 13:08:21 -03:00
oobabooga
2ec1d96c91
Add cache_4bit option for ExLlamaV2 ( #5645 )
2024-03-06 23:02:25 -03:00
oobabooga
2174958362
Revert gradio to 3.50.2 ( #5640 )
2024-03-06 11:52:46 -03:00
oobabooga
d61e31e182
Save the extensions after Gradio 4 ( #5632 )
2024-03-05 07:54:34 -03:00
oobabooga
63a1d4afc8
Bump gradio to 4.19 ( #5522 )
2024-03-05 07:32:28 -03:00
oobabooga
f697cb4609
Move update_wizard_windows.sh to update_wizard_windows.bat (oops)
2024-03-04 19:26:24 -08:00
kalomaze
cfb25c9b3f
Cubic sampling w/ curve param ( #5551 )
...
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-03-03 13:22:21 -03:00
oobabooga
09b13acfb2
Perplexity evaluation: print to terminal after calculation is finished
2024-02-28 19:58:21 -08:00
oobabooga
4164e29416
Block the "To create a public link, set share=True" gradio message
2024-02-25 15:06:08 -08:00
oobabooga
d34126255d
Fix loading extensions with "-" in the name ( closes #5557 )
2024-02-25 09:24:52 -08:00
oobabooga
10aedc329f
Logging: more readable messages when renaming chat histories
2024-02-22 07:57:06 -08:00
oobabooga
faf3bf2503
Perplexity evaluation: make UI events more robust (attempt)
2024-02-22 07:13:22 -08:00
oobabooga
ac5a7a26ea
Perplexity evaluation: add some informative error messages
2024-02-21 20:20:52 -08:00
oobabooga
59032140b5
Fix CFG with llamacpp_HF (2nd attempt)
2024-02-19 18:35:42 -08:00
oobabooga
c203c57c18
Fix CFG with llamacpp_HF
2024-02-19 18:09:49 -08:00
oobabooga
ae05d9830f
Replace {{char}}, {{user}} in the chat template itself
2024-02-18 19:57:54 -08:00
oobabooga
1f27bef71b
Move chat UI elements to the right on desktop ( #5538 )
2024-02-18 14:32:05 -03:00
oobabooga
d6bd71db7f
ExLlamaV2: fix loading when autosplit is not set
2024-02-17 12:54:37 -08:00
oobabooga
af0bbf5b13
Lint
2024-02-17 09:01:04 -08:00
oobabooga
a6730f88f7
Add --autosplit flag for ExLlamaV2 ( #5524 )
2024-02-16 15:26:10 -03:00
oobabooga
4039999be5
Autodetect llamacpp_HF loader when tokenizer exists
2024-02-16 09:29:26 -08:00
oobabooga
76d28eaa9e
Add a menu for customizing the instruction template for the model ( #5521 )
2024-02-16 14:21:17 -03:00
oobabooga
0e1d8d5601
Instruction template: make "Send to default/notebook" work without a tokenizer
2024-02-16 08:01:07 -08:00
oobabooga
44018c2f69
Add a "llamacpp_HF creator" menu ( #5519 )
2024-02-16 12:43:24 -03:00
oobabooga
b2b74c83a6
Fix Qwen1.5 in llamacpp_HF
2024-02-15 19:04:19 -08:00
oobabooga
080f7132c0
Revert gradio to 3.50.2 ( #5513 )
2024-02-15 20:40:23 -03:00
oobabooga
7123ac3f77
Remove "Maximum UI updates/second" parameter ( #5507 )
2024-02-14 23:34:30 -03:00
DominikKowalczyk
33c4ce0720
Bump gradio to 4.19 ( #5419 )
...
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-02-14 23:28:26 -03:00
oobabooga
b16958575f
Minor bug fix
2024-02-13 19:48:32 -08:00
oobabooga
d47182d9d1
llamacpp_HF: do not use oobabooga/llama-tokenizer ( #5499 )
2024-02-14 00:28:51 -03:00
oobabooga
069ed7c6ef
Lint
2024-02-13 16:05:41 -08:00
oobabooga
86c320ab5a
llama.cpp: add a progress bar for prompt evaluation
2024-02-07 21:56:10 -08:00
oobabooga
c55b8ce932
Improved random preset generation
2024-02-06 08:51:52 -08:00
oobabooga
4e34ae0587
Minor logging improvements
2024-02-06 08:22:08 -08:00
oobabooga
3add2376cd
Better warpers logging
2024-02-06 07:09:21 -08:00
oobabooga
494cc3c5b0
Handle empty sampler priority field, use default values
2024-02-06 07:05:32 -08:00
oobabooga
775902c1f2
Sampler priority: better logging, always save to presets
2024-02-06 06:49:22 -08:00
oobabooga
acfbe6b3b3
Minor doc changes
2024-02-06 06:35:01 -08:00
oobabooga
8ee3cea7cb
Improve some log messages
2024-02-06 06:31:27 -08:00
oobabooga
8a6d9abb41
Small fixes
2024-02-06 06:26:27 -08:00
oobabooga
2a1063eff5
Revert "Remove non-HF ExLlamaV2 loader ( #5431 )"
...
This reverts commit cde000d478 .
2024-02-06 06:21:36 -08:00
oobabooga
8c35fefb3b
Add custom sampler order support ( #5443 )
2024-02-06 11:20:10 -03:00
oobabooga
7301c7618f
Minor change to Models tab
2024-02-04 21:49:58 -08:00
oobabooga
f234fbe83f
Improve a log message after previous commit
2024-02-04 21:44:53 -08:00
oobabooga
7073665a10
Truncate long chat completions inputs ( #5439 )
2024-02-05 02:31:24 -03:00
oobabooga
9033fa5eee
Organize the Model tab
2024-02-04 19:30:22 -08:00
Forkoz
2a45620c85
Split by rows instead of layers for llama.cpp multi-gpu ( #5435 )
2024-02-04 23:36:40 -03:00
Badis Ghoubali
3df7e151f7
fix the n_batch slider ( #5436 )
2024-02-04 18:15:30 -03:00
oobabooga
4e188eeb80
Lint
2024-02-03 20:40:10 -08:00
oobabooga
cde000d478
Remove non-HF ExLlamaV2 loader ( #5431 )
2024-02-04 01:15:51 -03:00
kalomaze
b6077b02e4
Quadratic sampling ( #5403 )
...
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-02-04 00:20:02 -03:00
Badis Ghoubali
40c7977f9b
Add roleplay.gbnf grammar ( #5368 )
2024-01-28 21:41:28 -03:00
sam-ngu
c0bdcee646
added trust_remote_code to deepspeed init loaderClass ( #5237 )
2024-01-26 11:10:57 -03:00
oobabooga
87dc421ee8
Bump exllamav2 to 0.0.12 ( #5352 )
2024-01-22 22:40:12 -03:00
oobabooga
aad73667af
Lint
2024-01-22 03:25:55 -08:00
lmg-anon
db1da9f98d
Fix logprobs tokens in OpenAI API ( #5339 )
2024-01-22 08:07:42 -03:00
Forkoz
5c5ef4cef7
UI: change n_gpu_layers maximum to 256 for larger models. ( #5262 )
2024-01-17 17:13:16 -03:00
ilya sheprut
4d14eb8b82
LoRA: Fix error "Attempting to unscale FP16 gradients" when training ( #5268 )
2024-01-17 17:11:49 -03:00
oobabooga
e055967974
Add prompt_lookup_num_tokens parameter ( #5296 )
2024-01-17 17:09:36 -03:00
oobabooga
b3fc2cd887
UI: Do not save unchanged extension settings to settings.yaml
2024-01-10 03:48:30 -08:00
oobabooga
53dc1d8197
UI: Do not save unchanged settings to settings.yaml
2024-01-09 18:59:04 -08:00
oobabooga
89e7e107fc
Lint
2024-01-09 16:27:50 -08:00
mamei16
bec4e0a1ce
Fix update event in refresh buttons ( #5197 )
2024-01-09 14:49:37 -03:00
oobabooga
4333d82b9d
Minor bug fix
2024-01-09 06:55:18 -08:00
oobabooga
953343cced
Improve the file saving/deletion menus
2024-01-09 06:33:47 -08:00
oobabooga
123f27a3c5
Load the nearest character after deleting a character
...
Instead of the first.
2024-01-09 06:24:27 -08:00
oobabooga
b908ed318d
Revert "Rename past chats -> chat history"
...
This reverts commit aac93a1fd6 .
2024-01-09 05:26:07 -08:00
oobabooga
4ca82a4df9
Save light/dark theme on "Save UI defaults to settings.yaml"
2024-01-09 04:20:10 -08:00
oobabooga
7af50ede94
Reorder some buttons
2024-01-09 04:11:50 -08:00
oobabooga
a9f49a7574
Confirm the chat history rename with enter
2024-01-09 04:00:53 -08:00
oobabooga
7bdd2118a2
Change some log messages when deleting files
2024-01-09 03:32:01 -08:00
oobabooga
aac93a1fd6
Rename past chats -> chat history
2024-01-09 03:14:30 -08:00
oobabooga
615fa11af8
Move new chat button, improve history deletion handling
2024-01-08 21:22:37 -08:00
oobabooga
4f7e1eeafd
Past chat histories in a side bar on desktop ( #5098 )
...
Lots of room for improvement, but that's a start.
2024-01-09 01:57:29 -03:00
oobabooga
372ef5e2d8
Fix dynatemp parameters always visible
2024-01-08 19:42:31 -08:00
oobabooga
29c2693ea0
dynatemp_low, dynatemp_high, dynatemp_exponent parameters ( #5209 )
2024-01-08 23:28:35 -03:00
oobabooga
c4e005efec
Fix dropdown menus sometimes failing to refresh
2024-01-08 17:49:54 -08:00
oobabooga
9cd2106303
Revert "Add dynamic temperature to the random preset button"
...
This reverts commit 4365fb890f .
2024-01-08 16:46:24 -08:00
oobabooga
4365fb890f
Add dynamic temperature to the random preset button
2024-01-07 13:08:15 -08:00
oobabooga
0d07b3a6a1
Add dynamic_temperature_low parameter ( #5198 )
2024-01-07 17:03:47 -03:00
oobabooga
b8a0b3f925
Don't print torch tensors with --verbose
2024-01-07 10:35:55 -08:00
oobabooga
cf820c69c5
Print generation parameters with --verbose (HF only)
2024-01-07 10:06:23 -08:00
oobabooga
c4c7fc4ab3
Lint
2024-01-07 09:36:56 -08:00
kalomaze
48327cc5c4
Dynamic Temperature HF loader support ( #5174 )
...
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-01-07 10:36:26 -03:00
oobabooga
248742df1c
Save extension fields to settings.yaml on "Save UI defaults"
2024-01-04 20:33:42 -08:00
oobabooga
c9d814592e
Increase maximum temperature value to 5
2024-01-04 17:28:15 -08:00
oobabooga
e4d724eb3f
Fix cache_folder bug introduced in 37eff915d6
2024-01-04 07:49:40 -08:00
Alberto Cano
37eff915d6
Use --disk-cache-dir for all caches
2024-01-04 00:27:26 -03:00
Lounger
7965f6045e
Fix loading latest history for file names with dots ( #5162 )
2024-01-03 22:39:41 -03:00
AstrisCantCode
b80e6365d0
Fix various bugs for LoRA training ( #5161 )
2024-01-03 20:42:20 -03:00
oobabooga
7cce88c403
Rmove an unncecessary exception
2024-01-02 07:20:59 -08:00
oobabooga
94afa0f9cf
Minor style changes
2024-01-01 16:00:22 -08:00
oobabooga
cbf6f9e695
Update some UI messages
2023-12-30 21:31:17 -08:00
oobabooga
2aad91f3c9
Remove deprecated command-line flags ( #5131 )
2023-12-31 02:07:48 -03:00
oobabooga
2734ce3e4c
Remove RWKV loader ( #5130 )
2023-12-31 02:01:40 -03:00
oobabooga
0e54a09bcb
Remove exllamav1 loaders ( #5128 )
2023-12-31 01:57:06 -03:00
oobabooga
8e397915c9
Remove --sdp-attention, --xformers flags ( #5126 )
2023-12-31 01:36:51 -03:00
B611
b7dd1f9542
Specify utf-8 encoding for model metadata file open ( #5125 )
2023-12-31 01:34:32 -03:00
oobabooga
c06f630bcc
Increase max_updates_second maximum value
2023-12-24 13:29:47 -08:00
oobabooga
8c60495878
UI: add "Maximum UI updates/second" parameter
2023-12-24 09:17:40 -08:00
zhangningboo
1b8b61b928
Fix output_ids decoding for Qwen/Qwen-7B-Chat ( #5045 )
2023-12-22 23:11:02 -03:00
Yiximail
afc91edcb2
Reset the model_name after unloading the model ( #5051 )
2023-12-22 22:18:24 -03:00
oobabooga
2706149c65
Organize the CMD arguments by group ( #5027 )
2023-12-21 00:33:55 -03:00
oobabooga
c727a70572
Remove redundancy from modules/loaders.py
2023-12-20 19:18:07 -08:00
luna
6efbe3009f
let exllama v1 models load safetensor loras ( #4854 )
2023-12-20 13:29:19 -03:00
oobabooga
bcba200790
Fix EOS being ignored in ExLlamav2 after previous commit
2023-12-20 07:54:06 -08:00
oobabooga
f0f6d9bdf9
Add HQQ back & update version
...
This reverts commit 2289e9031e .
2023-12-20 07:46:09 -08:00
oobabooga
b15f510154
Optimize ExLlamav2 (non-HF) loader
2023-12-20 07:31:42 -08:00
oobabooga
fadb295d4d
Lint
2023-12-19 21:36:57 -08:00
oobabooga
fb8ee9f7ff
Add a specific error if HQQ is missing
2023-12-19 21:32:58 -08:00
oobabooga
9992f7d8c0
Improve several log messages
2023-12-19 20:54:32 -08:00
oobabooga
23818dc098
Better logger
...
Credits: vladmandic/automatic
2023-12-19 20:38:33 -08:00
oobabooga
95600073bc
Add an informative error when extension requirements are missing
2023-12-19 20:20:45 -08:00
oobabooga
d8279dc710
Replace character name placeholders in chat context ( closes #5007 )
2023-12-19 17:31:46 -08:00
oobabooga
e83e6cedbe
Organize the model menu
2023-12-19 13:18:26 -08:00
oobabooga
f4ae0075e8
Fix conversion from old template format to jinja2
2023-12-19 13:16:52 -08:00
oobabooga
de138b8ba6
Add llama-cpp-python wheels with tensor cores support ( #5003 )
2023-12-19 17:30:53 -03:00
oobabooga
0a299d5959
Bump llama-cpp-python to 0.2.24 ( #5001 )
2023-12-19 15:22:21 -03:00
oobabooga
83cf1a6b67
Fix Yi space issue ( closes #4996 )
2023-12-19 07:54:19 -08:00
oobabooga
9847809a7a
Add a warning about ppl evaluation without --no_use_fast
2023-12-18 18:09:24 -08:00
oobabooga
f6d701624c
UI: mention that QuIP# does not work on Windows
2023-12-18 18:05:02 -08:00
oobabooga
a23a004434
Update the example template
2023-12-18 17:47:35 -08:00
Water
674be9a09a
Add HQQ quant loader ( #4888 )
...
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-12-18 21:23:16 -03:00
oobabooga
1f9e25e76a
UI: update "Saved instruction templates" dropdown after loading template
2023-12-17 21:19:06 -08:00
oobabooga
da1c8d77ea
Merge remote-tracking branch 'refs/remotes/origin/dev' into dev
2023-12-17 21:05:10 -08:00
oobabooga
cac89df97b
Instruction templates: better handle unwanted bos tokens
2023-12-17 21:04:30 -08:00
oobabooga
f0d6ead877
llama.cpp: read instruction template from GGUF metadata ( #4975 )
2023-12-18 01:51:58 -03:00
oobabooga
f1f2c4c3f4
Add --num_experts_per_token parameter (ExLlamav2) ( #4955 )
2023-12-17 12:08:33 -03:00
oobabooga
12690d3ffc
Better HF grammar implementation ( #4953 )
2023-12-17 02:01:23 -03:00
oobabooga
f8079d067d
UI: save the sent chat message on "no model is loaded" error
2023-12-16 10:52:41 -08:00
oobabooga
3bbf6c601d
AutoGPTQ: Add --disable_exllamav2 flag (Mixtral CPU offloading needs this)
2023-12-15 06:46:13 -08:00
oobabooga
2cb5b68ad9
Bug fix: when generation fails, save the sent message ( #4915 )
2023-12-15 01:01:45 -03:00
Kim Jaewon
e53f99faa0
[OpenAI Extension] Add 'max_logits' parameter in logits endpoint ( #4916 )
2023-12-15 00:22:43 -03:00
Lounger
5754f0c357
Fix deleting chat logs ( #4914 )
2023-12-13 21:54:43 -03:00
Bartowski
f51156705d
Allow symlinked folder within root directory ( #4863 )
2023-12-13 18:08:21 -03:00
Ixion
3f3960dbfb
Fixed invalid Jinja2 syntax in instruction templates ( #4911 )
2023-12-13 15:46:23 -03:00
oobabooga
fcf5512364
Jinja templates: fix a potential small bug
2023-12-13 10:19:39 -08:00
oobabooga
7f1a6a70e3
Update the llamacpp_HF comment
2023-12-12 21:04:20 -08:00
oobabooga
1c531a3713
Minor cleanup
2023-12-12 13:25:21 -08:00
oobabooga
8513028968
Fix lag in the chat tab during streaming
2023-12-12 13:01:25 -08:00
oobabooga
39d2fe1ed9
Jinja templates for Instruct and Chat ( #4874 )
2023-12-12 17:23:14 -03:00
oobabooga
aab0dd962d
Revert "Update callbacks.py to show tracebacks on ValueError ( #4892 )"
...
This reverts commit 993ca51a65 .
2023-12-12 11:47:11 -08:00
Nehereus
993ca51a65
Update callbacks.py to show tracebacks on ValueError ( #4892 )
2023-12-12 02:29:27 -03:00
Morgan Schweers
602b8c6210
Make new browser reloads recognize current model. ( #4865 )
2023-12-11 02:51:01 -03:00
oobabooga
8c8825b777
Add QuIP# to README
2023-12-08 08:40:42 -08:00
oobabooga
2a335b8aa7
Cleanup: set shared.model_name only once
2023-12-08 06:35:23 -08:00
oobabooga
62d59a516f
Add trust_remote_code to all HF loaders
2023-12-08 06:29:26 -08:00
oobabooga
181743fd97
Fix missing spaces tokenizer issue ( closes #4834 )
2023-12-08 05:16:46 -08:00
Yiximail
1c74b3ab45
Fix partial unicode characters issue ( #4837 )
2023-12-08 09:50:53 -03:00
oobabooga
2c5a1e67f9
Parameters: change max_new_tokens & repetition_penalty_range defaults ( #4842 )
2023-12-07 20:04:52 -03:00
oobabooga
98361af4d5
Add QuIP# support ( #4803 )
...
It has to be installed manually for now.
2023-12-06 00:01:01 -03:00
oobabooga
6430acadde
Minor bug fix after https://github.com/oobabooga/text-generation-webui/pull/4814
2023-12-05 10:08:11 -08:00
oobabooga
0f828ea441
Do not limit API updates/second
2023-12-04 20:45:43 -08:00
oobabooga
9edb193def
Optimize HF text generation ( #4814 )
2023-12-05 00:00:40 -03:00
俞航
ac9f154bcc
Bump exllamav2 from 0.0.8 to 0.0.10 & Fix code change ( #4782 )
2023-12-04 21:15:05 -03:00
oobabooga
131a5212ce
UI: update context upper limit to 200000
2023-12-04 15:48:34 -08:00
oobabooga
be88b072e9
Update --loader flag description
2023-12-04 15:41:25 -08:00
oobabooga
7fc9033b2e
Recommend ExLlama_HF and ExLlamav2_HF
2023-12-04 15:28:46 -08:00
Lounger
7c0a17962d
Gallery improvements ( #4789 )
2023-12-03 22:45:50 -03:00
oobabooga
77d6ccf12b
Add a LOADER debug message while loading models
2023-11-30 12:00:32 -08:00
oobabooga
092a2c3516
Fix a bug in llama.cpp get_logits() function
2023-11-30 11:21:40 -08:00
oobabooga
2698d7c9fd
Fix llama.cpp model unloading
2023-11-29 15:19:48 -08:00
oobabooga
9940ed9c77
Sort the loaders
2023-11-29 15:13:03 -08:00
oobabooga
a7670c31ca
Sort
2023-11-28 18:43:33 -08:00
oobabooga
6e51bae2e0
Sort the loaders menu
2023-11-28 18:41:11 -08:00
oobabooga
68059d7c23
llama.cpp: minor log change & lint
2023-11-27 10:44:55 -08:00
tsukanov-as
9f7ae6bb2e
fix detection of stopping strings when HTML escaping is used ( #4728 )
2023-11-27 15:42:08 -03:00
oobabooga
0589ff5b12
Bump llama-cpp-python to 0.2.19 & add min_p and typical_p parameters to llama.cpp loader ( #4701 )
2023-11-21 20:59:39 -03:00
oobabooga
2769a1fa25
Hide deprecated args from Session tab
2023-11-21 15:15:16 -08:00
oobabooga
a2e6d00128
Use convert_ids_to_tokens instead of decode in logits endpoint
...
This preserves the llama tokenizer spaces.
2023-11-19 09:22:08 -08:00
oobabooga
9da7bb203d
Minor LoRA bug fix
2023-11-19 07:59:29 -08:00
oobabooga
a6f1e1bcc5
Fix PEFT LoRA unloading
2023-11-19 07:55:25 -08:00
oobabooga
ab94f0d9bf
Minor style change
2023-11-18 21:11:04 -08:00
oobabooga
5fcee696ea
New feature: enlarge character pictures on click ( #4654 )
2023-11-19 02:05:17 -03:00
oobabooga
ef6feedeb2
Add --nowebui flag for pure API mode ( #4651 )
2023-11-18 23:38:39 -03:00
oobabooga
0fa1af296c
Add /v1/internal/logits endpoint ( #4650 )
2023-11-18 23:19:31 -03:00
oobabooga
8f4f4daf8b
Add --admin-key flag for API ( #4649 )
2023-11-18 22:33:27 -03:00
Jordan Tucker
baab894759
fix: use system message in chat-instruct mode ( #4648 )
2023-11-18 20:20:13 -03:00
oobabooga
47d9e2618b
Refresh the Preset menu after saving a preset
2023-11-18 14:03:42 -08:00
oobabooga
83b64e7fc1
New feature: "random preset" button ( #4647 )
2023-11-18 18:31:41 -03:00
oobabooga
e0ca49ed9c
Bump llama-cpp-python to 0.2.18 (2nd attempt) ( #4637 )
...
* Update requirements*.txt
* Add back seed
2023-11-18 00:31:27 -03:00
oobabooga
9d6f79db74
Revert "Bump llama-cpp-python to 0.2.18 ( #4611 )"
...
This reverts commit 923c8e25fb .
2023-11-17 05:14:25 -08:00
oobabooga
13dc3b61da
Update README
2023-11-16 19:57:55 -08:00
oobabooga
8b66d83aa9
Set use_fast=True by default, create --no_use_fast flag
...
This increases tokens/second for HF loaders.
2023-11-16 19:55:28 -08:00
oobabooga
6525707a7f
Fix "send instruction template to..." buttons ( closes #4625 )
2023-11-16 18:16:42 -08:00
oobabooga
510a01ef46
Lint
2023-11-16 18:03:06 -08:00
oobabooga
923c8e25fb
Bump llama-cpp-python to 0.2.18 ( #4611 )
2023-11-16 22:55:14 -03:00
oobabooga
58c6001be9
Add missing exllamav2 samplers
2023-11-16 07:09:40 -08:00
oobabooga
cd41f8912b
Warn users about n_ctx / max_seq_len
2023-11-15 18:56:42 -08:00
oobabooga
9be48e83a9
Start API when "api" checkbox is checked
2023-11-15 16:35:47 -08:00
oobabooga
a85ce5f055
Add more info messages for truncation / instruction template
2023-11-15 16:20:31 -08:00
oobabooga
883701bc40
Alternative solution to 025da386a0
...
Fixes an error.
2023-11-15 16:04:02 -08:00
oobabooga
8ac942813c
Revert "Fix CPU memory limit error (issue #3763 ) ( #4597 )"
...
This reverts commit 025da386a0 .
2023-11-15 16:01:54 -08:00
oobabooga
e6f44d6d19
Print context length / instruction template to terminal when loading models
2023-11-15 16:00:51 -08:00
oobabooga
e05d8fd441
Style changes
2023-11-15 15:51:37 -08:00
Andy Bao
025da386a0
Fix CPU memory limit error (issue #3763 ) ( #4597 )
...
get_max_memory_dict() was not properly formatting shared.args.cpu_memory
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-11-15 20:27:20 -03:00
oobabooga
4aabff3728
Remove old API, launch OpenAI API with --api
2023-11-10 06:39:08 -08:00
oobabooga
2af7e382b1
Revert "Bump llama-cpp-python to 0.2.14"
...
This reverts commit 5c3eb22ce6 .
The new version has issues:
https://github.com/oobabooga/text-generation-webui/issues/4540
https://github.com/abetlen/llama-cpp-python/issues/893
2023-11-09 10:02:13 -08:00
oobabooga
21ed9a260e
Document the new "Custom system message" field
2023-11-08 17:54:10 -08:00
oobabooga
2358706453
Add /v1/internal/model/load endpoint (tentative)
2023-11-07 20:58:06 -08:00
oobabooga
43c53a7820
Refactor the /v1/models endpoint
2023-11-07 19:59:27 -08:00
oobabooga
1b69694fe9
Add types to the encode/decode/token-count endpoints
2023-11-07 19:32:14 -08:00
oobabooga
6e2e0317af
Separate context and system message in instruction formats ( #4499 )
2023-11-07 20:02:58 -03:00
oobabooga
5c0559da69
Training: fix .txt files now showing in dropdowns
2023-11-07 14:41:11 -08:00
oobabooga
af3d25a503
Disable logits_all in llamacpp_HF (makes processing 3x faster)
2023-11-07 14:35:48 -08:00
oobabooga
5c3eb22ce6
Bump llama-cpp-python to 0.2.14
2023-11-07 14:20:43 -08:00
oobabooga
ec17a5d2b7
Make OpenAI API the default API ( #4430 )
2023-11-06 02:38:29 -03:00
feng lui
4766a57352
transformers: add use_flash_attention_2 option ( #4373 )
2023-11-04 13:59:33 -03:00
wouter van der plas
add359379e
fixed two links in the ui ( #4452 )
2023-11-04 13:41:42 -03:00
oobabooga
aa5d671579
Add temperature_last parameter ( #4472 )
2023-11-04 13:09:07 -03:00
oobabooga
1ab8700d94
Change frequency/presence penalty ranges
2023-11-03 17:38:19 -07:00
oobabooga
45fcb60e7a
Make truncation_length_max apply to max_seq_len/n_ctx
2023-11-03 11:29:31 -07:00
oobabooga
7f9c1cbb30
Change min_p default to 0.0
2023-11-03 08:25:22 -07:00
oobabooga
4537853e2c
Change min_p default to 1.0
2023-11-03 08:13:50 -07:00
kalomaze
367e5e6e43
Implement Min P as a sampler option in HF loaders ( #4449 )
2023-11-02 16:32:51 -03:00
oobabooga
fcb7017b7a
Remove a checkbox
2023-11-02 12:24:09 -07:00
Julien Chaumond
fdcaa955e3
transformers: Add a flag to force load from safetensors ( #4450 )
2023-11-02 16:20:54 -03:00
oobabooga
c0655475ae
Add cache_8bit option
2023-11-02 11:23:04 -07:00
oobabooga
42f816312d
Merge remote-tracking branch 'refs/remotes/origin/dev' into dev
2023-11-02 11:09:26 -07:00
oobabooga
77abd9b69b
Add no_flash_attn option
2023-11-02 11:08:53 -07:00
Julien Chaumond
a56ef2a942
make torch.load a bit safer ( #4448 )
2023-11-02 14:07:08 -03:00
Mehran Ziadloo
aaf726dbfb
Updating the shared settings object when loading a model ( #4425 )
2023-11-01 01:29:57 -03:00
oobabooga
9bd0724d85
Change frequency/presence penalty ranges
2023-10-31 20:57:56 -07:00
Meheret
0707ed7677
updated wiki link ( #4415 )
2023-10-31 19:09:05 -03:00
oobabooga
262f8ae5bb
Use default gr.Dataframe for evaluation table
2023-10-27 06:49:14 -07:00
oobabooga
839a87bac8
Fix is_ccl_available & is_xpu_available imports
2023-10-26 20:27:04 -07:00
Abhilash Majumder
778a010df8
Intel Gpu support initialization ( #4340 )
2023-10-26 23:39:51 -03:00
oobabooga
92b2f57095
Minor metadata bug fix (second attempt)
2023-10-26 18:57:32 -07:00
tdrussell
72f6fc6923
Rename additive_repetition_penalty to presence_penalty, add frequency_penalty ( #4376 )
2023-10-25 12:10:28 -03:00
oobabooga
ef1489cd4d
Remove unused parameter in AutoAWQ
2023-10-23 20:45:43 -07:00
oobabooga
1edf321362
Lint
2023-10-23 13:09:03 -07:00
oobabooga
280ae720d7
Organize
2023-10-23 13:07:17 -07:00
oobabooga
49e5eecce4
Merge remote-tracking branch 'refs/remotes/origin/main'
2023-10-23 12:54:05 -07:00
oobabooga
306d764ff6
Minor metadata bug fix
2023-10-23 12:46:24 -07:00
adrianfiedler
4bc411332f
Fix broken links ( #4367 )
...
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-10-23 14:09:57 -03:00
oobabooga
92691ee626
Disable trust_remote_code by default
2023-10-23 09:57:44 -07:00
tdrussell
4440f87722
Add additive_repetition_penalty sampler setting. ( #3627 )
2023-10-23 02:28:07 -03:00
oobabooga
df90d03e0b
Replace --mul_mat_q with --no_mul_mat_q
2023-10-22 12:23:03 -07:00
Googulator
d0c3b407b3
transformers loader: multi-LoRAs support ( #3120 )
2023-10-22 16:06:22 -03:00
omo
4405513ca5
Option to select/target additional linear modules/layers in LORA training ( #4178 )
2023-10-22 15:57:19 -03:00
oobabooga
2d1b3332e4
Ignore warnings on Colab
2023-10-21 21:45:25 -07:00
oobabooga
09f807af83
Use ExLlama_HF for GPTQ models by default
2023-10-21 20:45:38 -07:00
oobabooga
506d05aede
Organize command-line arguments
2023-10-21 18:52:59 -07:00
oobabooga
fbac6d21ca
Add missing exception
2023-10-20 23:53:24 -07:00
Brian Dashore
3345da2ea4
Add flash-attention 2 for windows ( #4235 )
2023-10-21 03:46:23 -03:00
Johan
1d5a015ce7
Enable special token support for exllamav2 ( #4314 )
2023-10-21 01:54:06 -03:00
turboderp
ae8cd449ae
ExLlamav2_HF: Convert logits to FP32 ( #4310 )
2023-10-18 23:16:05 -03:00
oobabooga
f17f7a6913
Increase the evaluation table height
2023-10-16 12:55:35 -07:00
oobabooga
8ea554bc19
Check for torch.xpu.is_available()
2023-10-16 12:53:40 -07:00
oobabooga
188d20e9e5
Reduce the evaluation table height
2023-10-16 10:53:42 -07:00
oobabooga
2d44adbb76
Clear the torch cache while evaluating
2023-10-16 10:52:50 -07:00
oobabooga
71cac7a1b2
Increase the height of the evaluation table
2023-10-15 21:56:40 -07:00
oobabooga
e14bde4946
Minor improvements to evaluation logs
2023-10-15 20:51:43 -07:00
oobabooga
b88b2b74a6
Experimental Intel Arc transformers support (untested)
2023-10-15 20:51:11 -07:00
Forkoz
8cce1f1126
Exllamav2 lora support ( #4229 )
...
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-10-14 16:12:41 -03:00
oobabooga
773c17faec
Fix a warning
2023-10-10 20:53:38 -07:00
oobabooga
f63361568c
Fix safetensors kwarg usage in AutoAWQ
2023-10-10 19:03:09 -07:00
oobabooga
39f16ff83d
Fix default/notebook tabs css
2023-10-10 18:45:12 -07:00
oobabooga
fae8062d39
Bump to latest gradio (3.47) ( #4258 )
2023-10-10 22:20:49 -03:00
oobabooga
9fab9a1ca6
Minor fix
2023-10-10 14:08:11 -07:00
oobabooga
a49cc69a4a
Ignore rope_freq_base if value is 10000
2023-10-10 13:57:40 -07:00
oobabooga
3a9d90c3a1
Download models with 4 threads by default
2023-10-10 13:52:10 -07:00
Forkoz
35695e18c7
Remove import. ( #4247 )
...
For real this time.
2023-10-09 18:06:11 -03:00
Forkoz
2e471071af
Update llama_attn_hijack.py ( #4231 )
2023-10-08 15:16:48 -03:00
Brian Dashore
98fa73a974
Text Generation: stop if EOS token is reached ( #4213 )
2023-10-07 19:46:42 -03:00
Brian Dashore
7743b5e9de
Llamacpp_HF: Fix CFG cache init ( #4219 )
...
Documentation says that model.context_params should be sent when
a new context is created. The current code uses model.params which
doesn't exist.
Signed-off-by: kingbri <bdashore3@proton.me>
2023-10-07 19:38:29 -03:00
turboderp
8a98646a21
Bump ExLlamaV2 to 0.0.5 ( #4186 )
2023-10-05 19:12:22 -03:00
oobabooga
7ffb424c7b
Add AutoAWQ to README
2023-10-05 09:22:37 -07:00
cal066
cc632c3f33
AutoAWQ: initial support ( #3999 )
2023-10-05 13:19:18 -03:00
tdrussell
cb26163a20
Fix off-by-one error in exllama_hf caching logic ( #4145 )
2023-10-05 12:20:56 -03:00
oobabooga
ae4ba3007f
Add grammar to transformers and _HF loaders ( #4091 )
2023-10-05 10:01:36 -03:00
oobabooga
b6fe6acf88
Add threads_batch parameter
2023-10-01 21:28:00 -07:00
jllllll
41a2de96e5
Bump llama-cpp-python to 0.2.11
2023-10-01 18:08:10 -05:00
oobabooga
f2d82f731a
Add recommended NTKv1 alpha values
2023-09-29 13:48:38 -07:00
oobabooga
abe99cddeb
Extend evaluation slider bounds
2023-09-29 13:06:26 -07:00
oobabooga
96da2e1c0d
Read more metadata (config.json & quantize_config.json)
2023-09-29 06:14:16 -07:00
oobabooga
56b5a4af74
exllamav2 typical_p
2023-09-28 20:10:12 -07:00
oobabooga
f8e9733412
Minor syntax change
2023-09-28 19:32:35 -07:00
oobabooga
f931184b53
Increase truncation limits to 32768
2023-09-28 19:28:22 -07:00
oobabooga
1dd13e4643
Read Transformers config.json metadata
2023-09-28 19:19:47 -07:00
StoyanStAtanasov
7e6ff8d1f0
Enable NUMA feature for llama_cpp_python ( #4040 )
2023-09-26 22:05:00 -03:00
oobabooga
87ea2d96fd
Add a note about RWKV loader
2023-09-26 17:43:39 -07:00
oobabooga
0c89180966
Another minor fix
2023-09-26 06:54:21 -07:00
oobabooga
365335e1ae
Minor fix
2023-09-26 06:47:19 -07:00
oobabooga
1ca54faaf0
Improve --multi-user mode
2023-09-26 06:42:33 -07:00
oobabooga
019371c0b6
Lint
2023-09-25 20:31:11 -07:00
oobabooga
814520fed1
Extension install improvements
2023-09-25 20:27:06 -07:00
oobabooga
7f1460af29
Change a warning
2023-09-25 20:22:27 -07:00
oobabooga
862b45b1c7
Extension install improvements
2023-09-25 19:48:30 -07:00
oobabooga
c8952cce55
Move documentation from UI to docs/
2023-09-25 12:28:28 -07:00
oobabooga
d0d221df49
Add --use_fast option ( closes #3741 )
2023-09-25 12:19:43 -07:00
oobabooga
b973b91d73
Automatically filter by loader ( closes #4072 )
2023-09-25 10:28:35 -07:00
oobabooga
63de9eb24f
Clean up the transformers loader
2023-09-24 20:26:26 -07:00
oobabooga
36c38d7561
Add disable_exllama to Transformers loader (for GPTQ LoRA training)
2023-09-24 20:03:11 -07:00
oobabooga
55a685d999
Minor fixes
2023-09-24 14:15:10 -07:00
oobabooga
08cf150c0c
Add a grammar editor to the UI ( #4061 )
2023-09-24 18:05:24 -03:00
oobabooga
eb0b7c1053
Fix a minor UI bug
2023-09-24 07:17:33 -07:00
oobabooga
3edac43426
Remove print statement
2023-09-24 07:13:00 -07:00
oobabooga
b227e65d86
Add grammar to llama.cpp loader ( closes #4019 )
2023-09-24 07:10:45 -07:00
oobabooga
2e7b6b0014
Create alternative requirements.txt with AMD and Metal wheels ( #4052 )
2023-09-24 09:58:29 -03:00
oobabooga
7a3ca2c68f
Better detect EXL2 models
2023-09-23 13:05:55 -07:00
oobabooga
b1467bd064
Move one-click-installers into the repository ( #4028 from oobabooga/one-click)
2023-09-22 17:43:07 -03:00
oobabooga
c075969875
Add instructions
2023-09-22 13:10:03 -07:00
oobabooga
8ab3eca9ec
Add a warning for outdated installations
2023-09-22 09:35:19 -07:00
oobabooga
95976a9d4f
Fix a bug while deleting characters
2023-09-22 06:02:34 -07:00
oobabooga
d5330406fa
Add a rename menu for chat histories
2023-09-21 19:16:51 -07:00
oobabooga
00ab450c13
Multiple histories for each character ( #4022 )
2023-09-21 17:19:32 -03:00
oobabooga
029da9563f
Avoid redundant function call in llamacpp_hf
2023-09-19 14:14:40 -07:00
oobabooga
869f47fff9
Lint
2023-09-19 13:51:57 -07:00
oobabooga
13ac55fa18
Reorder some functions
2023-09-19 13:51:57 -07:00
oobabooga
03dc69edc5
ExLlama_HF (v1 and v2) prefix matching
2023-09-19 13:12:19 -07:00
oobabooga
5075087461
Fix command-line arguments being ignored
2023-09-19 13:11:46 -07:00
oobabooga
ff5d3d2d09
Add missing import
2023-09-18 16:26:54 -07:00
oobabooga
605ec3c9f2
Add a warning about ExLlamaV2 without flash-attn
2023-09-18 12:26:35 -07:00
oobabooga
f0ef971edb
Remove obsolete warning
2023-09-18 12:25:10 -07:00
oobabooga
745807dc03
Faster llamacpp_HF prefix matching
2023-09-18 11:02:45 -07:00
BadisG
893a72a1c5
Stop generation immediately when using "Maximum tokens/second" ( #3952 )
...
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-09-18 14:27:06 -03:00
Cebtenzzre
8466cf229a
llama.cpp: fix ban_eos_token ( #3987 )
2023-09-18 12:15:02 -03:00
oobabooga
0ede2965d5
Remove an error message
2023-09-17 18:46:08 -07:00