Commit graph

185 commits

Author SHA1 Message Date
oobabooga 2a92a842ce
Bump gradio to 4.23 (#5758) 2024-03-26 16:32:20 -03:00
oobabooga d828844a6f Small fix: don't save truncation_length to settings.yaml
It should derive from model metadata or from a command-line flag.
2024-03-14 08:56:28 -07:00
oobabooga 2ef5490a36 UI: make light theme less blinding 2024-03-13 08:23:16 -07:00
oobabooga 28076928ac
UI: Add a new "User description" field for user personality/biography (#5691) 2024-03-11 23:41:57 -03:00
oobabooga afb51bd5d6
Add StreamingLLM for llamacpp & llamacpp_HF (2nd attempt) (#5669) 2024-03-09 00:25:33 -03:00
oobabooga 2ec1d96c91
Add cache_4bit option for ExLlamaV2 (#5645) 2024-03-06 23:02:25 -03:00
oobabooga 2174958362
Revert gradio to 3.50.2 (#5640) 2024-03-06 11:52:46 -03:00
oobabooga 63a1d4afc8
Bump gradio to 4.19 (#5522) 2024-03-05 07:32:28 -03:00
kalomaze cfb25c9b3f
Cubic sampling w/ curve param (#5551)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-03-03 13:22:21 -03:00
oobabooga a6730f88f7
Add --autosplit flag for ExLlamaV2 (#5524) 2024-02-16 15:26:10 -03:00
oobabooga 080f7132c0
Revert gradio to 3.50.2 (#5513) 2024-02-15 20:40:23 -03:00
oobabooga 7123ac3f77
Remove "Maximum UI updates/second" parameter (#5507) 2024-02-14 23:34:30 -03:00
oobabooga 8c35fefb3b
Add custom sampler order support (#5443) 2024-02-06 11:20:10 -03:00
Forkoz 2a45620c85
Split by rows instead of layers for llama.cpp multi-gpu (#5435) 2024-02-04 23:36:40 -03:00
kalomaze b6077b02e4
Quadratic sampling (#5403)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-02-04 00:20:02 -03:00
oobabooga e055967974
Add prompt_lookup_num_tokens parameter (#5296) 2024-01-17 17:09:36 -03:00
oobabooga b3fc2cd887 UI: Do not save unchanged extension settings to settings.yaml 2024-01-10 03:48:30 -08:00
oobabooga 53dc1d8197 UI: Do not save unchanged settings to settings.yaml 2024-01-09 18:59:04 -08:00
mamei16 bec4e0a1ce
Fix update event in refresh buttons (#5197) 2024-01-09 14:49:37 -03:00
oobabooga 4ca82a4df9 Save light/dark theme on "Save UI defaults to settings.yaml" 2024-01-09 04:20:10 -08:00
oobabooga 29c2693ea0
dynatemp_low, dynatemp_high, dynatemp_exponent parameters (#5209) 2024-01-08 23:28:35 -03:00
oobabooga c4e005efec Fix dropdown menus sometimes failing to refresh 2024-01-08 17:49:54 -08:00
oobabooga 0d07b3a6a1
Add dynamic_temperature_low parameter (#5198) 2024-01-07 17:03:47 -03:00
kalomaze 48327cc5c4
Dynamic Temperature HF loader support (#5174)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-01-07 10:36:26 -03:00
oobabooga 248742df1c Save extension fields to settings.yaml on "Save UI defaults" 2024-01-04 20:33:42 -08:00
oobabooga 8c60495878 UI: add "Maximum UI updates/second" parameter 2023-12-24 09:17:40 -08:00
oobabooga de138b8ba6
Add llama-cpp-python wheels with tensor cores support (#5003) 2023-12-19 17:30:53 -03:00
oobabooga 0a299d5959
Bump llama-cpp-python to 0.2.24 (#5001) 2023-12-19 15:22:21 -03:00
Water 674be9a09a
Add HQQ quant loader (#4888)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-12-18 21:23:16 -03:00
oobabooga f1f2c4c3f4
Add --num_experts_per_token parameter (ExLlamav2) (#4955) 2023-12-17 12:08:33 -03:00
oobabooga 3bbf6c601d AutoGPTQ: Add --disable_exllamav2 flag (Mixtral CPU offloading needs this) 2023-12-15 06:46:13 -08:00
oobabooga 39d2fe1ed9
Jinja templates for Instruct and Chat (#4874) 2023-12-12 17:23:14 -03:00
oobabooga 5fcee696ea
New feature: enlarge character pictures on click (#4654) 2023-11-19 02:05:17 -03:00
oobabooga e0ca49ed9c
Bump llama-cpp-python to 0.2.18 (2nd attempt) (#4637)
* Update requirements*.txt

* Add back seed
2023-11-18 00:31:27 -03:00
oobabooga 9d6f79db74 Revert "Bump llama-cpp-python to 0.2.18 (#4611)"
This reverts commit 923c8e25fb.
2023-11-17 05:14:25 -08:00
oobabooga 8b66d83aa9 Set use_fast=True by default, create --no_use_fast flag
This increases tokens/second for HF loaders.
2023-11-16 19:55:28 -08:00
oobabooga 923c8e25fb
Bump llama-cpp-python to 0.2.18 (#4611) 2023-11-16 22:55:14 -03:00
oobabooga 6e2e0317af
Separate context and system message in instruction formats (#4499) 2023-11-07 20:02:58 -03:00
oobabooga af3d25a503 Disable logits_all in llamacpp_HF (makes processing 3x faster) 2023-11-07 14:35:48 -08:00
feng lui 4766a57352
transformers: add use_flash_attention_2 option (#4373) 2023-11-04 13:59:33 -03:00
oobabooga aa5d671579
Add temperature_last parameter (#4472) 2023-11-04 13:09:07 -03:00
kalomaze 367e5e6e43
Implement Min P as a sampler option in HF loaders (#4449) 2023-11-02 16:32:51 -03:00
oobabooga c0655475ae Add cache_8bit option 2023-11-02 11:23:04 -07:00
Abhilash Majumder 778a010df8
Intel Gpu support initialization (#4340) 2023-10-26 23:39:51 -03:00
tdrussell 72f6fc6923
Rename additive_repetition_penalty to presence_penalty, add frequency_penalty (#4376) 2023-10-25 12:10:28 -03:00
tdrussell 4440f87722
Add additive_repetition_penalty sampler setting. (#3627) 2023-10-23 02:28:07 -03:00
oobabooga df90d03e0b Replace --mul_mat_q with --no_mul_mat_q 2023-10-22 12:23:03 -07:00
oobabooga fae8062d39
Bump to latest gradio (3.47) (#4258) 2023-10-10 22:20:49 -03:00
oobabooga b6fe6acf88 Add threads_batch parameter 2023-10-01 21:28:00 -07:00
jllllll 41a2de96e5
Bump llama-cpp-python to 0.2.11 2023-10-01 18:08:10 -05:00
StoyanStAtanasov 7e6ff8d1f0
Enable NUMA feature for llama_cpp_python (#4040) 2023-09-26 22:05:00 -03:00
oobabooga 1ca54faaf0 Improve --multi-user mode 2023-09-26 06:42:33 -07:00
oobabooga d0d221df49 Add --use_fast option (closes #3741) 2023-09-25 12:19:43 -07:00
oobabooga b973b91d73 Automatically filter by loader (closes #4072) 2023-09-25 10:28:35 -07:00
oobabooga 08cf150c0c
Add a grammar editor to the UI (#4061) 2023-09-24 18:05:24 -03:00
oobabooga b227e65d86 Add grammar to llama.cpp loader (closes #4019) 2023-09-24 07:10:45 -07:00
saltacc f01b9aa71f
Add customizable ban tokens (#3899) 2023-09-15 18:27:27 -03:00
oobabooga 1ce3c93600 Allow "Your name" field to be saved 2023-09-14 03:44:35 -07:00
oobabooga 9f199c7a4c Use Noto Sans font
Copied from 6c8bd06308/public/webfonts/NotoSans
2023-09-13 13:48:05 -07:00
oobabooga ed86878f02 Remove GGML support 2023-09-11 07:44:00 -07:00
oobabooga cec8db52e5
Add max_tokens_second param (#3533) 2023-08-29 17:44:31 -03:00
oobabooga 52ab2a6b9e Add rope_freq_base parameter for CodeLlama 2023-08-25 06:55:15 -07:00
oobabooga d6934bc7bc
Implement CFG for ExLlama_HF (#3666) 2023-08-24 16:27:36 -03:00
oobabooga 7cba000421
Bump llama-cpp-python, +tensor_split by @shouyiwang, +mul_mat_q (#3610) 2023-08-18 12:03:34 -03:00
oobabooga 73d9befb65 Make "Show controls" customizable through settings.yaml 2023-08-16 07:04:18 -07:00
oobabooga 2a29208224
Add a "Show controls" button to chat UI (#3590) 2023-08-16 02:39:58 -03:00
oobabooga ccfc02a28d
Add the --disable_exllama option for AutoGPTQ (#3545 from clefever/disable-exllama) 2023-08-14 15:15:55 -03:00
oobabooga 619cb4e78b
Add "save defaults to settings.yaml" button (#3574) 2023-08-14 11:46:07 -03:00
oobabooga 4a05aa92cb Add "send to" buttons for instruction templates
- Remove instruction templates from prompt dropdowns (default/notebook)
- Add 3 buttons to Parameters > Instruction template as a replacement
- Increase the number of lines of 'negative prompt' field to 3, and add a scrollbar
- When uploading a character, switch to the Character tab
- When uploading chat history, switch to the Chat tab
2023-08-13 18:35:45 -07:00
oobabooga a1a9ec895d
Unify the 3 interface modes (#3554) 2023-08-13 01:12:15 -03:00
Chris Lefever 0230fa4e9c Add the --disable_exllama option for AutoGPTQ 2023-08-12 02:26:58 -04:00
oobabooga 65aa11890f
Refactor everything (#3481) 2023-08-06 21:49:27 -03:00
oobabooga 0af10ab49b
Add Classifier Free Guidance (CFG) for Transformers/ExLlama (#3325) 2023-08-06 17:22:48 -03:00
missionfloyd 2336b75d92
Remove unnecessary chat.js (#3445) 2023-08-04 01:58:37 -03:00
oobabooga 0e8f9354b5 Add direct download for session/chat history JSONs 2023-08-02 19:43:39 -07:00
oobabooga e931844fe2
Add auto_max_new_tokens parameter (#3419) 2023-08-02 14:52:20 -03:00
oobabooga b17893a58f Revert "Add tensor split support for llama.cpp (#3171)"
This reverts commit 031fe7225e.
2023-07-26 07:06:01 -07:00
oobabooga c2e0d46616 Add credits 2023-07-25 15:49:04 -07:00
Shouyi 031fe7225e
Add tensor split support for llama.cpp (#3171) 2023-07-25 18:59:26 -03:00
oobabooga a07d070b6c
Add llama-2-70b GGML support (#3285) 2023-07-24 16:37:03 -03:00
Gabriel Pena eedb3bf023
Add low vram mode on llama cpp (#3076) 2023-07-12 11:05:13 -03:00
Ricardo Pinto 3e9da5a27c
Changed FormComponent to IOComponent (#3017)
Co-authored-by: Ricardo Pinto <1-ricardo.pinto@users.noreply.gitlab.cognitage.com>
2023-07-11 18:52:16 -03:00
oobabooga c21b73ff37 Minor change to ui.py 2023-07-07 09:09:14 -07:00
oobabooga 333075e726
Fix #3003 2023-07-04 11:38:35 -03:00
Panchovix 10c8c197bf
Add Support for Static NTK RoPE scaling for exllama/exllama_hf (#2955) 2023-07-04 01:13:16 -03:00
oobabooga 4b1804a438
Implement sessions + add basic multi-user support (#2991) 2023-07-04 00:03:30 -03:00
oobabooga 3443219cbc
Add repetition penalty range parameter to transformers (#2916) 2023-06-29 13:40:13 -03:00
oobabooga c52290de50
ExLlama with long context (#2875) 2023-06-25 22:49:26 -03:00
oobabooga 3ae9af01aa Add --no_use_cuda_fp16 param for AutoGPTQ 2023-06-23 12:22:56 -03:00
oobabooga 5f392122fd Add gpu_split param to ExLlama
Adapted from code created by Ph0rk0z. Thank you Ph0rk0z.
2023-06-16 20:49:36 -03:00
oobabooga 7ef6a50e84
Reorganize model loading UI completely (#2720) 2023-06-16 19:00:37 -03:00
Tom Jobbins 646b0c889f
AutoGPTQ: Add UI and command line support for disabling fused attention and fused MLP (#2648) 2023-06-15 23:59:54 -03:00
oobabooga ac122832f7 Make dropdown menus more similar to automatic1111 2023-06-11 14:20:16 -03:00
oobabooga f276d88546 Use AutoGPTQ by default for GPTQ models 2023-06-05 15:41:48 -03:00
oobabooga 2f6631195a Add desc_act checkbox to the UI 2023-06-02 01:45:46 -03:00
Luis Lopez 9e7204bef4
Add tail-free and top-a sampling (#2357) 2023-05-29 21:40:01 -03:00
oobabooga 1394f44e14 Add triton checkbox for AutoGPTQ 2023-05-29 15:32:45 -03:00
Honkware 204731952a
Falcon support (trust-remote-code and autogptq checkboxes) (#2367)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-05-29 10:20:18 -03:00
DGdev91 cf088566f8
Make llama.cpp read prompt size and seed from settings (#2299) 2023-05-25 10:29:31 -03:00
oobabooga 361451ba60
Add --load-in-4bit parameter (#2320) 2023-05-25 01:14:13 -03:00