Commit graph

225 commits

Author SHA1 Message Date
oobabooga ae54d8faaa
New llama.cpp loader (#6846) 2025-04-18 09:59:37 -03:00
oobabooga 5bcd2d7ad0
Add the top N-sigma sampler (#6796) 2025-03-14 16:45:11 -03:00
oobabooga 0360f54ae8 UI: add a "Show after" parameter (to use with DeepSeek </think>) 2025-02-02 15:30:09 -08:00
oobabooga a5d64b586d
Add a "copy" button below each message (#6654) 2025-01-11 16:59:21 -03:00
oobabooga 83c426e96b
Organize internals (#6646) 2025-01-10 18:04:32 -03:00
oobabooga 7157257c3f
Remove the AutoGPTQ loader (#6641) 2025-01-08 19:28:56 -03:00
oobabooga c0f600c887 Add a --torch-compile flag for transformers 2025-01-05 05:47:00 -08:00
oobabooga 11af199aff Add a "Static KV cache" option for transformers 2025-01-04 17:52:57 -08:00
oobabooga 4b3e1b3757 UI: add a "Search chats" input field 2025-01-02 18:46:40 -08:00
oobabooga b051e2c161 UI: improve a margin for readability 2024-12-17 19:58:21 -08:00
Diner Burger addad3c63e
Allow more granular KV cache settings (#6561) 2024-12-17 17:43:48 -03:00
oobabooga c43ee5db11 UI: very minor color change 2024-12-17 07:59:55 -08:00
oobabooga d769618591
Improved UI (#6575) 2024-12-17 00:47:41 -03:00
oobabooga 93c250b9b6 Add a UI element for enable_tp 2024-10-01 11:16:15 -07:00
Philipp Emanuel Weidmann 301375834e
Exclude Top Choices (XTC): A sampler that boosts creativity, breaks writing clichés, and inhibits non-verbatim repetition (#6335) 2024-09-27 22:50:12 -03:00
oobabooga e6181e834a Remove AutoAWQ as a standalone loader
(it works better through transformers)
2024-07-23 15:31:17 -07:00
oobabooga aa809e420e Bump llama-cpp-python to 0.2.83, add back tensorcore wheels
Also add back the progress bar patch
2024-07-22 18:05:11 -07:00
oobabooga 11bbf71aa5
Bump back llama-cpp-python (#6257) 2024-07-22 16:19:41 -03:00
oobabooga 0f53a736c1 Revert the llama-cpp-python update 2024-07-22 12:02:25 -07:00
oobabooga a687f950ba Remove the tensorcores llama.cpp wheels
They are not faster than the default wheels anymore and they use a lot of space.
2024-07-22 11:54:35 -07:00
oobabooga f2d802e707 UI: make Default/Notebook contents persist on page reload 2024-07-22 11:07:10 -07:00
oobabooga 79e8dbe45f UI: minor optimization 2024-07-21 22:06:49 -07:00
oobabooga 17df2d7bdf UI: don't export the instruction template on "Save UI defaults to settings.yaml" 2024-07-21 10:45:01 -07:00
oobabooga 916d1d8283 UI: improve the style of code blocks in light theme 2024-07-20 20:32:57 -07:00
oobabooga 79c4d3da3d
Optimize the UI (#6251) 2024-07-21 00:01:42 -03:00
oobabooga e436d69e2b Add --no_xformers and --no_sdpa flags for ExllamaV2 2024-07-11 15:47:37 -07:00
GralchemOz 8a39f579d8
transformers: Add eager attention option to make Gemma-2 work properly (#6188) 2024-07-01 12:08:08 -03:00
oobabooga da196707cf UI: improve the light theme a bit 2024-06-27 21:05:38 -07:00
oobabooga 577a8cd3ee
Add TensorRT-LLM support (#5715) 2024-06-24 02:30:03 -03:00
Forkoz 1d79aa67cf
Fix flash-attn UI parameter to actually store true. (#6076) 2024-06-13 00:34:54 -03:00
oobabooga 2d196ed2fe Remove obsolete pre_layer parameter 2024-06-12 18:56:44 -07:00
oobabooga 9e189947d1 Minor fix after bd7cc4234d (thanks @belladoreai) 2024-05-21 10:37:30 -07:00
Philipp Emanuel Weidmann 852c943769
DRY: A modern repetition penalty that reliably prevents looping (#5677) 2024-05-19 23:53:47 -03:00
oobabooga e61055253c Bump llama-cpp-python to 0.2.69, add --flash-attn option 2024-05-03 04:31:22 -07:00
oobabooga 51fb766bea
Add back my llama-cpp-python wheels, bump to 0.2.65 (#5964) 2024-04-30 09:11:31 -03:00
oobabooga 70845c76fb
Add back the max_updates_second parameter (#5937) 2024-04-26 10:14:51 -03:00
oobabooga 6761b5e7c6
Improved instruct style (with syntax highlighting & LaTeX rendering) (#5936) 2024-04-26 10:13:11 -03:00
oobabooga f0538efb99 Remove obsolete --tensorcores references 2024-04-24 00:31:28 -07:00
Ashley Kleynhans 70c637bf90
Fix saving of UI defaults to settings.yaml - Fixes #5592 (#5794) 2024-04-11 18:19:16 -03:00
oobabooga 35da6b989d
Organize the parameters tab (#5767) 2024-03-28 16:45:03 -03:00
oobabooga 2a92a842ce
Bump gradio to 4.23 (#5758) 2024-03-26 16:32:20 -03:00
oobabooga d828844a6f Small fix: don't save truncation_length to settings.yaml
It should derive from model metadata or from a command-line flag.
2024-03-14 08:56:28 -07:00
oobabooga 2ef5490a36 UI: make light theme less blinding 2024-03-13 08:23:16 -07:00
oobabooga 28076928ac
UI: Add a new "User description" field for user personality/biography (#5691) 2024-03-11 23:41:57 -03:00
oobabooga afb51bd5d6
Add StreamingLLM for llamacpp & llamacpp_HF (2nd attempt) (#5669) 2024-03-09 00:25:33 -03:00
oobabooga 2ec1d96c91
Add cache_4bit option for ExLlamaV2 (#5645) 2024-03-06 23:02:25 -03:00
oobabooga 2174958362
Revert gradio to 3.50.2 (#5640) 2024-03-06 11:52:46 -03:00
oobabooga 63a1d4afc8
Bump gradio to 4.19 (#5522) 2024-03-05 07:32:28 -03:00
kalomaze cfb25c9b3f
Cubic sampling w/ curve param (#5551)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-03-03 13:22:21 -03:00
oobabooga a6730f88f7
Add --autosplit flag for ExLlamaV2 (#5524) 2024-02-16 15:26:10 -03:00
oobabooga 080f7132c0
Revert gradio to 3.50.2 (#5513) 2024-02-15 20:40:23 -03:00
oobabooga 7123ac3f77
Remove "Maximum UI updates/second" parameter (#5507) 2024-02-14 23:34:30 -03:00
oobabooga 8c35fefb3b
Add custom sampler order support (#5443) 2024-02-06 11:20:10 -03:00
Forkoz 2a45620c85
Split by rows instead of layers for llama.cpp multi-gpu (#5435) 2024-02-04 23:36:40 -03:00
kalomaze b6077b02e4
Quadratic sampling (#5403)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-02-04 00:20:02 -03:00
oobabooga e055967974
Add prompt_lookup_num_tokens parameter (#5296) 2024-01-17 17:09:36 -03:00
oobabooga b3fc2cd887 UI: Do not save unchanged extension settings to settings.yaml 2024-01-10 03:48:30 -08:00
oobabooga 53dc1d8197 UI: Do not save unchanged settings to settings.yaml 2024-01-09 18:59:04 -08:00
mamei16 bec4e0a1ce
Fix update event in refresh buttons (#5197) 2024-01-09 14:49:37 -03:00
oobabooga 4ca82a4df9 Save light/dark theme on "Save UI defaults to settings.yaml" 2024-01-09 04:20:10 -08:00
oobabooga 29c2693ea0
dynatemp_low, dynatemp_high, dynatemp_exponent parameters (#5209) 2024-01-08 23:28:35 -03:00
oobabooga c4e005efec Fix dropdown menus sometimes failing to refresh 2024-01-08 17:49:54 -08:00
oobabooga 0d07b3a6a1
Add dynamic_temperature_low parameter (#5198) 2024-01-07 17:03:47 -03:00
kalomaze 48327cc5c4
Dynamic Temperature HF loader support (#5174)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-01-07 10:36:26 -03:00
oobabooga 248742df1c Save extension fields to settings.yaml on "Save UI defaults" 2024-01-04 20:33:42 -08:00
oobabooga 8c60495878 UI: add "Maximum UI updates/second" parameter 2023-12-24 09:17:40 -08:00
oobabooga de138b8ba6
Add llama-cpp-python wheels with tensor cores support (#5003) 2023-12-19 17:30:53 -03:00
oobabooga 0a299d5959
Bump llama-cpp-python to 0.2.24 (#5001) 2023-12-19 15:22:21 -03:00
Water 674be9a09a
Add HQQ quant loader (#4888)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-12-18 21:23:16 -03:00
oobabooga f1f2c4c3f4
Add --num_experts_per_token parameter (ExLlamav2) (#4955) 2023-12-17 12:08:33 -03:00
oobabooga 3bbf6c601d AutoGPTQ: Add --disable_exllamav2 flag (Mixtral CPU offloading needs this) 2023-12-15 06:46:13 -08:00
oobabooga 39d2fe1ed9
Jinja templates for Instruct and Chat (#4874) 2023-12-12 17:23:14 -03:00
oobabooga 5fcee696ea
New feature: enlarge character pictures on click (#4654) 2023-11-19 02:05:17 -03:00
oobabooga e0ca49ed9c
Bump llama-cpp-python to 0.2.18 (2nd attempt) (#4637)
* Update requirements*.txt

* Add back seed
2023-11-18 00:31:27 -03:00
oobabooga 9d6f79db74 Revert "Bump llama-cpp-python to 0.2.18 (#4611)"
This reverts commit 923c8e25fb.
2023-11-17 05:14:25 -08:00
oobabooga 8b66d83aa9 Set use_fast=True by default, create --no_use_fast flag
This increases tokens/second for HF loaders.
2023-11-16 19:55:28 -08:00
oobabooga 923c8e25fb
Bump llama-cpp-python to 0.2.18 (#4611) 2023-11-16 22:55:14 -03:00
oobabooga 6e2e0317af
Separate context and system message in instruction formats (#4499) 2023-11-07 20:02:58 -03:00
oobabooga af3d25a503 Disable logits_all in llamacpp_HF (makes processing 3x faster) 2023-11-07 14:35:48 -08:00
feng lui 4766a57352
transformers: add use_flash_attention_2 option (#4373) 2023-11-04 13:59:33 -03:00
oobabooga aa5d671579
Add temperature_last parameter (#4472) 2023-11-04 13:09:07 -03:00
kalomaze 367e5e6e43
Implement Min P as a sampler option in HF loaders (#4449) 2023-11-02 16:32:51 -03:00
oobabooga c0655475ae Add cache_8bit option 2023-11-02 11:23:04 -07:00
Abhilash Majumder 778a010df8
Intel Gpu support initialization (#4340) 2023-10-26 23:39:51 -03:00
tdrussell 72f6fc6923
Rename additive_repetition_penalty to presence_penalty, add frequency_penalty (#4376) 2023-10-25 12:10:28 -03:00
tdrussell 4440f87722
Add additive_repetition_penalty sampler setting. (#3627) 2023-10-23 02:28:07 -03:00
oobabooga df90d03e0b Replace --mul_mat_q with --no_mul_mat_q 2023-10-22 12:23:03 -07:00
oobabooga fae8062d39
Bump to latest gradio (3.47) (#4258) 2023-10-10 22:20:49 -03:00
oobabooga b6fe6acf88 Add threads_batch parameter 2023-10-01 21:28:00 -07:00
jllllll 41a2de96e5
Bump llama-cpp-python to 0.2.11 2023-10-01 18:08:10 -05:00
StoyanStAtanasov 7e6ff8d1f0
Enable NUMA feature for llama_cpp_python (#4040) 2023-09-26 22:05:00 -03:00
oobabooga 1ca54faaf0 Improve --multi-user mode 2023-09-26 06:42:33 -07:00
oobabooga d0d221df49 Add --use_fast option (closes #3741) 2023-09-25 12:19:43 -07:00
oobabooga b973b91d73 Automatically filter by loader (closes #4072) 2023-09-25 10:28:35 -07:00
oobabooga 08cf150c0c
Add a grammar editor to the UI (#4061) 2023-09-24 18:05:24 -03:00
oobabooga b227e65d86 Add grammar to llama.cpp loader (closes #4019) 2023-09-24 07:10:45 -07:00
saltacc f01b9aa71f
Add customizable ban tokens (#3899) 2023-09-15 18:27:27 -03:00
oobabooga 1ce3c93600 Allow "Your name" field to be saved 2023-09-14 03:44:35 -07:00
oobabooga 9f199c7a4c Use Noto Sans font
Copied from 6c8bd06308/public/webfonts/NotoSans
2023-09-13 13:48:05 -07:00
oobabooga ed86878f02 Remove GGML support 2023-09-11 07:44:00 -07:00