Commit graph

1379 commits

Author SHA1 Message Date
oobabooga 35da6b989d
Organize the parameters tab (#5767) 2024-03-28 16:45:03 -03:00
Yiximail 8c9aca239a
Fix prompt incorrectly set to empty when suffix is empty string (#5757) 2024-03-26 16:33:09 -03:00
oobabooga 2a92a842ce
Bump gradio to 4.23 (#5758) 2024-03-26 16:32:20 -03:00
oobabooga 49b111e2dd Lint 2024-03-17 08:33:23 -07:00
oobabooga d890c99b53 Fix StreamingLLM when content is removed from the beginning of the prompt 2024-03-14 09:18:54 -07:00
oobabooga d828844a6f Small fix: don't save truncation_length to settings.yaml
It should derive from model metadata or from a command-line flag.
2024-03-14 08:56:28 -07:00
oobabooga 2ef5490a36 UI: make light theme less blinding 2024-03-13 08:23:16 -07:00
oobabooga 40a60e0297 Convert attention_sink_size to int (closes #5696) 2024-03-13 08:15:49 -07:00
oobabooga edec3bf3b0 UI: avoid caching convert_to_markdown calls during streaming 2024-03-13 08:14:34 -07:00
oobabooga 8152152dd6 Small fix after 28076928ac 2024-03-11 19:56:35 -07:00
oobabooga 28076928ac
UI: Add a new "User description" field for user personality/biography (#5691) 2024-03-11 23:41:57 -03:00
oobabooga 63701f59cf UI: mention that n_gpu_layers > 0 is necessary for the GPU to be used 2024-03-11 18:54:15 -07:00
oobabooga 46031407b5 Increase the cache size of convert_to_markdown to 4096 2024-03-11 18:43:04 -07:00
oobabooga 9eca197409 Minor logging change 2024-03-11 16:31:13 -07:00
oobabooga afadc787d7 Optimize the UI by caching convert_to_markdown calls 2024-03-10 20:10:07 -07:00
oobabooga 056717923f Document StreamingLLM 2024-03-10 19:15:23 -07:00
oobabooga 15d90d9bd5 Minor logging change 2024-03-10 18:20:50 -07:00
oobabooga cf0697936a Optimize StreamingLLM by over 10x 2024-03-08 21:48:28 -08:00
oobabooga afb51bd5d6
Add StreamingLLM for llamacpp & llamacpp_HF (2nd attempt) (#5669) 2024-03-09 00:25:33 -03:00
oobabooga 549bb88975 Increase height of "Custom stopping strings" UI field 2024-03-08 12:54:30 -08:00
oobabooga 238f69accc Move "Command for chat-instruct mode" to the main chat tab (closes #5634) 2024-03-08 12:52:52 -08:00
oobabooga bae14c8f13 Right-truncate long chat completion prompts instead of left-truncating
Instructions are usually at the beginning of the prompt.
2024-03-07 08:50:24 -08:00
Bartowski 104573f7d4
Update cache_4bit documentation (#5649)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-03-07 13:08:21 -03:00
oobabooga 2ec1d96c91
Add cache_4bit option for ExLlamaV2 (#5645) 2024-03-06 23:02:25 -03:00
oobabooga 2174958362
Revert gradio to 3.50.2 (#5640) 2024-03-06 11:52:46 -03:00
oobabooga d61e31e182
Save the extensions after Gradio 4 (#5632) 2024-03-05 07:54:34 -03:00
oobabooga 63a1d4afc8
Bump gradio to 4.19 (#5522) 2024-03-05 07:32:28 -03:00
oobabooga f697cb4609 Move update_wizard_windows.sh to update_wizard_windows.bat (oops) 2024-03-04 19:26:24 -08:00
kalomaze cfb25c9b3f
Cubic sampling w/ curve param (#5551)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-03-03 13:22:21 -03:00
oobabooga 09b13acfb2 Perplexity evaluation: print to terminal after calculation is finished 2024-02-28 19:58:21 -08:00
oobabooga 4164e29416 Block the "To create a public link, set share=True" gradio message 2024-02-25 15:06:08 -08:00
oobabooga d34126255d Fix loading extensions with "-" in the name (closes #5557) 2024-02-25 09:24:52 -08:00
oobabooga 10aedc329f Logging: more readable messages when renaming chat histories 2024-02-22 07:57:06 -08:00
oobabooga faf3bf2503 Perplexity evaluation: make UI events more robust (attempt) 2024-02-22 07:13:22 -08:00
oobabooga ac5a7a26ea Perplexity evaluation: add some informative error messages 2024-02-21 20:20:52 -08:00
oobabooga 59032140b5 Fix CFG with llamacpp_HF (2nd attempt) 2024-02-19 18:35:42 -08:00
oobabooga c203c57c18 Fix CFG with llamacpp_HF 2024-02-19 18:09:49 -08:00
oobabooga ae05d9830f Replace {{char}}, {{user}} in the chat template itself 2024-02-18 19:57:54 -08:00
oobabooga 1f27bef71b
Move chat UI elements to the right on desktop (#5538) 2024-02-18 14:32:05 -03:00
oobabooga d6bd71db7f ExLlamaV2: fix loading when autosplit is not set 2024-02-17 12:54:37 -08:00
oobabooga af0bbf5b13 Lint 2024-02-17 09:01:04 -08:00
oobabooga a6730f88f7
Add --autosplit flag for ExLlamaV2 (#5524) 2024-02-16 15:26:10 -03:00
oobabooga 4039999be5 Autodetect llamacpp_HF loader when tokenizer exists 2024-02-16 09:29:26 -08:00
oobabooga 76d28eaa9e
Add a menu for customizing the instruction template for the model (#5521) 2024-02-16 14:21:17 -03:00
oobabooga 0e1d8d5601 Instruction template: make "Send to default/notebook" work without a tokenizer 2024-02-16 08:01:07 -08:00
oobabooga 44018c2f69
Add a "llamacpp_HF creator" menu (#5519) 2024-02-16 12:43:24 -03:00
oobabooga b2b74c83a6 Fix Qwen1.5 in llamacpp_HF 2024-02-15 19:04:19 -08:00
oobabooga 080f7132c0
Revert gradio to 3.50.2 (#5513) 2024-02-15 20:40:23 -03:00
oobabooga 7123ac3f77
Remove "Maximum UI updates/second" parameter (#5507) 2024-02-14 23:34:30 -03:00
DominikKowalczyk 33c4ce0720
Bump gradio to 4.19 (#5419)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-02-14 23:28:26 -03:00
oobabooga b16958575f Minor bug fix 2024-02-13 19:48:32 -08:00
oobabooga d47182d9d1
llamacpp_HF: do not use oobabooga/llama-tokenizer (#5499) 2024-02-14 00:28:51 -03:00
oobabooga 069ed7c6ef Lint 2024-02-13 16:05:41 -08:00
oobabooga 86c320ab5a llama.cpp: add a progress bar for prompt evaluation 2024-02-07 21:56:10 -08:00
oobabooga c55b8ce932 Improved random preset generation 2024-02-06 08:51:52 -08:00
oobabooga 4e34ae0587 Minor logging improvements 2024-02-06 08:22:08 -08:00
oobabooga 3add2376cd Better warpers logging 2024-02-06 07:09:21 -08:00
oobabooga 494cc3c5b0 Handle empty sampler priority field, use default values 2024-02-06 07:05:32 -08:00
oobabooga 775902c1f2 Sampler priority: better logging, always save to presets 2024-02-06 06:49:22 -08:00
oobabooga acfbe6b3b3 Minor doc changes 2024-02-06 06:35:01 -08:00
oobabooga 8ee3cea7cb Improve some log messages 2024-02-06 06:31:27 -08:00
oobabooga 8a6d9abb41 Small fixes 2024-02-06 06:26:27 -08:00
oobabooga 2a1063eff5 Revert "Remove non-HF ExLlamaV2 loader (#5431)"
This reverts commit cde000d478.
2024-02-06 06:21:36 -08:00
oobabooga 8c35fefb3b
Add custom sampler order support (#5443) 2024-02-06 11:20:10 -03:00
oobabooga 7301c7618f Minor change to Models tab 2024-02-04 21:49:58 -08:00
oobabooga f234fbe83f Improve a log message after previous commit 2024-02-04 21:44:53 -08:00
oobabooga 7073665a10
Truncate long chat completions inputs (#5439) 2024-02-05 02:31:24 -03:00
oobabooga 9033fa5eee Organize the Model tab 2024-02-04 19:30:22 -08:00
Forkoz 2a45620c85
Split by rows instead of layers for llama.cpp multi-gpu (#5435) 2024-02-04 23:36:40 -03:00
Badis Ghoubali 3df7e151f7
fix the n_batch slider (#5436) 2024-02-04 18:15:30 -03:00
oobabooga 4e188eeb80 Lint 2024-02-03 20:40:10 -08:00
oobabooga cde000d478
Remove non-HF ExLlamaV2 loader (#5431) 2024-02-04 01:15:51 -03:00
kalomaze b6077b02e4
Quadratic sampling (#5403)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-02-04 00:20:02 -03:00
Badis Ghoubali 40c7977f9b
Add roleplay.gbnf grammar (#5368) 2024-01-28 21:41:28 -03:00
sam-ngu c0bdcee646
added trust_remote_code to deepspeed init loaderClass (#5237) 2024-01-26 11:10:57 -03:00
oobabooga 87dc421ee8
Bump exllamav2 to 0.0.12 (#5352) 2024-01-22 22:40:12 -03:00
oobabooga aad73667af Lint 2024-01-22 03:25:55 -08:00
lmg-anon db1da9f98d
Fix logprobs tokens in OpenAI API (#5339) 2024-01-22 08:07:42 -03:00
Forkoz 5c5ef4cef7
UI: change n_gpu_layers maximum to 256 for larger models. (#5262) 2024-01-17 17:13:16 -03:00
ilya sheprut 4d14eb8b82
LoRA: Fix error "Attempting to unscale FP16 gradients" when training (#5268) 2024-01-17 17:11:49 -03:00
oobabooga e055967974
Add prompt_lookup_num_tokens parameter (#5296) 2024-01-17 17:09:36 -03:00
oobabooga b3fc2cd887 UI: Do not save unchanged extension settings to settings.yaml 2024-01-10 03:48:30 -08:00
oobabooga 53dc1d8197 UI: Do not save unchanged settings to settings.yaml 2024-01-09 18:59:04 -08:00
oobabooga 89e7e107fc Lint 2024-01-09 16:27:50 -08:00
mamei16 bec4e0a1ce
Fix update event in refresh buttons (#5197) 2024-01-09 14:49:37 -03:00
oobabooga 4333d82b9d Minor bug fix 2024-01-09 06:55:18 -08:00
oobabooga 953343cced Improve the file saving/deletion menus 2024-01-09 06:33:47 -08:00
oobabooga 123f27a3c5 Load the nearest character after deleting a character
Instead of the first.
2024-01-09 06:24:27 -08:00
oobabooga b908ed318d Revert "Rename past chats -> chat history"
This reverts commit aac93a1fd6.
2024-01-09 05:26:07 -08:00
oobabooga 4ca82a4df9 Save light/dark theme on "Save UI defaults to settings.yaml" 2024-01-09 04:20:10 -08:00
oobabooga 7af50ede94 Reorder some buttons 2024-01-09 04:11:50 -08:00
oobabooga a9f49a7574 Confirm the chat history rename with enter 2024-01-09 04:00:53 -08:00
oobabooga 7bdd2118a2 Change some log messages when deleting files 2024-01-09 03:32:01 -08:00
oobabooga aac93a1fd6 Rename past chats -> chat history 2024-01-09 03:14:30 -08:00
oobabooga 615fa11af8 Move new chat button, improve history deletion handling 2024-01-08 21:22:37 -08:00
oobabooga 4f7e1eeafd
Past chat histories in a side bar on desktop (#5098)
Lots of room for improvement, but that's a start.
2024-01-09 01:57:29 -03:00
oobabooga 372ef5e2d8 Fix dynatemp parameters always visible 2024-01-08 19:42:31 -08:00
oobabooga 29c2693ea0
dynatemp_low, dynatemp_high, dynatemp_exponent parameters (#5209) 2024-01-08 23:28:35 -03:00
oobabooga c4e005efec Fix dropdown menus sometimes failing to refresh 2024-01-08 17:49:54 -08:00
oobabooga 9cd2106303 Revert "Add dynamic temperature to the random preset button"
This reverts commit 4365fb890f.
2024-01-08 16:46:24 -08:00
oobabooga 4365fb890f Add dynamic temperature to the random preset button 2024-01-07 13:08:15 -08:00
oobabooga 0d07b3a6a1
Add dynamic_temperature_low parameter (#5198) 2024-01-07 17:03:47 -03:00
oobabooga b8a0b3f925 Don't print torch tensors with --verbose 2024-01-07 10:35:55 -08:00
oobabooga cf820c69c5 Print generation parameters with --verbose (HF only) 2024-01-07 10:06:23 -08:00
oobabooga c4c7fc4ab3 Lint 2024-01-07 09:36:56 -08:00
kalomaze 48327cc5c4
Dynamic Temperature HF loader support (#5174)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-01-07 10:36:26 -03:00
oobabooga 248742df1c Save extension fields to settings.yaml on "Save UI defaults" 2024-01-04 20:33:42 -08:00
oobabooga c9d814592e Increase maximum temperature value to 5 2024-01-04 17:28:15 -08:00
oobabooga e4d724eb3f Fix cache_folder bug introduced in 37eff915d6 2024-01-04 07:49:40 -08:00
Alberto Cano 37eff915d6
Use --disk-cache-dir for all caches 2024-01-04 00:27:26 -03:00
Lounger 7965f6045e
Fix loading latest history for file names with dots (#5162) 2024-01-03 22:39:41 -03:00
AstrisCantCode b80e6365d0
Fix various bugs for LoRA training (#5161) 2024-01-03 20:42:20 -03:00
oobabooga 7cce88c403 Rmove an unncecessary exception 2024-01-02 07:20:59 -08:00
oobabooga 94afa0f9cf Minor style changes 2024-01-01 16:00:22 -08:00
oobabooga cbf6f9e695 Update some UI messages 2023-12-30 21:31:17 -08:00
oobabooga 2aad91f3c9
Remove deprecated command-line flags (#5131) 2023-12-31 02:07:48 -03:00
oobabooga 2734ce3e4c
Remove RWKV loader (#5130) 2023-12-31 02:01:40 -03:00
oobabooga 0e54a09bcb
Remove exllamav1 loaders (#5128) 2023-12-31 01:57:06 -03:00
oobabooga 8e397915c9
Remove --sdp-attention, --xformers flags (#5126) 2023-12-31 01:36:51 -03:00
B611 b7dd1f9542
Specify utf-8 encoding for model metadata file open (#5125) 2023-12-31 01:34:32 -03:00
oobabooga c06f630bcc Increase max_updates_second maximum value 2023-12-24 13:29:47 -08:00
oobabooga 8c60495878 UI: add "Maximum UI updates/second" parameter 2023-12-24 09:17:40 -08:00
zhangningboo 1b8b61b928
Fix output_ids decoding for Qwen/Qwen-7B-Chat (#5045) 2023-12-22 23:11:02 -03:00
Yiximail afc91edcb2
Reset the model_name after unloading the model (#5051) 2023-12-22 22:18:24 -03:00
oobabooga 2706149c65
Organize the CMD arguments by group (#5027) 2023-12-21 00:33:55 -03:00
oobabooga c727a70572 Remove redundancy from modules/loaders.py 2023-12-20 19:18:07 -08:00
luna 6efbe3009f
let exllama v1 models load safetensor loras (#4854) 2023-12-20 13:29:19 -03:00
oobabooga bcba200790 Fix EOS being ignored in ExLlamav2 after previous commit 2023-12-20 07:54:06 -08:00
oobabooga f0f6d9bdf9 Add HQQ back & update version
This reverts commit 2289e9031e.
2023-12-20 07:46:09 -08:00
oobabooga b15f510154 Optimize ExLlamav2 (non-HF) loader 2023-12-20 07:31:42 -08:00
oobabooga fadb295d4d Lint 2023-12-19 21:36:57 -08:00
oobabooga fb8ee9f7ff Add a specific error if HQQ is missing 2023-12-19 21:32:58 -08:00
oobabooga 9992f7d8c0 Improve several log messages 2023-12-19 20:54:32 -08:00
oobabooga 23818dc098 Better logger
Credits: vladmandic/automatic
2023-12-19 20:38:33 -08:00
oobabooga 95600073bc Add an informative error when extension requirements are missing 2023-12-19 20:20:45 -08:00
oobabooga d8279dc710 Replace character name placeholders in chat context (closes #5007) 2023-12-19 17:31:46 -08:00
oobabooga e83e6cedbe Organize the model menu 2023-12-19 13:18:26 -08:00
oobabooga f4ae0075e8 Fix conversion from old template format to jinja2 2023-12-19 13:16:52 -08:00
oobabooga de138b8ba6
Add llama-cpp-python wheels with tensor cores support (#5003) 2023-12-19 17:30:53 -03:00
oobabooga 0a299d5959
Bump llama-cpp-python to 0.2.24 (#5001) 2023-12-19 15:22:21 -03:00
oobabooga 83cf1a6b67 Fix Yi space issue (closes #4996) 2023-12-19 07:54:19 -08:00
oobabooga 9847809a7a Add a warning about ppl evaluation without --no_use_fast 2023-12-18 18:09:24 -08:00
oobabooga f6d701624c UI: mention that QuIP# does not work on Windows 2023-12-18 18:05:02 -08:00
oobabooga a23a004434 Update the example template 2023-12-18 17:47:35 -08:00
Water 674be9a09a
Add HQQ quant loader (#4888)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-12-18 21:23:16 -03:00
oobabooga 1f9e25e76a UI: update "Saved instruction templates" dropdown after loading template 2023-12-17 21:19:06 -08:00
oobabooga da1c8d77ea Merge remote-tracking branch 'refs/remotes/origin/dev' into dev 2023-12-17 21:05:10 -08:00
oobabooga cac89df97b Instruction templates: better handle unwanted bos tokens 2023-12-17 21:04:30 -08:00
oobabooga f0d6ead877
llama.cpp: read instruction template from GGUF metadata (#4975) 2023-12-18 01:51:58 -03:00
oobabooga f1f2c4c3f4
Add --num_experts_per_token parameter (ExLlamav2) (#4955) 2023-12-17 12:08:33 -03:00