Commit graph

1988 commits

Author SHA1 Message Date
oobabooga 0f3a423de1 Alternative solution to "get next logits" deadlock (#6106) 2024-06-13 19:34:16 -07:00
oobabooga 386500aa37 Avoid unnecessary calls UI -> backend, to make it faster 2024-06-12 20:52:42 -07:00
Forkoz 1d79aa67cf
Fix flash-attn UI parameter to actually store true. (#6076) 2024-06-13 00:34:54 -03:00
Belladore 3abafee696
DRY sampler improvements (#6053) 2024-06-12 23:39:11 -03:00
oobabooga a36fa73071 Lint 2024-06-12 19:00:21 -07:00
oobabooga 2d196ed2fe Remove obsolete pre_layer parameter 2024-06-12 18:56:44 -07:00
Belladore 46174a2d33
Fix error when bos_token_id is None. (#6061) 2024-06-12 22:52:27 -03:00
Belladore a363cdfca1
Fix missing bos token for some models (including Llama-3) (#6050) 2024-05-27 09:21:30 -03:00
oobabooga 8df68b05e9 Remove MinPLogitsWarper (it's now a transformers built-in) 2024-05-27 05:03:30 -07:00
oobabooga 4f1e96b9e3 Downloader: Add --model-dir argument, respect --model-dir in the UI 2024-05-23 20:42:46 -07:00
oobabooga ad54d524f7 Revert "Fix stopping strings for llama-3 and phi (#6043)"
This reverts commit 5499bc9bc8.
2024-05-22 17:18:08 -07:00
oobabooga 5499bc9bc8
Fix stopping strings for llama-3 and phi (#6043) 2024-05-22 13:53:59 -03:00
oobabooga 9e189947d1 Minor fix after bd7cc4234d (thanks @belladoreai) 2024-05-21 10:37:30 -07:00
oobabooga ae86292159 Fix getting Phi-3-small-128k-instruct logits 2024-05-21 10:35:00 -07:00
oobabooga bd7cc4234d
Backend cleanup (#6025) 2024-05-21 13:32:02 -03:00
Philipp Emanuel Weidmann 852c943769
DRY: A modern repetition penalty that reliably prevents looping (#5677) 2024-05-19 23:53:47 -03:00
oobabooga 9f77ed1b98
--idle-timeout flag to unload the model if unused for N minutes (#6026) 2024-05-19 23:29:39 -03:00
altoiddealer 818b4e0354
Let grammar escape backslashes (#5865) 2024-05-19 20:26:09 -03:00
Tisjwlf 907702c204
Fix gguf multipart file loading (#5857) 2024-05-19 20:22:09 -03:00
A0nameless0man 5cb59707f3
fix: grammar not support utf-8 (#5900) 2024-05-19 20:10:39 -03:00
Samuel Wein b63dc4e325
UI: Warn user if they are trying to load a model from no path (#6006) 2024-05-19 20:05:17 -03:00
chr 6b546a2c8b
llama.cpp: increase the max threads from 32 to 256 (#5889) 2024-05-19 20:02:19 -03:00
oobabooga a38a37b3b3 llama.cpp: default n_gpu_layers to the maximum value for the model automatically 2024-05-19 10:57:42 -07:00
oobabooga a4611232b7 Make --verbose output less spammy 2024-05-18 09:57:00 -07:00
oobabooga e9c9483171 Improve the logging messages while loading models 2024-05-03 08:10:44 -07:00
oobabooga e61055253c Bump llama-cpp-python to 0.2.69, add --flash-attn option 2024-05-03 04:31:22 -07:00
oobabooga 51fb766bea
Add back my llama-cpp-python wheels, bump to 0.2.65 (#5964) 2024-04-30 09:11:31 -03:00
oobabooga dfdb6fee22 Set llm_int8_enable_fp32_cpu_offload=True for --load-in-4bit
To allow for 32-bit CPU offloading (it's very slow).
2024-04-26 09:39:27 -07:00
oobabooga 70845c76fb
Add back the max_updates_second parameter (#5937) 2024-04-26 10:14:51 -03:00
oobabooga 6761b5e7c6
Improved instruct style (with syntax highlighting & LaTeX rendering) (#5936) 2024-04-26 10:13:11 -03:00
oobabooga 4094813f8d Lint 2024-04-24 09:53:41 -07:00
oobabooga 64e2a9a0a7 Fix the Phi-3 template when used in the UI 2024-04-24 01:34:11 -07:00
oobabooga f0538efb99 Remove obsolete --tensorcores references 2024-04-24 00:31:28 -07:00
Colin f3c9103e04
Revert walrus operator for params['max_memory'] (#5878) 2024-04-24 01:09:14 -03:00
oobabooga 9b623b8a78
Bump llama-cpp-python to 0.2.64, use official wheels (#5921) 2024-04-23 23:17:05 -03:00
oobabooga f27e1ba302
Add a /v1/internal/chat-prompt endpoint (#5879) 2024-04-19 00:24:46 -03:00
oobabooga e158299fb4 Fix loading sharted GGUF models through llamacpp_HF 2024-04-11 14:50:05 -07:00
wangshuai09 fd4e46bce2
Add Ascend NPU support (basic) (#5541) 2024-04-11 18:42:20 -03:00
Ashley Kleynhans 70c637bf90
Fix saving of UI defaults to settings.yaml - Fixes #5592 (#5794) 2024-04-11 18:19:16 -03:00
oobabooga 3e3a7c4250 Bump llama-cpp-python to 0.2.61 & fix the crash 2024-04-11 14:15:34 -07:00
Victorivus c423d51a83
Fix issue #5783 for character images with transparency (#5827) 2024-04-11 02:23:43 -03:00
Alex O'Connell b94cd6754e
UI: Respect model and lora directory settings when downloading files (#5842) 2024-04-11 01:55:02 -03:00
oobabooga 17c4319e2d Fix loading command-r context length metadata 2024-04-10 21:39:59 -07:00
oobabooga cbd65ba767
Add a simple min_p preset, make it the default (#5836) 2024-04-09 12:50:16 -03:00
oobabooga d02744282b Minor logging change 2024-04-06 18:56:58 -07:00
oobabooga dd6e4ac55f Prevent double <BOS_TOKEN> with Command R+ 2024-04-06 13:14:32 -07:00
oobabooga 1bdceea2d4 UI: Focus on the chat input after starting a new chat 2024-04-06 12:57:57 -07:00
oobabooga 168a0f4f67 UI: do not load the "gallery" extension by default 2024-04-06 12:43:21 -07:00
oobabooga 64a76856bd Metadata: Fix loading Command R+ template with multiple options 2024-04-06 07:32:17 -07:00
oobabooga 1b87844928 Minor fix 2024-04-05 18:43:43 -07:00
oobabooga 6b7f7555fc Logging message to make transformers loader a bit more transparent 2024-04-05 18:40:02 -07:00
oobabooga 0f536dd97d UI: Fix the "Show controls" action 2024-04-05 12:18:33 -07:00
oobabooga 308452b783 Bitsandbytes: load preconverted 4bit models without additional flags 2024-04-04 18:10:24 -07:00
oobabooga d423021a48
Remove CTransformers support (#5807) 2024-04-04 20:23:58 -03:00
oobabooga 13fe38eb27 Remove specialized code for gpt-4chan 2024-04-04 16:11:47 -07:00
oobabooga 9ab7365b56 Read rope_theta for DBRX model (thanks turboderp) 2024-04-01 20:25:31 -07:00
oobabooga db5f6cd1d8 Fix ExLlamaV2 loaders using unnecessary "bits" metadata 2024-03-30 21:51:39 -07:00
oobabooga 624faa1438 Fix ExLlamaV2 context length setting (closes #5750) 2024-03-30 21:33:16 -07:00
oobabooga 9653a9176c Minor improvements to Parameters tab 2024-03-29 10:41:24 -07:00
oobabooga 35da6b989d
Organize the parameters tab (#5767) 2024-03-28 16:45:03 -03:00
Yiximail 8c9aca239a
Fix prompt incorrectly set to empty when suffix is empty string (#5757) 2024-03-26 16:33:09 -03:00
oobabooga 2a92a842ce
Bump gradio to 4.23 (#5758) 2024-03-26 16:32:20 -03:00
oobabooga 49b111e2dd Lint 2024-03-17 08:33:23 -07:00
oobabooga d890c99b53 Fix StreamingLLM when content is removed from the beginning of the prompt 2024-03-14 09:18:54 -07:00
oobabooga d828844a6f Small fix: don't save truncation_length to settings.yaml
It should derive from model metadata or from a command-line flag.
2024-03-14 08:56:28 -07:00
oobabooga 2ef5490a36 UI: make light theme less blinding 2024-03-13 08:23:16 -07:00
oobabooga 40a60e0297 Convert attention_sink_size to int (closes #5696) 2024-03-13 08:15:49 -07:00
oobabooga edec3bf3b0 UI: avoid caching convert_to_markdown calls during streaming 2024-03-13 08:14:34 -07:00
oobabooga 8152152dd6 Small fix after 28076928ac 2024-03-11 19:56:35 -07:00
oobabooga 28076928ac
UI: Add a new "User description" field for user personality/biography (#5691) 2024-03-11 23:41:57 -03:00
oobabooga 63701f59cf UI: mention that n_gpu_layers > 0 is necessary for the GPU to be used 2024-03-11 18:54:15 -07:00
oobabooga 46031407b5 Increase the cache size of convert_to_markdown to 4096 2024-03-11 18:43:04 -07:00
oobabooga 9eca197409 Minor logging change 2024-03-11 16:31:13 -07:00
oobabooga afadc787d7 Optimize the UI by caching convert_to_markdown calls 2024-03-10 20:10:07 -07:00
oobabooga 056717923f Document StreamingLLM 2024-03-10 19:15:23 -07:00
oobabooga 15d90d9bd5 Minor logging change 2024-03-10 18:20:50 -07:00
oobabooga cf0697936a Optimize StreamingLLM by over 10x 2024-03-08 21:48:28 -08:00
oobabooga afb51bd5d6
Add StreamingLLM for llamacpp & llamacpp_HF (2nd attempt) (#5669) 2024-03-09 00:25:33 -03:00
oobabooga 549bb88975 Increase height of "Custom stopping strings" UI field 2024-03-08 12:54:30 -08:00
oobabooga 238f69accc Move "Command for chat-instruct mode" to the main chat tab (closes #5634) 2024-03-08 12:52:52 -08:00
oobabooga bae14c8f13 Right-truncate long chat completion prompts instead of left-truncating
Instructions are usually at the beginning of the prompt.
2024-03-07 08:50:24 -08:00
Bartowski 104573f7d4
Update cache_4bit documentation (#5649)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-03-07 13:08:21 -03:00
oobabooga 2ec1d96c91
Add cache_4bit option for ExLlamaV2 (#5645) 2024-03-06 23:02:25 -03:00
oobabooga 2174958362
Revert gradio to 3.50.2 (#5640) 2024-03-06 11:52:46 -03:00
oobabooga d61e31e182
Save the extensions after Gradio 4 (#5632) 2024-03-05 07:54:34 -03:00
oobabooga 63a1d4afc8
Bump gradio to 4.19 (#5522) 2024-03-05 07:32:28 -03:00
oobabooga f697cb4609 Move update_wizard_windows.sh to update_wizard_windows.bat (oops) 2024-03-04 19:26:24 -08:00
kalomaze cfb25c9b3f
Cubic sampling w/ curve param (#5551)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-03-03 13:22:21 -03:00
oobabooga 09b13acfb2 Perplexity evaluation: print to terminal after calculation is finished 2024-02-28 19:58:21 -08:00
oobabooga 4164e29416 Block the "To create a public link, set share=True" gradio message 2024-02-25 15:06:08 -08:00
oobabooga d34126255d Fix loading extensions with "-" in the name (closes #5557) 2024-02-25 09:24:52 -08:00
oobabooga 10aedc329f Logging: more readable messages when renaming chat histories 2024-02-22 07:57:06 -08:00
oobabooga faf3bf2503 Perplexity evaluation: make UI events more robust (attempt) 2024-02-22 07:13:22 -08:00
oobabooga ac5a7a26ea Perplexity evaluation: add some informative error messages 2024-02-21 20:20:52 -08:00
oobabooga 59032140b5 Fix CFG with llamacpp_HF (2nd attempt) 2024-02-19 18:35:42 -08:00
oobabooga c203c57c18 Fix CFG with llamacpp_HF 2024-02-19 18:09:49 -08:00
oobabooga ae05d9830f Replace {{char}}, {{user}} in the chat template itself 2024-02-18 19:57:54 -08:00
oobabooga 1f27bef71b
Move chat UI elements to the right on desktop (#5538) 2024-02-18 14:32:05 -03:00
oobabooga d6bd71db7f ExLlamaV2: fix loading when autosplit is not set 2024-02-17 12:54:37 -08:00
oobabooga af0bbf5b13 Lint 2024-02-17 09:01:04 -08:00
oobabooga a6730f88f7
Add --autosplit flag for ExLlamaV2 (#5524) 2024-02-16 15:26:10 -03:00
oobabooga 4039999be5 Autodetect llamacpp_HF loader when tokenizer exists 2024-02-16 09:29:26 -08:00
oobabooga 76d28eaa9e
Add a menu for customizing the instruction template for the model (#5521) 2024-02-16 14:21:17 -03:00
oobabooga 0e1d8d5601 Instruction template: make "Send to default/notebook" work without a tokenizer 2024-02-16 08:01:07 -08:00
oobabooga 44018c2f69
Add a "llamacpp_HF creator" menu (#5519) 2024-02-16 12:43:24 -03:00
oobabooga b2b74c83a6 Fix Qwen1.5 in llamacpp_HF 2024-02-15 19:04:19 -08:00
oobabooga 080f7132c0
Revert gradio to 3.50.2 (#5513) 2024-02-15 20:40:23 -03:00
oobabooga 7123ac3f77
Remove "Maximum UI updates/second" parameter (#5507) 2024-02-14 23:34:30 -03:00
DominikKowalczyk 33c4ce0720
Bump gradio to 4.19 (#5419)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-02-14 23:28:26 -03:00
oobabooga b16958575f Minor bug fix 2024-02-13 19:48:32 -08:00
oobabooga d47182d9d1
llamacpp_HF: do not use oobabooga/llama-tokenizer (#5499) 2024-02-14 00:28:51 -03:00
oobabooga 069ed7c6ef Lint 2024-02-13 16:05:41 -08:00
oobabooga 86c320ab5a llama.cpp: add a progress bar for prompt evaluation 2024-02-07 21:56:10 -08:00
oobabooga c55b8ce932 Improved random preset generation 2024-02-06 08:51:52 -08:00
oobabooga 4e34ae0587 Minor logging improvements 2024-02-06 08:22:08 -08:00
oobabooga 3add2376cd Better warpers logging 2024-02-06 07:09:21 -08:00
oobabooga 494cc3c5b0 Handle empty sampler priority field, use default values 2024-02-06 07:05:32 -08:00
oobabooga 775902c1f2 Sampler priority: better logging, always save to presets 2024-02-06 06:49:22 -08:00
oobabooga acfbe6b3b3 Minor doc changes 2024-02-06 06:35:01 -08:00
oobabooga 8ee3cea7cb Improve some log messages 2024-02-06 06:31:27 -08:00
oobabooga 8a6d9abb41 Small fixes 2024-02-06 06:26:27 -08:00
oobabooga 2a1063eff5 Revert "Remove non-HF ExLlamaV2 loader (#5431)"
This reverts commit cde000d478.
2024-02-06 06:21:36 -08:00
oobabooga 8c35fefb3b
Add custom sampler order support (#5443) 2024-02-06 11:20:10 -03:00
oobabooga 7301c7618f Minor change to Models tab 2024-02-04 21:49:58 -08:00
oobabooga f234fbe83f Improve a log message after previous commit 2024-02-04 21:44:53 -08:00
oobabooga 7073665a10
Truncate long chat completions inputs (#5439) 2024-02-05 02:31:24 -03:00
oobabooga 9033fa5eee Organize the Model tab 2024-02-04 19:30:22 -08:00
Forkoz 2a45620c85
Split by rows instead of layers for llama.cpp multi-gpu (#5435) 2024-02-04 23:36:40 -03:00
Badis Ghoubali 3df7e151f7
fix the n_batch slider (#5436) 2024-02-04 18:15:30 -03:00
oobabooga 4e188eeb80 Lint 2024-02-03 20:40:10 -08:00
oobabooga cde000d478
Remove non-HF ExLlamaV2 loader (#5431) 2024-02-04 01:15:51 -03:00
kalomaze b6077b02e4
Quadratic sampling (#5403)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-02-04 00:20:02 -03:00
Badis Ghoubali 40c7977f9b
Add roleplay.gbnf grammar (#5368) 2024-01-28 21:41:28 -03:00
sam-ngu c0bdcee646
added trust_remote_code to deepspeed init loaderClass (#5237) 2024-01-26 11:10:57 -03:00
oobabooga 87dc421ee8
Bump exllamav2 to 0.0.12 (#5352) 2024-01-22 22:40:12 -03:00
oobabooga aad73667af Lint 2024-01-22 03:25:55 -08:00
lmg-anon db1da9f98d
Fix logprobs tokens in OpenAI API (#5339) 2024-01-22 08:07:42 -03:00
Forkoz 5c5ef4cef7
UI: change n_gpu_layers maximum to 256 for larger models. (#5262) 2024-01-17 17:13:16 -03:00
ilya sheprut 4d14eb8b82
LoRA: Fix error "Attempting to unscale FP16 gradients" when training (#5268) 2024-01-17 17:11:49 -03:00
oobabooga e055967974
Add prompt_lookup_num_tokens parameter (#5296) 2024-01-17 17:09:36 -03:00
oobabooga b3fc2cd887 UI: Do not save unchanged extension settings to settings.yaml 2024-01-10 03:48:30 -08:00
oobabooga 53dc1d8197 UI: Do not save unchanged settings to settings.yaml 2024-01-09 18:59:04 -08:00
oobabooga 89e7e107fc Lint 2024-01-09 16:27:50 -08:00
mamei16 bec4e0a1ce
Fix update event in refresh buttons (#5197) 2024-01-09 14:49:37 -03:00
oobabooga 4333d82b9d Minor bug fix 2024-01-09 06:55:18 -08:00
oobabooga 953343cced Improve the file saving/deletion menus 2024-01-09 06:33:47 -08:00
oobabooga 123f27a3c5 Load the nearest character after deleting a character
Instead of the first.
2024-01-09 06:24:27 -08:00
oobabooga b908ed318d Revert "Rename past chats -> chat history"
This reverts commit aac93a1fd6.
2024-01-09 05:26:07 -08:00
oobabooga 4ca82a4df9 Save light/dark theme on "Save UI defaults to settings.yaml" 2024-01-09 04:20:10 -08:00
oobabooga 7af50ede94 Reorder some buttons 2024-01-09 04:11:50 -08:00
oobabooga a9f49a7574 Confirm the chat history rename with enter 2024-01-09 04:00:53 -08:00
oobabooga 7bdd2118a2 Change some log messages when deleting files 2024-01-09 03:32:01 -08:00
oobabooga aac93a1fd6 Rename past chats -> chat history 2024-01-09 03:14:30 -08:00
oobabooga 615fa11af8 Move new chat button, improve history deletion handling 2024-01-08 21:22:37 -08:00
oobabooga 4f7e1eeafd
Past chat histories in a side bar on desktop (#5098)
Lots of room for improvement, but that's a start.
2024-01-09 01:57:29 -03:00
oobabooga 372ef5e2d8 Fix dynatemp parameters always visible 2024-01-08 19:42:31 -08:00
oobabooga 29c2693ea0
dynatemp_low, dynatemp_high, dynatemp_exponent parameters (#5209) 2024-01-08 23:28:35 -03:00
oobabooga c4e005efec Fix dropdown menus sometimes failing to refresh 2024-01-08 17:49:54 -08:00
oobabooga 9cd2106303 Revert "Add dynamic temperature to the random preset button"
This reverts commit 4365fb890f.
2024-01-08 16:46:24 -08:00
oobabooga 4365fb890f Add dynamic temperature to the random preset button 2024-01-07 13:08:15 -08:00
oobabooga 0d07b3a6a1
Add dynamic_temperature_low parameter (#5198) 2024-01-07 17:03:47 -03:00
oobabooga b8a0b3f925 Don't print torch tensors with --verbose 2024-01-07 10:35:55 -08:00
oobabooga cf820c69c5 Print generation parameters with --verbose (HF only) 2024-01-07 10:06:23 -08:00
oobabooga c4c7fc4ab3 Lint 2024-01-07 09:36:56 -08:00
kalomaze 48327cc5c4
Dynamic Temperature HF loader support (#5174)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-01-07 10:36:26 -03:00
oobabooga 248742df1c Save extension fields to settings.yaml on "Save UI defaults" 2024-01-04 20:33:42 -08:00
oobabooga c9d814592e Increase maximum temperature value to 5 2024-01-04 17:28:15 -08:00
oobabooga e4d724eb3f Fix cache_folder bug introduced in 37eff915d6 2024-01-04 07:49:40 -08:00
Alberto Cano 37eff915d6
Use --disk-cache-dir for all caches 2024-01-04 00:27:26 -03:00
Lounger 7965f6045e
Fix loading latest history for file names with dots (#5162) 2024-01-03 22:39:41 -03:00
AstrisCantCode b80e6365d0
Fix various bugs for LoRA training (#5161) 2024-01-03 20:42:20 -03:00
oobabooga 7cce88c403 Rmove an unncecessary exception 2024-01-02 07:20:59 -08:00
oobabooga 94afa0f9cf Minor style changes 2024-01-01 16:00:22 -08:00
oobabooga cbf6f9e695 Update some UI messages 2023-12-30 21:31:17 -08:00
oobabooga 2aad91f3c9
Remove deprecated command-line flags (#5131) 2023-12-31 02:07:48 -03:00
oobabooga 2734ce3e4c
Remove RWKV loader (#5130) 2023-12-31 02:01:40 -03:00
oobabooga 0e54a09bcb
Remove exllamav1 loaders (#5128) 2023-12-31 01:57:06 -03:00
oobabooga 8e397915c9
Remove --sdp-attention, --xformers flags (#5126) 2023-12-31 01:36:51 -03:00
B611 b7dd1f9542
Specify utf-8 encoding for model metadata file open (#5125) 2023-12-31 01:34:32 -03:00
oobabooga c06f630bcc Increase max_updates_second maximum value 2023-12-24 13:29:47 -08:00
oobabooga 8c60495878 UI: add "Maximum UI updates/second" parameter 2023-12-24 09:17:40 -08:00
zhangningboo 1b8b61b928
Fix output_ids decoding for Qwen/Qwen-7B-Chat (#5045) 2023-12-22 23:11:02 -03:00
Yiximail afc91edcb2
Reset the model_name after unloading the model (#5051) 2023-12-22 22:18:24 -03:00
oobabooga 2706149c65
Organize the CMD arguments by group (#5027) 2023-12-21 00:33:55 -03:00
oobabooga c727a70572 Remove redundancy from modules/loaders.py 2023-12-20 19:18:07 -08:00
luna 6efbe3009f
let exllama v1 models load safetensor loras (#4854) 2023-12-20 13:29:19 -03:00
oobabooga bcba200790 Fix EOS being ignored in ExLlamav2 after previous commit 2023-12-20 07:54:06 -08:00
oobabooga f0f6d9bdf9 Add HQQ back & update version
This reverts commit 2289e9031e.
2023-12-20 07:46:09 -08:00
oobabooga b15f510154 Optimize ExLlamav2 (non-HF) loader 2023-12-20 07:31:42 -08:00
oobabooga fadb295d4d Lint 2023-12-19 21:36:57 -08:00
oobabooga fb8ee9f7ff Add a specific error if HQQ is missing 2023-12-19 21:32:58 -08:00
oobabooga 9992f7d8c0 Improve several log messages 2023-12-19 20:54:32 -08:00
oobabooga 23818dc098 Better logger
Credits: vladmandic/automatic
2023-12-19 20:38:33 -08:00
oobabooga 95600073bc Add an informative error when extension requirements are missing 2023-12-19 20:20:45 -08:00
oobabooga d8279dc710 Replace character name placeholders in chat context (closes #5007) 2023-12-19 17:31:46 -08:00
oobabooga e83e6cedbe Organize the model menu 2023-12-19 13:18:26 -08:00
oobabooga f4ae0075e8 Fix conversion from old template format to jinja2 2023-12-19 13:16:52 -08:00
oobabooga de138b8ba6
Add llama-cpp-python wheels with tensor cores support (#5003) 2023-12-19 17:30:53 -03:00
oobabooga 0a299d5959
Bump llama-cpp-python to 0.2.24 (#5001) 2023-12-19 15:22:21 -03:00
oobabooga 83cf1a6b67 Fix Yi space issue (closes #4996) 2023-12-19 07:54:19 -08:00
oobabooga 9847809a7a Add a warning about ppl evaluation without --no_use_fast 2023-12-18 18:09:24 -08:00
oobabooga f6d701624c UI: mention that QuIP# does not work on Windows 2023-12-18 18:05:02 -08:00
oobabooga a23a004434 Update the example template 2023-12-18 17:47:35 -08:00
Water 674be9a09a
Add HQQ quant loader (#4888)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-12-18 21:23:16 -03:00
oobabooga 1f9e25e76a UI: update "Saved instruction templates" dropdown after loading template 2023-12-17 21:19:06 -08:00
oobabooga da1c8d77ea Merge remote-tracking branch 'refs/remotes/origin/dev' into dev 2023-12-17 21:05:10 -08:00
oobabooga cac89df97b Instruction templates: better handle unwanted bos tokens 2023-12-17 21:04:30 -08:00
oobabooga f0d6ead877
llama.cpp: read instruction template from GGUF metadata (#4975) 2023-12-18 01:51:58 -03:00
oobabooga f1f2c4c3f4
Add --num_experts_per_token parameter (ExLlamav2) (#4955) 2023-12-17 12:08:33 -03:00
oobabooga 12690d3ffc
Better HF grammar implementation (#4953) 2023-12-17 02:01:23 -03:00
oobabooga f8079d067d UI: save the sent chat message on "no model is loaded" error 2023-12-16 10:52:41 -08:00
oobabooga 3bbf6c601d AutoGPTQ: Add --disable_exllamav2 flag (Mixtral CPU offloading needs this) 2023-12-15 06:46:13 -08:00
oobabooga 2cb5b68ad9
Bug fix: when generation fails, save the sent message (#4915) 2023-12-15 01:01:45 -03:00
Kim Jaewon e53f99faa0
[OpenAI Extension] Add 'max_logits' parameter in logits endpoint (#4916) 2023-12-15 00:22:43 -03:00
Lounger 5754f0c357
Fix deleting chat logs (#4914) 2023-12-13 21:54:43 -03:00
Bartowski f51156705d
Allow symlinked folder within root directory (#4863) 2023-12-13 18:08:21 -03:00
Ixion 3f3960dbfb
Fixed invalid Jinja2 syntax in instruction templates (#4911) 2023-12-13 15:46:23 -03:00
oobabooga fcf5512364 Jinja templates: fix a potential small bug 2023-12-13 10:19:39 -08:00
oobabooga 7f1a6a70e3 Update the llamacpp_HF comment 2023-12-12 21:04:20 -08:00
oobabooga 1c531a3713 Minor cleanup 2023-12-12 13:25:21 -08:00
oobabooga 8513028968 Fix lag in the chat tab during streaming 2023-12-12 13:01:25 -08:00
oobabooga 39d2fe1ed9
Jinja templates for Instruct and Chat (#4874) 2023-12-12 17:23:14 -03:00
oobabooga aab0dd962d Revert "Update callbacks.py to show tracebacks on ValueError (#4892)"
This reverts commit 993ca51a65.
2023-12-12 11:47:11 -08:00
Nehereus 993ca51a65
Update callbacks.py to show tracebacks on ValueError (#4892) 2023-12-12 02:29:27 -03:00
Morgan Schweers 602b8c6210
Make new browser reloads recognize current model. (#4865) 2023-12-11 02:51:01 -03:00
oobabooga 8c8825b777 Add QuIP# to README 2023-12-08 08:40:42 -08:00
oobabooga 2a335b8aa7 Cleanup: set shared.model_name only once 2023-12-08 06:35:23 -08:00
oobabooga 62d59a516f Add trust_remote_code to all HF loaders 2023-12-08 06:29:26 -08:00
oobabooga 181743fd97 Fix missing spaces tokenizer issue (closes #4834) 2023-12-08 05:16:46 -08:00
Yiximail 1c74b3ab45
Fix partial unicode characters issue (#4837) 2023-12-08 09:50:53 -03:00
oobabooga 2c5a1e67f9
Parameters: change max_new_tokens & repetition_penalty_range defaults (#4842) 2023-12-07 20:04:52 -03:00
oobabooga 98361af4d5
Add QuIP# support (#4803)
It has to be installed manually for now.
2023-12-06 00:01:01 -03:00
oobabooga 6430acadde Minor bug fix after https://github.com/oobabooga/text-generation-webui/pull/4814 2023-12-05 10:08:11 -08:00
oobabooga 0f828ea441 Do not limit API updates/second 2023-12-04 20:45:43 -08:00
oobabooga 9edb193def
Optimize HF text generation (#4814) 2023-12-05 00:00:40 -03:00
俞航 ac9f154bcc
Bump exllamav2 from 0.0.8 to 0.0.10 & Fix code change (#4782) 2023-12-04 21:15:05 -03:00
oobabooga 131a5212ce UI: update context upper limit to 200000 2023-12-04 15:48:34 -08:00
oobabooga be88b072e9 Update --loader flag description 2023-12-04 15:41:25 -08:00
oobabooga 7fc9033b2e Recommend ExLlama_HF and ExLlamav2_HF 2023-12-04 15:28:46 -08:00
Lounger 7c0a17962d
Gallery improvements (#4789) 2023-12-03 22:45:50 -03:00
oobabooga 77d6ccf12b Add a LOADER debug message while loading models 2023-11-30 12:00:32 -08:00
oobabooga 092a2c3516 Fix a bug in llama.cpp get_logits() function 2023-11-30 11:21:40 -08:00
oobabooga 2698d7c9fd Fix llama.cpp model unloading 2023-11-29 15:19:48 -08:00
oobabooga 9940ed9c77 Sort the loaders 2023-11-29 15:13:03 -08:00
oobabooga a7670c31ca Sort 2023-11-28 18:43:33 -08:00
oobabooga 6e51bae2e0 Sort the loaders menu 2023-11-28 18:41:11 -08:00
oobabooga 68059d7c23 llama.cpp: minor log change & lint 2023-11-27 10:44:55 -08:00
tsukanov-as 9f7ae6bb2e
fix detection of stopping strings when HTML escaping is used (#4728) 2023-11-27 15:42:08 -03:00
oobabooga 0589ff5b12
Bump llama-cpp-python to 0.2.19 & add min_p and typical_p parameters to llama.cpp loader (#4701) 2023-11-21 20:59:39 -03:00
oobabooga 2769a1fa25 Hide deprecated args from Session tab 2023-11-21 15:15:16 -08:00
oobabooga a2e6d00128 Use convert_ids_to_tokens instead of decode in logits endpoint
This preserves the llama tokenizer spaces.
2023-11-19 09:22:08 -08:00
oobabooga 9da7bb203d Minor LoRA bug fix 2023-11-19 07:59:29 -08:00
oobabooga a6f1e1bcc5 Fix PEFT LoRA unloading 2023-11-19 07:55:25 -08:00
oobabooga ab94f0d9bf Minor style change 2023-11-18 21:11:04 -08:00
oobabooga 5fcee696ea
New feature: enlarge character pictures on click (#4654) 2023-11-19 02:05:17 -03:00
oobabooga ef6feedeb2
Add --nowebui flag for pure API mode (#4651) 2023-11-18 23:38:39 -03:00
oobabooga 0fa1af296c
Add /v1/internal/logits endpoint (#4650) 2023-11-18 23:19:31 -03:00
oobabooga 8f4f4daf8b
Add --admin-key flag for API (#4649) 2023-11-18 22:33:27 -03:00
Jordan Tucker baab894759
fix: use system message in chat-instruct mode (#4648) 2023-11-18 20:20:13 -03:00
oobabooga 47d9e2618b Refresh the Preset menu after saving a preset 2023-11-18 14:03:42 -08:00
oobabooga 83b64e7fc1
New feature: "random preset" button (#4647) 2023-11-18 18:31:41 -03:00
oobabooga e0ca49ed9c
Bump llama-cpp-python to 0.2.18 (2nd attempt) (#4637)
* Update requirements*.txt

* Add back seed
2023-11-18 00:31:27 -03:00
oobabooga 9d6f79db74 Revert "Bump llama-cpp-python to 0.2.18 (#4611)"
This reverts commit 923c8e25fb.
2023-11-17 05:14:25 -08:00
oobabooga 13dc3b61da Update README 2023-11-16 19:57:55 -08:00
oobabooga 8b66d83aa9 Set use_fast=True by default, create --no_use_fast flag
This increases tokens/second for HF loaders.
2023-11-16 19:55:28 -08:00
oobabooga 6525707a7f Fix "send instruction template to..." buttons (closes #4625) 2023-11-16 18:16:42 -08:00
oobabooga 510a01ef46 Lint 2023-11-16 18:03:06 -08:00
oobabooga 923c8e25fb
Bump llama-cpp-python to 0.2.18 (#4611) 2023-11-16 22:55:14 -03:00
oobabooga 58c6001be9 Add missing exllamav2 samplers 2023-11-16 07:09:40 -08:00
oobabooga cd41f8912b Warn users about n_ctx / max_seq_len 2023-11-15 18:56:42 -08:00
oobabooga 9be48e83a9 Start API when "api" checkbox is checked 2023-11-15 16:35:47 -08:00
oobabooga a85ce5f055 Add more info messages for truncation / instruction template 2023-11-15 16:20:31 -08:00
oobabooga 883701bc40 Alternative solution to 025da386a0
Fixes an error.
2023-11-15 16:04:02 -08:00
oobabooga 8ac942813c Revert "Fix CPU memory limit error (issue #3763) (#4597)"
This reverts commit 025da386a0.
2023-11-15 16:01:54 -08:00
oobabooga e6f44d6d19 Print context length / instruction template to terminal when loading models 2023-11-15 16:00:51 -08:00
oobabooga e05d8fd441 Style changes 2023-11-15 15:51:37 -08:00
Andy Bao 025da386a0
Fix CPU memory limit error (issue #3763) (#4597)
get_max_memory_dict() was not properly formatting shared.args.cpu_memory

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-11-15 20:27:20 -03:00
oobabooga 4aabff3728 Remove old API, launch OpenAI API with --api 2023-11-10 06:39:08 -08:00
oobabooga 2af7e382b1 Revert "Bump llama-cpp-python to 0.2.14"
This reverts commit 5c3eb22ce6.

The new version has issues:

https://github.com/oobabooga/text-generation-webui/issues/4540
https://github.com/abetlen/llama-cpp-python/issues/893
2023-11-09 10:02:13 -08:00
oobabooga 21ed9a260e Document the new "Custom system message" field 2023-11-08 17:54:10 -08:00
oobabooga 2358706453 Add /v1/internal/model/load endpoint (tentative) 2023-11-07 20:58:06 -08:00
oobabooga 43c53a7820 Refactor the /v1/models endpoint 2023-11-07 19:59:27 -08:00
oobabooga 1b69694fe9 Add types to the encode/decode/token-count endpoints 2023-11-07 19:32:14 -08:00
oobabooga 6e2e0317af
Separate context and system message in instruction formats (#4499) 2023-11-07 20:02:58 -03:00
oobabooga 5c0559da69 Training: fix .txt files now showing in dropdowns 2023-11-07 14:41:11 -08:00
oobabooga af3d25a503 Disable logits_all in llamacpp_HF (makes processing 3x faster) 2023-11-07 14:35:48 -08:00
oobabooga 5c3eb22ce6 Bump llama-cpp-python to 0.2.14 2023-11-07 14:20:43 -08:00
oobabooga ec17a5d2b7
Make OpenAI API the default API (#4430) 2023-11-06 02:38:29 -03:00
feng lui 4766a57352
transformers: add use_flash_attention_2 option (#4373) 2023-11-04 13:59:33 -03:00
wouter van der plas add359379e
fixed two links in the ui (#4452) 2023-11-04 13:41:42 -03:00
oobabooga aa5d671579
Add temperature_last parameter (#4472) 2023-11-04 13:09:07 -03:00
oobabooga 1ab8700d94 Change frequency/presence penalty ranges 2023-11-03 17:38:19 -07:00
oobabooga 45fcb60e7a Make truncation_length_max apply to max_seq_len/n_ctx 2023-11-03 11:29:31 -07:00
oobabooga 7f9c1cbb30 Change min_p default to 0.0 2023-11-03 08:25:22 -07:00
oobabooga 4537853e2c Change min_p default to 1.0 2023-11-03 08:13:50 -07:00
kalomaze 367e5e6e43
Implement Min P as a sampler option in HF loaders (#4449) 2023-11-02 16:32:51 -03:00
oobabooga fcb7017b7a Remove a checkbox 2023-11-02 12:24:09 -07:00
Julien Chaumond fdcaa955e3
transformers: Add a flag to force load from safetensors (#4450) 2023-11-02 16:20:54 -03:00
oobabooga c0655475ae Add cache_8bit option 2023-11-02 11:23:04 -07:00
oobabooga 42f816312d Merge remote-tracking branch 'refs/remotes/origin/dev' into dev 2023-11-02 11:09:26 -07:00
oobabooga 77abd9b69b Add no_flash_attn option 2023-11-02 11:08:53 -07:00
Julien Chaumond a56ef2a942
make torch.load a bit safer (#4448) 2023-11-02 14:07:08 -03:00
Mehran Ziadloo aaf726dbfb
Updating the shared settings object when loading a model (#4425) 2023-11-01 01:29:57 -03:00
oobabooga 9bd0724d85 Change frequency/presence penalty ranges 2023-10-31 20:57:56 -07:00
Meheret 0707ed7677
updated wiki link (#4415) 2023-10-31 19:09:05 -03:00
oobabooga 262f8ae5bb Use default gr.Dataframe for evaluation table 2023-10-27 06:49:14 -07:00
oobabooga 839a87bac8 Fix is_ccl_available & is_xpu_available imports 2023-10-26 20:27:04 -07:00
Abhilash Majumder 778a010df8
Intel Gpu support initialization (#4340) 2023-10-26 23:39:51 -03:00
oobabooga 92b2f57095 Minor metadata bug fix (second attempt) 2023-10-26 18:57:32 -07:00
tdrussell 72f6fc6923
Rename additive_repetition_penalty to presence_penalty, add frequency_penalty (#4376) 2023-10-25 12:10:28 -03:00
oobabooga ef1489cd4d Remove unused parameter in AutoAWQ 2023-10-23 20:45:43 -07:00
oobabooga 1edf321362 Lint 2023-10-23 13:09:03 -07:00
oobabooga 280ae720d7 Organize 2023-10-23 13:07:17 -07:00
oobabooga 49e5eecce4 Merge remote-tracking branch 'refs/remotes/origin/main' 2023-10-23 12:54:05 -07:00
oobabooga 306d764ff6 Minor metadata bug fix 2023-10-23 12:46:24 -07:00
adrianfiedler 4bc411332f
Fix broken links (#4367)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-10-23 14:09:57 -03:00
oobabooga 92691ee626 Disable trust_remote_code by default 2023-10-23 09:57:44 -07:00
tdrussell 4440f87722
Add additive_repetition_penalty sampler setting. (#3627) 2023-10-23 02:28:07 -03:00
oobabooga df90d03e0b Replace --mul_mat_q with --no_mul_mat_q 2023-10-22 12:23:03 -07:00
Googulator d0c3b407b3
transformers loader: multi-LoRAs support (#3120) 2023-10-22 16:06:22 -03:00
omo 4405513ca5
Option to select/target additional linear modules/layers in LORA training (#4178) 2023-10-22 15:57:19 -03:00
oobabooga 2d1b3332e4 Ignore warnings on Colab 2023-10-21 21:45:25 -07:00
oobabooga 09f807af83 Use ExLlama_HF for GPTQ models by default 2023-10-21 20:45:38 -07:00
oobabooga 506d05aede Organize command-line arguments 2023-10-21 18:52:59 -07:00
oobabooga fbac6d21ca Add missing exception 2023-10-20 23:53:24 -07:00
Brian Dashore 3345da2ea4
Add flash-attention 2 for windows (#4235) 2023-10-21 03:46:23 -03:00
Johan 1d5a015ce7
Enable special token support for exllamav2 (#4314) 2023-10-21 01:54:06 -03:00
turboderp ae8cd449ae
ExLlamav2_HF: Convert logits to FP32 (#4310) 2023-10-18 23:16:05 -03:00
oobabooga f17f7a6913 Increase the evaluation table height 2023-10-16 12:55:35 -07:00
oobabooga 8ea554bc19 Check for torch.xpu.is_available() 2023-10-16 12:53:40 -07:00
oobabooga 188d20e9e5 Reduce the evaluation table height 2023-10-16 10:53:42 -07:00
oobabooga 2d44adbb76 Clear the torch cache while evaluating 2023-10-16 10:52:50 -07:00
oobabooga 71cac7a1b2 Increase the height of the evaluation table 2023-10-15 21:56:40 -07:00
oobabooga e14bde4946 Minor improvements to evaluation logs 2023-10-15 20:51:43 -07:00
oobabooga b88b2b74a6 Experimental Intel Arc transformers support (untested) 2023-10-15 20:51:11 -07:00
Forkoz 8cce1f1126
Exllamav2 lora support (#4229)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-10-14 16:12:41 -03:00
oobabooga 773c17faec Fix a warning 2023-10-10 20:53:38 -07:00
oobabooga f63361568c Fix safetensors kwarg usage in AutoAWQ 2023-10-10 19:03:09 -07:00
oobabooga 39f16ff83d Fix default/notebook tabs css 2023-10-10 18:45:12 -07:00
oobabooga fae8062d39
Bump to latest gradio (3.47) (#4258) 2023-10-10 22:20:49 -03:00
oobabooga 9fab9a1ca6 Minor fix 2023-10-10 14:08:11 -07:00
oobabooga a49cc69a4a Ignore rope_freq_base if value is 10000 2023-10-10 13:57:40 -07:00
oobabooga 3a9d90c3a1 Download models with 4 threads by default 2023-10-10 13:52:10 -07:00
Forkoz 35695e18c7
Remove import. (#4247)
For real this time.
2023-10-09 18:06:11 -03:00
Forkoz 2e471071af
Update llama_attn_hijack.py (#4231) 2023-10-08 15:16:48 -03:00
Brian Dashore 98fa73a974
Text Generation: stop if EOS token is reached (#4213) 2023-10-07 19:46:42 -03:00
Brian Dashore 7743b5e9de
Llamacpp_HF: Fix CFG cache init (#4219)
Documentation says that model.context_params should be sent when
a new context is created. The current code uses model.params which
doesn't exist.

Signed-off-by: kingbri <bdashore3@proton.me>
2023-10-07 19:38:29 -03:00
turboderp 8a98646a21
Bump ExLlamaV2 to 0.0.5 (#4186) 2023-10-05 19:12:22 -03:00
oobabooga 7ffb424c7b Add AutoAWQ to README 2023-10-05 09:22:37 -07:00
cal066 cc632c3f33
AutoAWQ: initial support (#3999) 2023-10-05 13:19:18 -03:00
tdrussell cb26163a20
Fix off-by-one error in exllama_hf caching logic (#4145) 2023-10-05 12:20:56 -03:00
oobabooga ae4ba3007f
Add grammar to transformers and _HF loaders (#4091) 2023-10-05 10:01:36 -03:00
oobabooga b6fe6acf88 Add threads_batch parameter 2023-10-01 21:28:00 -07:00
jllllll 41a2de96e5
Bump llama-cpp-python to 0.2.11 2023-10-01 18:08:10 -05:00
oobabooga f2d82f731a Add recommended NTKv1 alpha values 2023-09-29 13:48:38 -07:00
oobabooga abe99cddeb Extend evaluation slider bounds 2023-09-29 13:06:26 -07:00
oobabooga 96da2e1c0d Read more metadata (config.json & quantize_config.json) 2023-09-29 06:14:16 -07:00
oobabooga 56b5a4af74 exllamav2 typical_p 2023-09-28 20:10:12 -07:00
oobabooga f8e9733412 Minor syntax change 2023-09-28 19:32:35 -07:00
oobabooga f931184b53 Increase truncation limits to 32768 2023-09-28 19:28:22 -07:00
oobabooga 1dd13e4643 Read Transformers config.json metadata 2023-09-28 19:19:47 -07:00
StoyanStAtanasov 7e6ff8d1f0
Enable NUMA feature for llama_cpp_python (#4040) 2023-09-26 22:05:00 -03:00
oobabooga 87ea2d96fd Add a note about RWKV loader 2023-09-26 17:43:39 -07:00
oobabooga 0c89180966 Another minor fix 2023-09-26 06:54:21 -07:00
oobabooga 365335e1ae Minor fix 2023-09-26 06:47:19 -07:00
oobabooga 1ca54faaf0 Improve --multi-user mode 2023-09-26 06:42:33 -07:00
oobabooga 019371c0b6 Lint 2023-09-25 20:31:11 -07:00
oobabooga 814520fed1 Extension install improvements 2023-09-25 20:27:06 -07:00
oobabooga 7f1460af29 Change a warning 2023-09-25 20:22:27 -07:00
oobabooga 862b45b1c7 Extension install improvements 2023-09-25 19:48:30 -07:00
oobabooga c8952cce55 Move documentation from UI to docs/ 2023-09-25 12:28:28 -07:00
oobabooga d0d221df49 Add --use_fast option (closes #3741) 2023-09-25 12:19:43 -07:00
oobabooga b973b91d73 Automatically filter by loader (closes #4072) 2023-09-25 10:28:35 -07:00
oobabooga 63de9eb24f Clean up the transformers loader 2023-09-24 20:26:26 -07:00
oobabooga 36c38d7561 Add disable_exllama to Transformers loader (for GPTQ LoRA training) 2023-09-24 20:03:11 -07:00
oobabooga 55a685d999 Minor fixes 2023-09-24 14:15:10 -07:00
oobabooga 08cf150c0c
Add a grammar editor to the UI (#4061) 2023-09-24 18:05:24 -03:00
oobabooga eb0b7c1053 Fix a minor UI bug 2023-09-24 07:17:33 -07:00
oobabooga 3edac43426 Remove print statement 2023-09-24 07:13:00 -07:00
oobabooga b227e65d86 Add grammar to llama.cpp loader (closes #4019) 2023-09-24 07:10:45 -07:00
oobabooga 2e7b6b0014
Create alternative requirements.txt with AMD and Metal wheels (#4052) 2023-09-24 09:58:29 -03:00
oobabooga 7a3ca2c68f Better detect EXL2 models 2023-09-23 13:05:55 -07:00
oobabooga b1467bd064
Move one-click-installers into the repository (#4028 from oobabooga/one-click) 2023-09-22 17:43:07 -03:00
oobabooga c075969875 Add instructions 2023-09-22 13:10:03 -07:00
oobabooga 8ab3eca9ec Add a warning for outdated installations 2023-09-22 09:35:19 -07:00
oobabooga 95976a9d4f Fix a bug while deleting characters 2023-09-22 06:02:34 -07:00
oobabooga d5330406fa Add a rename menu for chat histories 2023-09-21 19:16:51 -07:00
oobabooga 00ab450c13
Multiple histories for each character (#4022) 2023-09-21 17:19:32 -03:00
oobabooga 029da9563f Avoid redundant function call in llamacpp_hf 2023-09-19 14:14:40 -07:00
oobabooga 869f47fff9 Lint 2023-09-19 13:51:57 -07:00
oobabooga 13ac55fa18 Reorder some functions 2023-09-19 13:51:57 -07:00
oobabooga 03dc69edc5 ExLlama_HF (v1 and v2) prefix matching 2023-09-19 13:12:19 -07:00
oobabooga 5075087461 Fix command-line arguments being ignored 2023-09-19 13:11:46 -07:00
oobabooga ff5d3d2d09 Add missing import 2023-09-18 16:26:54 -07:00
oobabooga 605ec3c9f2 Add a warning about ExLlamaV2 without flash-attn 2023-09-18 12:26:35 -07:00
oobabooga f0ef971edb Remove obsolete warning 2023-09-18 12:25:10 -07:00
oobabooga 745807dc03 Faster llamacpp_HF prefix matching 2023-09-18 11:02:45 -07:00
BadisG 893a72a1c5
Stop generation immediately when using "Maximum tokens/second" (#3952)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-09-18 14:27:06 -03:00
Cebtenzzre 8466cf229a
llama.cpp: fix ban_eos_token (#3987) 2023-09-18 12:15:02 -03:00
oobabooga 0ede2965d5 Remove an error message 2023-09-17 18:46:08 -07:00
missionfloyd cc8eda298a
Move hover menu shortcuts to right side (#3951) 2023-09-17 22:33:00 -03:00
oobabooga 280cca9f66 Merge remote-tracking branch 'refs/remotes/origin/main' 2023-09-17 18:01:27 -07:00
oobabooga b062d50c45 Remove exllama import that causes problems 2023-09-17 18:00:32 -07:00
James Braza fee38e0601
Simplified ExLlama cloning instructions and failure message (#3972) 2023-09-17 19:26:05 -03:00
Lu Guanghua 9858acee7b
Fix unexpected extensions load after gradio restart (#3965) 2023-09-17 17:35:43 -03:00
oobabooga d9b0f2c9c3 Fix llama.cpp double decoding 2023-09-17 13:07:48 -07:00
oobabooga d71465708c llamacpp_HF prefix matching 2023-09-17 11:51:01 -07:00
oobabooga 37e2980e05 Recommend mul_mat_q for llama.cpp 2023-09-17 08:27:11 -07:00
oobabooga a069f3904c Undo part of ad8ac545a5 2023-09-17 08:12:23 -07:00
oobabooga ad8ac545a5 Tokenization improvements 2023-09-17 07:02:00 -07:00
saltacc cd08eb0753
token probs for non HF loaders (#3957) 2023-09-17 10:42:32 -03:00
kalomaze 7c9664ed35
Allow full model URL to be used for download (#3919)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-09-16 10:06:13 -03:00
saltacc ed6b6411fb
Fix exllama tokenizers (#3954)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-09-16 09:42:38 -03:00
missionfloyd 2ad6ca8874
Add back chat buttons with --chat-buttons (#3947) 2023-09-16 00:39:37 -03:00
oobabooga ef04138bc0 Improve the UI tokenizer 2023-09-15 19:30:44 -07:00
oobabooga c3e4c9fdc2 Add a simple tokenizer to the UI 2023-09-15 19:09:03 -07:00
saltacc f01b9aa71f
Add customizable ban tokens (#3899) 2023-09-15 18:27:27 -03:00
oobabooga 5b117590ad Add some scrollbars to Parameters tab 2023-09-15 09:17:37 -07:00
Johan fdcee0c215
Allow custom tokenizer for llamacpp_HF loader (#3941) 2023-09-15 12:38:38 -03:00
oobabooga fd7257c7f8 Prevent code blocks from flickering while streaming 2023-09-15 07:46:26 -07:00
oobabooga a3ecf3bb65 Add cai-chat-square chat style 2023-09-14 16:15:08 -07:00
oobabooga 3d1c0f173d User config precedence over GGUF metadata 2023-09-14 12:15:52 -07:00
oobabooga 94dc64f870 Add a border 2023-09-14 07:20:36 -07:00
oobabooga 70aafa34dc Fix blockquote markdown rendering 2023-09-14 05:57:04 -07:00
oobabooga 644a9b8765 Change the chat generate button 2023-09-14 05:16:44 -07:00
oobabooga ecc90f9f62 Continue on Alt + Enter 2023-09-14 03:59:12 -07:00
oobabooga 1ce3c93600 Allow "Your name" field to be saved 2023-09-14 03:44:35 -07:00
oobabooga 27dbcc59f5
Make the chat input expand upwards (#3920) 2023-09-14 07:06:42 -03:00
oobabooga 6b6af74e14 Keyboard shortcuts without conflicts (hopefully) 2023-09-14 02:33:52 -07:00
oobabooga fc11d1eff0 Add chat keyboard shortcuts 2023-09-13 19:22:40 -07:00
oobabooga 9f199c7a4c Use Noto Sans font
Copied from 6c8bd06308/public/webfonts/NotoSans
2023-09-13 13:48:05 -07:00
oobabooga 8ce94b735c Show progress on impersonate 2023-09-13 11:22:53 -07:00
oobabooga 7cd437e05c Properly close the hover menu on mobile 2023-09-13 11:10:46 -07:00
oobabooga 1b47b5c676 Change the Generate/Stop buttons 2023-09-13 09:25:26 -07:00
oobabooga 8ea28cbfe0 Reorder chat buttons 2023-09-13 08:49:11 -07:00
oobabooga 5e3d2f7d44
Reorganize chat buttons (#3892) 2023-09-13 02:36:12 -03:00
Panchovix 34dc7306b8
Fix NTK (alpha) and RoPE scaling for exllamav2 and exllamav2_HF (#3897) 2023-09-13 02:35:09 -03:00
oobabooga b7adf290fc Fix ExLlama-v2 path issue 2023-09-12 17:42:22 -07:00
oobabooga b190676893 Merge remote-tracking branch 'refs/remotes/origin/main' 2023-09-12 15:06:33 -07:00
oobabooga 2f935547c8 Minor changes 2023-09-12 15:05:21 -07:00
oobabooga 18e6b275f3 Add alpha_value/compress_pos_emb to ExLlama-v2 2023-09-12 15:02:47 -07:00
Gennadij 460c40d8ab
Read more GGUF metadata (scale_linear and freq_base) (#3877) 2023-09-12 17:02:42 -03:00
oobabooga 16e1696071 Minor qol change 2023-09-12 10:44:26 -07:00
oobabooga c2a309f56e
Add ExLlamaV2 and ExLlamav2_HF loaders (#3881) 2023-09-12 14:33:07 -03:00
oobabooga df123a20fc Prevent extra keys from being saved to settings.yaml 2023-09-11 20:13:10 -07:00
oobabooga dae428a967 Revamp cai-chat theme, make it default 2023-09-11 19:30:40 -07:00
oobabooga 78811dd89a Fix GGUF metadata reading for falcon 2023-09-11 15:49:50 -07:00
oobabooga 9331ab4798
Read GGUF metadata (#3873) 2023-09-11 18:49:30 -03:00
oobabooga df52dab67b Lint 2023-09-11 07:57:38 -07:00
oobabooga ed86878f02 Remove GGML support 2023-09-11 07:44:00 -07:00
John Smith cc7b7ba153
fix lora training with alpaca_lora_4bit (#3853) 2023-09-11 01:22:20 -03:00
Forkoz 15e9b8c915
Exllama new rope settings (#3852) 2023-09-11 01:14:36 -03:00
oobabooga 4affa08821 Do not impose instruct mode while loading models 2023-09-02 11:31:33 -07:00
oobabooga 47e490c7b4 Set use_cache=True by default for all models 2023-08-30 13:26:27 -07:00
missionfloyd 787219267c
Allow downloading single file from UI (#3737) 2023-08-29 23:32:36 -03:00
oobabooga cec8db52e5
Add max_tokens_second param (#3533) 2023-08-29 17:44:31 -03:00
oobabooga 2b58a89f6a Clear instruction template before loading new one 2023-08-29 13:11:32 -07:00
oobabooga 36864cb3e8 Use Alpaca as the default instruction template 2023-08-29 13:06:25 -07:00
oobabooga 9a202f7fb2 Prevent <ul> lists from flickering during streaming 2023-08-28 20:45:07 -07:00
oobabooga 439dd0faab Fix stopping strings in the chat API 2023-08-28 19:40:11 -07:00
oobabooga c75f98a6d6 Autoscroll Notebook/Default textareas during streaming 2023-08-28 18:22:03 -07:00
oobabooga 558e918fd6 Add a typing dots (...) animation to chat tab 2023-08-28 13:50:36 -07:00
oobabooga 57e9ded00c
Make it possible to scroll during streaming (#3721) 2023-08-28 16:03:20 -03:00
Cebtenzzre 2f5d769a8d
accept floating-point alpha value on the command line (#3712) 2023-08-27 18:54:43 -03:00
oobabooga b2296dcda0 Ctrl+S to show/hide chat controls 2023-08-27 13:14:33 -07:00
Ravindra Marella e4c3e1bdd2
Fix ctransformers model unload (#3711)
Add missing comma in model types list

Fixes marella/ctransformers#111
2023-08-27 10:53:48 -03:00
oobabooga 0c9e818bb8 Update truncation length based on max_seq_len/n_ctx 2023-08-26 23:10:45 -07:00
oobabooga 3361728da1 Change some comments 2023-08-26 22:24:44 -07:00
oobabooga 8aeae3b3f4 Fix llamacpp_HF loading 2023-08-26 22:15:06 -07:00
oobabooga 7f5370a272 Minor fixes/cosmetics 2023-08-26 22:11:07 -07:00
jllllll 4d61a7d9da
Account for deprecated GGML parameters 2023-08-26 14:07:46 -05:00
jllllll 4a999e3bcd
Use separate llama-cpp-python packages for GGML support 2023-08-26 10:40:08 -05:00
oobabooga 83640d6f43 Replace ggml occurences with gguf 2023-08-26 01:06:59 -07:00
jllllll db42b365c9
Fix ctransformers threads auto-detection (#3688) 2023-08-25 14:37:02 -03:00
cal066 960980247f
ctransformers: gguf support (#3685) 2023-08-25 11:33:04 -03:00
oobabooga 21058c37f7 Add missing file 2023-08-25 07:10:26 -07:00
oobabooga f4f04c8c32 Fix a typo 2023-08-25 07:08:38 -07:00
oobabooga 5c7d8bfdfd Detect CodeLlama settings 2023-08-25 07:06:57 -07:00
oobabooga 52ab2a6b9e Add rope_freq_base parameter for CodeLlama 2023-08-25 06:55:15 -07:00
oobabooga feecd8190f Unescape inline code blocks 2023-08-24 21:01:09 -07:00
oobabooga 3320accfdc
Add CFG to llamacpp_HF (second attempt) (#3678) 2023-08-24 20:32:21 -03:00
oobabooga d6934bc7bc
Implement CFG for ExLlama_HF (#3666) 2023-08-24 16:27:36 -03:00
oobabooga 87442c6d18 Fix Notebook Logits tab 2023-08-22 21:00:12 -07:00
oobabooga c0b119c3a3 Improve logit viewer format 2023-08-22 20:35:12 -07:00
oobabooga 8545052c9d Add the option to use samplers in the logit viewer 2023-08-22 20:18:16 -07:00
oobabooga 25e5eaa6a6 Remove outdated training warning 2023-08-22 13:16:44 -07:00
oobabooga 335c49cc7e Bump peft and transformers 2023-08-22 13:14:59 -07:00
cal066 e042bf8624
ctransformers: add mlock and no-mmap options (#3649) 2023-08-22 16:51:34 -03:00
oobabooga 6cca8b8028 Only update notebook token counter on input
For performance during streaming
2023-08-21 05:39:55 -07:00
oobabooga 2cb07065ec Fix an escaping bug 2023-08-20 21:50:42 -07:00
oobabooga a74dd9003f Fix HTML escaping for perplexity_colors extension 2023-08-20 21:40:22 -07:00
oobabooga 57036abc76 Add "send to default/notebook" buttons to chat tab 2023-08-20 19:54:59 -07:00
oobabooga 429cacd715 Add a token counter similar to automatic1111
It can now be found in the Default and Notebook tabs
2023-08-20 19:37:33 -07:00
oobabooga 120fb86c6a
Add a simple logit viewer (#3636) 2023-08-20 20:49:21 -03:00
oobabooga ef17da70af Fix ExLlama truncation 2023-08-20 08:53:26 -07:00
oobabooga ee964bcce9 Update a comment about RoPE scaling 2023-08-20 07:01:43 -07:00
missionfloyd 1cae784761
Unescape last message (#3623) 2023-08-19 09:29:08 -03:00
Cebtenzzre 942ad6067d
llama.cpp: make Stop button work with streaming disabled (#3620) 2023-08-19 00:17:27 -03:00
oobabooga f6724a1a01 Return the visible history with "Copy last reply" 2023-08-18 13:04:45 -07:00
oobabooga b96fd22a81
Refactor the training tab (#3619) 2023-08-18 16:58:38 -03:00
oobabooga c4733000d7 Return the visible history with "Remove last" 2023-08-18 09:25:51 -07:00
oobabooga 7cba000421
Bump llama-cpp-python, +tensor_split by @shouyiwang, +mul_mat_q (#3610) 2023-08-18 12:03:34 -03:00
oobabooga bdb6eb5734 Restyle the chat input box + several CSS improvements
- Remove extra spacing below the last chat message
- Change the background color of code blocks in dark mode
- Remove border radius from selected header bar elements
- Make the chat scrollbar more discrete
2023-08-17 11:10:38 -07:00
oobabooga cebe07f29c Unescape HTML inside code blocks 2023-08-16 21:08:26 -07:00
oobabooga a4e903e932 Escape HTML in chat messages 2023-08-16 09:25:52 -07:00
oobabooga 73d9befb65 Make "Show controls" customizable through settings.yaml 2023-08-16 07:04:18 -07:00
oobabooga 2a29208224
Add a "Show controls" button to chat UI (#3590) 2023-08-16 02:39:58 -03:00
cal066 991bb57e43
ctransformers: Fix up model_type name consistency (#3567) 2023-08-14 15:17:24 -03:00
oobabooga ccfc02a28d
Add the --disable_exllama option for AutoGPTQ (#3545 from clefever/disable-exllama) 2023-08-14 15:15:55 -03:00
oobabooga 7e57b35b5e Clean up old code 2023-08-14 10:10:39 -07:00
oobabooga 4d067e9b52 Add back a variable to keep old extensions working 2023-08-14 09:39:06 -07:00
oobabooga d8a82d34ed Improve a warning 2023-08-14 08:46:05 -07:00
oobabooga 3e0a9f9cdb Refresh the character dropdown when saving/deleting a character 2023-08-14 08:23:41 -07:00
oobabooga 890b4abdad Fix session saving 2023-08-14 07:55:52 -07:00
oobabooga 619cb4e78b
Add "save defaults to settings.yaml" button (#3574) 2023-08-14 11:46:07 -03:00
oobabooga a95e6f02cb Add a placeholder for custom stopping strings 2023-08-13 21:17:20 -07:00
oobabooga ff9b5861c8 Fix impersonate when some text is present (closes #3564) 2023-08-13 21:10:47 -07:00
oobabooga cc7e6ef645 Fix a CSS conflict 2023-08-13 19:24:09 -07:00
Eve 66c04c304d
Various ctransformers fixes (#3556)
---------

Co-authored-by: cal066 <cal066@users.noreply.github.com>
2023-08-13 23:09:03 -03:00
oobabooga 4a05aa92cb Add "send to" buttons for instruction templates
- Remove instruction templates from prompt dropdowns (default/notebook)
- Add 3 buttons to Parameters > Instruction template as a replacement
- Increase the number of lines of 'negative prompt' field to 3, and add a scrollbar
- When uploading a character, switch to the Character tab
- When uploading chat history, switch to the Chat tab
2023-08-13 18:35:45 -07:00
oobabooga f6db2c78d1 Fix ctransformers seed 2023-08-13 05:48:53 -07:00
oobabooga a1a9ec895d
Unify the 3 interface modes (#3554) 2023-08-13 01:12:15 -03:00
cal066 bf70c19603
ctransformers: move thread and seed parameters (#3543) 2023-08-13 00:04:03 -03:00
Chris Lefever 0230fa4e9c Add the --disable_exllama option for AutoGPTQ 2023-08-12 02:26:58 -04:00
oobabooga 0e05818266 Style changes 2023-08-11 16:35:57 -07:00
oobabooga 2f918ccf7c Remove unused parameter 2023-08-11 11:15:22 -07:00
oobabooga 28c8df337b Add repetition_penalty_range to ctransformers 2023-08-11 11:04:19 -07:00
cal066 7a4fcee069
Add ctransformers support (#3313)
---------

Co-authored-by: cal066 <cal066@users.noreply.github.com>
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
Co-authored-by: randoentity <137087500+randoentity@users.noreply.github.com>
2023-08-11 14:41:33 -03:00
oobabooga 8dbaa20ca8 Don't replace last reply with an empty message 2023-08-10 13:14:48 -07:00
oobabooga 0789554f65 Allow --lora to use an absolute path 2023-08-10 10:03:12 -07:00
oobabooga 3929971b66 Don't show oobabooga_llama-tokenizer in the model dropdown 2023-08-10 10:02:48 -07:00
oobabooga c7f52bbdc1 Revert "Remove GPTQ-for-LLaMa monkey patch support"
This reverts commit e3d3565b2a.
2023-08-10 08:39:41 -07:00
jllllll d6765bebc4
Update installation documentation 2023-08-10 00:53:48 -05:00
jllllll d7ee4c2386
Remove unused import 2023-08-10 00:10:14 -05:00
jllllll e3d3565b2a
Remove GPTQ-for-LLaMa monkey patch support
AutoGPTQ will be the preferred GPTQ LoRa loader in the future.
2023-08-09 23:59:04 -05:00
jllllll bee73cedbd
Streamline GPTQ-for-LLaMa support 2023-08-09 23:42:34 -05:00
oobabooga 6c6a52aaad Change the filenames for caches and histories 2023-08-09 07:47:19 -07:00
oobabooga d8fb506aff Add RoPE scaling support for transformers (including dynamic NTK)
https://github.com/huggingface/transformers/pull/24653
2023-08-08 21:25:48 -07:00
Friedemann Lipphardt 901b028d55
Add option for named cloudflare tunnels (#3364) 2023-08-08 22:20:27 -03:00
oobabooga bf08b16b32 Fix disappearing profile picture bug 2023-08-08 14:09:01 -07:00
Gennadij 0e78f3b4d4
Fixed a typo in "rms_norm_eps", incorrectly set as n_gqa (#3494) 2023-08-08 00:31:11 -03:00
oobabooga 37fb719452
Increase the Context/Greeting boxes sizes 2023-08-08 00:09:00 -03:00
oobabooga 584dd33424
Fix missing example_dialogue when uploading characters 2023-08-07 23:44:59 -03:00
oobabooga 412f6ff9d3 Change alpha_value maximum and step 2023-08-07 06:08:51 -07:00
oobabooga a373c96d59 Fix a bug in modules/shared.py 2023-08-06 20:36:35 -07:00
oobabooga 3d48933f27 Remove ancient deprecation warnings 2023-08-06 18:58:59 -07:00
oobabooga c237ce607e Move characters/instruction-following to instruction-templates 2023-08-06 17:50:32 -07:00
oobabooga 65aa11890f
Refactor everything (#3481) 2023-08-06 21:49:27 -03:00
oobabooga d4b851bdc8 Credit turboderp 2023-08-06 13:43:15 -07:00
oobabooga 0af10ab49b
Add Classifier Free Guidance (CFG) for Transformers/ExLlama (#3325) 2023-08-06 17:22:48 -03:00
missionfloyd 5134878344
Fix chat message order (#3461) 2023-08-05 13:53:54 -03:00
jllllll 44f31731af
Create logs dir if missing when saving history (#3462) 2023-08-05 13:47:16 -03:00
Forkoz 9dcb37e8d4
Fix: Mirostat fails on models split across multiple GPUs 2023-08-05 13:45:47 -03:00
oobabooga 8df3cdfd51
Add SSL certificate support (#3453) 2023-08-04 13:57:31 -03:00
missionfloyd 2336b75d92
Remove unnecessary chat.js (#3445) 2023-08-04 01:58:37 -03:00
oobabooga 4b3384e353 Handle unfinished lists during markdown streaming 2023-08-03 17:15:18 -07:00
Pete f4005164f4
Fix llama.cpp truncation (#3400)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-08-03 20:01:15 -03:00
oobabooga 87dab03dc0
Add the --cpu option for llama.cpp to prevent CUDA from being used (#3432) 2023-08-03 11:00:36 -03:00
oobabooga 3e70bce576 Properly format exceptions in the UI 2023-08-03 06:57:21 -07:00
oobabooga 32c564509e Fix loading session in chat mode 2023-08-02 21:13:16 -07:00
oobabooga 0e8f9354b5 Add direct download for session/chat history JSONs 2023-08-02 19:43:39 -07:00
oobabooga 32a2bbee4a Implement auto_max_new_tokens for ExLlama 2023-08-02 11:03:56 -07:00
oobabooga e931844fe2
Add auto_max_new_tokens parameter (#3419) 2023-08-02 14:52:20 -03:00
Pete 6afc1a193b
Add a scrollbar to notebook/default, improve chat scrollbar style (#3403)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-08-02 12:02:36 -03:00
oobabooga b53ed70a70 Make llamacpp_HF 6x faster 2023-08-01 13:18:20 -07:00
oobabooga 8d46a8c50a Change the default chat style and the default preset 2023-08-01 09:35:17 -07:00
oobabooga 959feba602 When saving model settings, only save the settings for the current loader 2023-08-01 06:10:09 -07:00
oobabooga f094330df0 When saving a preset, only save params that differ from the defaults 2023-07-31 19:13:29 -07:00
oobabooga 84297d05c4 Add a "Filter by loader" menu to the Parameters tab 2023-07-31 19:09:02 -07:00
oobabooga 7de7b3d495 Fix newlines in exported character yamls 2023-07-31 10:46:02 -07:00
oobabooga 5ca37765d3 Only replace {{user}} and {{char}} at generation time 2023-07-30 11:42:30 -07:00
oobabooga 6e16af34fd Save uploaded characters as yaml
Also allow yaml characters to be uploaded directly
2023-07-30 11:25:38 -07:00
oobabooga b31321c779 Define visible_text before applying chat_input extensions 2023-07-26 07:27:14 -07:00
oobabooga b17893a58f Revert "Add tensor split support for llama.cpp (#3171)"
This reverts commit 031fe7225e.
2023-07-26 07:06:01 -07:00
oobabooga 28779cd959 Use dark theme by default 2023-07-25 20:11:57 -07:00
oobabooga c2e0d46616 Add credits 2023-07-25 15:49:04 -07:00
oobabooga 77d2e9f060 Remove flexgen 2 2023-07-25 15:18:25 -07:00
oobabooga 75c2dd38cf Remove flexgen support 2023-07-25 15:15:29 -07:00
Foxtr0t1337 85b3a26e25
Ignore values which are not string in training.py (#3287) 2023-07-25 19:00:25 -03:00
Shouyi 031fe7225e
Add tensor split support for llama.cpp (#3171) 2023-07-25 18:59:26 -03:00
Eve f653546484
README updates and improvements (#3198) 2023-07-25 18:58:13 -03:00
oobabooga ef8637e32d
Add extension example, replace input_hijack with chat_input_modifier (#3307) 2023-07-25 18:49:56 -03:00
oobabooga a07d070b6c
Add llama-2-70b GGML support (#3285) 2023-07-24 16:37:03 -03:00
jllllll 1141987a0d
Add checks for ROCm and unsupported architectures to llama_cpp_cuda loading (#3225) 2023-07-24 11:25:36 -03:00
Ikko Eltociear Ashimine b2d5433409
Fix typo in deepspeed_parameters.py (#3222)
configration -> configuration
2023-07-24 11:17:28 -03:00
oobabooga 4b19b74e6c Add CUDA wheels for llama-cpp-python by jllllll 2023-07-19 19:33:43 -07:00
oobabooga 913e060348 Change the default preset to Divine Intellect
It seems to reduce hallucination while using instruction-tuned models.
2023-07-19 08:24:37 -07:00
randoentity a69955377a
[GGML] Support for customizable RoPE (#3083)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-07-17 22:32:37 -03:00
appe233 89e0d15cf5
Use 'torch.backends.mps.is_available' to check if mps is supported (#3164) 2023-07-17 21:27:18 -03:00
oobabooga 8c1c2e0fae Increase max_new_tokens upper limit 2023-07-17 17:08:22 -07:00
oobabooga b1a6ea68dd Disable "autoload the model" by default 2023-07-17 07:40:56 -07:00
oobabooga a199f21799 Optimize llamacpp_hf a bit 2023-07-16 20:49:48 -07:00
oobabooga 6a3edb0542 Clean up llamacpp_hf.py 2023-07-15 22:40:55 -07:00
oobabooga 27a84b4e04 Make AutoGPTQ the default again
Purely for compatibility with more models.
You should still use ExLlama_HF for LLaMA models.
2023-07-15 22:29:23 -07:00
oobabooga 5e3f7e00a9
Create llamacpp_HF loader (#3062) 2023-07-16 02:21:13 -03:00
oobabooga 94dfcec237
Make it possible to evaluate exllama perplexity (#3138) 2023-07-16 01:52:55 -03:00
oobabooga b284f2407d Make ExLlama_HF the new default for GPTQ 2023-07-14 14:03:56 -07:00
Morgan Schweers 6d1e911577
Add support for logits processors in extensions (#3029) 2023-07-13 17:22:41 -03:00
oobabooga e202190c4f lint 2023-07-12 11:33:25 -07:00
FartyPants 9b55d3a9f9
More robust and error prone training (#3058) 2023-07-12 15:29:43 -03:00
oobabooga 30f37530d5 Add back .replace('\r', '') 2023-07-12 09:52:20 -07:00
Fernando Tarin Morales 987d0fe023
Fix: Fixed the tokenization process of a raw dataset and improved its efficiency (#3035) 2023-07-12 12:05:37 -03:00
kabachuha 3f19e94c93
Add Tensorboard/Weights and biases integration for training (#2624) 2023-07-12 11:53:31 -03:00
kizinfo 5d513eea22
Add ability to load all text files from a subdirectory for training (#1997)
* Update utils.py

returns individual txt files and subdirectories to getdatasets to allow for training from a directory of text files

* Update training.py

minor tweak to training on raw datasets to detect if a directory is selected, and if so, to load in all the txt files in that directory for training

* Update put-trainer-datasets-here.txt

document

* Minor change

* Use pathlib, sort by natural keys

* Space

---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-07-12 11:44:30 -03:00
practicaldreamer 73a0def4af
Add Feature to Log Sample of Training Dataset for Inspection (#1711) 2023-07-12 11:26:45 -03:00
oobabooga b6ba68eda9 Merge remote-tracking branch 'refs/remotes/origin/dev' into dev 2023-07-12 07:19:34 -07:00
oobabooga a17b78d334 Disable wandb during training 2023-07-12 07:19:12 -07:00
Gabriel Pena eedb3bf023
Add low vram mode on llama cpp (#3076) 2023-07-12 11:05:13 -03:00
Axiom Wolf d986c17c52
Chat history download creates more detailed file names (#3051) 2023-07-12 00:10:36 -03:00
Salvador E. Tropea 324e45b848
[Fixed] wbits and groupsize values from model not shown (#2977) 2023-07-11 23:27:38 -03:00
oobabooga e3810dff40 Style changes 2023-07-11 18:49:06 -07:00
Ricardo Pinto 3e9da5a27c
Changed FormComponent to IOComponent (#3017)
Co-authored-by: Ricardo Pinto <1-ricardo.pinto@users.noreply.gitlab.cognitage.com>
2023-07-11 18:52:16 -03:00
Forkoz 74ea7522a0
Lora fixes for AutoGPTQ (#2818) 2023-07-09 01:03:43 -03:00
oobabooga 5ac4e4da8b Make --model work with argument like models/folder_name 2023-07-08 10:22:54 -07:00
oobabooga b6643e5039 Add decode functions to llama.cpp/exllama 2023-07-07 09:11:30 -07:00
oobabooga 1ba2e88551 Add truncation to exllama 2023-07-07 09:09:23 -07:00
oobabooga c21b73ff37 Minor change to ui.py 2023-07-07 09:09:14 -07:00
oobabooga de994331a4 Merge remote-tracking branch 'refs/remotes/origin/main' 2023-07-06 22:25:43 -07:00
oobabooga 9aee1064a3 Block a cloudfare request 2023-07-06 22:24:52 -07:00
Fernando Tarin Morales d7e14e1f78
Fixed the param name when loading a LoRA using a model loaded in 4 or 8 bits (#3036) 2023-07-07 02:24:07 -03:00
Xiaojian "JJ" Deng ff45317032
Update models.py (#3020)
Hopefully fixed error with "ValueError: Tokenizer class GPTNeoXTokenizer does not exist or is not currently 
imported."
2023-07-05 21:40:43 -03:00
oobabooga 8705eba830 Remove universal llama tokenizer support
Instead replace it with a warning if the tokenizer files look off
2023-07-04 19:43:19 -07:00
oobabooga 333075e726
Fix #3003 2023-07-04 11:38:35 -03:00
oobabooga 463ddfffd0 Fix start_with 2023-07-03 23:32:02 -07:00
oobabooga 373555c4fb Fix loading some histories (thanks kaiokendev) 2023-07-03 22:19:28 -07:00
Panchovix 10c8c197bf
Add Support for Static NTK RoPE scaling for exllama/exllama_hf (#2955) 2023-07-04 01:13:16 -03:00
oobabooga 7e8340b14d Make greetings appear in --multi-user mode 2023-07-03 20:08:14 -07:00
oobabooga 4b1804a438
Implement sessions + add basic multi-user support (#2991) 2023-07-04 00:03:30 -03:00
FartyPants 1f8cae14f9
Update training.py - correct use of lora_names (#2988) 2023-07-03 17:41:18 -03:00
FartyPants c23c88ee4c
Update LoRA.py - avoid potential error (#2953) 2023-07-03 17:40:22 -03:00
FartyPants 33f56fd41d
Update models.py to clear LORA names after unload (#2951) 2023-07-03 17:39:06 -03:00
FartyPants 48b11f9c5b
Training: added trainable parameters info (#2944) 2023-07-03 17:38:36 -03:00
Turamarth14 847f70b694
Update html_generator.py (#2954)
With version 10.0.0 of Pillow the constant Image.ANTIALIAS has been removed. Instead Image.LANCZOS should be used.
2023-07-02 01:43:58 -03:00
ardfork 3c076c3c80
Disable half2 for ExLlama when using HIP (#2912) 2023-06-29 15:03:16 -03:00
missionfloyd ac0f96e785
Some more character import tweaks. (#2921) 2023-06-29 14:56:25 -03:00
oobabooga 79db629665 Minor bug fix 2023-06-29 13:53:06 -03:00
oobabooga 3443219cbc
Add repetition penalty range parameter to transformers (#2916) 2023-06-29 13:40:13 -03:00
oobabooga 20740ab16e Revert "Fix exllama_hf gibbersh above 2048 context, and works >5000 context. (#2913)"
This reverts commit 37a16d23a7.
2023-06-28 18:10:34 -03:00
Panchovix 37a16d23a7
Fix exllama_hf gibbersh above 2048 context, and works >5000 context. (#2913) 2023-06-28 12:36:07 -03:00
FartyPants ab1998146b
Training update - backup the existing adapter before training on top of it (#2902) 2023-06-27 18:24:04 -03:00
oobabooga 22d455b072 Add LoRA support to ExLlama_HF 2023-06-26 00:10:33 -03:00
oobabooga c52290de50
ExLlama with long context (#2875) 2023-06-25 22:49:26 -03:00
oobabooga 9290c6236f Keep ExLlama_HF if already selected 2023-06-25 19:06:28 -03:00
oobabooga 75fd763f99 Fix chat saving issue (closes #2863) 2023-06-25 18:14:57 -03:00
FartyPants 21c189112c
Several Training Enhancements (#2868) 2023-06-25 15:34:46 -03:00
oobabooga 95212edf1f
Update training.py 2023-06-25 12:13:15 -03:00
oobabooga f31281a8de Fix loading instruction templates containing literal '\n' 2023-06-25 02:13:26 -03:00
oobabooga f0fcd1f697 Sort some imports 2023-06-25 01:44:36 -03:00
oobabooga 365b672531 Minor change to prevent future bugs 2023-06-25 01:38:54 -03:00
jllllll bef67af23c
Use pre-compiled python module for ExLlama (#2770) 2023-06-24 20:24:17 -03:00
oobabooga cec5fb0ef6 Failed attempt at evaluating exllama_hf perplexity 2023-06-24 12:02:25 -03:00
快乐的我531 e356f69b36
Make stop_everything work with non-streamed generation (#2848) 2023-06-24 11:19:16 -03:00
oobabooga ec482f3dae Apply input extensions after yielding *Is typing...* 2023-06-24 11:07:11 -03:00
oobabooga 3e80f2aceb Apply the output extensions only once
Relevant for google translate, silero
2023-06-24 10:59:07 -03:00
missionfloyd 51a388fa34
Organize chat history/character import menu (#2845)
* Organize character import menu

* Move Chat history upload/download labels
2023-06-24 09:55:02 -03:00
oobabooga 8bb3bb39b3
Implement stopping string search in string space (#2847) 2023-06-24 09:43:00 -03:00
oobabooga 3ae9af01aa Add --no_use_cuda_fp16 param for AutoGPTQ 2023-06-23 12:22:56 -03:00
Panchovix 5646690769
Fix some models not loading on exllama_hf (#2835) 2023-06-23 11:31:02 -03:00
oobabooga 383c50f05b
Replace old presets with the results of Preset Arena (#2830) 2023-06-23 01:48:29 -03:00
Panchovix b4a38c24b7
Fix Multi-GPU not working on exllama_hf (#2803) 2023-06-22 16:05:25 -03:00
LarryVRH 580c1ee748
Implement a demo HF wrapper for exllama to utilize existing HF transformers decoding. (#2777) 2023-06-21 15:31:42 -03:00
EugeoSynthesisThirtyTwo 7625c6de89
fix usage of self in classmethod (#2781) 2023-06-20 16:18:42 -03:00
MikoAL c40932eb39
Added Falcon LoRA training support (#2684)
I am 50% sure this will work
2023-06-20 01:03:44 -03:00
FartyPants ce86f726e9
Added saving of training logs to training_log.json (#2769) 2023-06-20 00:47:36 -03:00
Cebtenzzre 59e7ecb198
llama.cpp: implement ban_eos_token via logits_processor (#2765) 2023-06-19 21:31:19 -03:00
oobabooga eb30f4441f
Add ExLlama+LoRA support (#2756) 2023-06-19 12:31:24 -03:00
oobabooga 5f418f6171 Fix a memory leak (credits for the fix: Ph0rk0z) 2023-06-19 01:19:28 -03:00
ThisIsPIRI def3b69002
Fix loading condition for universal llama tokenizer (#2753) 2023-06-18 18:14:06 -03:00
oobabooga 09c781b16f Add modules/block_requests.py
This has become unnecessary, but it could be useful in the future
for other libraries.
2023-06-18 16:31:14 -03:00
Forkoz 3cae1221d4
Update exllama.py - Respect model dir parameter (#2744) 2023-06-18 13:26:30 -03:00
oobabooga c5641b65d3 Handle leading spaces properly in ExLllama 2023-06-17 19:35:12 -03:00
oobabooga 05a743d6ad Make llama.cpp use tfs parameter 2023-06-17 19:08:25 -03:00
oobabooga e19cbea719 Add a variable to modules/shared.py 2023-06-17 19:02:29 -03:00
oobabooga cbd63eeeff Fix repeated tokens with exllama 2023-06-17 19:02:08 -03:00
oobabooga 766c760cd7 Use gen_begin_reuse in exllama 2023-06-17 18:00:10 -03:00
oobabooga b27f83c0e9 Make exllama stoppable 2023-06-16 22:03:23 -03:00
oobabooga 7f06d551a3 Fix streaming callback 2023-06-16 21:44:56 -03:00
oobabooga 5f392122fd Add gpu_split param to ExLlama
Adapted from code created by Ph0rk0z. Thank you Ph0rk0z.
2023-06-16 20:49:36 -03:00
oobabooga 9f40032d32
Add ExLlama support (#2444) 2023-06-16 20:35:38 -03:00
oobabooga dea43685b0 Add some clarifications 2023-06-16 19:10:53 -03:00
oobabooga 7ef6a50e84
Reorganize model loading UI completely (#2720) 2023-06-16 19:00:37 -03:00
Tom Jobbins 646b0c889f
AutoGPTQ: Add UI and command line support for disabling fused attention and fused MLP (#2648) 2023-06-15 23:59:54 -03:00
oobabooga 2b9a6b9259 Merge remote-tracking branch 'refs/remotes/origin/main' 2023-06-14 18:45:24 -03:00
oobabooga 4d508cbe58 Add some checks to AutoGPTQ loader 2023-06-14 18:44:43 -03:00
FartyPants 56c19e623c
Add LORA name instead of "default" in PeftModel (#2689) 2023-06-14 18:29:42 -03:00
oobabooga 474dc7355a Allow API requests to use parameter presets 2023-06-14 11:32:20 -03:00
oobabooga e471919e6d Make llava/minigpt-4 work with AutoGPTQ 2023-06-11 17:56:01 -03:00
oobabooga f4defde752 Add a menu for installing extensions 2023-06-11 17:11:06 -03:00
oobabooga ac122832f7 Make dropdown menus more similar to automatic1111 2023-06-11 14:20:16 -03:00
oobabooga 6133675e0f
Add menus for saving presets/characters/instruction templates/prompts (#2621) 2023-06-11 12:19:18 -03:00
brandonj60 b04e18d10c
Add Mirostat v2 sampling to transformer models (#2571) 2023-06-09 21:26:31 -03:00
oobabooga 6015616338 Style changes 2023-06-06 13:06:05 -03:00
oobabooga f040073ef1 Handle the case of older autogptq install 2023-06-06 13:05:05 -03:00
oobabooga bc58dc40bd Fix a minor bug 2023-06-06 12:57:13 -03:00
oobabooga 00b94847da Remove softprompt support 2023-06-06 07:42:23 -03:00
oobabooga 0aebc838a0 Don't save the history for 'None' character 2023-06-06 07:21:07 -03:00
oobabooga 9f215523e2 Remove some unused imports 2023-06-06 07:05:46 -03:00
oobabooga 0f0108ce34 Never load the history for default character 2023-06-06 07:00:11 -03:00
oobabooga 11f38b5c2b Add AutoGPTQ LoRA support 2023-06-05 23:32:57 -03:00
oobabooga 3a5cfe96f0 Increase chat_prompt_size_max 2023-06-05 17:37:37 -03:00