Commit graph

3586 commits

Author SHA1 Message Date
Bartowski 9ad116a6e2
Add config for hyperion and hercules models to use chatml (#5742) 2024-03-26 16:35:29 -03:00
wldhx 7cbafc0540
docker: Remove obsolete CLI_ARGS variable (#5726) 2024-03-26 16:34:53 -03:00
Yiximail bdcf31035f
Set a default empty string for user_bio to fix #5717 issue (#5722) 2024-03-26 16:34:03 -03:00
Yiximail 8c9aca239a
Fix prompt incorrectly set to empty when suffix is empty string (#5757) 2024-03-26 16:33:09 -03:00
oobabooga 2a92a842ce
Bump gradio to 4.23 (#5758) 2024-03-26 16:32:20 -03:00
oobabooga 49b111e2dd Lint 2024-03-17 08:33:23 -07:00
oobabooga d890c99b53 Fix StreamingLLM when content is removed from the beginning of the prompt 2024-03-14 09:18:54 -07:00
oobabooga d828844a6f Small fix: don't save truncation_length to settings.yaml
It should derive from model metadata or from a command-line flag.
2024-03-14 08:56:28 -07:00
oobabooga 2ef5490a36 UI: make light theme less blinding 2024-03-13 08:23:16 -07:00
oobabooga 40a60e0297 Convert attention_sink_size to int (closes #5696) 2024-03-13 08:15:49 -07:00
oobabooga edec3bf3b0 UI: avoid caching convert_to_markdown calls during streaming 2024-03-13 08:14:34 -07:00
oobabooga 8152152dd6 Small fix after 28076928ac 2024-03-11 19:56:35 -07:00
oobabooga 28076928ac
UI: Add a new "User description" field for user personality/biography (#5691) 2024-03-11 23:41:57 -03:00
oobabooga 63701f59cf UI: mention that n_gpu_layers > 0 is necessary for the GPU to be used 2024-03-11 18:54:15 -07:00
oobabooga 46031407b5 Increase the cache size of convert_to_markdown to 4096 2024-03-11 18:43:04 -07:00
oobabooga 9eca197409 Minor logging change 2024-03-11 16:31:13 -07:00
oobabooga afadc787d7 Optimize the UI by caching convert_to_markdown calls 2024-03-10 20:10:07 -07:00
oobabooga 056717923f Document StreamingLLM 2024-03-10 19:15:23 -07:00
oobabooga 15d90d9bd5 Minor logging change 2024-03-10 18:20:50 -07:00
oobabooga abcdd0ad5b API: don't use settings.yaml for default values 2024-03-10 16:15:52 -07:00
oobabooga a102c704f5 Add numba to requirements.txt 2024-03-10 16:13:29 -07:00
oobabooga b3ade5832b Keep AQLM only for Linux (fails to install on Windows) 2024-03-10 09:41:17 -07:00
oobabooga 67b24b0b88 Bump llama-cpp-python to 0.2.56 2024-03-10 09:07:27 -07:00
oobabooga 763f9beb7e Bump bitsandbytes to 0.43, add official Windows wheel 2024-03-10 08:30:53 -07:00
oobabooga 52a34921ef Installer: validate the checksum for the miniconda installer on Windows 2024-03-09 16:33:12 -08:00
oobabooga cf0697936a Optimize StreamingLLM by over 10x 2024-03-08 21:48:28 -08:00
oobabooga afb51bd5d6
Add StreamingLLM for llamacpp & llamacpp_HF (2nd attempt) (#5669) 2024-03-09 00:25:33 -03:00
oobabooga 9271e80914 Add back AutoAWQ for Windows
https://github.com/casper-hansen/AutoAWQ/issues/377#issuecomment-1986440695
2024-03-08 14:54:56 -08:00
oobabooga 549bb88975 Increase height of "Custom stopping strings" UI field 2024-03-08 12:54:30 -08:00
oobabooga 238f69accc Move "Command for chat-instruct mode" to the main chat tab (closes #5634) 2024-03-08 12:52:52 -08:00
oobabooga d0663bae31
Bump AutoAWQ to 0.2.3 (Linux only) (#5658) 2024-03-08 17:36:28 -03:00
oobabooga 0e6eb7c27a
Add AQLM support (transformers loader) (#5466) 2024-03-08 17:30:36 -03:00
oobabooga 2681f6f640
Make superbooga & superboogav2 functional again (#5656) 2024-03-07 15:03:18 -03:00
oobabooga bae14c8f13 Right-truncate long chat completion prompts instead of left-truncating
Instructions are usually at the beginning of the prompt.
2024-03-07 08:50:24 -08:00
Bartowski 104573f7d4
Update cache_4bit documentation (#5649)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-03-07 13:08:21 -03:00
oobabooga bef08129bc Small fix for cuda 11.8 in the one-click installer 2024-03-06 21:43:36 -08:00
oobabooga 303433001f Fix a check in the installer 2024-03-06 21:13:54 -08:00
oobabooga bde7f00cae Change the exllamav2 version number 2024-03-06 21:08:29 -08:00
oobabooga 2ec1d96c91
Add cache_4bit option for ExLlamaV2 (#5645) 2024-03-06 23:02:25 -03:00
oobabooga fa0e68cefd Installer: add back INSTALL_EXTENSIONS environment variable (for docker) 2024-03-06 11:31:06 -08:00
oobabooga fcc92caa30 Installer: add option to install requirements for just one extension 2024-03-06 07:36:23 -08:00
oobabooga 2174958362
Revert gradio to 3.50.2 (#5640) 2024-03-06 11:52:46 -03:00
oobabooga 7eee9e9470 Add -k to curl command to download miniconda on windows (closes #5628) 2024-03-06 06:46:50 -08:00
oobabooga 03f03af535 Revert "Update peft requirement from ==0.8.* to ==0.9.* (#5626)"
This reverts commit 72a498ddd4.
2024-03-05 02:56:37 -08:00
oobabooga d61e31e182
Save the extensions after Gradio 4 (#5632) 2024-03-05 07:54:34 -03:00
oobabooga ae12d045ea Merge remote-tracking branch 'refs/remotes/origin/dev' into dev 2024-03-05 02:35:04 -08:00
dependabot[bot] 72a498ddd4
Update peft requirement from ==0.8.* to ==0.9.* (#5626) 2024-03-05 07:34:32 -03:00
oobabooga 1437f757a1 Bump HQQ to 0.1.5 2024-03-05 02:33:51 -08:00
oobabooga 63a1d4afc8
Bump gradio to 4.19 (#5522) 2024-03-05 07:32:28 -03:00
oobabooga 164ff2440d Use the correct PyTorch in the Colab notebook 2024-03-05 01:05:19 -08:00