Commit graph

1031 commits

Author SHA1 Message Date
Φφ 1a1e420e65 Silero_tts streaming fix
Temporarily suppress the streaming during the audio response as it would interfere with the audio (making it stutter and play anew)
2023-03-25 21:33:30 +03:00
oobabooga 8c8e8b4450
Fix the early stopping callback #559 2023-03-25 12:35:52 -03:00
oobabooga a1f12d607f
Merge pull request #538 from Ph0rk0z/display-input-context
Add display of context when input was generated
2023-03-25 11:56:18 -03:00
oobabooga 70f9565f37
Update README.md 2023-03-25 02:35:30 -03:00
oobabooga 25be9698c7
Fix LoRA on mps 2023-03-25 01:18:32 -03:00
oobabooga 3da633a497
Merge pull request #529 from EyeDeck/main
Allow loading of .safetensors through GPTQ-for-LLaMa
2023-03-24 23:51:01 -03:00
oobabooga 9fa47c0eed
Revert GPTQ_loader.py (accident) 2023-03-24 19:57:12 -03:00
oobabooga a6bf54739c
Revert models.py (accident) 2023-03-24 19:56:45 -03:00
oobabooga 0a16224451
Update GPTQ_loader.py 2023-03-24 19:54:36 -03:00
oobabooga a80aa65986
Update models.py 2023-03-24 19:53:20 -03:00
oobabooga 507db0929d
Do not use empty user messages in chat mode
This allows the bot to send messages by clicking on Generate with empty inputs.
2023-03-24 17:22:22 -03:00
oobabooga 6e1b16c2aa
Update html_generator.py 2023-03-24 17:18:27 -03:00
oobabooga ffb0187e83
Update chat.py 2023-03-24 17:17:29 -03:00
oobabooga c14e598f14
Merge pull request #433 from mayaeary/fix/api-reload
Fix api extension duplicating
2023-03-24 16:56:10 -03:00
oobabooga bfe960731f
Merge branch 'main' into fix/api-reload 2023-03-24 16:54:41 -03:00
oobabooga 4a724ed22f
Reorder imports 2023-03-24 16:53:56 -03:00
oobabooga 8fad84abc2
Update extensions.py 2023-03-24 16:51:27 -03:00
oobabooga d8e950d6bd
Don't load the model twice when using --lora 2023-03-24 16:30:32 -03:00
oobabooga fd99995b01
Make the Stop button more consistent in chat mode 2023-03-24 15:59:27 -03:00
Forkoz b740c5b284
Add display of context when input was generated
Not sure if I did this right but it does move with the conversation and seems to match value.
2023-03-24 08:56:07 -05:00
oobabooga 4f5c2ce785
Fix chat_generation_attempts 2023-03-24 02:03:30 -03:00
oobabooga 04417b658b
Update README.md 2023-03-24 01:40:43 -03:00
oobabooga bb4cb22453
Download .pt files using download-model.py (for 4-bit models) 2023-03-24 00:49:04 -03:00
oobabooga 143b5b5edf
Mention one-click-bandaid in the README 2023-03-23 23:28:50 -03:00
EyeDeck dcfd866402 Allow loading of .safetensors through GPTQ-for-LLaMa 2023-03-23 21:31:34 -04:00
oobabooga 8747c74339
Another missing import 2023-03-23 22:19:01 -03:00
oobabooga 7078d168c3
Missing import 2023-03-23 22:16:08 -03:00
oobabooga d1327f99f9
Fix broken callbacks.py 2023-03-23 22:12:24 -03:00
oobabooga 9bdb3c784d
Minor fix 2023-03-23 22:02:40 -03:00
oobabooga b0abb327d8
Update LoRA.py 2023-03-23 22:02:09 -03:00
oobabooga bf22d16ebc
Clear cache while switching LoRAs 2023-03-23 21:56:26 -03:00
oobabooga 4578e88ffd
Stop the bot from talking for you in chat mode 2023-03-23 21:38:20 -03:00
oobabooga 9bf6ecf9e2
Fix LoRA device map (attempt) 2023-03-23 16:49:41 -03:00
oobabooga c5ebcc5f7e
Change the default names (#518)
* Update shared.py

* Update settings-template.json
2023-03-23 13:36:00 -03:00
oobabooga 29bd41d453
Fix LoRA in CPU mode 2023-03-23 01:05:13 -03:00
oobabooga eac27f4f55
Make LoRAs work in 16-bit mode 2023-03-23 00:55:33 -03:00
oobabooga bfa81e105e
Fix FlexGen streaming 2023-03-23 00:22:14 -03:00
oobabooga 7b6f85d327
Fix markdown headers in light mode 2023-03-23 00:13:34 -03:00
oobabooga de6a09dc7f
Properly separate the original prompt from the reply 2023-03-23 00:12:40 -03:00
oobabooga d5fc1bead7
Merge pull request #489 from Brawlence/ext-fixes
Extensions performance & memory optimisations
2023-03-22 16:10:59 -03:00
oobabooga bfb1be2820
Minor fix 2023-03-22 16:09:48 -03:00
oobabooga 0abff499e2
Use image.thumbnail 2023-03-22 16:03:05 -03:00
oobabooga 104212529f
Minor changes 2023-03-22 15:55:03 -03:00
wywywywy 61346b88ea
Add "seed" menu in the Parameters tab 2023-03-22 15:40:20 -03:00
Φφ 5389fce8e1 Extensions performance & memory optimisations
Reworked remove_surrounded_chars() to use regular expression ( https://regexr.com/7alb5 ) instead of repeated string concatenations for elevenlab_tts, silero_tts, sd_api_pictures. This should be both faster and more robust in handling asterisks.

Reduced the memory footprint of send_pictures and sd_api_pictures by scaling the images in the chat to 300 pixels max-side wise. (The user already has the original in case of the sent picture and there's an option to save the SD generation).
This should fix history growing annoyingly large with multiple pictures present
2023-03-22 11:51:00 +03:00
oobabooga 45b7e53565
Only catch proper Exceptions in the text generation function 2023-03-20 20:36:02 -03:00
oobabooga 6872ffd976
Update README.md 2023-03-20 16:53:14 -03:00
oobabooga db4219a340
Update comments 2023-03-20 16:40:08 -03:00
oobabooga 7618f3fe8c
Add -gptq-preload for 4-bit offloading (#460)
This works in a 4GB card now:

```
python server.py --model llama-7b-hf --gptq-bits 4 --gptq-pre-layer 20
```
2023-03-20 16:30:56 -03:00
Vladimir Belitskiy e96687b1d6 Do not send empty user input as part of the prompt.
However, if extensions modify the empty prompt to be non-empty,
it'l still work as before.
2023-03-20 14:27:39 -04:00