Commit graph

249 commits

Author SHA1 Message Date
oobabooga 27f3a78834 Better detect when no model is loaded 2023-04-16 17:35:54 -03:00
oobabooga b937c9d8c2
Add skip_special_tokens checkbox for Dolly model (#1218) 2023-04-16 14:24:49 -03:00
kernyan ac19d5101f
revert incorrect eos_token_id change from #814 (#1261)
- fixes #1054
2023-04-16 01:47:01 -03:00
oobabooga a2127239de Fix a bug 2023-04-16 01:41:37 -03:00
oobabooga 9d3c6d2dc3 Fix a bug 2023-04-16 01:40:47 -03:00
Mikel Bober-Irizar 16a3a5b039
Merge pull request from GHSA-hv5m-3rp9-xcpf
* Remove eval of API input

* Remove unnecessary eval/exec for security

* Use ast.literal_eval

* Use ast.literal_eval

---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-04-16 01:36:50 -03:00
oobabooga 8e31f2bad4
Automatically set wbits/groupsize/instruct based on model name (#1167) 2023-04-14 11:07:28 -03:00
oobabooga 04866dc4fc Add a warning for when no model is loaded 2023-04-13 10:35:08 -03:00
oobabooga cacbcda208
Two new options: truncation length and ban eos token 2023-04-11 18:46:06 -03:00
catalpaaa 78bbc66fc4
allow custom stopping strings in all modes (#903) 2023-04-11 12:30:06 -03:00
oobabooga 0f212093a3
Refactor the UI
A single dictionary called 'interface_state' is now passed as input to all functions. The values are updated only when necessary.

The goal is to make it easier to add new elements to the UI.
2023-04-11 11:46:30 -03:00
Alex "mcmonkey" Goodwin 0caf718a21
add on-page documentation to parameters (#1008) 2023-04-10 17:19:12 -03:00
oobabooga bd04ff27ad Make the bos token optional 2023-04-10 16:44:22 -03:00
oobabooga 769aa900ea Print the used seed 2023-04-10 10:53:31 -03:00
Alex "mcmonkey" Goodwin 30befe492a fix random seeds to actually randomize
Without this fix, manual seeds get locked in.
2023-04-10 06:29:10 -07:00
oobabooga cb169d0834 Minor formatting changes 2023-04-08 17:34:07 -03:00
Φφ ffd102e5c0
SD Api Pics extension, v.1.1 (#596) 2023-04-07 21:36:04 -03:00
oobabooga 6762e62a40 Simplifications 2023-04-07 11:14:32 -03:00
oobabooga ea6e77df72
Make the code more like PEP8 for readability (#862) 2023-04-07 00:15:45 -03:00
oobabooga 113f94b61e Bump transformers (16-bit llama must be reconverted/redownloaded) 2023-04-06 16:04:03 -03:00
oobabooga 3f3e42e26c
Refactor several function calls and the API 2023-04-06 01:22:15 -03:00
oobabooga b0890a7925 Add shared.is_chat() function 2023-04-01 20:15:00 -03:00
oobabooga eeafd60713 Fix streaming 2023-03-31 19:05:38 -03:00
oobabooga 52065ae4cd Add repetition_penalty 2023-03-31 19:01:34 -03:00
oobabooga 0aee7341d8 Properly count tokens/s for llama.cpp in chat mode 2023-03-31 17:04:32 -03:00
oobabooga 09b0a3aafb Add repetition_penalty 2023-03-31 14:45:17 -03:00
oobabooga 9d1dcf880a General improvements 2023-03-31 14:27:01 -03:00
Thomas Antony a5f5736e74 Add to text_generation.py 2023-03-30 11:22:38 +01:00
oobabooga 1cb9246160 Adapt to the new model names 2023-03-29 21:47:36 -03:00
oobabooga 48a6c9513e
Merge pull request #572 from clusterfudge/issues/571
Potential fix for issues/571
2023-03-27 14:06:38 -03:00
oobabooga af65c12900 Change Stop button behavior 2023-03-27 13:23:59 -03:00
Sean Fitzgerald 0bac80d9eb Potential fix for issues/571 2023-03-25 13:08:45 -07:00
Forkoz b740c5b284
Add display of context when input was generated
Not sure if I did this right but it does move with the conversation and seems to match value.
2023-03-24 08:56:07 -05:00
oobabooga 4578e88ffd
Stop the bot from talking for you in chat mode 2023-03-23 21:38:20 -03:00
oobabooga bfa81e105e
Fix FlexGen streaming 2023-03-23 00:22:14 -03:00
oobabooga de6a09dc7f
Properly separate the original prompt from the reply 2023-03-23 00:12:40 -03:00
wywywywy 61346b88ea
Add "seed" menu in the Parameters tab 2023-03-22 15:40:20 -03:00
oobabooga 45b7e53565
Only catch proper Exceptions in the text generation function 2023-03-20 20:36:02 -03:00
oobabooga 75a7a84ef2
Exception handling (#454)
* Update text_generation.py
* Update extensions.py
2023-03-20 13:36:52 -03:00
oobabooga ddb62470e9 --no-cache and --gpu-memory in MiB for fine VRAM control 2023-03-19 19:21:41 -03:00
oobabooga e26763a510 Minor changes 2023-03-17 22:56:46 -03:00
Wojtek Kowaluk 30939e2aee add mps support on apple silicon 2023-03-18 00:56:23 +01:00
oobabooga a577fb1077 Keep GALACTICA special tokens (#300) 2023-03-16 00:46:59 -03:00
oobabooga cf2da86352 Prevent *Is typing* from disappearing instantly while streaming 2023-03-15 12:51:13 -03:00
oobabooga 9d6a625bd6 Add 'hallucinations' filter #326
This breaks the API since a new parameter has been added.
It should be a one-line fix. See api-example.py.
2023-03-15 11:10:35 -03:00
oobabooga afc5339510
Remove "eval" statements from text generation functions 2023-03-14 16:04:17 -03:00
oobabooga 0c224cf4f4 Fix GALACTICA (#285) 2023-03-13 10:32:28 -03:00
oobabooga b9e0712b92 Fix Open Assistant 2023-03-12 23:58:25 -03:00
oobabooga 1ddcd4d0ba Clean up silero_tts
This should only be used with --no-stream.

The shared.still_streaming implementation was faulty by design:
output_modifier should never be called when streaming is already over.
2023-03-12 23:42:49 -03:00
oobabooga c7aa51faa6 Use a list of eos_tokens instead of just a number
This might be the cause of LLaMA ramblings that some people have experienced.
2023-03-12 14:54:58 -03:00
Xan b3e10e47c0 Fix merge conflict in text_generation
- Need to update `shared.still_streaming = False` before the final `yield formatted_outputs`, shifted the position of some yields.
2023-03-12 18:56:35 +11:00
oobabooga 341e135036 Various fixes in chat mode 2023-03-12 02:53:08 -03:00
oobabooga b0e8cb8c88 Various fixes in chat mode 2023-03-12 02:31:45 -03:00
oobabooga 0bd5430988 Use 'with' statement to better handle streaming memory 2023-03-12 02:04:28 -03:00
oobabooga 37f0166b2d Fix memory leak in new streaming (second attempt) 2023-03-11 23:14:49 -03:00
oobabooga 59b5f7a4b7 Improve usage of stopping_criteria 2023-03-08 12:13:40 -03:00
oobabooga add9330e5e Bug fixes 2023-03-08 11:26:29 -03:00
Xan 5648a41a27 Merge branch 'main' of https://github.com/xanthousm/text-generation-webui 2023-03-08 22:08:54 +11:00
Xan ad6b699503 Better TTS with autoplay
- Adds "still_streaming" to shared module for extensions to know if generation is complete
- Changed TTS extension with new options:
   - Show text under the audio widget
   - Automatically play the audio once text generation finishes
   - manage the generated wav files (only keep files for finished generations, optional max file limit)
   - [wip] ability to change voice pitch and speed
- added 'tensorboard' to requirements, since python sent "tensorboard not found" errors after a fresh installation.
2023-03-08 22:02:17 +11:00
oobabooga 33fb6aed74 Minor bug fix 2023-03-08 03:08:16 -03:00
oobabooga ad2970374a Readability improvements 2023-03-08 03:00:06 -03:00
oobabooga 72d539dbff Better separate the FlexGen case 2023-03-08 02:54:47 -03:00
oobabooga ab50f80542 New text streaming method (much faster) 2023-03-08 02:46:35 -03:00
oobabooga 8e89bc596b Fix encode() for RWKV 2023-03-07 23:15:46 -03:00
oobabooga 19a34941ed Add proper streaming to RWKV 2023-03-07 18:17:56 -03:00
oobabooga 8660227e1b Add top_k to RWKV 2023-03-07 17:24:28 -03:00
oobabooga 20bd645f6a Fix bug in multigpu setups (attempt 3) 2023-03-06 15:58:18 -03:00
oobabooga 09a7c36e1b Minor improvement while running custom models 2023-03-06 15:36:35 -03:00
oobabooga 24c4c20391 Fix bug in multigpu setups (attempt #2) 2023-03-06 15:23:29 -03:00
oobabooga d88b7836c6 Fix bug in multigpu setups 2023-03-06 14:58:30 -03:00
oobabooga e91f4bc25a Add RWKV tokenizer 2023-03-06 08:45:49 -03:00
oobabooga a54b91af77 Improve readability 2023-03-05 10:21:15 -03:00
oobabooga 8e706df20e Fix a memory leak when text streaming is on 2023-03-05 10:12:43 -03:00
oobabooga c33715ad5b Move towards HF LLaMA implementation 2023-03-05 01:20:31 -03:00
oobabooga c93f1fa99b Count the tokens more conservatively 2023-03-04 03:10:21 -03:00
oobabooga 05e703b4a4 Print the performance information more reliably 2023-03-03 21:24:32 -03:00
oobabooga a345a2acd2 Add a tokenizer placeholder 2023-03-03 15:16:55 -03:00
oobabooga 5b354817f6 Make chat minimally work with LLaMA 2023-03-03 15:04:41 -03:00
oobabooga ea5c5eb3da Add LLaMA support 2023-03-03 14:39:14 -03:00
oobabooga 7bbe32f618 Don't return a value in an iterator function 2023-03-02 00:48:46 -03:00
oobabooga ff9f649c0c Remove some unused imports 2023-03-02 00:36:20 -03:00
oobabooga 955cf431e8 Minor consistency fix 2023-03-01 19:11:26 -03:00
oobabooga 831ac7ed3f Add top_p 2023-03-01 16:45:48 -03:00
oobabooga 7c4d5ca8cc Improve the text generation call a bit 2023-03-01 16:40:25 -03:00
oobabooga 0f6708c471 Sort the imports 2023-03-01 12:18:17 -03:00
oobabooga e735806c51 Add a generate() function for RWKV 2023-03-01 12:16:11 -03:00
oobabooga f871971de1 Trying to get the chat to work 2023-02-28 00:25:30 -03:00
oobabooga ebd698905c Add streaming to RWKV 2023-02-28 00:04:04 -03:00
oobabooga 70e522732c Move RWKV loader into a separate file 2023-02-27 23:50:16 -03:00
oobabooga ebc64a408c RWKV support prototype 2023-02-27 23:03:35 -03:00
oobabooga 6e843a11d6 Fix FlexGen in chat mode 2023-02-26 00:36:04 -03:00
oobabooga fa58fd5559 Proper way to free the cuda cache 2023-02-25 15:50:29 -03:00
oobabooga 700311ce40 Empty the cuda cache at model.generate() 2023-02-25 14:39:13 -03:00
oobabooga 78ad55641b Remove duplicate max_new_tokens parameter 2023-02-24 17:19:42 -03:00
oobabooga 65326b545a Move all gradio elements to shared (so that extensions can use them) 2023-02-24 16:46:50 -03:00
oobabooga 9ae063e42b Fix softprompts when deepspeed is active (#112) 2023-02-23 20:22:47 -03:00
oobabooga 7224343a70 Improve the imports 2023-02-23 14:41:42 -03:00
oobabooga 1dacd34165 Further refactor 2023-02-23 13:28:30 -03:00
oobabooga ce7feb3641 Further refactor 2023-02-23 13:03:52 -03:00
Renamed from modules/prompt.py (Browse further)