Commit graph

3090 commits

Author SHA1 Message Date
oobabooga 4b84e45116 Use +cpuavx2 instead of +cpuavx 2023-11-20 11:46:38 -08:00
oobabooga d7f1bc102b
Fix "Illegal instruction" bug in llama.cpp CPU only version (#4677) 2023-11-20 16:36:38 -03:00
drew9781 5e70263e25
docker: install xformers with sepcific cuda version, matching the docker image. (#4670) 2023-11-19 21:43:15 -03:00
oobabooga f0d66cf817 Add missing file 2023-11-19 10:12:13 -08:00
oobabooga a2e6d00128 Use convert_ids_to_tokens instead of decode in logits endpoint
This preserves the llama tokenizer spaces.
2023-11-19 09:22:08 -08:00
oobabooga 8cf05c1b31 Fix disappearing character gallery 2023-11-19 08:31:01 -08:00
oobabooga 9da7bb203d Minor LoRA bug fix 2023-11-19 07:59:29 -08:00
oobabooga 78af3b0a00 Update docs/What Works.md 2023-11-19 07:57:16 -08:00
oobabooga a6f1e1bcc5 Fix PEFT LoRA unloading 2023-11-19 07:55:25 -08:00
oobabooga a290d17386 Add hover cursor to bot pfp 2023-11-19 06:56:42 -08:00
oobabooga ab94f0d9bf Minor style change 2023-11-18 21:11:04 -08:00
oobabooga 5fcee696ea
New feature: enlarge character pictures on click (#4654) 2023-11-19 02:05:17 -03:00
Jordan Tucker cb836dd49c
fix: use shared chat-instruct_command with api (#4653) 2023-11-19 01:19:10 -03:00
oobabooga 771e62e476
Add /v1/internal/lora endpoints (#4652) 2023-11-19 00:35:22 -03:00
oobabooga ef6feedeb2
Add --nowebui flag for pure API mode (#4651) 2023-11-18 23:38:39 -03:00
oobabooga 0fa1af296c
Add /v1/internal/logits endpoint (#4650) 2023-11-18 23:19:31 -03:00
oobabooga 8f4f4daf8b
Add --admin-key flag for API (#4649) 2023-11-18 22:33:27 -03:00
wizd af76fbedb8
Openai embedding fix to support jina-embeddings-v2 (#4642) 2023-11-18 20:24:29 -03:00
Jordan Tucker baab894759
fix: use system message in chat-instruct mode (#4648) 2023-11-18 20:20:13 -03:00
oobabooga 47d9e2618b Refresh the Preset menu after saving a preset 2023-11-18 14:03:42 -08:00
oobabooga 83b64e7fc1
New feature: "random preset" button (#4647) 2023-11-18 18:31:41 -03:00
oobabooga d1a58da52f Update ancient Docker instructions 2023-11-17 19:52:53 -08:00
oobabooga e0ca49ed9c
Bump llama-cpp-python to 0.2.18 (2nd attempt) (#4637)
* Update requirements*.txt

* Add back seed
2023-11-18 00:31:27 -03:00
oobabooga 9d6f79db74 Revert "Bump llama-cpp-python to 0.2.18 (#4611)"
This reverts commit 923c8e25fb.
2023-11-17 05:14:25 -08:00
oobabooga e0a7cc5e0f Simplify CORS code 2023-11-16 20:11:55 -08:00
oobabooga 13dc3b61da Update README 2023-11-16 19:57:55 -08:00
oobabooga 8b66d83aa9 Set use_fast=True by default, create --no_use_fast flag
This increases tokens/second for HF loaders.
2023-11-16 19:55:28 -08:00
oobabooga b2ce8dc7ee Update a message 2023-11-16 18:46:26 -08:00
oobabooga 780b00e1cf Minor bug fix 2023-11-16 18:39:39 -08:00
oobabooga c0233bb9d3 Minor message change 2023-11-16 18:36:57 -08:00
oobabooga 94b7177174 Update docs/07 - Extensions 2023-11-16 18:24:46 -08:00
oobabooga 6525707a7f Fix "send instruction template to..." buttons (closes #4625) 2023-11-16 18:16:42 -08:00
oobabooga 510a01ef46 Lint 2023-11-16 18:03:06 -08:00
oobabooga 923c8e25fb
Bump llama-cpp-python to 0.2.18 (#4611) 2023-11-16 22:55:14 -03:00
Casper 61f429563e
Bump AutoAWQ to 0.1.7 (#4620) 2023-11-16 17:08:08 -03:00
oobabooga e7d460d932 Make sure that API requirements are installed 2023-11-16 10:08:41 -08:00
oobabooga cbf2b47476 Strip trailing "\" characters in CMD_FLAGS.txt 2023-11-16 09:33:36 -08:00
oobabooga 58c6001be9 Add missing exllamav2 samplers 2023-11-16 07:09:40 -08:00
oobabooga cd41f8912b Warn users about n_ctx / max_seq_len 2023-11-15 18:56:42 -08:00
oobabooga a475aa7816 Improve API documentation 2023-11-15 18:39:08 -08:00
oobabooga 9be48e83a9 Start API when "api" checkbox is checked 2023-11-15 16:35:47 -08:00
oobabooga a85ce5f055 Add more info messages for truncation / instruction template 2023-11-15 16:20:31 -08:00
oobabooga 883701bc40 Alternative solution to 025da386a0
Fixes an error.
2023-11-15 16:04:02 -08:00
oobabooga 8ac942813c Revert "Fix CPU memory limit error (issue #3763) (#4597)"
This reverts commit 025da386a0.
2023-11-15 16:01:54 -08:00
oobabooga e6f44d6d19 Print context length / instruction template to terminal when loading models 2023-11-15 16:00:51 -08:00
oobabooga e05d8fd441 Style changes 2023-11-15 15:51:37 -08:00
oobabooga be125e2708 Add /v1/internal/model/unload endpoint 2023-11-15 15:48:33 -08:00
David Nielson 564d0cde82
Use standard hyphens in filenames (#4576) 2023-11-15 20:29:00 -03:00
Andy Bao 025da386a0
Fix CPU memory limit error (issue #3763) (#4597)
get_max_memory_dict() was not properly formatting shared.args.cpu_memory

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-11-15 20:27:20 -03:00
Anton Rogozin 8a9d5a0cea
update AutoGPTQ to higher version for lora applying error fixing (#4604) 2023-11-15 20:23:22 -03:00