Commit graph

30 commits

Author SHA1 Message Date
oobabooga 65aa11890f
Refactor everything (#3481) 2023-08-06 21:49:27 -03:00
Pete f4005164f4
Fix llama.cpp truncation (#3400)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-08-03 20:01:15 -03:00
oobabooga 87dab03dc0
Add the --cpu option for llama.cpp to prevent CUDA from being used (#3432) 2023-08-03 11:00:36 -03:00
oobabooga b17893a58f Revert "Add tensor split support for llama.cpp (#3171)"
This reverts commit 031fe7225e.
2023-07-26 07:06:01 -07:00
Shouyi 031fe7225e
Add tensor split support for llama.cpp (#3171) 2023-07-25 18:59:26 -03:00
oobabooga a07d070b6c
Add llama-2-70b GGML support (#3285) 2023-07-24 16:37:03 -03:00
jllllll 1141987a0d
Add checks for ROCm and unsupported architectures to llama_cpp_cuda loading (#3225) 2023-07-24 11:25:36 -03:00
oobabooga 4b19b74e6c Add CUDA wheels for llama-cpp-python by jllllll 2023-07-19 19:33:43 -07:00
randoentity a69955377a
[GGML] Support for customizable RoPE (#3083)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-07-17 22:32:37 -03:00
Gabriel Pena eedb3bf023
Add low vram mode on llama cpp (#3076) 2023-07-12 11:05:13 -03:00
oobabooga b6643e5039 Add decode functions to llama.cpp/exllama 2023-07-07 09:11:30 -07:00
EugeoSynthesisThirtyTwo 7625c6de89
fix usage of self in classmethod (#2781) 2023-06-20 16:18:42 -03:00
Cebtenzzre 59e7ecb198
llama.cpp: implement ban_eos_token via logits_processor (#2765) 2023-06-19 21:31:19 -03:00
oobabooga 05a743d6ad Make llama.cpp use tfs parameter 2023-06-17 19:08:25 -03:00
oobabooga 9f40032d32
Add ExLlama support (#2444) 2023-06-16 20:35:38 -03:00
oobabooga 6015616338 Style changes 2023-06-06 13:06:05 -03:00
DGdev91 cf088566f8
Make llama.cpp read prompt size and seed from settings (#2299) 2023-05-25 10:29:31 -03:00
oobabooga c0fd7f3257
Add mirostat parameters for llama.cpp (#2287) 2023-05-22 19:37:24 -03:00
oobabooga e116d31180 Prevent unwanted log messages from modules 2023-05-21 22:42:34 -03:00
Andrei e657dd342d
Add in-memory cache support for llama.cpp (#1936) 2023-05-15 20:19:55 -03:00
Jakub Strnad 0227e738ed
Add settings UI for llama.cpp and fixed reloading of llama.cpp models (#2087) 2023-05-15 19:51:23 -03:00
AlphaAtlas 071f0776ad
Add llama.cpp GPU offload option (#2060) 2023-05-14 22:58:11 -03:00
Ahmed Said fbcd32988e
added no_mmap & mlock parameters to llama.cpp and removed llamacpp_model_alternative (#1649)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-05-02 18:25:28 -03:00
oobabooga ea6e77df72
Make the code more like PEP8 for readability (#862) 2023-04-07 00:15:45 -03:00
oobabooga 2c52310642 Add --threads flag for llama.cpp 2023-03-31 21:18:05 -03:00
oobabooga 52065ae4cd Add repetition_penalty 2023-03-31 19:01:34 -03:00
oobabooga 2259143fec Fix llama.cpp with --no-stream 2023-03-31 18:43:45 -03:00
oobabooga 9d1dcf880a General improvements 2023-03-31 14:27:01 -03:00
Thomas Antony 7fa5d96c22 Update to use new llamacpp API 2023-03-30 11:23:05 +01:00
Thomas Antony 7a562481fa Initial version of llamacpp_model.py 2023-03-30 11:22:07 +01:00