Commit graph

44 commits

Author SHA1 Message Date
oobabooga d423021a48
Remove CTransformers support (#5807) 2024-04-04 20:23:58 -03:00
oobabooga 9ab7365b56 Read rope_theta for DBRX model (thanks turboderp) 2024-04-01 20:25:31 -07:00
oobabooga db5f6cd1d8 Fix ExLlamaV2 loaders using unnecessary "bits" metadata 2024-03-30 21:51:39 -07:00
oobabooga 624faa1438 Fix ExLlamaV2 context length setting (closes #5750) 2024-03-30 21:33:16 -07:00
oobabooga 4039999be5 Autodetect llamacpp_HF loader when tokenizer exists 2024-02-16 09:29:26 -08:00
oobabooga 76d28eaa9e
Add a menu for customizing the instruction template for the model (#5521) 2024-02-16 14:21:17 -03:00
oobabooga 2a1063eff5 Revert "Remove non-HF ExLlamaV2 loader (#5431)"
This reverts commit cde000d478.
2024-02-06 06:21:36 -08:00
oobabooga cde000d478
Remove non-HF ExLlamaV2 loader (#5431) 2024-02-04 01:15:51 -03:00
oobabooga 2734ce3e4c
Remove RWKV loader (#5130) 2023-12-31 02:01:40 -03:00
oobabooga 0e54a09bcb
Remove exllamav1 loaders (#5128) 2023-12-31 01:57:06 -03:00
B611 b7dd1f9542
Specify utf-8 encoding for model metadata file open (#5125) 2023-12-31 01:34:32 -03:00
Water 674be9a09a
Add HQQ quant loader (#4888)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-12-18 21:23:16 -03:00
oobabooga f0d6ead877
llama.cpp: read instruction template from GGUF metadata (#4975) 2023-12-18 01:51:58 -03:00
oobabooga 39d2fe1ed9
Jinja templates for Instruct and Chat (#4874) 2023-12-12 17:23:14 -03:00
oobabooga 98361af4d5
Add QuIP# support (#4803)
It has to be installed manually for now.
2023-12-06 00:01:01 -03:00
oobabooga e05d8fd441 Style changes 2023-11-15 15:51:37 -08:00
oobabooga 09f807af83 Use ExLlama_HF for GPTQ models by default 2023-10-21 20:45:38 -07:00
oobabooga e14bde4946 Minor improvements to evaluation logs 2023-10-15 20:51:43 -07:00
oobabooga 9fab9a1ca6 Minor fix 2023-10-10 14:08:11 -07:00
oobabooga a49cc69a4a Ignore rope_freq_base if value is 10000 2023-10-10 13:57:40 -07:00
oobabooga 7ffb424c7b Add AutoAWQ to README 2023-10-05 09:22:37 -07:00
cal066 cc632c3f33
AutoAWQ: initial support (#3999) 2023-10-05 13:19:18 -03:00
oobabooga 96da2e1c0d Read more metadata (config.json & quantize_config.json) 2023-09-29 06:14:16 -07:00
oobabooga f8e9733412 Minor syntax change 2023-09-28 19:32:35 -07:00
oobabooga 1dd13e4643 Read Transformers config.json metadata 2023-09-28 19:19:47 -07:00
oobabooga 7a3ca2c68f Better detect EXL2 models 2023-09-23 13:05:55 -07:00
oobabooga 5075087461 Fix command-line arguments being ignored 2023-09-19 13:11:46 -07:00
oobabooga 3d1c0f173d User config precedence over GGUF metadata 2023-09-14 12:15:52 -07:00
Gennadij 460c40d8ab
Read more GGUF metadata (scale_linear and freq_base) (#3877) 2023-09-12 17:02:42 -03:00
oobabooga 16e1696071 Minor qol change 2023-09-12 10:44:26 -07:00
oobabooga 9331ab4798
Read GGUF metadata (#3873) 2023-09-11 18:49:30 -03:00
oobabooga ed86878f02 Remove GGML support 2023-09-11 07:44:00 -07:00
jllllll 4a999e3bcd
Use separate llama-cpp-python packages for GGML support 2023-08-26 10:40:08 -05:00
oobabooga 83640d6f43 Replace ggml occurences with gguf 2023-08-26 01:06:59 -07:00
oobabooga d6934bc7bc
Implement CFG for ExLlama_HF (#3666) 2023-08-24 16:27:36 -03:00
oobabooga 65aa11890f
Refactor everything (#3481) 2023-08-06 21:49:27 -03:00
oobabooga 959feba602 When saving model settings, only save the settings for the current loader 2023-08-01 06:10:09 -07:00
oobabooga 75c2dd38cf Remove flexgen support 2023-07-25 15:15:29 -07:00
oobabooga 27a84b4e04 Make AutoGPTQ the default again
Purely for compatibility with more models.
You should still use ExLlama_HF for LLaMA models.
2023-07-15 22:29:23 -07:00
oobabooga b284f2407d Make ExLlama_HF the new default for GPTQ 2023-07-14 14:03:56 -07:00
Salvador E. Tropea 324e45b848
[Fixed] wbits and groupsize values from model not shown (#2977) 2023-07-11 23:27:38 -03:00
oobabooga 9290c6236f Keep ExLlama_HF if already selected 2023-06-25 19:06:28 -03:00
oobabooga 9f40032d32
Add ExLlama support (#2444) 2023-06-16 20:35:38 -03:00
oobabooga 7ef6a50e84
Reorganize model loading UI completely (#2720) 2023-06-16 19:00:37 -03:00