Commit graph

13 commits

Author SHA1 Message Date
oobabooga a156ebbf76 Lint 2025-10-15 13:15:01 -07:00
oobabooga c871d9cdbd Revert "Same as 7f06aec3a1 but for exllamav3_hf"
This reverts commit deb37b821b.
2025-10-15 13:05:41 -07:00
oobabooga deb37b821b Same as 7f06aec3a1 but for exllamav3_hf 2025-10-09 13:02:38 -07:00
oobabooga 9e9ab39892 Make exllamav3_hf and exllamav2_hf functional again 2025-09-17 12:29:22 -07:00
oobabooga 1972479610 Add the TP option to exllamav3_HF 2025-08-19 06:48:22 -07:00
oobabooga 219f0a7731 Fix exllamav3_hf models failing to unload (closes #7031) 2025-05-30 12:05:49 -07:00
oobabooga f3da45f65d ExLlamaV3_HF: Change max_chunk_size to 256 2025-05-04 20:37:15 -07:00
oobabooga ee0592473c Fix ExLlamaV3_HF leaking memory (attempt) 2025-04-27 21:04:02 -07:00
oobabooga d4017fbb6d
ExLlamaV3: Add kv cache quantization (#6903) 2025-04-25 21:32:00 -03:00
oobabooga d4b1e31c49 Use --ctx-size to specify the context size for all loaders
Old flags are still recognized as alternatives.
2025-04-25 16:59:03 -07:00
oobabooga b3bf7a885d Fix ExLlamaV2_HF and ExLlamaV3_HF after ae02ffc605 2025-04-20 11:32:48 -07:00
oobabooga 1c4a2c9a71 Make exllamav3 safer as well 2025-04-18 06:17:58 -07:00
oobabooga 8b8d39ec4e
Add ExLlamaV3 support (#6832) 2025-04-09 00:07:08 -03:00