Commit graph

1890 commits

Author SHA1 Message Date
oobabooga
9e7b326e34 Lint 2025-08-19 06:50:40 -07:00
oobabooga
1972479610 Add the TP option to exllamav3_HF 2025-08-19 06:48:22 -07:00
oobabooga
e0f5905a97 Code formatting 2025-08-19 06:34:05 -07:00
oobabooga
5b06284a8a UI: Keep ExLlamav3_HF selected if already selected for EXL3 models 2025-08-19 06:23:21 -07:00
oobabooga
cbba58bef9 UI: Fix code blocks having an extra empty line 2025-08-18 15:50:09 -07:00
oobabooga
7d23a55901 Fix model unloading when switching loaders (closes #7203) 2025-08-18 09:05:47 -07:00
oobabooga
64eba9576c mtmd: Fix a bug when "include past attachments" is unchecked 2025-08-17 14:08:40 -07:00
oobabooga
dbabe67e77 ExLlamaV3: Enable the --enable-tp option, add a --tp-backend option 2025-08-17 13:19:11 -07:00
oobabooga
d771ca4a13 Fix web search (attempt) 2025-08-14 12:05:14 -07:00
altoiddealer
57f6e9af5a
Set multimodal status during Model Loading (#7199) 2025-08-13 16:47:27 -03:00
oobabooga
41b95e9ec3 Lint 2025-08-12 13:37:37 -07:00
oobabooga
7301452b41 UI: Minor info message change 2025-08-12 13:23:24 -07:00
oobabooga
8d7b88106a Revert "mtmd: Fail early if images are provided but the model doesn't support them (llama.cpp)"
This reverts commit d8fcc71616.
2025-08-12 13:20:16 -07:00
oobabooga
2238302b49 ExLlamaV3: Add speculative decoding 2025-08-12 08:50:45 -07:00
oobabooga
d8fcc71616 mtmd: Fail early if images are provided but the model doesn't support them (llama.cpp) 2025-08-11 18:02:33 -07:00
oobabooga
e6447cd24a mtmd: Update the llama-server request 2025-08-11 17:42:35 -07:00
oobabooga
0e3def449a llama.cpp: --swa-full to llama-server when streaming-llm is checked 2025-08-11 15:17:25 -07:00
oobabooga
0e88a621fd UI: Better organize the right sidebar 2025-08-11 15:16:03 -07:00
oobabooga
a78ca6ffcd Remove a comment 2025-08-11 12:33:38 -07:00
oobabooga
999471256c Lint 2025-08-11 12:32:17 -07:00
oobabooga
b62c8845f3 mtmd: Fix /chat/completions for llama.cpp 2025-08-11 12:01:59 -07:00
oobabooga
38c0b4a1ad Default ctx-size to 8192 when not found in the metadata 2025-08-11 07:39:53 -07:00
oobabooga
52d1cbbbe9 Fix an import 2025-08-11 07:38:39 -07:00
oobabooga
4809ddfeb8 Exllamav3: small sampler fixes 2025-08-11 07:35:22 -07:00
oobabooga
4d8dbbab64 API: Fix sampler_priority usage for ExLlamaV3 2025-08-11 07:26:11 -07:00
oobabooga
0ea62d88f6 mtmd: Fix "continue" when an image is present 2025-08-09 21:47:02 -07:00
oobabooga
2f90ac9880 Move the new image_utils.py file to modules/ 2025-08-09 21:41:38 -07:00
oobabooga
c6b4d1e87f Fix the exllamav2 loader ignoring add_bos 2025-08-09 21:34:35 -07:00
oobabooga
d86b0ec010
Add multimodal support (llama.cpp) (#7027) 2025-08-10 01:27:25 -03:00
oobabooga
a289a92b94 Fix exllamav3 token count 2025-08-09 17:10:58 -07:00
oobabooga
d489eb589a Attempt at fixing new exllamav3 loader undefined behavior when switching conversations 2025-08-09 14:11:31 -07:00
oobabooga
a6d6bee88c Change a comment 2025-08-09 07:51:03 -07:00
oobabooga
2fe79a93cc mtmd: Handle another case after 3f5ec9644f 2025-08-09 07:50:24 -07:00
oobabooga
59c6138e98 Remove a log message 2025-08-09 07:32:15 -07:00
oobabooga
f396b82a4f mtmd: Better way to detect if an EXL3 model is multimodal 2025-08-09 07:31:36 -07:00
oobabooga
fa9be444fa Use ExLlamav3 instead of ExLlamav3_HF by default for EXL3 models 2025-08-09 07:26:59 -07:00
oobabooga
3f5ec9644f mtmd: Place the image <__media__> at the top of the prompt 2025-08-09 07:06:07 -07:00
oobabooga
1168004067 Minor change 2025-08-09 07:01:55 -07:00
oobabooga
9e260332cc Remove some unnecessary code 2025-08-08 21:22:47 -07:00
oobabooga
544c3a7c9f Polish the new exllamav3 loader 2025-08-08 21:15:53 -07:00
oobabooga
8fcadff8d3 mtmd: Use the base64 attachment for the UI preview instead of the file 2025-08-08 20:13:54 -07:00
oobabooga
6e9de75727 Support loading chat templates from chat_template.json files 2025-08-08 19:35:09 -07:00
Katehuuh
88127f46c1
Add multimodal support (ExLlamaV3) (#7174) 2025-08-08 23:31:16 -03:00
oobabooga
b391ac8eb1 Fix getting the ctx-size for EXL3/EXL2/Transformers models 2025-08-08 18:11:45 -07:00
oobabooga
3e24f455c8 Fix continue for GPT-OSS (hopefully the final fix) 2025-08-06 10:18:42 -07:00
oobabooga
0c1403f2c7 Handle GPT-OSS as a special case when continuing 2025-08-06 08:05:37 -07:00
oobabooga
6ce4b353c4 Fix the GPT-OSS template 2025-08-06 07:12:39 -07:00
oobabooga
7c82d65a9d Handle GPT-OSS as a special template case 2025-08-05 18:05:09 -07:00
oobabooga
fbea21a1f1 Only use enable_thinking if the template supports it 2025-08-05 17:33:27 -07:00
oobabooga
bfbbfc2361 Ignore add_generation_prompt in GPT-OSS 2025-08-05 17:33:01 -07:00