text-generation-webui

mirror of https://github.com/oobabooga/text-generation-webui.git synced 2026-04-06 23:23:43 +00:00

Author	SHA1	Message	Date
oobabooga	d86b0ec010	Add multimodal support (llama.cpp) (#7027 )	2025-08-10 01:27:25 -03:00
oobabooga	a289a92b94	Fix exllamav3 token count	2025-08-09 17:10:58 -07:00
oobabooga	d489eb589a	Attempt at fixing new exllamav3 loader undefined behavior when switching conversations	2025-08-09 14:11:31 -07:00
oobabooga	a6d6bee88c	Change a comment	2025-08-09 07:51:03 -07:00
oobabooga	2fe79a93cc	mtmd: Handle another case after `3f5ec9644f`	2025-08-09 07:50:24 -07:00
oobabooga	59c6138e98	Remove a log message	2025-08-09 07:32:15 -07:00
oobabooga	f396b82a4f	mtmd: Better way to detect if an EXL3 model is multimodal	2025-08-09 07:31:36 -07:00
oobabooga	fa9be444fa	Use ExLlamav3 instead of ExLlamav3_HF by default for EXL3 models	2025-08-09 07:26:59 -07:00
oobabooga	3f5ec9644f	mtmd: Place the image <__media__> at the top of the prompt	2025-08-09 07:06:07 -07:00
oobabooga	1168004067	Minor change	2025-08-09 07:01:55 -07:00
oobabooga	9e260332cc	Remove some unnecessary code	2025-08-08 21:22:47 -07:00
oobabooga	544c3a7c9f	Polish the new exllamav3 loader	2025-08-08 21:15:53 -07:00
oobabooga	8fcadff8d3	mtmd: Use the base64 attachment for the UI preview instead of the file	2025-08-08 20:13:54 -07:00
oobabooga	6e9de75727	Support loading chat templates from chat_template.json files	2025-08-08 19:35:09 -07:00
Katehuuh	88127f46c1	Add multimodal support (ExLlamaV3) (#7174 )	2025-08-08 23:31:16 -03:00
oobabooga	b391ac8eb1	Fix getting the ctx-size for EXL3/EXL2/Transformers models	2025-08-08 18:11:45 -07:00
oobabooga	3e24f455c8	Fix continue for GPT-OSS (hopefully the final fix)	2025-08-06 10:18:42 -07:00
oobabooga	0c1403f2c7	Handle GPT-OSS as a special case when continuing	2025-08-06 08:05:37 -07:00
oobabooga	6ce4b353c4	Fix the GPT-OSS template	2025-08-06 07:12:39 -07:00
oobabooga	7c82d65a9d	Handle GPT-OSS as a special template case	2025-08-05 18:05:09 -07:00
oobabooga	fbea21a1f1	Only use enable_thinking if the template supports it	2025-08-05 17:33:27 -07:00
oobabooga	bfbbfc2361	Ignore add_generation_prompt in GPT-OSS	2025-08-05 17:33:01 -07:00
oobabooga	20adc3c967	Start over new template handling (to avoid overcomplicating)	2025-08-05 16:58:45 -07:00
oobabooga	80f6abb07e	Begin fixing 'Continue' with GPT-OSS	2025-08-05 16:01:19 -07:00
oobabooga	e5b8d4d072	Fix a typo	2025-08-05 15:52:56 -07:00
oobabooga	701048cf33	Try to avoid breaking jinja2 parsing for older models	2025-08-05 15:51:24 -07:00
oobabooga	7d98ca6195	Make web search functional with thinking models	2025-08-05 15:44:33 -07:00
oobabooga	0e42575c57	Fix thinking block parsing for GPT-OSS under llama.cpp	2025-08-05 15:36:20 -07:00
oobabooga	498778b8ac	Add a new 'Reasoning effort' UI element	2025-08-05 15:19:11 -07:00
oobabooga	6bb8212731	Fix thinking block rendering for GPT-OSS	2025-08-05 15:06:22 -07:00
oobabooga	5c5a4dfc14	Fix impersonate	2025-08-05 13:04:10 -07:00
oobabooga	ecd16d6bf9	Automatically set skip_special_tokens to False for channel-based templates	2025-08-05 12:57:49 -07:00
oobabooga	178c3e75cc	Handle templates with channels separately	2025-08-05 12:52:17 -07:00
oobabooga	9f28f53cfc	Better parsing of the gpt-oss template	2025-08-05 11:56:00 -07:00
oobabooga	3b28dc1821	Don't pass torch_dtype to transformers loader, let it be autodetected	2025-08-05 11:35:53 -07:00
oobabooga	3039aeffeb	Fix parsing the gpt-oss-20b template	2025-08-05 11:35:17 -07:00
oobabooga	5989043537	Transformers: Support standalone .jinja chat templates (for GPT-OSS)	2025-08-05 11:22:18 -07:00
oobabooga	f08bb9a201	Handle edge case in chat history loading (closes #7155 )	2025-07-24 10:34:59 -07:00
oobabooga	d746484521	Handle both int and str types in grammar char processing	2025-07-23 11:52:51 -07:00
oobabooga	0c667de7a7	UI: Add a None option for the speculative decoding model (closes #7145 )	2025-07-19 12:14:41 -07:00
oobabooga	845432b9b4	Remove the obsolete modules/relative_imports.py file	2025-07-14 21:03:18 -07:00
oobabooga	1d1b20bd77	Remove the --torch-compile option (it doesn't do anything currently)	2025-07-11 10:51:23 -07:00
oobabooga	273888f218	Revert "Use eager attention by default instead of sdpa" This reverts commit `bd4881c4dc`.	2025-07-10 18:56:46 -07:00
oobabooga	635e6efd18	Ignore add_bos_token in instruct prompts, let the jinja2 template decide	2025-07-10 07:14:01 -07:00
oobabooga	e015355e4a	Update README	2025-07-09 20:03:53 -07:00
oobabooga	bd4881c4dc	Use eager attention by default instead of sdpa	2025-07-09 19:57:37 -07:00
oobabooga	b69f435311	Fix latest transformers being super slow	2025-07-09 19:56:50 -07:00
oobabooga	6c2bdda0f0	Transformers loader: replace `use_flash_attention_2`/`use_eager_attention` with a unified `attn_implementation` Closes #7107	2025-07-09 18:39:37 -07:00
oobabooga	07e6f004c5	Rename a button in the Session tab for clarity	2025-07-07 11:28:47 -07:00
Alidr79	e5767d4fc5	Update ui_model_menu.py blocking the --multi-user access in backend (#7098 )	2025-07-06 21:48:53 -03:00

1 2 3 4 5 ...

1912 commits