text-generation-webui

mirror of https://github.com/oobabooga/text-generation-webui.git synced 2026-02-18 05:35:26 +01:00

Author	SHA1	Message	Date
oobabooga	bc55feaf3e	Improve host header validation in local mode	2025-04-26 15:42:17 -07:00
oobabooga	3a207e7a57	Improve the --help formatting a bit	2025-04-26 07:31:04 -07:00
oobabooga	6acb0e1bee	Change a UI description	2025-04-26 05:13:08 -07:00
oobabooga	cbd4d967cc	Update a --help message	2025-04-26 05:09:52 -07:00
oobabooga	763a7011c0	Remove an ancient/obsolete migration check	2025-04-26 04:59:05 -07:00
oobabooga	d9de14d1f7	Restructure the repository (#6904 )	2025-04-26 08:56:54 -03:00
oobabooga	d4017fbb6d	ExLlamaV3: Add kv cache quantization (#6903 )	2025-04-25 21:32:00 -03:00
oobabooga	d4b1e31c49	Use `--ctx-size` to specify the context size for all loaders Old flags are still recognized as alternatives.	2025-04-25 16:59:03 -07:00
oobabooga	faababc4ea	llama.cpp: Add a prompt processing progress bar	2025-04-25 16:42:30 -07:00
oobabooga	877cf44c08	llama.cpp: Add StreamingLLM (`--streaming-llm`)	2025-04-25 16:21:41 -07:00
oobabooga	d35818f4e1	UI: Add a collapsible thinking block to messages with `<think>` steps (#6902 )	2025-04-25 18:02:02 -03:00
oobabooga	98f4c694b9	llama.cpp: Add --extra-flags parameter for passing additional flags to llama-server	2025-04-25 07:32:51 -07:00
oobabooga	5861013e68	Merge remote-tracking branch 'refs/remotes/origin/dev' into dev	2025-04-24 20:36:20 -07:00
oobabooga	a90df27ff5	UI: Add a greeting when the chat history is empty	2025-04-24 20:33:40 -07:00
oobabooga	ae1fe87365	ExLlamaV2: Add speculative decoding (#6899 )	2025-04-25 00:11:04 -03:00
Matthew Jenkins	8f2493cc60	Prevent llamacpp defaults from locking up consumer hardware (#6870 )	2025-04-24 23:38:57 -03:00
oobabooga	93fd4ad25d	llama.cpp: Document the --device-draft syntax	2025-04-24 09:20:11 -07:00
oobabooga	f1b64df8dd	EXL2: add another torch.cuda.synchronize() call to prevent errors	2025-04-24 09:03:49 -07:00
oobabooga	c71a2af5ab	Handle CMD_FLAGS.txt in the main code (closes #6896 )	2025-04-24 08:21:06 -07:00
oobabooga	bfbde73409	Make 'instruct' the default chat mode	2025-04-24 07:08:49 -07:00
oobabooga	e99c20bcb0	llama.cpp: Add speculative decoding (#6891 )	2025-04-23 20:10:16 -03:00
oobabooga	9424ba17c8	UI: show only part 00001 of multipart GGUF models in the model menu	2025-04-22 19:56:42 -07:00
oobabooga	25cf3600aa	Lint	2025-04-22 08:04:02 -07:00
oobabooga	39cbb5fee0	Lint	2025-04-22 08:03:25 -07:00
oobabooga	008c6dd682	Lint	2025-04-22 08:02:37 -07:00
oobabooga	78aeabca89	Fix the transformers loader	2025-04-21 18:33:14 -07:00
oobabooga	8320190184	Fix the exllamav2_HF and exllamav3_HF loaders	2025-04-21 18:32:23 -07:00
oobabooga	15989c2ed8	Make llama.cpp the default loader	2025-04-21 16:36:35 -07:00
oobabooga	86c3ed3218	Small change to the unload_model() function	2025-04-20 20:00:56 -07:00
oobabooga	fe8e80e04a	Merge remote-tracking branch 'refs/remotes/origin/dev' into dev	2025-04-20 19:09:27 -07:00
oobabooga	ff1c00bdd9	llama.cpp: set the random seed manually	2025-04-20 19:08:44 -07:00
Matthew Jenkins	d3e7c655e5	Add support for llama-cpp builds from https://github.com/ggml-org/llama.cpp (#6862 )	2025-04-20 23:06:24 -03:00
oobabooga	e243424ba1	Fix an import	2025-04-20 17:51:28 -07:00
oobabooga	8cfd7f976b	Revert "Remove the old --model-menu flag" This reverts commit `109de34e3b`.	2025-04-20 13:35:42 -07:00
oobabooga	b3bf7a885d	Fix ExLlamaV2_HF and ExLlamaV3_HF after `ae02ffc605`	2025-04-20 11:32:48 -07:00
oobabooga	ae02ffc605	Refactor the transformers loader (#6859 )	2025-04-20 13:33:47 -03:00
oobabooga	6ba0164c70	Lint	2025-04-19 17:45:21 -07:00
oobabooga	5ab069786b	llama.cpp: add back the two encode calls (they are harmless now)	2025-04-19 17:38:36 -07:00
oobabooga	b9da5c7e3a	Use 127.0.0.1 instead of localhost for faster llama.cpp on Windows	2025-04-19 17:36:04 -07:00
oobabooga	9c9df2063f	llama.cpp: fix unicode decoding (closes #6856 )	2025-04-19 16:38:15 -07:00
oobabooga	ba976d1390	llama.cpp: avoid two 'encode' calls	2025-04-19 16:35:01 -07:00
oobabooga	ed42154c78	Revert "llama.cpp: close the connection immediately on 'Stop'" This reverts commit `5fdebc554b`.	2025-04-19 05:32:36 -07:00
oobabooga	5fdebc554b	llama.cpp: close the connection immediately on 'Stop'	2025-04-19 04:59:24 -07:00
oobabooga	6589ebeca8	Revert "llama.cpp: new optimization attempt" This reverts commit `e2e73ed22f`.	2025-04-18 21:16:21 -07:00
oobabooga	e2e73ed22f	llama.cpp: new optimization attempt	2025-04-18 21:05:08 -07:00
oobabooga	e2e90af6cd	llama.cpp: don't include --rope-freq-base in the launch command if null	2025-04-18 20:51:18 -07:00
oobabooga	9f07a1f5d7	llama.cpp: new attempt at optimizing the llama-server connection	2025-04-18 19:30:53 -07:00
oobabooga	f727b4a2cc	llama.cpp: close the connection properly when generation is cancelled	2025-04-18 19:01:39 -07:00
oobabooga	b3342b8dd8	llama.cpp: optimize the llama-server connection	2025-04-18 18:46:36 -07:00
oobabooga	2002590536	Revert "Attempt at making the llama-server streaming more efficient." This reverts commit `5ad080ff25`.	2025-04-18 18:13:54 -07:00

1 2 3 4 5 ...

1596 commits