diff --git a/.github/ISSUE_TEMPLATE/bug_report_template.yml b/.github/ISSUE_TEMPLATE/bug_report_template.yml
index bd30a0c9..ad22b656 100644
--- a/.github/ISSUE_TEMPLATE/bug_report_template.yml
+++ b/.github/ISSUE_TEMPLATE/bug_report_template.yml
@@ -46,7 +46,7 @@ body:
id: system-info
attributes:
label: System Info
- description: "Please share your system info with us: operating system, GPU brand, and GPU model. If you are using a Google Colab notebook, mention that instead."
+ description: "Please share your operating system and GPU type (NVIDIA/AMD/Intel/Apple). If you are using a Google Colab notebook, mention that instead."
render: shell
placeholder:
validations:
diff --git a/README.md b/README.md
index 45ab48eb..6e7c05b1 100644
--- a/README.md
+++ b/README.md
@@ -24,20 +24,24 @@ Its goal is to become the [AUTOMATIC1111/stable-diffusion-webui](https://github.
- Multiple sampling parameters and generation options for sophisticated text generation control.
- Switch between different models in the UI without restarting.
- Automatic GPU layers for GGUF models (on NVIDIA GPUs).
-- Free-form text generation in the Default/Notebook tabs without being limited to chat turns.
+- Free-form text generation in the Notebook tab without being limited to chat turns.
- OpenAI-compatible API with Chat and Completions endpoints, including tool-calling support – see [examples](https://github.com/oobabooga/text-generation-webui/wiki/12-%E2%80%90-OpenAI-API#examples).
- Extension support, with numerous built-in and user-contributed extensions available. See the [wiki](https://github.com/oobabooga/text-generation-webui/wiki/07-%E2%80%90-Extensions) and [extensions directory](https://github.com/oobabooga/text-generation-webui-extensions) for details.
## How to install
-#### Option 1: Portable builds (start here)
+#### Option 1: Portable builds (get started in 1 minute)
-No installation needed – just unzip and run. Compatible with GGUF (llama.cpp) models on Windows, Linux, and macOS.
+No installation needed – just download, unzip and run. All dependencies included.
-Download from: https://github.com/oobabooga/text-generation-webui/releases
+Compatible with GGUF (llama.cpp) models on Windows, Linux, and macOS.
+
+Download from here: https://github.com/oobabooga/text-generation-webui/releases
#### Option 2: One-click installer
+For users who need additional backends (ExLlamaV3, Transformers) or extensions (TTS, voice input, translation, etc). Requires ~10GB disk space and downloads PyTorch.
+
1. Clone the repository, or [download its source code](https://github.com/oobabooga/text-generation-webui/archive/refs/heads/main.zip) and extract it.
2. Run the startup script for your OS: `start_windows.bat`, `start_linux.sh`, or `start_macos.sh`.
3. When prompted, select your GPU vendor.
@@ -150,21 +154,21 @@ The `requirements*.txt` above contain various wheels precompiled through GitHub
```
For NVIDIA GPU:
ln -s docker/{nvidia/Dockerfile,nvidia/docker-compose.yml,.dockerignore} .
-For AMD GPU:
+For AMD GPU:
ln -s docker/{amd/Dockerfile,amd/docker-compose.yml,.dockerignore} .
For Intel GPU:
ln -s docker/{intel/Dockerfile,amd/docker-compose.yml,.dockerignore} .
For CPU only
ln -s docker/{cpu/Dockerfile,cpu/docker-compose.yml,.dockerignore} .
cp docker/.env.example .env
-#Create logs/cache dir :
+#Create logs/cache dir :
mkdir -p user_data/logs user_data/cache
-# Edit .env and set:
+# Edit .env and set:
# TORCH_CUDA_ARCH_LIST based on your GPU model
# APP_RUNTIME_GID your host user's group id (run `id -g` in a terminal)
# BUILD_EXTENIONS optionally add comma separated list of extensions to build
# Edit user_data/CMD_FLAGS.txt and add in it the options you want to execute (like --listen --cpu)
-#
+#
docker compose up --build
```
@@ -188,7 +192,7 @@ List of command-line flags
```txt
-usage: server.py [-h] [--multi-user] [--character CHARACTER] [--model MODEL] [--lora LORA [LORA ...]] [--model-dir MODEL_DIR] [--lora-dir LORA_DIR] [--model-menu] [--settings SETTINGS]
+usage: server.py [-h] [--multi-user] [--model MODEL] [--lora LORA [LORA ...]] [--model-dir MODEL_DIR] [--lora-dir LORA_DIR] [--model-menu] [--settings SETTINGS]
[--extensions EXTENSIONS [EXTENSIONS ...]] [--verbose] [--idle-timeout IDLE_TIMEOUT] [--loader LOADER] [--cpu] [--cpu-memory CPU_MEMORY] [--disk] [--disk-cache-dir DISK_CACHE_DIR]
[--load-in-8bit] [--bf16] [--no-cache] [--trust-remote-code] [--force-safetensors] [--no_use_fast] [--use_flash_attention_2] [--use_eager_attention] [--torch-compile] [--load-in-4bit]
[--use_double_quant] [--compute_dtype COMPUTE_DTYPE] [--quant_type QUANT_TYPE] [--flash-attn] [--threads THREADS] [--threads-batch THREADS_BATCH] [--batch-size BATCH_SIZE] [--no-mmap]
@@ -207,7 +211,6 @@ options:
Basic settings:
--multi-user Multi-user mode. Chat histories are not saved or automatically loaded. Warning: this is likely not safe for sharing publicly.
- --character CHARACTER The name of the character to load in chat mode by default.
--model MODEL Name of the model to load by default.
--lora LORA [LORA ...] The list of LoRAs to load. If you want to load more than one LoRA, write the names separated by spaces.
--model-dir MODEL_DIR Path to directory with all the models.
diff --git a/css/chat_style-wpp.css b/css/chat_style-wpp.css
index 353201c2..b2ac4d2a 100644
--- a/css/chat_style-wpp.css
+++ b/css/chat_style-wpp.css
@@ -1,57 +1,105 @@
.message {
- padding-bottom: 22px;
- padding-top: 3px;
+ display: block;
+ padding-top: 0;
+ padding-bottom: 21px;
font-size: 15px;
font-family: 'Noto Sans', Helvetica, Arial, sans-serif;
line-height: 1.428571429;
+ grid-template-columns: none;
}
-.text-you {
+.circle-you, .circle-bot {
+ display: none;
+}
+
+.text {
+ max-width: 65%;
+ border-radius: 18px;
+ padding: 12px 16px;
+ margin-bottom: 8px;
+ clear: both;
+ box-shadow: 0 1px 2px rgb(0 0 0 / 10%);
+}
+
+.username {
+ font-weight: 600;
+ margin-bottom: 8px;
+ opacity: 0.65;
+ padding-left: 0;
+}
+
+/* User messages - right aligned, WhatsApp green */
+.circle-you + .text {
background-color: #d9fdd3;
- border-radius: 15px;
- padding: 10px;
- padding-top: 5px;
float: right;
+ margin-left: auto;
+ margin-right: 8px;
}
-.text-bot {
- background-color: #f2f2f2;
- border-radius: 15px;
- padding: 10px;
- padding-top: 5px;
+.circle-you + .text .username {
+ display: none;
}
-.dark .text-you {
- background-color: #005c4b;
- color: #111b21;
+/* Bot messages - left aligned, white */
+.circle-bot + .text {
+ background-color: #fff;
+ float: left;
+ margin-right: auto;
+ margin-left: 8px;
+ border: 1px solid #e5e5e5;
}
-.dark .text-bot {
- background-color: #1f2937;
- color: #111b21;
+.circle-bot + .text .message-actions {
+ bottom: -25px !important;
}
-.text-bot p, .text-you p {
- margin-top: 5px;
+/* Dark theme colors */
+.dark .circle-you + .text {
+ background-color: #144d37;
+ color: #e4e6ea;
+ box-shadow: 0 1px 2px rgb(0 0 0 / 30%);
+}
+
+.dark .circle-bot + .text {
+ background-color: #202c33;
+ color: #e4e6ea;
+ border: 1px solid #3c4043;
+ box-shadow: 0 1px 2px rgb(0 0 0 / 30%);
+}
+
+.dark .username {
+ opacity: 0.7;
}
.message-body img {
max-width: 300px;
max-height: 300px;
- border-radius: 20px;
+ border-radius: 12px;
}
.message-body p {
- margin-bottom: 0 !important;
font-size: 15px !important;
- line-height: 1.428571429 !important;
- font-weight: 500;
+ line-height: 1.4 !important;
+ font-weight: 400;
+}
+
+.message-body p:first-child {
+ margin-top: 0 !important;
}
.dark .message-body p em {
- color: rgb(138 138 138) !important;
+ color: rgb(170 170 170) !important;
}
.message-body p em {
- color: rgb(110 110 110) !important;
+ color: rgb(100 100 100) !important;
+}
+
+/* Message actions positioning */
+.message-actions {
+ margin-top: 8px;
+}
+
+.message-body p, .chat .message-body ul, .chat .message-body ol {
+ margin-bottom: 10px !important;
}
diff --git a/css/main.css b/css/main.css
index a22fdd95..bc59f833 100644
--- a/css/main.css
+++ b/css/main.css
@@ -97,11 +97,11 @@ ol li p, ul li p {
display: inline-block;
}
-#chat-tab, #default-tab, #notebook-tab, #parameters, #chat-settings, #lora, #training-tab, #model-tab, #session-tab {
+#notebook-parent-tab, #chat-tab, #parameters, #chat-settings, #lora, #training-tab, #model-tab, #session-tab, #character-tab {
border: 0;
}
-#default-tab, #notebook-tab, #parameters, #chat-settings, #lora, #training-tab, #model-tab, #session-tab {
+#notebook-parent-tab, #parameters, #chat-settings, #lora, #training-tab, #model-tab, #session-tab, #character-tab {
padding: 1rem;
}
@@ -167,15 +167,15 @@ gradio-app > :first-child {
}
.textbox_default textarea {
- height: calc(100dvh - 201px);
+ height: calc(100dvh - 202px);
}
.textbox_default_output textarea {
- height: calc(100dvh - 117px);
+ height: calc(100dvh - 118px);
}
.textbox textarea {
- height: calc(100dvh - 172px);
+ height: calc(100dvh - 145px)
}
.textbox_logits textarea {
@@ -307,7 +307,7 @@ audio {
}
#notebook-token-counter {
- top: calc(100dvh - 171px) !important;
+ top: calc(100dvh - 172px) !important;
}
#default-token-counter span, #notebook-token-counter span {
@@ -421,6 +421,7 @@ div.svelte-362y77>*, div.svelte-362y77>.form>* {
text-align: start;
padding-left: 1rem;
padding-right: 1rem;
+ contain: layout;
}
.chat .message .timestamp {
@@ -905,6 +906,10 @@ div.svelte-362y77>*, div.svelte-362y77>.form>* {
flex-shrink: 1;
}
+#search_chat {
+ padding-right: 0.5rem;
+}
+
#search_chat > :nth-child(2) > :first-child {
display: none;
}
@@ -925,7 +930,7 @@ div.svelte-362y77>*, div.svelte-362y77>.form>* {
position: fixed;
bottom: 0;
left: 0;
- width: calc(100vw / 2 - 600px);
+ width: calc(0.5 * (100vw - min(100vw, 48rem) - (120px - var(--header-width))));
z-index: 10000;
}
@@ -1020,12 +1025,14 @@ div.svelte-362y77>*, div.svelte-362y77>.form>* {
width: 100%;
justify-content: center;
gap: 9px;
+ padding-right: 0.5rem;
}
#past-chats-row,
#chat-controls {
width: 260px;
padding: 0.5rem;
+ padding-right: 0;
height: calc(100dvh - 16px);
flex-shrink: 0;
box-sizing: content-box;
@@ -1289,6 +1296,20 @@ div.svelte-362y77>*, div.svelte-362y77>.form>* {
opacity: 1;
}
+/* Disable message action hover effects during generation */
+._generating .message:hover .message-actions,
+._generating .user-message:hover .message-actions,
+._generating .assistant-message:hover .message-actions {
+ opacity: 0 !important;
+}
+
+/* Disable message action hover effects during scrolling */
+.scrolling .message:hover .message-actions,
+.scrolling .user-message:hover .message-actions,
+.scrolling .assistant-message:hover .message-actions {
+ opacity: 0 !important;
+}
+
.footer-button svg {
stroke: rgb(156 163 175);
transition: stroke 0.2s;
@@ -1625,7 +1646,27 @@ button:focus {
display: none;
}
-/* Disable hover effects while scrolling */
-.chat-parent.scrolling * {
- pointer-events: none !important;
+#character-context textarea {
+ height: calc((100vh - 350px) * 2/3) !important;
+ min-height: 90px !important;
+}
+
+#character-greeting textarea {
+ height: calc((100vh - 350px) * 1/3) !important;
+ min-height: 90px !important;
+}
+
+#user-description textarea {
+ height: calc(100vh - 231px) !important;
+ min-height: 90px !important;
+}
+
+#instruction-template-str textarea,
+#chat-template-str textarea {
+ height: calc(100vh - 300px) !important;
+ min-height: 90px !important;
+}
+
+#textbox-notebook span {
+ display: none;
}
diff --git a/docs/12 - OpenAI API.md b/docs/12 - OpenAI API.md
index db9befed..ec999397 100644
--- a/docs/12 - OpenAI API.md
+++ b/docs/12 - OpenAI API.md
@@ -1,6 +1,6 @@
## OpenAI compatible API
-The main API for this project is meant to be a drop-in replacement to the OpenAI API, including Chat and Completions endpoints.
+The main API for this project is meant to be a drop-in replacement to the OpenAI API, including Chat and Completions endpoints.
* It is 100% offline and private.
* It doesn't create any logs.
@@ -30,10 +30,10 @@ curl http://127.0.0.1:5000/v1/completions \
-H "Content-Type: application/json" \
-d '{
"prompt": "This is a cake recipe:\n\n1.",
- "max_tokens": 200,
- "temperature": 1,
- "top_p": 0.9,
- "seed": 10
+ "max_tokens": 512,
+ "temperature": 0.6,
+ "top_p": 0.95,
+ "top_k": 20
}'
```
@@ -51,7 +51,9 @@ curl http://127.0.0.1:5000/v1/chat/completions \
"content": "Hello!"
}
],
- "mode": "instruct"
+ "temperature": 0.6,
+ "top_p": 0.95,
+ "top_k": 20
}'
```
@@ -67,8 +69,11 @@ curl http://127.0.0.1:5000/v1/chat/completions \
"content": "Hello! Who are you?"
}
],
- "mode": "chat",
- "character": "Example"
+ "mode": "chat-instruct",
+ "character": "Example",
+ "temperature": 0.6,
+ "top_p": 0.95,
+ "top_k": 20
}'
```
@@ -84,7 +89,9 @@ curl http://127.0.0.1:5000/v1/chat/completions \
"content": "Hello!"
}
],
- "mode": "instruct",
+ "temperature": 0.6,
+ "top_p": 0.95,
+ "top_k": 20,
"stream": true
}'
```
@@ -125,10 +132,11 @@ curl -k http://127.0.0.1:5000/v1/internal/model/list \
curl -k http://127.0.0.1:5000/v1/internal/model/load \
-H "Content-Type: application/json" \
-d '{
- "model_name": "model_name",
+ "model_name": "Qwen_Qwen3-0.6B-Q4_K_M.gguf",
"args": {
- "load_in_4bit": true,
- "n_gpu_layers": 12
+ "ctx_size": 32768,
+ "flash_attn": true,
+ "cache_type": "q8_0"
}
}'
```
@@ -150,9 +158,10 @@ while True:
user_message = input("> ")
history.append({"role": "user", "content": user_message})
data = {
- "mode": "chat",
- "character": "Example",
- "messages": history
+ "messages": history,
+ "temperature": 0.6,
+ "top_p": 0.95,
+ "top_k": 20
}
response = requests.post(url, headers=headers, json=data, verify=False)
@@ -182,9 +191,11 @@ while True:
user_message = input("> ")
history.append({"role": "user", "content": user_message})
data = {
- "mode": "instruct",
"stream": True,
- "messages": history
+ "messages": history,
+ "temperature": 0.6,
+ "top_p": 0.95,
+ "top_k": 20
}
stream_response = requests.post(url, headers=headers, json=data, verify=False, stream=True)
@@ -218,10 +229,10 @@ headers = {
data = {
"prompt": "This is a cake recipe:\n\n1.",
- "max_tokens": 200,
- "temperature": 1,
- "top_p": 0.9,
- "seed": 10,
+ "max_tokens": 512,
+ "temperature": 0.6,
+ "top_p": 0.95,
+ "top_k": 20,
"stream": True,
}
diff --git a/extensions/openai/models.py b/extensions/openai/models.py
index a7e67df6..f8d9a1e8 100644
--- a/extensions/openai/models.py
+++ b/extensions/openai/models.py
@@ -18,19 +18,6 @@ def list_models():
return {'model_names': get_available_models()[1:]}
-def list_dummy_models():
- result = {
- "object": "list",
- "data": []
- }
-
- # these are expected by so much, so include some here as a dummy
- for model in ['gpt-3.5-turbo', 'text-embedding-ada-002']:
- result["data"].append(model_info_dict(model))
-
- return result
-
-
def model_info_dict(model_name: str) -> dict:
return {
"id": model_name,
diff --git a/extensions/openai/script.py b/extensions/openai/script.py
index 24bcd69d..3d8d5f73 100644
--- a/extensions/openai/script.py
+++ b/extensions/openai/script.py
@@ -180,7 +180,7 @@ async def handle_models(request: Request):
is_list = request.url.path.split('?')[0].split('#')[0] == '/v1/models'
if is_list:
- response = OAImodels.list_dummy_models()
+ response = OAImodels.list_models()
else:
model_name = path[len('/v1/models/'):]
response = OAImodels.model_info_dict(model_name)
diff --git a/extensions/openai/typing.py b/extensions/openai/typing.py
index b28ebb4e..6643ed16 100644
--- a/extensions/openai/typing.py
+++ b/extensions/openai/typing.py
@@ -158,7 +158,7 @@ class ChatCompletionRequestParams(BaseModel):
user_bio: str | None = Field(default=None, description="The user description/personality.")
chat_template_str: str | None = Field(default=None, description="Jinja2 template for chat.")
- chat_instruct_command: str | None = None
+ chat_instruct_command: str | None = "Continue the chat dialogue below. Write a single reply for the character \"<|character|>\".\n\n<|prompt|>"
continue_: bool = Field(default=False, description="Makes the last bot message in the history be continued instead of starting a new message.")
diff --git a/js/main.js b/js/main.js
index e970884d..3ff4bf06 100644
--- a/js/main.js
+++ b/js/main.js
@@ -170,6 +170,13 @@ targetElement.addEventListener("scroll", function() {
// Create a MutationObserver instance
const observer = new MutationObserver(function(mutations) {
+ // Check if this is just the scrolling class being toggled
+ const isScrollingClassOnly = mutations.every(mutation =>
+ mutation.type === "attributes" &&
+ mutation.attributeName === "class" &&
+ mutation.target === targetElement
+ );
+
if (targetElement.classList.contains("_generating")) {
typing.parentNode.classList.add("visible-dots");
document.getElementById("stop").style.display = "flex";
@@ -182,7 +189,7 @@ const observer = new MutationObserver(function(mutations) {
doSyntaxHighlighting();
- if (!window.isScrolled && targetElement.scrollTop !== targetElement.scrollHeight) {
+ if (!window.isScrolled && !isScrollingClassOnly && targetElement.scrollTop !== targetElement.scrollHeight) {
targetElement.scrollTop = targetElement.scrollHeight;
}
@@ -231,8 +238,15 @@ function doSyntaxHighlighting() {
if (messageBodies.length > 0) {
observer.disconnect();
- messageBodies.forEach((messageBody) => {
+ let hasSeenVisible = false;
+
+ // Go from last message to first
+ for (let i = messageBodies.length - 1; i >= 0; i--) {
+ const messageBody = messageBodies[i];
+
if (isElementVisibleOnScreen(messageBody)) {
+ hasSeenVisible = true;
+
// Handle both code and math in a single pass through each message
const codeBlocks = messageBody.querySelectorAll("pre code:not([data-highlighted])");
codeBlocks.forEach((codeBlock) => {
@@ -249,8 +263,12 @@ function doSyntaxHighlighting() {
{ left: "\\[", right: "\\]", display: true },
],
});
+ } else if (hasSeenVisible) {
+ // We've seen visible messages but this one is not visible
+ // Since we're going from last to first, we can break
+ break;
}
- });
+ }
observer.observe(targetElement, config);
}
@@ -777,11 +795,43 @@ initializeSidebars();
// Add click event listeners to toggle buttons
pastChatsToggle.addEventListener("click", () => {
+ const isCurrentlyOpen = !pastChatsRow.classList.contains("sidebar-hidden");
toggleSidebar(pastChatsRow, pastChatsToggle);
+
+ // On desktop, open/close both sidebars at the same time
+ if (!isMobile()) {
+ if (isCurrentlyOpen) {
+ // If we just closed the left sidebar, also close the right sidebar
+ if (!chatControlsRow.classList.contains("sidebar-hidden")) {
+ toggleSidebar(chatControlsRow, chatControlsToggle, true);
+ }
+ } else {
+ // If we just opened the left sidebar, also open the right sidebar
+ if (chatControlsRow.classList.contains("sidebar-hidden")) {
+ toggleSidebar(chatControlsRow, chatControlsToggle, false);
+ }
+ }
+ }
});
chatControlsToggle.addEventListener("click", () => {
+ const isCurrentlyOpen = !chatControlsRow.classList.contains("sidebar-hidden");
toggleSidebar(chatControlsRow, chatControlsToggle);
+
+ // On desktop, open/close both sidebars at the same time
+ if (!isMobile()) {
+ if (isCurrentlyOpen) {
+ // If we just closed the right sidebar, also close the left sidebar
+ if (!pastChatsRow.classList.contains("sidebar-hidden")) {
+ toggleSidebar(pastChatsRow, pastChatsToggle, true);
+ }
+ } else {
+ // If we just opened the right sidebar, also open the left sidebar
+ if (pastChatsRow.classList.contains("sidebar-hidden")) {
+ toggleSidebar(pastChatsRow, pastChatsToggle, false);
+ }
+ }
+ }
});
navigationToggle.addEventListener("click", () => {
diff --git a/js/show_controls.js b/js/show_controls.js
index 1a87b52d..f974d412 100644
--- a/js/show_controls.js
+++ b/js/show_controls.js
@@ -1,14 +1,26 @@
-const belowChatInput = document.querySelectorAll(
- "#chat-tab > div > :nth-child(1), #chat-tab > div > :nth-child(3), #chat-tab > div > :nth-child(4), #extensions"
-);
const chatParent = document.querySelector(".chat-parent");
function toggle_controls(value) {
- if (value) {
- belowChatInput.forEach(element => {
- element.style.display = "inherit";
- });
+ const extensions = document.querySelector("#extensions");
+ if (value) {
+ // SHOW MODE: Click toggles to show hidden sidebars
+ const navToggle = document.getElementById("navigation-toggle");
+ const pastChatsToggle = document.getElementById("past-chats-toggle");
+
+ if (navToggle && document.querySelector(".header_bar")?.classList.contains("sidebar-hidden")) {
+ navToggle.click();
+ }
+ if (pastChatsToggle && document.getElementById("past-chats-row")?.classList.contains("sidebar-hidden")) {
+ pastChatsToggle.click();
+ }
+
+ // Show extensions only
+ if (extensions) {
+ extensions.style.display = "inherit";
+ }
+
+ // Remove bigchat classes
chatParent.classList.remove("bigchat");
document.getElementById("chat-input-row").classList.remove("bigchat");
document.getElementById("chat-col").classList.remove("bigchat");
@@ -20,10 +32,23 @@ function toggle_controls(value) {
}
} else {
- belowChatInput.forEach(element => {
- element.style.display = "none";
- });
+ // HIDE MODE: Click toggles to hide visible sidebars
+ const navToggle = document.getElementById("navigation-toggle");
+ const pastChatsToggle = document.getElementById("past-chats-toggle");
+ if (navToggle && !document.querySelector(".header_bar")?.classList.contains("sidebar-hidden")) {
+ navToggle.click();
+ }
+ if (pastChatsToggle && !document.getElementById("past-chats-row")?.classList.contains("sidebar-hidden")) {
+ pastChatsToggle.click();
+ }
+
+ // Hide extensions only
+ if (extensions) {
+ extensions.style.display = "none";
+ }
+
+ // Add bigchat classes
chatParent.classList.add("bigchat");
document.getElementById("chat-input-row").classList.add("bigchat");
document.getElementById("chat-col").classList.add("bigchat");
diff --git a/js/switch_tabs.js b/js/switch_tabs.js
index 0564f891..7fb78aea 100644
--- a/js/switch_tabs.js
+++ b/js/switch_tabs.js
@@ -1,24 +1,14 @@
-let chat_tab = document.getElementById("chat-tab");
-let main_parent = chat_tab.parentNode;
-
function scrollToTop() {
- window.scrollTo({
- top: 0,
- // behavior: 'smooth'
- });
+ window.scrollTo({ top: 0 });
}
function findButtonsByText(buttonText) {
const buttons = document.getElementsByTagName("button");
const matchingButtons = [];
- buttonText = buttonText.trim();
for (let i = 0; i < buttons.length; i++) {
- const button = buttons[i];
- const buttonInnerText = button.textContent.trim();
-
- if (buttonInnerText === buttonText) {
- matchingButtons.push(button);
+ if (buttons[i].textContent.trim() === buttonText) {
+ matchingButtons.push(buttons[i]);
}
}
@@ -26,34 +16,23 @@ function findButtonsByText(buttonText) {
}
function switch_to_chat() {
- let chat_tab_button = main_parent.childNodes[0].childNodes[1];
- chat_tab_button.click();
- scrollToTop();
-}
-
-function switch_to_default() {
- let default_tab_button = main_parent.childNodes[0].childNodes[5];
- default_tab_button.click();
+ document.getElementById("chat-tab-button").click();
scrollToTop();
}
function switch_to_notebook() {
- let notebook_tab_button = main_parent.childNodes[0].childNodes[9];
- notebook_tab_button.click();
+ document.getElementById("notebook-parent-tab-button").click();
findButtonsByText("Raw")[1].click();
scrollToTop();
}
function switch_to_generation_parameters() {
- let parameters_tab_button = main_parent.childNodes[0].childNodes[13];
- parameters_tab_button.click();
+ document.getElementById("parameters-button").click();
findButtonsByText("Generation")[0].click();
scrollToTop();
}
function switch_to_character() {
- let parameters_tab_button = main_parent.childNodes[0].childNodes[13];
- parameters_tab_button.click();
- findButtonsByText("Character")[0].click();
+ document.getElementById("character-tab-button").click();
scrollToTop();
}
diff --git a/modules/chat.py b/modules/chat.py
index dfc301df..9290dd62 100644
--- a/modules/chat.py
+++ b/modules/chat.py
@@ -217,8 +217,8 @@ def generate_chat_prompt(user_input, state, **kwargs):
user_key = f"user_{row_idx}"
enhanced_user_msg = user_msg
- # Add attachment content if present
- if user_key in metadata and "attachments" in metadata[user_key]:
+ # Add attachment content if present AND if past attachments are enabled
+ if (state.get('include_past_attachments', True) and user_key in metadata and "attachments" in metadata[user_key]):
attachments_text = ""
for attachment in metadata[user_key]["attachments"]:
filename = attachment.get("name", "file")
@@ -332,10 +332,10 @@ def generate_chat_prompt(user_input, state, **kwargs):
user_message = messages[-1]['content']
# Bisect the truncation point
- left, right = 0, len(user_message) - 1
+ left, right = 0, len(user_message)
- while right - left > 1:
- mid = (left + right) // 2
+ while left < right:
+ mid = (left + right + 1) // 2
messages[-1]['content'] = user_message[:mid]
prompt = make_prompt(messages)
@@ -344,7 +344,7 @@ def generate_chat_prompt(user_input, state, **kwargs):
if encoded_length <= max_length:
left = mid
else:
- right = mid
+ right = mid - 1
messages[-1]['content'] = user_message[:left]
prompt = make_prompt(messages)
@@ -353,7 +353,17 @@ def generate_chat_prompt(user_input, state, **kwargs):
logger.error(f"Failed to build the chat prompt. The input is too long for the available context length.\n\nTruncation length: {state['truncation_length']}\nmax_new_tokens: {state['max_new_tokens']} (is it too high?)\nAvailable context length: {max_length}\n")
raise ValueError
else:
- logger.warning(f"The input has been truncated. Context length: {state['truncation_length']}, max_new_tokens: {state['max_new_tokens']}, available context length: {max_length}.")
+ # Calculate token counts for the log message
+ original_user_tokens = get_encoded_length(user_message)
+ truncated_user_tokens = get_encoded_length(user_message[:left])
+ total_context = max_length + state['max_new_tokens']
+
+ logger.warning(
+ f"User message truncated from {original_user_tokens} to {truncated_user_tokens} tokens. "
+ f"Context full: {max_length} input tokens ({total_context} total, {state['max_new_tokens']} for output). "
+ f"Increase ctx-size while loading the model to avoid truncation."
+ )
+
break
prompt = make_prompt(messages)
@@ -604,6 +614,7 @@ def generate_search_query(user_message, state):
search_state['max_new_tokens'] = 64
search_state['auto_max_new_tokens'] = False
search_state['enable_thinking'] = False
+ search_state['start_with'] = ""
# Generate the full prompt using existing history + augmented message
formatted_prompt = generate_chat_prompt(augmented_message, search_state)
@@ -1069,16 +1080,27 @@ def load_latest_history(state):
'''
if shared.args.multi_user:
- return start_new_chat(state)
+ return start_new_chat(state), None
histories = find_all_histories(state)
if len(histories) > 0:
- history = load_history(histories[0], state['character_menu'], state['mode'])
- else:
- history = start_new_chat(state)
+ # Try to load the last visited chat for this character/mode
+ chat_state = load_last_chat_state()
+ key = get_chat_state_key(state['character_menu'], state['mode'])
+ last_chat_id = chat_state.get("last_chats", {}).get(key)
- return history
+ # If we have a stored last chat and it still exists, use it
+ if last_chat_id and last_chat_id in histories:
+ unique_id = last_chat_id
+ else:
+ # Fall back to most recent (current behavior)
+ unique_id = histories[0]
+
+ history = load_history(unique_id, state['character_menu'], state['mode'])
+ return history, unique_id
+ else:
+ return start_new_chat(state), None
def load_history_after_deletion(state, idx):
@@ -1110,6 +1132,42 @@ def update_character_menu_after_deletion(idx):
return gr.update(choices=characters, value=characters[idx])
+def get_chat_state_key(character, mode):
+ """Generate a key for storing last chat state"""
+ if mode == 'instruct':
+ return 'instruct'
+ else:
+ return f"chat_{character}"
+
+
+def load_last_chat_state():
+ """Load the last chat state from file"""
+ state_file = Path('user_data/logs/chat_state.json')
+ if state_file.exists():
+ try:
+ with open(state_file, 'r', encoding='utf-8') as f:
+ return json.loads(f.read())
+ except:
+ pass
+
+ return {"last_chats": {}}
+
+
+def save_last_chat_state(character, mode, unique_id):
+ """Save the last visited chat for a character/mode"""
+ if shared.args.multi_user:
+ return
+
+ state = load_last_chat_state()
+ key = get_chat_state_key(character, mode)
+ state["last_chats"][key] = unique_id
+
+ state_file = Path('user_data/logs/chat_state.json')
+ state_file.parent.mkdir(exist_ok=True)
+ with open(state_file, 'w', encoding='utf-8') as f:
+ f.write(json.dumps(state, indent=2))
+
+
def load_history(unique_id, character, mode):
p = get_history_file_path(unique_id, character, mode)
@@ -1543,6 +1601,9 @@ def handle_unique_id_select(state):
history = load_history(state['unique_id'], state['character_menu'], state['mode'])
html = redraw_html(history, state['name1'], state['name2'], state['mode'], state['chat_style'], state['character_menu'])
+ # Save this as the last visited chat
+ save_last_chat_state(state['character_menu'], state['mode'], state['unique_id'])
+
convert_to_markdown.cache_clear()
return [history, html]
@@ -1743,14 +1804,14 @@ def handle_character_menu_change(state):
state['greeting'] = greeting
state['context'] = context
- history = load_latest_history(state)
+ history, loaded_unique_id = load_latest_history(state)
histories = find_all_histories_with_first_prompts(state)
html = redraw_html(history, state['name1'], state['name2'], state['mode'], state['chat_style'], state['character_menu'])
convert_to_markdown.cache_clear()
if len(histories) > 0:
- past_chats_update = gr.update(choices=histories, value=histories[0][1])
+ past_chats_update = gr.update(choices=histories, value=loaded_unique_id or histories[0][1])
else:
past_chats_update = gr.update(choices=histories)
@@ -1762,7 +1823,7 @@ def handle_character_menu_change(state):
picture,
greeting,
context,
- past_chats_update,
+ past_chats_update
]
@@ -1786,14 +1847,19 @@ def handle_character_picture_change(picture):
def handle_mode_change(state):
- history = load_latest_history(state)
+ history, loaded_unique_id = load_latest_history(state)
histories = find_all_histories_with_first_prompts(state)
+
+ # Ensure character picture cache exists
+ if state['mode'] in ['chat', 'chat-instruct'] and state['character_menu'] and state['character_menu'] != 'None':
+ generate_pfp_cache(state['character_menu'])
+
html = redraw_html(history, state['name1'], state['name2'], state['mode'], state['chat_style'], state['character_menu'])
convert_to_markdown.cache_clear()
if len(histories) > 0:
- past_chats_update = gr.update(choices=histories, value=histories[0][1])
+ past_chats_update = gr.update(choices=histories, value=loaded_unique_id or histories[0][1])
else:
past_chats_update = gr.update(choices=histories)
@@ -1852,10 +1918,16 @@ def handle_send_instruction_click(state):
output = generate_chat_prompt("Input", state)
- return output
+ if state["show_two_notebook_columns"]:
+ return gr.update(), output, ""
+ else:
+ return output, gr.update(), gr.update()
def handle_send_chat_click(state):
output = generate_chat_prompt("", state, _continue=True)
- return output
+ if state["show_two_notebook_columns"]:
+ return gr.update(), output, ""
+ else:
+ return output, gr.update(), gr.update()
diff --git a/modules/html_generator.py b/modules/html_generator.py
index af64894e..11572fc6 100644
--- a/modules/html_generator.py
+++ b/modules/html_generator.py
@@ -595,64 +595,6 @@ def generate_cai_chat_html(history, name1, name2, style, character, reset_cache=
return output
-def generate_chat_html(history, name1, name2, reset_cache=False, last_message_only=False):
- if not last_message_only:
- output = f'
'
- else:
- output = ""
-
- def create_message(role, content, raw_content):
- """Inner function for WPP-style messages."""
- text_class = "text-you" if role == "user" else "text-bot"
-
- # Get role-specific data
- timestamp = format_message_timestamp(history, role, i)
- attachments = format_message_attachments(history, role, i)
-
- # Create info button if timestamp exists
- info_message = ""
- if timestamp:
- tooltip_text = get_message_tooltip(history, role, i)
- info_message = info_button.replace('title="message"', f'title="{html.escape(tooltip_text)}"')
-
- return (
- f'
'
- f'
'
- f'
{content}
'
- f'{attachments}'
- f'{actions_html(history, i, role, info_message)}'
- f'
'
- f'
'
- )
-
- # Determine range
- start_idx = len(history['visible']) - 1 if last_message_only else 0
- end_idx = len(history['visible'])
-
- for i in range(start_idx, end_idx):
- row_visible = history['visible'][i]
- row_internal = history['internal'][i]
-
- # Convert content
- if last_message_only:
- converted_visible = [None, convert_to_markdown_wrapped(row_visible[1], message_id=i, use_cache=i != len(history['visible']) - 1)]
- else:
- converted_visible = [convert_to_markdown_wrapped(entry, message_id=i, use_cache=i != len(history['visible']) - 1) for entry in row_visible]
-
- # Generate messages
- if not last_message_only and converted_visible[0]:
- output += create_message("user", converted_visible[0], row_internal[0])
-
- output += create_message("assistant", converted_visible[1], row_internal[1])
-
- if not last_message_only:
- output += "
"
-
- return output
-
-
def time_greeting():
current_hour = datetime.datetime.now().hour
if 5 <= current_hour < 12:
@@ -669,8 +611,6 @@ def chat_html_wrapper(history, name1, name2, mode, style, character, reset_cache
result = f'{greeting}
'
elif mode == 'instruct':
result = generate_instruct_html(history, last_message_only=last_message_only)
- elif style == 'wpp':
- result = generate_chat_html(history, name1, name2, last_message_only=last_message_only)
else:
result = generate_cai_chat_html(history, name1, name2, style, character, reset_cache=reset_cache, last_message_only=last_message_only)
diff --git a/modules/llama_cpp_server.py b/modules/llama_cpp_server.py
index a79e24e4..e64f1694 100644
--- a/modules/llama_cpp_server.py
+++ b/modules/llama_cpp_server.py
@@ -30,6 +30,7 @@ class LlamaServer:
self.session = requests.Session()
self.vocabulary_size = None
self.bos_token = ""
+ self.last_prompt_token_count = 0
# Start the server
self._start_server()
@@ -128,6 +129,7 @@ class LlamaServer:
payload = self.prepare_payload(state)
token_ids = self.encode(prompt, add_bos_token=state["add_bos_token"])
+ self.last_prompt_token_count = len(token_ids)
if state['auto_max_new_tokens']:
max_new_tokens = state['truncation_length'] - len(token_ids)
else:
diff --git a/modules/models_settings.py b/modules/models_settings.py
index 283a9744..37aa37cf 100644
--- a/modules/models_settings.py
+++ b/modules/models_settings.py
@@ -9,6 +9,7 @@ import gradio as gr
import yaml
from modules import chat, loaders, metadata_gguf, shared, ui
+from modules.logging_colors import logger
def get_fallback_settings():
@@ -56,7 +57,13 @@ def get_model_metadata(model):
if path.is_file():
model_file = path
else:
- model_file = list(path.glob('*.gguf'))[0]
+ gguf_files = list(path.glob('*.gguf'))
+ if not gguf_files:
+ error_msg = f"No .gguf models found in directory: {path}"
+ logger.error(error_msg)
+ raise FileNotFoundError(error_msg)
+
+ model_file = gguf_files[0]
metadata = load_gguf_metadata_with_cache(model_file)
@@ -171,6 +178,8 @@ def infer_loader(model_name, model_settings, hf_quant_method=None):
path_to_model = Path(f'{shared.args.model_dir}/{model_name}')
if not path_to_model.exists():
loader = None
+ elif shared.args.portable:
+ loader = 'llama.cpp'
elif len(list(path_to_model.glob('*.gguf'))) > 0:
loader = 'llama.cpp'
elif re.match(r'.*\.gguf', model_name.lower()):
@@ -450,26 +459,19 @@ def update_gpu_layers_and_vram(loader, model, gpu_layers, ctx_size, cache_type,
else:
return (0, gpu_layers) if auto_adjust else 0
+ # Get model settings including user preferences
+ model_settings = get_model_metadata(model)
+
current_layers = gpu_layers
- max_layers = gpu_layers
+ max_layers = model_settings.get('max_gpu_layers', 256)
if auto_adjust:
- # Get model settings including user preferences
- model_settings = get_model_metadata(model)
-
- # Get the true maximum layers
- max_layers = model_settings.get('max_gpu_layers', model_settings.get('gpu_layers', gpu_layers))
-
# Check if this is a user-saved setting
user_config = shared.user_config
model_regex = Path(model).name + '$'
has_user_setting = model_regex in user_config and 'gpu_layers' in user_config[model_regex]
- if has_user_setting:
- # For user settings, just use the current value (which already has user pref)
- # but ensure the slider maximum is correct
- current_layers = gpu_layers # Already has user setting
- else:
+ if not has_user_setting:
# No user setting, auto-adjust from the maximum
current_layers = max_layers # Start from max
diff --git a/modules/prompts.py b/modules/prompts.py
index 8f00cac2..79d9b56e 100644
--- a/modules/prompts.py
+++ b/modules/prompts.py
@@ -1,22 +1,33 @@
from pathlib import Path
+from modules import shared, utils
from modules.text_generation import get_encoded_length
def load_prompt(fname):
- if fname in ['None', '']:
- return ''
- else:
- file_path = Path(f'user_data/prompts/{fname}.txt')
- if not file_path.exists():
- return ''
+ if not fname:
+ # Create new file
+ new_name = utils.current_time()
+ prompt_path = Path("user_data/logs/notebook") / f"{new_name}.txt"
+ prompt_path.parent.mkdir(parents=True, exist_ok=True)
+ initial_content = "In this story,"
+ prompt_path.write_text(initial_content, encoding='utf-8')
+ # Update settings to point to new file
+ shared.settings['prompt-notebook'] = new_name
+
+ return initial_content
+
+ file_path = Path(f'user_data/logs/notebook/{fname}.txt')
+ if file_path.exists():
with open(file_path, 'r', encoding='utf-8') as f:
text = f.read()
- if text[-1] == '\n':
+ if len(text) > 0 and text[-1] == '\n':
text = text[:-1]
return text
+ else:
+ return ''
def count_tokens(text):
diff --git a/modules/shared.py b/modules/shared.py
index 83920df8..5333ec4f 100644
--- a/modules/shared.py
+++ b/modules/shared.py
@@ -202,8 +202,7 @@ settings = {
'chat-instruct_command': 'Continue the chat dialogue below. Write a single reply for the character "<|character|>".\n\n<|prompt|>',
'enable_web_search': False,
'web_search_pages': 3,
- 'prompt-default': 'QA',
- 'prompt-notebook': 'QA',
+ 'prompt-notebook': '',
'preset': 'Qwen3 - Thinking' if Path('user_data/presets/Qwen3 - Thinking.yaml').exists() else None,
'max_new_tokens': 512,
'max_new_tokens_min': 1,
@@ -223,7 +222,9 @@ settings = {
'custom_token_bans': '',
'negative_prompt': '',
'dark_theme': True,
+ 'show_two_notebook_columns': False,
'paste_to_attachment': False,
+ 'include_past_attachments': True,
# Generation parameters - Curve shape
'temperature': 0.6,
diff --git a/modules/text_generation.py b/modules/text_generation.py
index 55b538b0..a75141f1 100644
--- a/modules/text_generation.py
+++ b/modules/text_generation.py
@@ -498,8 +498,14 @@ def generate_reply_custom(question, original_question, state, stopping_strings=N
traceback.print_exc()
finally:
t1 = time.time()
- original_tokens = len(encode(original_question)[0])
- new_tokens = len(encode(original_question + reply)[0]) - original_tokens
+
+ if hasattr(shared.model, 'last_prompt_token_count'):
+ original_tokens = shared.model.last_prompt_token_count
+ new_tokens = len(encode(reply)[0]) if reply else 0
+ else:
+ original_tokens = len(encode(original_question)[0])
+ new_tokens = len(encode(original_question + reply)[0]) - original_tokens
+
logger.info(f'Output generated in {(t1-t0):.2f} seconds ({new_tokens/(t1-t0):.2f} tokens/s, {new_tokens} tokens, context {original_tokens}, seed {state["seed"]})')
return
diff --git a/modules/ui.py b/modules/ui.py
index 2925faa5..0e8afa8f 100644
--- a/modules/ui.py
+++ b/modules/ui.py
@@ -6,6 +6,7 @@ import gradio as gr
import yaml
import extensions
+import modules.extensions as extensions_module
from modules import shared
from modules.chat import load_history
from modules.utils import gradio
@@ -273,7 +274,9 @@ def list_interface_input_elements():
# Other elements
elements += [
- 'paste_to_attachment'
+ 'show_two_notebook_columns',
+ 'paste_to_attachment',
+ 'include_past_attachments',
]
return elements
@@ -324,8 +327,7 @@ def save_settings(state, preset, extensions_list, show_controls, theme_state, ma
output[k] = state[k]
output['preset'] = preset
- output['prompt-default'] = state['prompt_menu-default']
- output['prompt-notebook'] = state['prompt_menu-notebook']
+ output['prompt-notebook'] = state['prompt_menu-default'] if state['show_two_notebook_columns'] else state['prompt_menu-notebook']
output['character'] = state['character_menu']
output['seed'] = int(output['seed'])
output['show_controls'] = show_controls
@@ -333,35 +335,41 @@ def save_settings(state, preset, extensions_list, show_controls, theme_state, ma
output.pop('instruction_template_str')
output.pop('truncation_length')
- # Only save extensions on manual save
+ # Handle extensions and extension parameters
if manual_save:
+ # Save current extensions and their parameter values
output['default_extensions'] = extensions_list
+
+ for extension_name in extensions_list:
+ extension = getattr(extensions, extension_name, None)
+ if extension:
+ extension = extension.script
+ if hasattr(extension, 'params'):
+ params = getattr(extension, 'params')
+ for param in params:
+ _id = f"{extension_name}-{param}"
+ # Only save if different from default value
+ if param not in shared.default_settings or params[param] != shared.default_settings[param]:
+ output[_id] = params[param]
else:
- # Preserve existing extensions from settings file during autosave
+ # Preserve existing extensions and extension parameters during autosave
settings_path = Path('user_data') / 'settings.yaml'
if settings_path.exists():
try:
with open(settings_path, 'r', encoding='utf-8') as f:
existing_settings = yaml.safe_load(f.read()) or {}
+ # Preserve default_extensions
if 'default_extensions' in existing_settings:
output['default_extensions'] = existing_settings['default_extensions']
+
+ # Preserve extension parameter values
+ for key, value in existing_settings.items():
+ if any(key.startswith(f"{ext_name}-") for ext_name in extensions_module.available_extensions):
+ output[key] = value
except Exception:
pass # If we can't read the file, just don't modify extensions
- # Save extension values in the UI
- for extension_name in extensions_list:
- extension = getattr(extensions, extension_name, None)
- if extension:
- extension = extension.script
- if hasattr(extension, 'params'):
- params = getattr(extension, 'params')
- for param in params:
- _id = f"{extension_name}-{param}"
- # Only save if different from default value
- if param not in shared.default_settings or params[param] != shared.default_settings[param]:
- output[_id] = params[param]
-
# Do not save unchanged settings
for key in list(output.keys()):
if key in shared.default_settings and output[key] == shared.default_settings[key]:
@@ -497,7 +505,9 @@ def setup_auto_save():
# Session tab (ui_session.py)
'show_controls',
'theme_state',
- 'paste_to_attachment'
+ 'show_two_notebook_columns',
+ 'paste_to_attachment',
+ 'include_past_attachments'
]
for element_name in change_elements:
diff --git a/modules/ui_chat.py b/modules/ui_chat.py
index 3b841b8b..8a90608f 100644
--- a/modules/ui_chat.py
+++ b/modules/ui_chat.py
@@ -70,7 +70,6 @@ def create_ui():
shared.gradio['Impersonate'] = gr.Button('Impersonate (Ctrl + Shift + M)', elem_id='Impersonate')
shared.gradio['Send dummy message'] = gr.Button('Send dummy message')
shared.gradio['Send dummy reply'] = gr.Button('Send dummy reply')
- shared.gradio['send-chat-to-default'] = gr.Button('Send to Default')
shared.gradio['send-chat-to-notebook'] = gr.Button('Send to Notebook')
shared.gradio['show_controls'] = gr.Checkbox(value=shared.settings['show_controls'], label='Show controls (Ctrl+S)', elem_id='show-controls')
@@ -111,9 +110,9 @@ def create_ui():
shared.gradio['edit_message'] = gr.Button(elem_id="Edit-message")
-def create_chat_settings_ui():
+def create_character_settings_ui():
mu = shared.args.multi_user
- with gr.Tab('Chat'):
+ with gr.Tab('Character', elem_id="character-tab"):
with gr.Row():
with gr.Column(scale=8):
with gr.Tab("Character"):
@@ -125,12 +124,12 @@ def create_chat_settings_ui():
shared.gradio['restore_character'] = gr.Button('Restore character', elem_classes='refresh-button', interactive=True, elem_id='restore-character')
shared.gradio['name2'] = gr.Textbox(value=shared.settings['name2'], lines=1, label='Character\'s name')
- shared.gradio['context'] = gr.Textbox(value=shared.settings['context'], lines=10, label='Context', elem_classes=['add_scrollbar'])
- shared.gradio['greeting'] = gr.Textbox(value=shared.settings['greeting'], lines=5, label='Greeting', elem_classes=['add_scrollbar'])
+ shared.gradio['context'] = gr.Textbox(value=shared.settings['context'], lines=10, label='Context', elem_classes=['add_scrollbar'], elem_id="character-context")
+ shared.gradio['greeting'] = gr.Textbox(value=shared.settings['greeting'], lines=5, label='Greeting', elem_classes=['add_scrollbar'], elem_id="character-greeting")
with gr.Tab("User"):
shared.gradio['name1'] = gr.Textbox(value=shared.settings['name1'], lines=1, label='Name')
- shared.gradio['user_bio'] = gr.Textbox(value=shared.settings['user_bio'], lines=10, label='Description', info='Here you can optionally write a description of yourself.', placeholder='{{user}}\'s personality: ...', elem_classes=['add_scrollbar'])
+ shared.gradio['user_bio'] = gr.Textbox(value=shared.settings['user_bio'], lines=10, label='Description', info='Here you can optionally write a description of yourself.', placeholder='{{user}}\'s personality: ...', elem_classes=['add_scrollbar'], elem_id="user-description")
with gr.Tab('Chat history'):
with gr.Row():
@@ -163,6 +162,9 @@ def create_chat_settings_ui():
shared.gradio['character_picture'] = gr.Image(label='Character picture', type='pil', interactive=not mu)
shared.gradio['your_picture'] = gr.Image(label='Your picture', type='pil', value=Image.open(Path('user_data/cache/pfp_me.png')) if Path('user_data/cache/pfp_me.png').exists() else None, interactive=not mu)
+
+def create_chat_settings_ui():
+ mu = shared.args.multi_user
with gr.Tab('Instruction template'):
with gr.Row():
with gr.Column():
@@ -178,15 +180,12 @@ def create_chat_settings_ui():
with gr.Row():
with gr.Column():
- shared.gradio['custom_system_message'] = gr.Textbox(value=shared.settings['custom_system_message'], lines=2, label='Custom system message', info='If not empty, will be used instead of the default one.', elem_classes=['add_scrollbar'])
- shared.gradio['instruction_template_str'] = gr.Textbox(value=shared.settings['instruction_template_str'], label='Instruction template', lines=24, info='This gets autodetected; you usually don\'t need to change it. Used in instruct and chat-instruct modes.', elem_classes=['add_scrollbar', 'monospace'])
+ shared.gradio['instruction_template_str'] = gr.Textbox(value=shared.settings['instruction_template_str'], label='Instruction template', lines=24, info='This gets autodetected; you usually don\'t need to change it. Used in instruct and chat-instruct modes.', elem_classes=['add_scrollbar', 'monospace'], elem_id='instruction-template-str')
with gr.Row():
- shared.gradio['send_instruction_to_default'] = gr.Button('Send to default', elem_classes=['small-button'])
shared.gradio['send_instruction_to_notebook'] = gr.Button('Send to notebook', elem_classes=['small-button'])
- shared.gradio['send_instruction_to_negative_prompt'] = gr.Button('Send to negative prompt', elem_classes=['small-button'])
with gr.Column():
- shared.gradio['chat_template_str'] = gr.Textbox(value=shared.settings['chat_template_str'], label='Chat template', lines=22, elem_classes=['add_scrollbar', 'monospace'])
+ shared.gradio['chat_template_str'] = gr.Textbox(value=shared.settings['chat_template_str'], label='Chat template', lines=22, elem_classes=['add_scrollbar', 'monospace'], info='Defines how the chat prompt in chat/chat-instruct modes is generated.', elem_id='chat-template-str')
def create_event_handlers():
@@ -298,7 +297,7 @@ def create_event_handlers():
shared.gradio['mode'].change(
ui.gather_interface_values, gradio(shared.input_elements), gradio('interface_state')).then(
chat.handle_mode_change, gradio('interface_state'), gradio('history', 'display', 'chat_style', 'chat-instruct_command', 'unique_id'), show_progress=False).then(
- None, gradio('mode'), None, js="(mode) => {const characterContainer = document.getElementById('character-menu').parentNode.parentNode; const isInChatTab = document.querySelector('#chat-controls').contains(characterContainer); if (isInChatTab) { characterContainer.style.display = mode === 'instruct' ? 'none' : ''; }}")
+ None, gradio('mode'), None, js="(mode) => {const characterContainer = document.getElementById('character-menu').parentNode.parentNode; const isInChatTab = document.querySelector('#chat-controls').contains(characterContainer); if (isInChatTab) { characterContainer.style.display = mode === 'instruct' ? 'none' : ''; } if (mode === 'instruct') document.querySelectorAll('.bigProfilePicture').forEach(el => el.remove());}")
shared.gradio['chat_style'].change(chat.redraw_html, gradio(reload_arr), gradio('display'), show_progress=False)
@@ -343,29 +342,14 @@ def create_event_handlers():
ui.gather_interface_values, gradio(shared.input_elements), gradio('interface_state')).then(
chat.handle_your_picture_change, gradio('your_picture', 'interface_state'), gradio('display'), show_progress=False)
- shared.gradio['send_instruction_to_default'].click(
- ui.gather_interface_values, gradio(shared.input_elements), gradio('interface_state')).then(
- chat.handle_send_instruction_click, gradio('interface_state'), gradio('textbox-default'), show_progress=False).then(
- None, None, None, js=f'() => {{{ui.switch_tabs_js}; switch_to_default()}}')
-
shared.gradio['send_instruction_to_notebook'].click(
ui.gather_interface_values, gradio(shared.input_elements), gradio('interface_state')).then(
- chat.handle_send_instruction_click, gradio('interface_state'), gradio('textbox-notebook'), show_progress=False).then(
+ chat.handle_send_instruction_click, gradio('interface_state'), gradio('textbox-notebook', 'textbox-default', 'output_textbox'), show_progress=False).then(
None, None, None, js=f'() => {{{ui.switch_tabs_js}; switch_to_notebook()}}')
- shared.gradio['send_instruction_to_negative_prompt'].click(
- ui.gather_interface_values, gradio(shared.input_elements), gradio('interface_state')).then(
- chat.handle_send_instruction_click, gradio('interface_state'), gradio('negative_prompt'), show_progress=False).then(
- None, None, None, js=f'() => {{{ui.switch_tabs_js}; switch_to_generation_parameters()}}')
-
- shared.gradio['send-chat-to-default'].click(
- ui.gather_interface_values, gradio(shared.input_elements), gradio('interface_state')).then(
- chat.handle_send_chat_click, gradio('interface_state'), gradio('textbox-default'), show_progress=False).then(
- None, None, None, js=f'() => {{{ui.switch_tabs_js}; switch_to_default()}}')
-
shared.gradio['send-chat-to-notebook'].click(
ui.gather_interface_values, gradio(shared.input_elements), gradio('interface_state')).then(
- chat.handle_send_chat_click, gradio('interface_state'), gradio('textbox-notebook'), show_progress=False).then(
+ chat.handle_send_chat_click, gradio('interface_state'), gradio('textbox-notebook', 'textbox-default', 'output_textbox'), show_progress=False).then(
None, None, None, js=f'() => {{{ui.switch_tabs_js}; switch_to_notebook()}}')
shared.gradio['show_controls'].change(None, gradio('show_controls'), None, js=f'(x) => {{{ui.show_controls_js}; toggle_controls(x)}}')
diff --git a/modules/ui_default.py b/modules/ui_default.py
index 8acc4b10..44af48a3 100644
--- a/modules/ui_default.py
+++ b/modules/ui_default.py
@@ -1,3 +1,5 @@
+from pathlib import Path
+
import gradio as gr
from modules import logits, shared, ui, utils
@@ -7,6 +9,7 @@ from modules.text_generation import (
get_token_ids,
stop_everything_event
)
+from modules.ui_notebook import store_notebook_state_and_debounce
from modules.utils import gradio
inputs = ('textbox-default', 'interface_state')
@@ -15,11 +18,12 @@ outputs = ('output_textbox', 'html-default')
def create_ui():
mu = shared.args.multi_user
- with gr.Tab('Default', elem_id='default-tab'):
+ with gr.Row(visible=shared.settings['show_two_notebook_columns']) as shared.gradio['default-tab']:
with gr.Row():
with gr.Column():
with gr.Row():
- shared.gradio['textbox-default'] = gr.Textbox(value=load_prompt(shared.settings['prompt-default']), lines=27, label='Input', elem_classes=['textbox_default', 'add_scrollbar'])
+ initial_text = load_prompt(shared.settings['prompt-notebook'])
+ shared.gradio['textbox-default'] = gr.Textbox(value=initial_text, lines=27, label='Input', elem_classes=['textbox_default', 'add_scrollbar'])
shared.gradio['token-counter-default'] = gr.HTML(value="0", elem_id="default-token-counter")
with gr.Row():
@@ -28,11 +32,21 @@ def create_ui():
shared.gradio['Generate-default'] = gr.Button('Generate', variant='primary')
with gr.Row():
- shared.gradio['prompt_menu-default'] = gr.Dropdown(choices=utils.get_available_prompts(), value=shared.settings['prompt-default'], label='Prompt', elem_classes='slim-dropdown')
+ shared.gradio['prompt_menu-default'] = gr.Dropdown(choices=utils.get_available_prompts(), value=shared.settings['prompt-notebook'], label='Prompt', elem_classes='slim-dropdown')
ui.create_refresh_button(shared.gradio['prompt_menu-default'], lambda: None, lambda: {'choices': utils.get_available_prompts()}, 'refresh-button', interactive=not mu)
- shared.gradio['save_prompt-default'] = gr.Button('💾', elem_classes='refresh-button', interactive=not mu)
+ shared.gradio['new_prompt-default'] = gr.Button('New', elem_classes='refresh-button', interactive=not mu)
+ shared.gradio['rename_prompt-default'] = gr.Button('Rename', elem_classes='refresh-button', interactive=not mu)
shared.gradio['delete_prompt-default'] = gr.Button('🗑️', elem_classes='refresh-button', interactive=not mu)
+ # Rename elements (initially hidden)
+ shared.gradio['rename_prompt_to-default'] = gr.Textbox(label="New name", elem_classes=['no-background'], visible=False)
+ shared.gradio['rename_prompt-cancel-default'] = gr.Button('Cancel', elem_classes=['refresh-button'], visible=False)
+ shared.gradio['rename_prompt-confirm-default'] = gr.Button('Confirm', elem_classes=['refresh-button'], variant='primary', visible=False)
+
+ # Delete confirmation elements (initially hidden)
+ shared.gradio['delete_prompt-cancel-default'] = gr.Button('Cancel', elem_classes=['refresh-button'], visible=False)
+ shared.gradio['delete_prompt-confirm-default'] = gr.Button('Confirm', variant='stop', elem_classes=['refresh-button'], visible=False)
+
with gr.Column():
with gr.Tab('Raw'):
shared.gradio['output_textbox'] = gr.Textbox(lines=27, label='Output', elem_id='textbox-default', elem_classes=['textbox_default_output', 'add_scrollbar'])
@@ -64,7 +78,7 @@ def create_event_handlers():
shared.gradio['Generate-default'].click(
ui.gather_interface_values, gradio(shared.input_elements), gradio('interface_state')).then(
lambda: [gr.update(visible=True), gr.update(visible=False)], None, gradio('Stop-default', 'Generate-default')).then(
- generate_reply_wrapper, gradio(inputs), gradio(outputs), show_progress=False).then(
+ generate_reply_wrapper, gradio('textbox-default', 'interface_state'), gradio(outputs), show_progress=False).then(
lambda state, left, right: state.update({'textbox-default': left, 'output_textbox': right}), gradio('interface_state', 'textbox-default', 'output_textbox'), None).then(
lambda: [gr.update(visible=False), gr.update(visible=True)], None, gradio('Stop-default', 'Generate-default')).then(
None, None, None, js=f'() => {{{ui.audio_notification_js}}}')
@@ -72,7 +86,7 @@ def create_event_handlers():
shared.gradio['textbox-default'].submit(
ui.gather_interface_values, gradio(shared.input_elements), gradio('interface_state')).then(
lambda: [gr.update(visible=True), gr.update(visible=False)], None, gradio('Stop-default', 'Generate-default')).then(
- generate_reply_wrapper, gradio(inputs), gradio(outputs), show_progress=False).then(
+ generate_reply_wrapper, gradio('textbox-default', 'interface_state'), gradio(outputs), show_progress=False).then(
lambda state, left, right: state.update({'textbox-default': left, 'output_textbox': right}), gradio('interface_state', 'textbox-default', 'output_textbox'), None).then(
lambda: [gr.update(visible=False), gr.update(visible=True)], None, gradio('Stop-default', 'Generate-default')).then(
None, None, None, js=f'() => {{{ui.audio_notification_js}}}')
@@ -80,16 +94,60 @@ def create_event_handlers():
shared.gradio['Continue-default'].click(
ui.gather_interface_values, gradio(shared.input_elements), gradio('interface_state')).then(
lambda: [gr.update(visible=True), gr.update(visible=False)], None, gradio('Stop-default', 'Generate-default')).then(
- generate_reply_wrapper, [shared.gradio['output_textbox']] + gradio(inputs)[1:], gradio(outputs), show_progress=False).then(
+ generate_reply_wrapper, gradio('output_textbox', 'interface_state'), gradio(outputs), show_progress=False).then(
lambda state, left, right: state.update({'textbox-default': left, 'output_textbox': right}), gradio('interface_state', 'textbox-default', 'output_textbox'), None).then(
lambda: [gr.update(visible=False), gr.update(visible=True)], None, gradio('Stop-default', 'Generate-default')).then(
None, None, None, js=f'() => {{{ui.audio_notification_js}}}')
shared.gradio['Stop-default'].click(stop_everything_event, None, None, queue=False)
shared.gradio['markdown_render-default'].click(lambda x: x, gradio('output_textbox'), gradio('markdown-default'), queue=False)
- shared.gradio['prompt_menu-default'].change(load_prompt, gradio('prompt_menu-default'), gradio('textbox-default'), show_progress=False)
- shared.gradio['save_prompt-default'].click(handle_save_prompt, gradio('textbox-default'), gradio('save_contents', 'save_filename', 'save_root', 'file_saver'), show_progress=False)
- shared.gradio['delete_prompt-default'].click(handle_delete_prompt, gradio('prompt_menu-default'), gradio('delete_filename', 'delete_root', 'file_deleter'), show_progress=False)
+ shared.gradio['prompt_menu-default'].change(lambda x: (load_prompt(x), ""), gradio('prompt_menu-default'), gradio('textbox-default', 'output_textbox'), show_progress=False)
+ shared.gradio['new_prompt-default'].click(handle_new_prompt, None, gradio('prompt_menu-default'), show_progress=False)
+
+ # Input change handler to save input (reusing notebook's debounced saving)
+ shared.gradio['textbox-default'].change(
+ store_notebook_state_and_debounce,
+ gradio('textbox-default', 'prompt_menu-default'),
+ None,
+ show_progress=False
+ )
+
+ shared.gradio['delete_prompt-default'].click(
+ lambda: [gr.update(visible=False), gr.update(visible=True), gr.update(visible=True)],
+ None,
+ gradio('delete_prompt-default', 'delete_prompt-cancel-default', 'delete_prompt-confirm-default'),
+ show_progress=False)
+
+ shared.gradio['delete_prompt-cancel-default'].click(
+ lambda: [gr.update(visible=True), gr.update(visible=False), gr.update(visible=False)],
+ None,
+ gradio('delete_prompt-default', 'delete_prompt-cancel-default', 'delete_prompt-confirm-default'),
+ show_progress=False)
+
+ shared.gradio['delete_prompt-confirm-default'].click(
+ handle_delete_prompt_confirm_default,
+ gradio('prompt_menu-default'),
+ gradio('prompt_menu-default', 'delete_prompt-default', 'delete_prompt-cancel-default', 'delete_prompt-confirm-default'),
+ show_progress=False)
+
+ shared.gradio['rename_prompt-default'].click(
+ handle_rename_prompt_click_default,
+ gradio('prompt_menu-default'),
+ gradio('rename_prompt_to-default', 'rename_prompt-default', 'rename_prompt-cancel-default', 'rename_prompt-confirm-default'),
+ show_progress=False)
+
+ shared.gradio['rename_prompt-cancel-default'].click(
+ lambda: [gr.update(visible=False), gr.update(visible=True), gr.update(visible=False), gr.update(visible=False)],
+ None,
+ gradio('rename_prompt_to-default', 'rename_prompt-default', 'rename_prompt-cancel-default', 'rename_prompt-confirm-default'),
+ show_progress=False)
+
+ shared.gradio['rename_prompt-confirm-default'].click(
+ handle_rename_prompt_confirm_default,
+ gradio('rename_prompt_to-default', 'prompt_menu-default'),
+ gradio('prompt_menu-default', 'rename_prompt_to-default', 'rename_prompt-default', 'rename_prompt-cancel-default', 'rename_prompt-confirm-default'),
+ show_progress=False)
+
shared.gradio['textbox-default'].change(lambda x: f"{count_tokens(x)}", gradio('textbox-default'), gradio('token-counter-default'), show_progress=False)
shared.gradio['get_logits-default'].click(
ui.gather_interface_values, gradio(shared.input_elements), gradio('interface_state')).then(
@@ -98,18 +156,61 @@ def create_event_handlers():
shared.gradio['get_tokens-default'].click(get_token_ids, gradio('textbox-default'), gradio('tokens-default'), show_progress=False)
-def handle_save_prompt(text):
+def handle_new_prompt():
+ new_name = utils.current_time()
+
+ # Create the new prompt file
+ prompt_path = Path("user_data/logs/notebook") / f"{new_name}.txt"
+ prompt_path.parent.mkdir(parents=True, exist_ok=True)
+ prompt_path.write_text("In this story,", encoding='utf-8')
+
+ return gr.update(choices=utils.get_available_prompts(), value=new_name)
+
+
+def handle_delete_prompt_confirm_default(prompt_name):
+ available_prompts = utils.get_available_prompts()
+ current_index = available_prompts.index(prompt_name) if prompt_name in available_prompts else 0
+
+ (Path("user_data/logs/notebook") / f"{prompt_name}.txt").unlink(missing_ok=True)
+ available_prompts = utils.get_available_prompts()
+
+ if available_prompts:
+ new_value = available_prompts[min(current_index, len(available_prompts) - 1)]
+ else:
+ new_value = utils.current_time()
+ Path("user_data/logs/notebook").mkdir(parents=True, exist_ok=True)
+ (Path("user_data/logs/notebook") / f"{new_value}.txt").write_text("In this story,")
+ available_prompts = [new_value]
+
return [
- text,
- utils.current_time() + ".txt",
- "user_data/prompts/",
+ gr.update(choices=available_prompts, value=new_value),
+ gr.update(visible=True),
+ gr.update(visible=False),
+ gr.update(visible=False)
+ ]
+
+
+def handle_rename_prompt_click_default(current_name):
+ return [
+ gr.update(value=current_name, visible=True),
+ gr.update(visible=False),
+ gr.update(visible=True),
gr.update(visible=True)
]
-def handle_delete_prompt(prompt):
+def handle_rename_prompt_confirm_default(new_name, current_name):
+ old_path = Path("user_data/logs/notebook") / f"{current_name}.txt"
+ new_path = Path("user_data/logs/notebook") / f"{new_name}.txt"
+
+ if old_path.exists() and not new_path.exists():
+ old_path.rename(new_path)
+
+ available_prompts = utils.get_available_prompts()
return [
- prompt + ".txt",
- "user_data/prompts/",
- gr.update(visible=True)
+ gr.update(choices=available_prompts, value=new_name),
+ gr.update(visible=False),
+ gr.update(visible=True),
+ gr.update(visible=False),
+ gr.update(visible=False)
]
diff --git a/modules/ui_model_menu.py b/modules/ui_model_menu.py
index 9e982f0e..6b106203 100644
--- a/modules/ui_model_menu.py
+++ b/modules/ui_model_menu.py
@@ -135,7 +135,7 @@ def create_event_handlers():
# with the model defaults (if any), and then the model is loaded
shared.gradio['model_menu'].change(
ui.gather_interface_values, gradio(shared.input_elements), gradio('interface_state')).then(
- handle_load_model_event_initial, gradio('model_menu', 'interface_state'), gradio(ui.list_interface_input_elements()) + gradio('interface_state'), show_progress=False).then(
+ handle_load_model_event_initial, gradio('model_menu', 'interface_state'), gradio(ui.list_interface_input_elements()) + gradio('interface_state') + gradio('vram_info'), show_progress=False).then(
partial(load_model_wrapper, autoload=False), gradio('model_menu', 'loader'), gradio('model_status'), show_progress=True).success(
handle_load_model_event_final, gradio('truncation_length', 'loader', 'interface_state'), gradio('truncation_length', 'filter_by_loader'), show_progress=False)
@@ -174,7 +174,12 @@ def create_event_handlers():
def load_model_wrapper(selected_model, loader, autoload=False):
- settings = get_model_metadata(selected_model)
+ try:
+ settings = get_model_metadata(selected_model)
+ except FileNotFoundError:
+ exc = traceback.format_exc()
+ yield exc.replace('\n', '\n\n')
+ return
if not autoload:
yield "### {}\n\n- Settings updated: Click \"Load\" to load the model\n- Max sequence length: {}".format(selected_model, settings['truncation_length_info'])
@@ -374,7 +379,8 @@ def handle_load_model_event_initial(model, state):
output = ui.apply_interface_values(state)
update_model_parameters(state) # This updates the command-line flags
- return output + [state]
+ vram_info = state.get('vram_info', "Estimated VRAM to load the model:
")
+ return output + [state] + [vram_info]
def handle_load_model_event_final(truncation_length, loader, state):
diff --git a/modules/ui_notebook.py b/modules/ui_notebook.py
index 3f79a93c..939d81f7 100644
--- a/modules/ui_notebook.py
+++ b/modules/ui_notebook.py
@@ -1,3 +1,7 @@
+import threading
+import time
+from pathlib import Path
+
import gradio as gr
from modules import logits, shared, ui, utils
@@ -7,22 +11,27 @@ from modules.text_generation import (
get_token_ids,
stop_everything_event
)
-from modules.ui_default import handle_delete_prompt, handle_save_prompt
from modules.utils import gradio
+_notebook_file_lock = threading.Lock()
+_notebook_auto_save_timer = None
+_last_notebook_text = None
+_last_notebook_prompt = None
+
inputs = ('textbox-notebook', 'interface_state')
outputs = ('textbox-notebook', 'html-notebook')
def create_ui():
mu = shared.args.multi_user
- with gr.Tab('Notebook', elem_id='notebook-tab'):
+ with gr.Row(visible=not shared.settings['show_two_notebook_columns']) as shared.gradio['notebook-tab']:
shared.gradio['last_input-notebook'] = gr.State('')
with gr.Row():
with gr.Column(scale=4):
with gr.Tab('Raw'):
with gr.Row():
- shared.gradio['textbox-notebook'] = gr.Textbox(value=load_prompt(shared.settings['prompt-notebook']), lines=27, elem_id='textbox-notebook', elem_classes=['textbox', 'add_scrollbar'])
+ initial_text = load_prompt(shared.settings['prompt-notebook'])
+ shared.gradio['textbox-notebook'] = gr.Textbox(label="", value=initial_text, lines=27, elem_id='textbox-notebook', elem_classes=['textbox', 'add_scrollbar'])
shared.gradio['token-counter-notebook'] = gr.HTML(value="0", elem_id="notebook-token-counter")
with gr.Tab('Markdown'):
@@ -57,9 +66,19 @@ def create_ui():
gr.HTML('')
with gr.Row():
shared.gradio['prompt_menu-notebook'] = gr.Dropdown(choices=utils.get_available_prompts(), value=shared.settings['prompt-notebook'], label='Prompt', elem_classes='slim-dropdown')
- ui.create_refresh_button(shared.gradio['prompt_menu-notebook'], lambda: None, lambda: {'choices': utils.get_available_prompts()}, ['refresh-button', 'refresh-button-small'], interactive=not mu)
- shared.gradio['save_prompt-notebook'] = gr.Button('💾', elem_classes=['refresh-button', 'refresh-button-small'], interactive=not mu)
- shared.gradio['delete_prompt-notebook'] = gr.Button('🗑️', elem_classes=['refresh-button', 'refresh-button-small'], interactive=not mu)
+
+ with gr.Row():
+ ui.create_refresh_button(shared.gradio['prompt_menu-notebook'], lambda: None, lambda: {'choices': utils.get_available_prompts()}, ['refresh-button'], interactive=not mu)
+ shared.gradio['new_prompt-notebook'] = gr.Button('New', elem_classes=['refresh-button'], interactive=not mu)
+ shared.gradio['rename_prompt-notebook'] = gr.Button('Rename', elem_classes=['refresh-button'], interactive=not mu)
+ shared.gradio['delete_prompt-notebook'] = gr.Button('🗑️', elem_classes=['refresh-button'], interactive=not mu)
+ shared.gradio['delete_prompt-confirm-notebook'] = gr.Button('Confirm', variant='stop', elem_classes=['refresh-button'], visible=False)
+ shared.gradio['delete_prompt-cancel-notebook'] = gr.Button('Cancel', elem_classes=['refresh-button'], visible=False)
+
+ with gr.Row(visible=False) as shared.gradio['rename-row-notebook']:
+ shared.gradio['rename_prompt_to-notebook'] = gr.Textbox(label="New name", elem_classes=['no-background'])
+ shared.gradio['rename_prompt-cancel-notebook'] = gr.Button('Cancel', elem_classes=['refresh-button'])
+ shared.gradio['rename_prompt-confirm-notebook'] = gr.Button('Confirm', elem_classes=['refresh-button'], variant='primary')
def create_event_handlers():
@@ -67,7 +86,7 @@ def create_event_handlers():
lambda x: x, gradio('textbox-notebook'), gradio('last_input-notebook')).then(
ui.gather_interface_values, gradio(shared.input_elements), gradio('interface_state')).then(
lambda: [gr.update(visible=True), gr.update(visible=False)], None, gradio('Stop-notebook', 'Generate-notebook')).then(
- generate_reply_wrapper, gradio(inputs), gradio(outputs), show_progress=False).then(
+ generate_and_save_wrapper_notebook, gradio('textbox-notebook', 'interface_state', 'prompt_menu-notebook'), gradio(outputs), show_progress=False).then(
lambda state, text: state.update({'textbox-notebook': text}), gradio('interface_state', 'textbox-notebook'), None).then(
lambda: [gr.update(visible=False), gr.update(visible=True)], None, gradio('Stop-notebook', 'Generate-notebook')).then(
None, None, None, js=f'() => {{{ui.audio_notification_js}}}')
@@ -76,7 +95,7 @@ def create_event_handlers():
lambda x: x, gradio('textbox-notebook'), gradio('last_input-notebook')).then(
ui.gather_interface_values, gradio(shared.input_elements), gradio('interface_state')).then(
lambda: [gr.update(visible=True), gr.update(visible=False)], None, gradio('Stop-notebook', 'Generate-notebook')).then(
- generate_reply_wrapper, gradio(inputs), gradio(outputs), show_progress=False).then(
+ generate_and_save_wrapper_notebook, gradio('textbox-notebook', 'interface_state', 'prompt_menu-notebook'), gradio(outputs), show_progress=False).then(
lambda state, text: state.update({'textbox-notebook': text}), gradio('interface_state', 'textbox-notebook'), None).then(
lambda: [gr.update(visible=False), gr.update(visible=True)], None, gradio('Stop-notebook', 'Generate-notebook')).then(
None, None, None, js=f'() => {{{ui.audio_notification_js}}}')
@@ -85,7 +104,7 @@ def create_event_handlers():
lambda x: x, gradio('last_input-notebook'), gradio('textbox-notebook'), show_progress=False).then(
ui.gather_interface_values, gradio(shared.input_elements), gradio('interface_state')).then(
lambda: [gr.update(visible=True), gr.update(visible=False)], None, gradio('Stop-notebook', 'Generate-notebook')).then(
- generate_reply_wrapper, gradio(inputs), gradio(outputs), show_progress=False).then(
+ generate_and_save_wrapper_notebook, gradio('textbox-notebook', 'interface_state', 'prompt_menu-notebook'), gradio(outputs), show_progress=False).then(
lambda state, text: state.update({'textbox-notebook': text}), gradio('interface_state', 'textbox-notebook'), None).then(
lambda: [gr.update(visible=False), gr.update(visible=True)], None, gradio('Stop-notebook', 'Generate-notebook')).then(
None, None, None, js=f'() => {{{ui.audio_notification_js}}}')
@@ -97,11 +116,173 @@ def create_event_handlers():
shared.gradio['markdown_render-notebook'].click(lambda x: x, gradio('textbox-notebook'), gradio('markdown-notebook'), queue=False)
shared.gradio['Stop-notebook'].click(stop_everything_event, None, None, queue=False)
shared.gradio['prompt_menu-notebook'].change(load_prompt, gradio('prompt_menu-notebook'), gradio('textbox-notebook'), show_progress=False)
- shared.gradio['save_prompt-notebook'].click(handle_save_prompt, gradio('textbox-notebook'), gradio('save_contents', 'save_filename', 'save_root', 'file_saver'), show_progress=False)
- shared.gradio['delete_prompt-notebook'].click(handle_delete_prompt, gradio('prompt_menu-notebook'), gradio('delete_filename', 'delete_root', 'file_deleter'), show_progress=False)
+ shared.gradio['new_prompt-notebook'].click(handle_new_prompt, None, gradio('prompt_menu-notebook'), show_progress=False)
+
+ shared.gradio['delete_prompt-notebook'].click(
+ lambda: [gr.update(visible=False), gr.update(visible=True), gr.update(visible=True)],
+ None,
+ gradio('delete_prompt-notebook', 'delete_prompt-cancel-notebook', 'delete_prompt-confirm-notebook'),
+ show_progress=False)
+
+ shared.gradio['delete_prompt-cancel-notebook'].click(
+ lambda: [gr.update(visible=True), gr.update(visible=False), gr.update(visible=False)],
+ None,
+ gradio('delete_prompt-notebook', 'delete_prompt-cancel-notebook', 'delete_prompt-confirm-notebook'),
+ show_progress=False)
+
+ shared.gradio['delete_prompt-confirm-notebook'].click(
+ handle_delete_prompt_confirm_notebook,
+ gradio('prompt_menu-notebook'),
+ gradio('prompt_menu-notebook', 'delete_prompt-notebook', 'delete_prompt-cancel-notebook', 'delete_prompt-confirm-notebook'),
+ show_progress=False)
+
+ shared.gradio['rename_prompt-notebook'].click(
+ handle_rename_prompt_click_notebook,
+ gradio('prompt_menu-notebook'),
+ gradio('rename_prompt_to-notebook', 'rename_prompt-notebook', 'rename-row-notebook'),
+ show_progress=False)
+
+ shared.gradio['rename_prompt-cancel-notebook'].click(
+ lambda: [gr.update(visible=True), gr.update(visible=False)],
+ None,
+ gradio('rename_prompt-notebook', 'rename-row-notebook'),
+ show_progress=False)
+
+ shared.gradio['rename_prompt-confirm-notebook'].click(
+ handle_rename_prompt_confirm_notebook,
+ gradio('rename_prompt_to-notebook', 'prompt_menu-notebook'),
+ gradio('prompt_menu-notebook', 'rename_prompt-notebook', 'rename-row-notebook'),
+ show_progress=False)
+
shared.gradio['textbox-notebook'].input(lambda x: f"{count_tokens(x)}", gradio('textbox-notebook'), gradio('token-counter-notebook'), show_progress=False)
+ shared.gradio['textbox-notebook'].change(
+ store_notebook_state_and_debounce,
+ gradio('textbox-notebook', 'prompt_menu-notebook'),
+ None,
+ show_progress=False
+ )
+
shared.gradio['get_logits-notebook'].click(
ui.gather_interface_values, gradio(shared.input_elements), gradio('interface_state')).then(
logits.get_next_logits, gradio('textbox-notebook', 'interface_state', 'use_samplers-notebook', 'logits-notebook'), gradio('logits-notebook', 'logits-notebook-previous'), show_progress=False)
shared.gradio['get_tokens-notebook'].click(get_token_ids, gradio('textbox-notebook'), gradio('tokens-notebook'), show_progress=False)
+
+
+def generate_and_save_wrapper_notebook(textbox_content, interface_state, prompt_name):
+ """Generate reply and automatically save the result for notebook mode with periodic saves"""
+ last_save_time = time.monotonic()
+ save_interval = 8
+ output = textbox_content
+
+ # Initial autosave
+ safe_autosave_prompt(output, prompt_name)
+
+ for i, (output, html_output) in enumerate(generate_reply_wrapper(textbox_content, interface_state)):
+ yield output, html_output
+
+ current_time = time.monotonic()
+ # Save on first iteration or if save_interval seconds have passed
+ if i == 0 or (current_time - last_save_time) >= save_interval:
+ safe_autosave_prompt(output, prompt_name)
+ last_save_time = current_time
+
+ # Final autosave
+ safe_autosave_prompt(output, prompt_name)
+
+
+def handle_new_prompt():
+ new_name = utils.current_time()
+
+ # Create the new prompt file
+ prompt_path = Path("user_data/logs/notebook") / f"{new_name}.txt"
+ prompt_path.parent.mkdir(parents=True, exist_ok=True)
+ prompt_path.write_text("In this story,", encoding='utf-8')
+
+ return gr.update(choices=utils.get_available_prompts(), value=new_name)
+
+
+def handle_delete_prompt_confirm_notebook(prompt_name):
+ available_prompts = utils.get_available_prompts()
+ current_index = available_prompts.index(prompt_name) if prompt_name in available_prompts else 0
+
+ (Path("user_data/logs/notebook") / f"{prompt_name}.txt").unlink(missing_ok=True)
+ available_prompts = utils.get_available_prompts()
+
+ if available_prompts:
+ new_value = available_prompts[min(current_index, len(available_prompts) - 1)]
+ else:
+ new_value = utils.current_time()
+ Path("user_data/logs/notebook").mkdir(parents=True, exist_ok=True)
+ (Path("user_data/logs/notebook") / f"{new_value}.txt").write_text("In this story,")
+ available_prompts = [new_value]
+
+ return [
+ gr.update(choices=available_prompts, value=new_value),
+ gr.update(visible=True),
+ gr.update(visible=False),
+ gr.update(visible=False)
+ ]
+
+
+def handle_rename_prompt_click_notebook(current_name):
+ return [
+ gr.update(value=current_name),
+ gr.update(visible=False),
+ gr.update(visible=True)
+ ]
+
+
+def handle_rename_prompt_confirm_notebook(new_name, current_name):
+ old_path = Path("user_data/logs/notebook") / f"{current_name}.txt"
+ new_path = Path("user_data/logs/notebook") / f"{new_name}.txt"
+
+ if old_path.exists() and not new_path.exists():
+ old_path.rename(new_path)
+
+ available_prompts = utils.get_available_prompts()
+ return [
+ gr.update(choices=available_prompts, value=new_name),
+ gr.update(visible=True),
+ gr.update(visible=False)
+ ]
+
+
+def autosave_prompt(text, prompt_name):
+ """Automatically save the text to the selected prompt file"""
+ if prompt_name and text.strip():
+ prompt_path = Path("user_data/logs/notebook") / f"{prompt_name}.txt"
+ prompt_path.parent.mkdir(parents=True, exist_ok=True)
+ prompt_path.write_text(text, encoding='utf-8')
+
+
+def safe_autosave_prompt(content, prompt_name):
+ """Thread-safe wrapper for autosave_prompt to prevent file corruption"""
+ with _notebook_file_lock:
+ autosave_prompt(content, prompt_name)
+
+
+def store_notebook_state_and_debounce(text, prompt_name):
+ """Store current notebook state and trigger debounced save"""
+ global _notebook_auto_save_timer, _last_notebook_text, _last_notebook_prompt
+
+ if shared.args.multi_user:
+ return
+
+ _last_notebook_text = text
+ _last_notebook_prompt = prompt_name
+
+ if _notebook_auto_save_timer is not None:
+ _notebook_auto_save_timer.cancel()
+
+ _notebook_auto_save_timer = threading.Timer(1.0, _perform_notebook_debounced_save)
+ _notebook_auto_save_timer.start()
+
+
+def _perform_notebook_debounced_save():
+ """Actually perform the notebook save using the stored state"""
+ try:
+ if _last_notebook_text is not None and _last_notebook_prompt is not None:
+ safe_autosave_prompt(_last_notebook_text, _last_notebook_prompt)
+ except Exception as e:
+ print(f"Notebook auto-save failed: {e}")
diff --git a/modules/ui_parameters.py b/modules/ui_parameters.py
index e2b10554..e42e4c0c 100644
--- a/modules/ui_parameters.py
+++ b/modules/ui_parameters.py
@@ -93,7 +93,7 @@ def create_ui():
with gr.Column():
shared.gradio['truncation_length'] = gr.Number(precision=0, step=256, value=get_truncation_length(), label='Truncate the prompt up to this length', info='The leftmost tokens are removed if the prompt exceeds this length.')
shared.gradio['seed'] = gr.Number(value=shared.settings['seed'], label='Seed (-1 for random)')
-
+ shared.gradio['custom_system_message'] = gr.Textbox(value=shared.settings['custom_system_message'], lines=2, label='Custom system message', info='If not empty, will be used instead of the default one.', elem_classes=['add_scrollbar'])
shared.gradio['custom_stopping_strings'] = gr.Textbox(lines=2, value=shared.settings["custom_stopping_strings"] or None, label='Custom stopping strings', info='Written between "" and separated by commas.', placeholder='"\\n", "\\nYou:"')
shared.gradio['custom_token_bans'] = gr.Textbox(value=shared.settings['custom_token_bans'] or None, label='Token bans', info='Token IDs to ban, separated by commas. The IDs can be found in the Default or Notebook tab.')
shared.gradio['negative_prompt'] = gr.Textbox(value=shared.settings['negative_prompt'], label='Negative prompt', info='For CFG. Only used when guidance_scale is different than 1.', lines=3, elem_classes=['add_scrollbar'])
diff --git a/modules/ui_session.py b/modules/ui_session.py
index 0673828e..a69e155b 100644
--- a/modules/ui_session.py
+++ b/modules/ui_session.py
@@ -11,7 +11,9 @@ def create_ui():
with gr.Column():
gr.Markdown("## Settings")
shared.gradio['toggle_dark_mode'] = gr.Button('Toggle light/dark theme 💡', elem_classes='refresh-button')
+ shared.gradio['show_two_notebook_columns'] = gr.Checkbox(label='Show two columns in the Notebook tab', value=shared.settings['show_two_notebook_columns'])
shared.gradio['paste_to_attachment'] = gr.Checkbox(label='Turn long pasted text into attachments in the Chat tab', value=shared.settings['paste_to_attachment'], elem_id='paste_to_attachment')
+ shared.gradio['include_past_attachments'] = gr.Checkbox(label='Include attachments/search results from previous messages in the chat prompt', value=shared.settings['include_past_attachments'])
with gr.Column():
gr.Markdown("## Extensions & flags")
@@ -33,6 +35,12 @@ def create_ui():
lambda x: 'dark' if x == 'light' else 'light', gradio('theme_state'), gradio('theme_state')).then(
None, None, None, js=f'() => {{{ui.dark_theme_js}; toggleDarkMode(); localStorage.setItem("theme", document.body.classList.contains("dark") ? "dark" : "light")}}')
+ shared.gradio['show_two_notebook_columns'].change(
+ handle_default_to_notebook_change,
+ gradio('show_two_notebook_columns', 'textbox-default', 'output_textbox', 'prompt_menu-default', 'textbox-notebook', 'prompt_menu-notebook'),
+ gradio('default-tab', 'notebook-tab', 'textbox-default', 'output_textbox', 'prompt_menu-default', 'textbox-notebook', 'prompt_menu-notebook')
+ )
+
# Reset interface event
shared.gradio['reset_interface'].click(
set_interface_arguments, gradio('extensions_menu', 'bool_menu'), None).then(
@@ -49,6 +57,31 @@ def handle_save_settings(state, preset, extensions, show_controls, theme):
]
+def handle_default_to_notebook_change(show_two_columns, default_input, default_output, default_prompt, notebook_input, notebook_prompt):
+ if show_two_columns:
+ # Notebook to default
+ return [
+ gr.update(visible=True),
+ gr.update(visible=False),
+ notebook_input,
+ "",
+ gr.update(value=notebook_prompt, choices=utils.get_available_prompts()),
+ gr.update(),
+ gr.update(),
+ ]
+ else:
+ # Default to notebook
+ return [
+ gr.update(visible=False),
+ gr.update(visible=True),
+ gr.update(),
+ gr.update(),
+ gr.update(),
+ default_input,
+ gr.update(value=default_prompt, choices=utils.get_available_prompts())
+ ]
+
+
def set_interface_arguments(extensions, bool_active):
shared.args.extensions = extensions
diff --git a/modules/utils.py b/modules/utils.py
index 21873541..c285d401 100644
--- a/modules/utils.py
+++ b/modules/utils.py
@@ -53,7 +53,7 @@ def delete_file(fname):
def current_time():
- return f"{datetime.now().strftime('%Y-%m-%d-%H%M%S')}"
+ return f"{datetime.now().strftime('%Y-%m-%d_%Hh%Mm%Ss')}"
def atoi(text):
@@ -159,10 +159,12 @@ def get_available_presets():
def get_available_prompts():
- prompt_files = list(Path('user_data/prompts').glob('*.txt'))
+ notebook_dir = Path('user_data/logs/notebook')
+ notebook_dir.mkdir(parents=True, exist_ok=True)
+
+ prompt_files = list(notebook_dir.glob('*.txt'))
sorted_files = sorted(prompt_files, key=lambda x: x.stat().st_mtime, reverse=True)
prompts = [file.stem for file in sorted_files]
- prompts.append('None')
return prompts
diff --git a/modules/web_search.py b/modules/web_search.py
index ffd7e483..401a42bb 100644
--- a/modules/web_search.py
+++ b/modules/web_search.py
@@ -4,6 +4,7 @@ from datetime import datetime
import requests
+from modules import shared
from modules.logging_colors import logger
@@ -28,6 +29,8 @@ def download_web_page(url, timeout=10):
# Initialize the HTML to Markdown converter
h = html2text.HTML2Text()
h.body_width = 0
+ h.ignore_images = True
+ h.ignore_links = True
# Convert the HTML to Markdown
markdown_text = h.handle(response.text)
@@ -90,6 +93,22 @@ def perform_web_search(query, num_pages=3, max_workers=5):
return []
+def truncate_content_by_tokens(content, max_tokens=8192):
+ """Truncate content to fit within token limit using binary search"""
+ if len(shared.tokenizer.encode(content)) <= max_tokens:
+ return content
+
+ left, right = 0, len(content)
+ while left < right:
+ mid = (left + right + 1) // 2
+ if len(shared.tokenizer.encode(content[:mid])) <= max_tokens:
+ left = mid
+ else:
+ right = mid - 1
+
+ return content[:left]
+
+
def add_web_search_attachments(history, row_idx, user_message, search_query, state):
"""Perform web search and add results as attachments"""
if not search_query:
@@ -126,7 +145,7 @@ def add_web_search_attachments(history, row_idx, user_message, search_query, sta
"name": result['title'],
"type": "text/html",
"url": result['url'],
- "content": result['content']
+ "content": truncate_content_by_tokens(result['content'])
}
history['metadata'][key]["attachments"].append(attachment)
diff --git a/requirements/full/requirements.txt b/requirements/full/requirements.txt
index a71e5240..19e5e0fe 100644
--- a/requirements/full/requirements.txt
+++ b/requirements/full/requirements.txt
@@ -34,10 +34,10 @@ sse-starlette==1.6.5
tiktoken
# CUDA wheels
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cu124-py3-none-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cu124-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
-https://github.com/oobabooga/exllamav3/releases/download/v0.0.3/exllamav3-0.0.3+cu124.torch2.6.0-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
-https://github.com/oobabooga/exllamav3/releases/download/v0.0.3/exllamav3-0.0.3+cu124.torch2.6.0-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cu124-py3-none-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cu124-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
+https://github.com/oobabooga/exllamav3/releases/download/v0.0.4/exllamav3-0.0.4+cu124.torch2.6.0-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
+https://github.com/oobabooga/exllamav3/releases/download/v0.0.4/exllamav3-0.0.4+cu124.torch2.6.0-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1+cu124.torch2.6.0-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1+cu124.torch2.6.0-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1-py3-none-any.whl; platform_system == "Linux" and platform_machine != "x86_64"
diff --git a/requirements/full/requirements_amd.txt b/requirements/full/requirements_amd.txt
index db1ead1a..ebef87a6 100644
--- a/requirements/full/requirements_amd.txt
+++ b/requirements/full/requirements_amd.txt
@@ -33,7 +33,7 @@ sse-starlette==1.6.5
tiktoken
# AMD wheels
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+vulkan-py3-none-win_amd64.whl; platform_system == "Windows"
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+vulkan-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+vulkan-py3-none-win_amd64.whl; platform_system == "Windows"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+vulkan-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64"
https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1+rocm6.2.4.torch2.6.0-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1-py3-none-any.whl; platform_system != "Darwin" and platform_machine != "x86_64"
diff --git a/requirements/full/requirements_amd_noavx2.txt b/requirements/full/requirements_amd_noavx2.txt
index a08aa392..f1fccc93 100644
--- a/requirements/full/requirements_amd_noavx2.txt
+++ b/requirements/full/requirements_amd_noavx2.txt
@@ -33,7 +33,7 @@ sse-starlette==1.6.5
tiktoken
# AMD wheels
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+vulkanavx-py3-none-win_amd64.whl; platform_system == "Windows"
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+vulkanavx-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+vulkanavx-py3-none-win_amd64.whl; platform_system == "Windows"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+vulkanavx-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64"
https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1+rocm6.2.4.torch2.6.0-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1-py3-none-any.whl; platform_system != "Darwin" and platform_machine != "x86_64"
diff --git a/requirements/full/requirements_apple_intel.txt b/requirements/full/requirements_apple_intel.txt
index fa217c3e..734f22c7 100644
--- a/requirements/full/requirements_apple_intel.txt
+++ b/requirements/full/requirements_apple_intel.txt
@@ -33,7 +33,7 @@ sse-starlette==1.6.5
tiktoken
# Mac wheels
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0-py3-none-macosx_15_0_x86_64.whl; platform_system == "Darwin" and platform_release >= "24.0.0" and platform_release < "25.0.0" and python_version == "3.11"
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0-py3-none-macosx_14_0_x86_64.whl; platform_system == "Darwin" and platform_release >= "23.0.0" and platform_release < "24.0.0" and python_version == "3.11"
-https://github.com/oobabooga/exllamav3/releases/download/v0.0.3/exllamav3-0.0.3-py3-none-any.whl
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0-py3-none-macosx_15_0_x86_64.whl; platform_system == "Darwin" and platform_release >= "24.0.0" and platform_release < "25.0.0" and python_version == "3.11"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0-py3-none-macosx_14_0_x86_64.whl; platform_system == "Darwin" and platform_release >= "23.0.0" and platform_release < "24.0.0" and python_version == "3.11"
+https://github.com/oobabooga/exllamav3/releases/download/v0.0.4/exllamav3-0.0.4-py3-none-any.whl
https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1-py3-none-any.whl
diff --git a/requirements/full/requirements_apple_silicon.txt b/requirements/full/requirements_apple_silicon.txt
index 52581f1a..f837aade 100644
--- a/requirements/full/requirements_apple_silicon.txt
+++ b/requirements/full/requirements_apple_silicon.txt
@@ -33,8 +33,8 @@ sse-starlette==1.6.5
tiktoken
# Mac wheels
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0-py3-none-macosx_15_0_arm64.whl; platform_system == "Darwin" and platform_release >= "24.0.0" and platform_release < "25.0.0" and python_version == "3.11"
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0-py3-none-macosx_14_0_arm64.whl; platform_system == "Darwin" and platform_release >= "23.0.0" and platform_release < "24.0.0" and python_version == "3.11"
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0-py3-none-macosx_13_0_arm64.whl; platform_system == "Darwin" and platform_release >= "22.0.0" and platform_release < "23.0.0" and python_version == "3.11"
-https://github.com/oobabooga/exllamav3/releases/download/v0.0.3/exllamav3-0.0.3-py3-none-any.whl
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0-py3-none-macosx_15_0_arm64.whl; platform_system == "Darwin" and platform_release >= "24.0.0" and platform_release < "25.0.0" and python_version == "3.11"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0-py3-none-macosx_14_0_arm64.whl; platform_system == "Darwin" and platform_release >= "23.0.0" and platform_release < "24.0.0" and python_version == "3.11"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0-py3-none-macosx_13_0_arm64.whl; platform_system == "Darwin" and platform_release >= "22.0.0" and platform_release < "23.0.0" and python_version == "3.11"
+https://github.com/oobabooga/exllamav3/releases/download/v0.0.4/exllamav3-0.0.4-py3-none-any.whl
https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1-py3-none-any.whl
diff --git a/requirements/full/requirements_cpu_only.txt b/requirements/full/requirements_cpu_only.txt
index b72f22aa..9ec8a720 100644
--- a/requirements/full/requirements_cpu_only.txt
+++ b/requirements/full/requirements_cpu_only.txt
@@ -33,5 +33,5 @@ sse-starlette==1.6.5
tiktoken
# llama.cpp (CPU only, AVX2)
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cpuavx2-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cpuavx2-py3-none-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cpuavx2-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cpuavx2-py3-none-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
diff --git a/requirements/full/requirements_cpu_only_noavx2.txt b/requirements/full/requirements_cpu_only_noavx2.txt
index e8de6057..3a3fcde9 100644
--- a/requirements/full/requirements_cpu_only_noavx2.txt
+++ b/requirements/full/requirements_cpu_only_noavx2.txt
@@ -33,5 +33,5 @@ sse-starlette==1.6.5
tiktoken
# llama.cpp (CPU only, no AVX2)
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cpuavx-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cpuavx-py3-none-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cpuavx-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cpuavx-py3-none-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
diff --git a/requirements/full/requirements_cuda128.txt b/requirements/full/requirements_cuda128.txt
index 7851041f..84ffa327 100644
--- a/requirements/full/requirements_cuda128.txt
+++ b/requirements/full/requirements_cuda128.txt
@@ -34,10 +34,10 @@ sse-starlette==1.6.5
tiktoken
# CUDA wheels
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cu124-py3-none-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cu124-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
-https://github.com/turboderp-org/exllamav3/releases/download/v0.0.3/exllamav3-0.0.3+cu128.torch2.7.0-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
-https://github.com/turboderp-org/exllamav3/releases/download/v0.0.3/exllamav3-0.0.3+cu128.torch2.7.0-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cu124-py3-none-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cu124-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
+https://github.com/turboderp-org/exllamav3/releases/download/v0.0.4/exllamav3-0.0.4+cu128.torch2.7.0-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
+https://github.com/turboderp-org/exllamav3/releases/download/v0.0.4/exllamav3-0.0.4+cu128.torch2.7.0-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1+cu128.torch2.7.0-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1+cu128.torch2.7.0-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1-py3-none-any.whl; platform_system == "Linux" and platform_machine != "x86_64"
diff --git a/requirements/full/requirements_cuda128_noavx2.txt b/requirements/full/requirements_cuda128_noavx2.txt
index c8015166..da995438 100644
--- a/requirements/full/requirements_cuda128_noavx2.txt
+++ b/requirements/full/requirements_cuda128_noavx2.txt
@@ -34,10 +34,10 @@ sse-starlette==1.6.5
tiktoken
# CUDA wheels
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cu124avx-py3-none-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cu124avx-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
-https://github.com/turboderp-org/exllamav3/releases/download/v0.0.3/exllamav3-0.0.3+cu128.torch2.7.0-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
-https://github.com/turboderp-org/exllamav3/releases/download/v0.0.3/exllamav3-0.0.3+cu128.torch2.7.0-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cu124avx-py3-none-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cu124avx-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
+https://github.com/turboderp-org/exllamav3/releases/download/v0.0.4/exllamav3-0.0.4+cu128.torch2.7.0-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
+https://github.com/turboderp-org/exllamav3/releases/download/v0.0.4/exllamav3-0.0.4+cu128.torch2.7.0-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1+cu128.torch2.7.0-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1+cu128.torch2.7.0-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1-py3-none-any.whl; platform_system == "Linux" and platform_machine != "x86_64"
diff --git a/requirements/full/requirements_noavx2.txt b/requirements/full/requirements_noavx2.txt
index 5e81ce1f..e68e8187 100644
--- a/requirements/full/requirements_noavx2.txt
+++ b/requirements/full/requirements_noavx2.txt
@@ -34,10 +34,10 @@ sse-starlette==1.6.5
tiktoken
# CUDA wheels
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cu124avx-py3-none-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cu124avx-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
-https://github.com/oobabooga/exllamav3/releases/download/v0.0.3/exllamav3-0.0.3+cu124.torch2.6.0-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
-https://github.com/oobabooga/exllamav3/releases/download/v0.0.3/exllamav3-0.0.3+cu124.torch2.6.0-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cu124avx-py3-none-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cu124avx-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
+https://github.com/oobabooga/exllamav3/releases/download/v0.0.4/exllamav3-0.0.4+cu124.torch2.6.0-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
+https://github.com/oobabooga/exllamav3/releases/download/v0.0.4/exllamav3-0.0.4+cu124.torch2.6.0-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1+cu124.torch2.6.0-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11"
https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1+cu124.torch2.6.0-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"
https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1-py3-none-any.whl; platform_system == "Linux" and platform_machine != "x86_64"
diff --git a/requirements/portable/requirements.txt b/requirements/portable/requirements.txt
index 4ddcf43f..f596675c 100644
--- a/requirements/portable/requirements.txt
+++ b/requirements/portable/requirements.txt
@@ -19,5 +19,5 @@ sse-starlette==1.6.5
tiktoken
# CUDA wheels
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cu124-py3-none-win_amd64.whl; platform_system == "Windows"
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cu124-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cu124-py3-none-win_amd64.whl; platform_system == "Windows"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cu124-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64"
diff --git a/requirements/portable/requirements_apple_intel.txt b/requirements/portable/requirements_apple_intel.txt
index 38a21618..e472e428 100644
--- a/requirements/portable/requirements_apple_intel.txt
+++ b/requirements/portable/requirements_apple_intel.txt
@@ -19,5 +19,5 @@ sse-starlette==1.6.5
tiktoken
# Mac wheels
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0-py3-none-macosx_15_0_x86_64.whl; platform_system == "Darwin" and platform_release >= "24.0.0" and platform_release < "25.0.0"
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0-py3-none-macosx_14_0_x86_64.whl; platform_system == "Darwin" and platform_release >= "23.0.0" and platform_release < "24.0.0"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0-py3-none-macosx_15_0_x86_64.whl; platform_system == "Darwin" and platform_release >= "24.0.0" and platform_release < "25.0.0"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0-py3-none-macosx_14_0_x86_64.whl; platform_system == "Darwin" and platform_release >= "23.0.0" and platform_release < "24.0.0"
diff --git a/requirements/portable/requirements_apple_silicon.txt b/requirements/portable/requirements_apple_silicon.txt
index 0b70c800..b60eccf5 100644
--- a/requirements/portable/requirements_apple_silicon.txt
+++ b/requirements/portable/requirements_apple_silicon.txt
@@ -19,6 +19,6 @@ sse-starlette==1.6.5
tiktoken
# Mac wheels
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0-py3-none-macosx_15_0_arm64.whl; platform_system == "Darwin" and platform_release >= "24.0.0" and platform_release < "25.0.0"
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0-py3-none-macosx_14_0_arm64.whl; platform_system == "Darwin" and platform_release >= "23.0.0" and platform_release < "24.0.0"
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0-py3-none-macosx_13_0_arm64.whl; platform_system == "Darwin" and platform_release >= "22.0.0" and platform_release < "23.0.0"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0-py3-none-macosx_15_0_arm64.whl; platform_system == "Darwin" and platform_release >= "24.0.0" and platform_release < "25.0.0"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0-py3-none-macosx_14_0_arm64.whl; platform_system == "Darwin" and platform_release >= "23.0.0" and platform_release < "24.0.0"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0-py3-none-macosx_13_0_arm64.whl; platform_system == "Darwin" and platform_release >= "22.0.0" and platform_release < "23.0.0"
diff --git a/requirements/portable/requirements_cpu_only.txt b/requirements/portable/requirements_cpu_only.txt
index 510a20f4..c6586848 100644
--- a/requirements/portable/requirements_cpu_only.txt
+++ b/requirements/portable/requirements_cpu_only.txt
@@ -19,5 +19,5 @@ sse-starlette==1.6.5
tiktoken
# llama.cpp (CPU only, AVX2)
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cpuavx2-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64"
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cpuavx2-py3-none-win_amd64.whl; platform_system == "Windows"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cpuavx2-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cpuavx2-py3-none-win_amd64.whl; platform_system == "Windows"
diff --git a/requirements/portable/requirements_cpu_only_noavx2.txt b/requirements/portable/requirements_cpu_only_noavx2.txt
index e6d9f0c5..d0f113a7 100644
--- a/requirements/portable/requirements_cpu_only_noavx2.txt
+++ b/requirements/portable/requirements_cpu_only_noavx2.txt
@@ -19,5 +19,5 @@ sse-starlette==1.6.5
tiktoken
# llama.cpp (CPU only, no AVX2)
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cpuavx-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64"
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cpuavx-py3-none-win_amd64.whl; platform_system == "Windows"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cpuavx-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cpuavx-py3-none-win_amd64.whl; platform_system == "Windows"
diff --git a/requirements/portable/requirements_noavx2.txt b/requirements/portable/requirements_noavx2.txt
index 48f92e0a..df1c5762 100644
--- a/requirements/portable/requirements_noavx2.txt
+++ b/requirements/portable/requirements_noavx2.txt
@@ -19,5 +19,5 @@ sse-starlette==1.6.5
tiktoken
# CUDA wheels
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cu124avx-py3-none-win_amd64.whl; platform_system == "Windows"
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cu124avx-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cu124avx-py3-none-win_amd64.whl; platform_system == "Windows"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cu124avx-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64"
diff --git a/requirements/portable/requirements_vulkan.txt b/requirements/portable/requirements_vulkan.txt
index 9f93424f..2da3a81a 100644
--- a/requirements/portable/requirements_vulkan.txt
+++ b/requirements/portable/requirements_vulkan.txt
@@ -19,5 +19,5 @@ sse-starlette==1.6.5
tiktoken
# CUDA wheels
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+vulkan-py3-none-win_amd64.whl; platform_system == "Windows"
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+vulkan-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+vulkan-py3-none-win_amd64.whl; platform_system == "Windows"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+vulkan-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64"
diff --git a/requirements/portable/requirements_vulkan_noavx2.txt b/requirements/portable/requirements_vulkan_noavx2.txt
index 9070b9a6..f53432d8 100644
--- a/requirements/portable/requirements_vulkan_noavx2.txt
+++ b/requirements/portable/requirements_vulkan_noavx2.txt
@@ -19,5 +19,5 @@ sse-starlette==1.6.5
tiktoken
# CUDA wheels
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+vulkanavx-py3-none-win_amd64.whl; platform_system == "Windows"
-https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+vulkanavx-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+vulkanavx-py3-none-win_amd64.whl; platform_system == "Windows"
+https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+vulkanavx-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64"
diff --git a/server.py b/server.py
index cfb21a6e..7ce3c208 100644
--- a/server.py
+++ b/server.py
@@ -33,7 +33,6 @@ import matplotlib
matplotlib.use('Agg') # This fixes LaTeX rendering on some systems
-import json
import os
import signal
import sys
@@ -144,12 +143,16 @@ def create_interface():
# Temporary clipboard for saving files
shared.gradio['temporary_text'] = gr.Textbox(visible=False)
- # Text Generation tab
+ # Chat tab
ui_chat.create_ui()
- ui_default.create_ui()
- ui_notebook.create_ui()
+
+ # Notebook tab
+ with gr.Tab("Notebook", elem_id='notebook-parent-tab'):
+ ui_default.create_ui()
+ ui_notebook.create_ui()
ui_parameters.create_ui() # Parameters tab
+ ui_chat.create_character_settings_ui() # Character tab
ui_model_menu.create_ui() # Model tab
if not shared.args.portable:
training.create_ui() # Training tab
diff --git a/user_data/prompts/Alpaca-with-Input.txt b/user_data/prompts/Alpaca-with-Input.txt
deleted file mode 100644
index 56df0e28..00000000
--- a/user_data/prompts/Alpaca-with-Input.txt
+++ /dev/null
@@ -1,10 +0,0 @@
-Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
-
-### Instruction:
-Instruction
-
-### Input:
-Input
-
-### Response:
-
diff --git a/user_data/prompts/QA.txt b/user_data/prompts/QA.txt
deleted file mode 100644
index 32b0e235..00000000
--- a/user_data/prompts/QA.txt
+++ /dev/null
@@ -1,4 +0,0 @@
-Common sense questions and answers
-
-Question:
-Factual answer: