diff --git a/.github/ISSUE_TEMPLATE/bug_report_template.yml b/.github/ISSUE_TEMPLATE/bug_report_template.yml index bd30a0c9..ad22b656 100644 --- a/.github/ISSUE_TEMPLATE/bug_report_template.yml +++ b/.github/ISSUE_TEMPLATE/bug_report_template.yml @@ -46,7 +46,7 @@ body: id: system-info attributes: label: System Info - description: "Please share your system info with us: operating system, GPU brand, and GPU model. If you are using a Google Colab notebook, mention that instead." + description: "Please share your operating system and GPU type (NVIDIA/AMD/Intel/Apple). If you are using a Google Colab notebook, mention that instead." render: shell placeholder: validations: diff --git a/README.md b/README.md index 45ab48eb..6e7c05b1 100644 --- a/README.md +++ b/README.md @@ -24,20 +24,24 @@ Its goal is to become the [AUTOMATIC1111/stable-diffusion-webui](https://github. - Multiple sampling parameters and generation options for sophisticated text generation control. - Switch between different models in the UI without restarting. - Automatic GPU layers for GGUF models (on NVIDIA GPUs). -- Free-form text generation in the Default/Notebook tabs without being limited to chat turns. +- Free-form text generation in the Notebook tab without being limited to chat turns. - OpenAI-compatible API with Chat and Completions endpoints, including tool-calling support – see [examples](https://github.com/oobabooga/text-generation-webui/wiki/12-%E2%80%90-OpenAI-API#examples). - Extension support, with numerous built-in and user-contributed extensions available. See the [wiki](https://github.com/oobabooga/text-generation-webui/wiki/07-%E2%80%90-Extensions) and [extensions directory](https://github.com/oobabooga/text-generation-webui-extensions) for details. ## How to install -#### Option 1: Portable builds (start here) +#### Option 1: Portable builds (get started in 1 minute) -No installation needed – just unzip and run. Compatible with GGUF (llama.cpp) models on Windows, Linux, and macOS. +No installation needed – just download, unzip and run. All dependencies included. -Download from: https://github.com/oobabooga/text-generation-webui/releases +Compatible with GGUF (llama.cpp) models on Windows, Linux, and macOS. + +Download from here: https://github.com/oobabooga/text-generation-webui/releases #### Option 2: One-click installer +For users who need additional backends (ExLlamaV3, Transformers) or extensions (TTS, voice input, translation, etc). Requires ~10GB disk space and downloads PyTorch. + 1. Clone the repository, or [download its source code](https://github.com/oobabooga/text-generation-webui/archive/refs/heads/main.zip) and extract it. 2. Run the startup script for your OS: `start_windows.bat`, `start_linux.sh`, or `start_macos.sh`. 3. When prompted, select your GPU vendor. @@ -150,21 +154,21 @@ The `requirements*.txt` above contain various wheels precompiled through GitHub ``` For NVIDIA GPU: ln -s docker/{nvidia/Dockerfile,nvidia/docker-compose.yml,.dockerignore} . -For AMD GPU: +For AMD GPU: ln -s docker/{amd/Dockerfile,amd/docker-compose.yml,.dockerignore} . For Intel GPU: ln -s docker/{intel/Dockerfile,amd/docker-compose.yml,.dockerignore} . For CPU only ln -s docker/{cpu/Dockerfile,cpu/docker-compose.yml,.dockerignore} . cp docker/.env.example .env -#Create logs/cache dir : +#Create logs/cache dir : mkdir -p user_data/logs user_data/cache -# Edit .env and set: +# Edit .env and set: # TORCH_CUDA_ARCH_LIST based on your GPU model # APP_RUNTIME_GID your host user's group id (run `id -g` in a terminal) # BUILD_EXTENIONS optionally add comma separated list of extensions to build # Edit user_data/CMD_FLAGS.txt and add in it the options you want to execute (like --listen --cpu) -# +# docker compose up --build ``` @@ -188,7 +192,7 @@ List of command-line flags ```txt -usage: server.py [-h] [--multi-user] [--character CHARACTER] [--model MODEL] [--lora LORA [LORA ...]] [--model-dir MODEL_DIR] [--lora-dir LORA_DIR] [--model-menu] [--settings SETTINGS] +usage: server.py [-h] [--multi-user] [--model MODEL] [--lora LORA [LORA ...]] [--model-dir MODEL_DIR] [--lora-dir LORA_DIR] [--model-menu] [--settings SETTINGS] [--extensions EXTENSIONS [EXTENSIONS ...]] [--verbose] [--idle-timeout IDLE_TIMEOUT] [--loader LOADER] [--cpu] [--cpu-memory CPU_MEMORY] [--disk] [--disk-cache-dir DISK_CACHE_DIR] [--load-in-8bit] [--bf16] [--no-cache] [--trust-remote-code] [--force-safetensors] [--no_use_fast] [--use_flash_attention_2] [--use_eager_attention] [--torch-compile] [--load-in-4bit] [--use_double_quant] [--compute_dtype COMPUTE_DTYPE] [--quant_type QUANT_TYPE] [--flash-attn] [--threads THREADS] [--threads-batch THREADS_BATCH] [--batch-size BATCH_SIZE] [--no-mmap] @@ -207,7 +211,6 @@ options: Basic settings: --multi-user Multi-user mode. Chat histories are not saved or automatically loaded. Warning: this is likely not safe for sharing publicly. - --character CHARACTER The name of the character to load in chat mode by default. --model MODEL Name of the model to load by default. --lora LORA [LORA ...] The list of LoRAs to load. If you want to load more than one LoRA, write the names separated by spaces. --model-dir MODEL_DIR Path to directory with all the models. diff --git a/css/chat_style-wpp.css b/css/chat_style-wpp.css index 353201c2..b2ac4d2a 100644 --- a/css/chat_style-wpp.css +++ b/css/chat_style-wpp.css @@ -1,57 +1,105 @@ .message { - padding-bottom: 22px; - padding-top: 3px; + display: block; + padding-top: 0; + padding-bottom: 21px; font-size: 15px; font-family: 'Noto Sans', Helvetica, Arial, sans-serif; line-height: 1.428571429; + grid-template-columns: none; } -.text-you { +.circle-you, .circle-bot { + display: none; +} + +.text { + max-width: 65%; + border-radius: 18px; + padding: 12px 16px; + margin-bottom: 8px; + clear: both; + box-shadow: 0 1px 2px rgb(0 0 0 / 10%); +} + +.username { + font-weight: 600; + margin-bottom: 8px; + opacity: 0.65; + padding-left: 0; +} + +/* User messages - right aligned, WhatsApp green */ +.circle-you + .text { background-color: #d9fdd3; - border-radius: 15px; - padding: 10px; - padding-top: 5px; float: right; + margin-left: auto; + margin-right: 8px; } -.text-bot { - background-color: #f2f2f2; - border-radius: 15px; - padding: 10px; - padding-top: 5px; +.circle-you + .text .username { + display: none; } -.dark .text-you { - background-color: #005c4b; - color: #111b21; +/* Bot messages - left aligned, white */ +.circle-bot + .text { + background-color: #fff; + float: left; + margin-right: auto; + margin-left: 8px; + border: 1px solid #e5e5e5; } -.dark .text-bot { - background-color: #1f2937; - color: #111b21; +.circle-bot + .text .message-actions { + bottom: -25px !important; } -.text-bot p, .text-you p { - margin-top: 5px; +/* Dark theme colors */ +.dark .circle-you + .text { + background-color: #144d37; + color: #e4e6ea; + box-shadow: 0 1px 2px rgb(0 0 0 / 30%); +} + +.dark .circle-bot + .text { + background-color: #202c33; + color: #e4e6ea; + border: 1px solid #3c4043; + box-shadow: 0 1px 2px rgb(0 0 0 / 30%); +} + +.dark .username { + opacity: 0.7; } .message-body img { max-width: 300px; max-height: 300px; - border-radius: 20px; + border-radius: 12px; } .message-body p { - margin-bottom: 0 !important; font-size: 15px !important; - line-height: 1.428571429 !important; - font-weight: 500; + line-height: 1.4 !important; + font-weight: 400; +} + +.message-body p:first-child { + margin-top: 0 !important; } .dark .message-body p em { - color: rgb(138 138 138) !important; + color: rgb(170 170 170) !important; } .message-body p em { - color: rgb(110 110 110) !important; + color: rgb(100 100 100) !important; +} + +/* Message actions positioning */ +.message-actions { + margin-top: 8px; +} + +.message-body p, .chat .message-body ul, .chat .message-body ol { + margin-bottom: 10px !important; } diff --git a/css/main.css b/css/main.css index a22fdd95..bc59f833 100644 --- a/css/main.css +++ b/css/main.css @@ -97,11 +97,11 @@ ol li p, ul li p { display: inline-block; } -#chat-tab, #default-tab, #notebook-tab, #parameters, #chat-settings, #lora, #training-tab, #model-tab, #session-tab { +#notebook-parent-tab, #chat-tab, #parameters, #chat-settings, #lora, #training-tab, #model-tab, #session-tab, #character-tab { border: 0; } -#default-tab, #notebook-tab, #parameters, #chat-settings, #lora, #training-tab, #model-tab, #session-tab { +#notebook-parent-tab, #parameters, #chat-settings, #lora, #training-tab, #model-tab, #session-tab, #character-tab { padding: 1rem; } @@ -167,15 +167,15 @@ gradio-app > :first-child { } .textbox_default textarea { - height: calc(100dvh - 201px); + height: calc(100dvh - 202px); } .textbox_default_output textarea { - height: calc(100dvh - 117px); + height: calc(100dvh - 118px); } .textbox textarea { - height: calc(100dvh - 172px); + height: calc(100dvh - 145px) } .textbox_logits textarea { @@ -307,7 +307,7 @@ audio { } #notebook-token-counter { - top: calc(100dvh - 171px) !important; + top: calc(100dvh - 172px) !important; } #default-token-counter span, #notebook-token-counter span { @@ -421,6 +421,7 @@ div.svelte-362y77>*, div.svelte-362y77>.form>* { text-align: start; padding-left: 1rem; padding-right: 1rem; + contain: layout; } .chat .message .timestamp { @@ -905,6 +906,10 @@ div.svelte-362y77>*, div.svelte-362y77>.form>* { flex-shrink: 1; } +#search_chat { + padding-right: 0.5rem; +} + #search_chat > :nth-child(2) > :first-child { display: none; } @@ -925,7 +930,7 @@ div.svelte-362y77>*, div.svelte-362y77>.form>* { position: fixed; bottom: 0; left: 0; - width: calc(100vw / 2 - 600px); + width: calc(0.5 * (100vw - min(100vw, 48rem) - (120px - var(--header-width)))); z-index: 10000; } @@ -1020,12 +1025,14 @@ div.svelte-362y77>*, div.svelte-362y77>.form>* { width: 100%; justify-content: center; gap: 9px; + padding-right: 0.5rem; } #past-chats-row, #chat-controls { width: 260px; padding: 0.5rem; + padding-right: 0; height: calc(100dvh - 16px); flex-shrink: 0; box-sizing: content-box; @@ -1289,6 +1296,20 @@ div.svelte-362y77>*, div.svelte-362y77>.form>* { opacity: 1; } +/* Disable message action hover effects during generation */ +._generating .message:hover .message-actions, +._generating .user-message:hover .message-actions, +._generating .assistant-message:hover .message-actions { + opacity: 0 !important; +} + +/* Disable message action hover effects during scrolling */ +.scrolling .message:hover .message-actions, +.scrolling .user-message:hover .message-actions, +.scrolling .assistant-message:hover .message-actions { + opacity: 0 !important; +} + .footer-button svg { stroke: rgb(156 163 175); transition: stroke 0.2s; @@ -1625,7 +1646,27 @@ button:focus { display: none; } -/* Disable hover effects while scrolling */ -.chat-parent.scrolling * { - pointer-events: none !important; +#character-context textarea { + height: calc((100vh - 350px) * 2/3) !important; + min-height: 90px !important; +} + +#character-greeting textarea { + height: calc((100vh - 350px) * 1/3) !important; + min-height: 90px !important; +} + +#user-description textarea { + height: calc(100vh - 231px) !important; + min-height: 90px !important; +} + +#instruction-template-str textarea, +#chat-template-str textarea { + height: calc(100vh - 300px) !important; + min-height: 90px !important; +} + +#textbox-notebook span { + display: none; } diff --git a/docs/12 - OpenAI API.md b/docs/12 - OpenAI API.md index db9befed..ec999397 100644 --- a/docs/12 - OpenAI API.md +++ b/docs/12 - OpenAI API.md @@ -1,6 +1,6 @@ ## OpenAI compatible API -The main API for this project is meant to be a drop-in replacement to the OpenAI API, including Chat and Completions endpoints. +The main API for this project is meant to be a drop-in replacement to the OpenAI API, including Chat and Completions endpoints. * It is 100% offline and private. * It doesn't create any logs. @@ -30,10 +30,10 @@ curl http://127.0.0.1:5000/v1/completions \ -H "Content-Type: application/json" \ -d '{ "prompt": "This is a cake recipe:\n\n1.", - "max_tokens": 200, - "temperature": 1, - "top_p": 0.9, - "seed": 10 + "max_tokens": 512, + "temperature": 0.6, + "top_p": 0.95, + "top_k": 20 }' ``` @@ -51,7 +51,9 @@ curl http://127.0.0.1:5000/v1/chat/completions \ "content": "Hello!" } ], - "mode": "instruct" + "temperature": 0.6, + "top_p": 0.95, + "top_k": 20 }' ``` @@ -67,8 +69,11 @@ curl http://127.0.0.1:5000/v1/chat/completions \ "content": "Hello! Who are you?" } ], - "mode": "chat", - "character": "Example" + "mode": "chat-instruct", + "character": "Example", + "temperature": 0.6, + "top_p": 0.95, + "top_k": 20 }' ``` @@ -84,7 +89,9 @@ curl http://127.0.0.1:5000/v1/chat/completions \ "content": "Hello!" } ], - "mode": "instruct", + "temperature": 0.6, + "top_p": 0.95, + "top_k": 20, "stream": true }' ``` @@ -125,10 +132,11 @@ curl -k http://127.0.0.1:5000/v1/internal/model/list \ curl -k http://127.0.0.1:5000/v1/internal/model/load \ -H "Content-Type: application/json" \ -d '{ - "model_name": "model_name", + "model_name": "Qwen_Qwen3-0.6B-Q4_K_M.gguf", "args": { - "load_in_4bit": true, - "n_gpu_layers": 12 + "ctx_size": 32768, + "flash_attn": true, + "cache_type": "q8_0" } }' ``` @@ -150,9 +158,10 @@ while True: user_message = input("> ") history.append({"role": "user", "content": user_message}) data = { - "mode": "chat", - "character": "Example", - "messages": history + "messages": history, + "temperature": 0.6, + "top_p": 0.95, + "top_k": 20 } response = requests.post(url, headers=headers, json=data, verify=False) @@ -182,9 +191,11 @@ while True: user_message = input("> ") history.append({"role": "user", "content": user_message}) data = { - "mode": "instruct", "stream": True, - "messages": history + "messages": history, + "temperature": 0.6, + "top_p": 0.95, + "top_k": 20 } stream_response = requests.post(url, headers=headers, json=data, verify=False, stream=True) @@ -218,10 +229,10 @@ headers = { data = { "prompt": "This is a cake recipe:\n\n1.", - "max_tokens": 200, - "temperature": 1, - "top_p": 0.9, - "seed": 10, + "max_tokens": 512, + "temperature": 0.6, + "top_p": 0.95, + "top_k": 20, "stream": True, } diff --git a/extensions/openai/models.py b/extensions/openai/models.py index a7e67df6..f8d9a1e8 100644 --- a/extensions/openai/models.py +++ b/extensions/openai/models.py @@ -18,19 +18,6 @@ def list_models(): return {'model_names': get_available_models()[1:]} -def list_dummy_models(): - result = { - "object": "list", - "data": [] - } - - # these are expected by so much, so include some here as a dummy - for model in ['gpt-3.5-turbo', 'text-embedding-ada-002']: - result["data"].append(model_info_dict(model)) - - return result - - def model_info_dict(model_name: str) -> dict: return { "id": model_name, diff --git a/extensions/openai/script.py b/extensions/openai/script.py index 24bcd69d..3d8d5f73 100644 --- a/extensions/openai/script.py +++ b/extensions/openai/script.py @@ -180,7 +180,7 @@ async def handle_models(request: Request): is_list = request.url.path.split('?')[0].split('#')[0] == '/v1/models' if is_list: - response = OAImodels.list_dummy_models() + response = OAImodels.list_models() else: model_name = path[len('/v1/models/'):] response = OAImodels.model_info_dict(model_name) diff --git a/extensions/openai/typing.py b/extensions/openai/typing.py index b28ebb4e..6643ed16 100644 --- a/extensions/openai/typing.py +++ b/extensions/openai/typing.py @@ -158,7 +158,7 @@ class ChatCompletionRequestParams(BaseModel): user_bio: str | None = Field(default=None, description="The user description/personality.") chat_template_str: str | None = Field(default=None, description="Jinja2 template for chat.") - chat_instruct_command: str | None = None + chat_instruct_command: str | None = "Continue the chat dialogue below. Write a single reply for the character \"<|character|>\".\n\n<|prompt|>" continue_: bool = Field(default=False, description="Makes the last bot message in the history be continued instead of starting a new message.") diff --git a/js/main.js b/js/main.js index e970884d..3ff4bf06 100644 --- a/js/main.js +++ b/js/main.js @@ -170,6 +170,13 @@ targetElement.addEventListener("scroll", function() { // Create a MutationObserver instance const observer = new MutationObserver(function(mutations) { + // Check if this is just the scrolling class being toggled + const isScrollingClassOnly = mutations.every(mutation => + mutation.type === "attributes" && + mutation.attributeName === "class" && + mutation.target === targetElement + ); + if (targetElement.classList.contains("_generating")) { typing.parentNode.classList.add("visible-dots"); document.getElementById("stop").style.display = "flex"; @@ -182,7 +189,7 @@ const observer = new MutationObserver(function(mutations) { doSyntaxHighlighting(); - if (!window.isScrolled && targetElement.scrollTop !== targetElement.scrollHeight) { + if (!window.isScrolled && !isScrollingClassOnly && targetElement.scrollTop !== targetElement.scrollHeight) { targetElement.scrollTop = targetElement.scrollHeight; } @@ -231,8 +238,15 @@ function doSyntaxHighlighting() { if (messageBodies.length > 0) { observer.disconnect(); - messageBodies.forEach((messageBody) => { + let hasSeenVisible = false; + + // Go from last message to first + for (let i = messageBodies.length - 1; i >= 0; i--) { + const messageBody = messageBodies[i]; + if (isElementVisibleOnScreen(messageBody)) { + hasSeenVisible = true; + // Handle both code and math in a single pass through each message const codeBlocks = messageBody.querySelectorAll("pre code:not([data-highlighted])"); codeBlocks.forEach((codeBlock) => { @@ -249,8 +263,12 @@ function doSyntaxHighlighting() { { left: "\\[", right: "\\]", display: true }, ], }); + } else if (hasSeenVisible) { + // We've seen visible messages but this one is not visible + // Since we're going from last to first, we can break + break; } - }); + } observer.observe(targetElement, config); } @@ -777,11 +795,43 @@ initializeSidebars(); // Add click event listeners to toggle buttons pastChatsToggle.addEventListener("click", () => { + const isCurrentlyOpen = !pastChatsRow.classList.contains("sidebar-hidden"); toggleSidebar(pastChatsRow, pastChatsToggle); + + // On desktop, open/close both sidebars at the same time + if (!isMobile()) { + if (isCurrentlyOpen) { + // If we just closed the left sidebar, also close the right sidebar + if (!chatControlsRow.classList.contains("sidebar-hidden")) { + toggleSidebar(chatControlsRow, chatControlsToggle, true); + } + } else { + // If we just opened the left sidebar, also open the right sidebar + if (chatControlsRow.classList.contains("sidebar-hidden")) { + toggleSidebar(chatControlsRow, chatControlsToggle, false); + } + } + } }); chatControlsToggle.addEventListener("click", () => { + const isCurrentlyOpen = !chatControlsRow.classList.contains("sidebar-hidden"); toggleSidebar(chatControlsRow, chatControlsToggle); + + // On desktop, open/close both sidebars at the same time + if (!isMobile()) { + if (isCurrentlyOpen) { + // If we just closed the right sidebar, also close the left sidebar + if (!pastChatsRow.classList.contains("sidebar-hidden")) { + toggleSidebar(pastChatsRow, pastChatsToggle, true); + } + } else { + // If we just opened the right sidebar, also open the left sidebar + if (pastChatsRow.classList.contains("sidebar-hidden")) { + toggleSidebar(pastChatsRow, pastChatsToggle, false); + } + } + } }); navigationToggle.addEventListener("click", () => { diff --git a/js/show_controls.js b/js/show_controls.js index 1a87b52d..f974d412 100644 --- a/js/show_controls.js +++ b/js/show_controls.js @@ -1,14 +1,26 @@ -const belowChatInput = document.querySelectorAll( - "#chat-tab > div > :nth-child(1), #chat-tab > div > :nth-child(3), #chat-tab > div > :nth-child(4), #extensions" -); const chatParent = document.querySelector(".chat-parent"); function toggle_controls(value) { - if (value) { - belowChatInput.forEach(element => { - element.style.display = "inherit"; - }); + const extensions = document.querySelector("#extensions"); + if (value) { + // SHOW MODE: Click toggles to show hidden sidebars + const navToggle = document.getElementById("navigation-toggle"); + const pastChatsToggle = document.getElementById("past-chats-toggle"); + + if (navToggle && document.querySelector(".header_bar")?.classList.contains("sidebar-hidden")) { + navToggle.click(); + } + if (pastChatsToggle && document.getElementById("past-chats-row")?.classList.contains("sidebar-hidden")) { + pastChatsToggle.click(); + } + + // Show extensions only + if (extensions) { + extensions.style.display = "inherit"; + } + + // Remove bigchat classes chatParent.classList.remove("bigchat"); document.getElementById("chat-input-row").classList.remove("bigchat"); document.getElementById("chat-col").classList.remove("bigchat"); @@ -20,10 +32,23 @@ function toggle_controls(value) { } } else { - belowChatInput.forEach(element => { - element.style.display = "none"; - }); + // HIDE MODE: Click toggles to hide visible sidebars + const navToggle = document.getElementById("navigation-toggle"); + const pastChatsToggle = document.getElementById("past-chats-toggle"); + if (navToggle && !document.querySelector(".header_bar")?.classList.contains("sidebar-hidden")) { + navToggle.click(); + } + if (pastChatsToggle && !document.getElementById("past-chats-row")?.classList.contains("sidebar-hidden")) { + pastChatsToggle.click(); + } + + // Hide extensions only + if (extensions) { + extensions.style.display = "none"; + } + + // Add bigchat classes chatParent.classList.add("bigchat"); document.getElementById("chat-input-row").classList.add("bigchat"); document.getElementById("chat-col").classList.add("bigchat"); diff --git a/js/switch_tabs.js b/js/switch_tabs.js index 0564f891..7fb78aea 100644 --- a/js/switch_tabs.js +++ b/js/switch_tabs.js @@ -1,24 +1,14 @@ -let chat_tab = document.getElementById("chat-tab"); -let main_parent = chat_tab.parentNode; - function scrollToTop() { - window.scrollTo({ - top: 0, - // behavior: 'smooth' - }); + window.scrollTo({ top: 0 }); } function findButtonsByText(buttonText) { const buttons = document.getElementsByTagName("button"); const matchingButtons = []; - buttonText = buttonText.trim(); for (let i = 0; i < buttons.length; i++) { - const button = buttons[i]; - const buttonInnerText = button.textContent.trim(); - - if (buttonInnerText === buttonText) { - matchingButtons.push(button); + if (buttons[i].textContent.trim() === buttonText) { + matchingButtons.push(buttons[i]); } } @@ -26,34 +16,23 @@ function findButtonsByText(buttonText) { } function switch_to_chat() { - let chat_tab_button = main_parent.childNodes[0].childNodes[1]; - chat_tab_button.click(); - scrollToTop(); -} - -function switch_to_default() { - let default_tab_button = main_parent.childNodes[0].childNodes[5]; - default_tab_button.click(); + document.getElementById("chat-tab-button").click(); scrollToTop(); } function switch_to_notebook() { - let notebook_tab_button = main_parent.childNodes[0].childNodes[9]; - notebook_tab_button.click(); + document.getElementById("notebook-parent-tab-button").click(); findButtonsByText("Raw")[1].click(); scrollToTop(); } function switch_to_generation_parameters() { - let parameters_tab_button = main_parent.childNodes[0].childNodes[13]; - parameters_tab_button.click(); + document.getElementById("parameters-button").click(); findButtonsByText("Generation")[0].click(); scrollToTop(); } function switch_to_character() { - let parameters_tab_button = main_parent.childNodes[0].childNodes[13]; - parameters_tab_button.click(); - findButtonsByText("Character")[0].click(); + document.getElementById("character-tab-button").click(); scrollToTop(); } diff --git a/modules/chat.py b/modules/chat.py index dfc301df..9290dd62 100644 --- a/modules/chat.py +++ b/modules/chat.py @@ -217,8 +217,8 @@ def generate_chat_prompt(user_input, state, **kwargs): user_key = f"user_{row_idx}" enhanced_user_msg = user_msg - # Add attachment content if present - if user_key in metadata and "attachments" in metadata[user_key]: + # Add attachment content if present AND if past attachments are enabled + if (state.get('include_past_attachments', True) and user_key in metadata and "attachments" in metadata[user_key]): attachments_text = "" for attachment in metadata[user_key]["attachments"]: filename = attachment.get("name", "file") @@ -332,10 +332,10 @@ def generate_chat_prompt(user_input, state, **kwargs): user_message = messages[-1]['content'] # Bisect the truncation point - left, right = 0, len(user_message) - 1 + left, right = 0, len(user_message) - while right - left > 1: - mid = (left + right) // 2 + while left < right: + mid = (left + right + 1) // 2 messages[-1]['content'] = user_message[:mid] prompt = make_prompt(messages) @@ -344,7 +344,7 @@ def generate_chat_prompt(user_input, state, **kwargs): if encoded_length <= max_length: left = mid else: - right = mid + right = mid - 1 messages[-1]['content'] = user_message[:left] prompt = make_prompt(messages) @@ -353,7 +353,17 @@ def generate_chat_prompt(user_input, state, **kwargs): logger.error(f"Failed to build the chat prompt. The input is too long for the available context length.\n\nTruncation length: {state['truncation_length']}\nmax_new_tokens: {state['max_new_tokens']} (is it too high?)\nAvailable context length: {max_length}\n") raise ValueError else: - logger.warning(f"The input has been truncated. Context length: {state['truncation_length']}, max_new_tokens: {state['max_new_tokens']}, available context length: {max_length}.") + # Calculate token counts for the log message + original_user_tokens = get_encoded_length(user_message) + truncated_user_tokens = get_encoded_length(user_message[:left]) + total_context = max_length + state['max_new_tokens'] + + logger.warning( + f"User message truncated from {original_user_tokens} to {truncated_user_tokens} tokens. " + f"Context full: {max_length} input tokens ({total_context} total, {state['max_new_tokens']} for output). " + f"Increase ctx-size while loading the model to avoid truncation." + ) + break prompt = make_prompt(messages) @@ -604,6 +614,7 @@ def generate_search_query(user_message, state): search_state['max_new_tokens'] = 64 search_state['auto_max_new_tokens'] = False search_state['enable_thinking'] = False + search_state['start_with'] = "" # Generate the full prompt using existing history + augmented message formatted_prompt = generate_chat_prompt(augmented_message, search_state) @@ -1069,16 +1080,27 @@ def load_latest_history(state): ''' if shared.args.multi_user: - return start_new_chat(state) + return start_new_chat(state), None histories = find_all_histories(state) if len(histories) > 0: - history = load_history(histories[0], state['character_menu'], state['mode']) - else: - history = start_new_chat(state) + # Try to load the last visited chat for this character/mode + chat_state = load_last_chat_state() + key = get_chat_state_key(state['character_menu'], state['mode']) + last_chat_id = chat_state.get("last_chats", {}).get(key) - return history + # If we have a stored last chat and it still exists, use it + if last_chat_id and last_chat_id in histories: + unique_id = last_chat_id + else: + # Fall back to most recent (current behavior) + unique_id = histories[0] + + history = load_history(unique_id, state['character_menu'], state['mode']) + return history, unique_id + else: + return start_new_chat(state), None def load_history_after_deletion(state, idx): @@ -1110,6 +1132,42 @@ def update_character_menu_after_deletion(idx): return gr.update(choices=characters, value=characters[idx]) +def get_chat_state_key(character, mode): + """Generate a key for storing last chat state""" + if mode == 'instruct': + return 'instruct' + else: + return f"chat_{character}" + + +def load_last_chat_state(): + """Load the last chat state from file""" + state_file = Path('user_data/logs/chat_state.json') + if state_file.exists(): + try: + with open(state_file, 'r', encoding='utf-8') as f: + return json.loads(f.read()) + except: + pass + + return {"last_chats": {}} + + +def save_last_chat_state(character, mode, unique_id): + """Save the last visited chat for a character/mode""" + if shared.args.multi_user: + return + + state = load_last_chat_state() + key = get_chat_state_key(character, mode) + state["last_chats"][key] = unique_id + + state_file = Path('user_data/logs/chat_state.json') + state_file.parent.mkdir(exist_ok=True) + with open(state_file, 'w', encoding='utf-8') as f: + f.write(json.dumps(state, indent=2)) + + def load_history(unique_id, character, mode): p = get_history_file_path(unique_id, character, mode) @@ -1543,6 +1601,9 @@ def handle_unique_id_select(state): history = load_history(state['unique_id'], state['character_menu'], state['mode']) html = redraw_html(history, state['name1'], state['name2'], state['mode'], state['chat_style'], state['character_menu']) + # Save this as the last visited chat + save_last_chat_state(state['character_menu'], state['mode'], state['unique_id']) + convert_to_markdown.cache_clear() return [history, html] @@ -1743,14 +1804,14 @@ def handle_character_menu_change(state): state['greeting'] = greeting state['context'] = context - history = load_latest_history(state) + history, loaded_unique_id = load_latest_history(state) histories = find_all_histories_with_first_prompts(state) html = redraw_html(history, state['name1'], state['name2'], state['mode'], state['chat_style'], state['character_menu']) convert_to_markdown.cache_clear() if len(histories) > 0: - past_chats_update = gr.update(choices=histories, value=histories[0][1]) + past_chats_update = gr.update(choices=histories, value=loaded_unique_id or histories[0][1]) else: past_chats_update = gr.update(choices=histories) @@ -1762,7 +1823,7 @@ def handle_character_menu_change(state): picture, greeting, context, - past_chats_update, + past_chats_update ] @@ -1786,14 +1847,19 @@ def handle_character_picture_change(picture): def handle_mode_change(state): - history = load_latest_history(state) + history, loaded_unique_id = load_latest_history(state) histories = find_all_histories_with_first_prompts(state) + + # Ensure character picture cache exists + if state['mode'] in ['chat', 'chat-instruct'] and state['character_menu'] and state['character_menu'] != 'None': + generate_pfp_cache(state['character_menu']) + html = redraw_html(history, state['name1'], state['name2'], state['mode'], state['chat_style'], state['character_menu']) convert_to_markdown.cache_clear() if len(histories) > 0: - past_chats_update = gr.update(choices=histories, value=histories[0][1]) + past_chats_update = gr.update(choices=histories, value=loaded_unique_id or histories[0][1]) else: past_chats_update = gr.update(choices=histories) @@ -1852,10 +1918,16 @@ def handle_send_instruction_click(state): output = generate_chat_prompt("Input", state) - return output + if state["show_two_notebook_columns"]: + return gr.update(), output, "" + else: + return output, gr.update(), gr.update() def handle_send_chat_click(state): output = generate_chat_prompt("", state, _continue=True) - return output + if state["show_two_notebook_columns"]: + return gr.update(), output, "" + else: + return output, gr.update(), gr.update() diff --git a/modules/html_generator.py b/modules/html_generator.py index af64894e..11572fc6 100644 --- a/modules/html_generator.py +++ b/modules/html_generator.py @@ -595,64 +595,6 @@ def generate_cai_chat_html(history, name1, name2, style, character, reset_cache= return output -def generate_chat_html(history, name1, name2, reset_cache=False, last_message_only=False): - if not last_message_only: - output = f'
' - else: - output = "" - - def create_message(role, content, raw_content): - """Inner function for WPP-style messages.""" - text_class = "text-you" if role == "user" else "text-bot" - - # Get role-specific data - timestamp = format_message_timestamp(history, role, i) - attachments = format_message_attachments(history, role, i) - - # Create info button if timestamp exists - info_message = "" - if timestamp: - tooltip_text = get_message_tooltip(history, role, i) - info_message = info_button.replace('title="message"', f'title="{html.escape(tooltip_text)}"') - - return ( - f'
' - f'
' - f'
{content}
' - f'{attachments}' - f'{actions_html(history, i, role, info_message)}' - f'
' - f'
' - ) - - # Determine range - start_idx = len(history['visible']) - 1 if last_message_only else 0 - end_idx = len(history['visible']) - - for i in range(start_idx, end_idx): - row_visible = history['visible'][i] - row_internal = history['internal'][i] - - # Convert content - if last_message_only: - converted_visible = [None, convert_to_markdown_wrapped(row_visible[1], message_id=i, use_cache=i != len(history['visible']) - 1)] - else: - converted_visible = [convert_to_markdown_wrapped(entry, message_id=i, use_cache=i != len(history['visible']) - 1) for entry in row_visible] - - # Generate messages - if not last_message_only and converted_visible[0]: - output += create_message("user", converted_visible[0], row_internal[0]) - - output += create_message("assistant", converted_visible[1], row_internal[1]) - - if not last_message_only: - output += "
" - - return output - - def time_greeting(): current_hour = datetime.datetime.now().hour if 5 <= current_hour < 12: @@ -669,8 +611,6 @@ def chat_html_wrapper(history, name1, name2, mode, style, character, reset_cache result = f'
{greeting}
' elif mode == 'instruct': result = generate_instruct_html(history, last_message_only=last_message_only) - elif style == 'wpp': - result = generate_chat_html(history, name1, name2, last_message_only=last_message_only) else: result = generate_cai_chat_html(history, name1, name2, style, character, reset_cache=reset_cache, last_message_only=last_message_only) diff --git a/modules/llama_cpp_server.py b/modules/llama_cpp_server.py index a79e24e4..e64f1694 100644 --- a/modules/llama_cpp_server.py +++ b/modules/llama_cpp_server.py @@ -30,6 +30,7 @@ class LlamaServer: self.session = requests.Session() self.vocabulary_size = None self.bos_token = "" + self.last_prompt_token_count = 0 # Start the server self._start_server() @@ -128,6 +129,7 @@ class LlamaServer: payload = self.prepare_payload(state) token_ids = self.encode(prompt, add_bos_token=state["add_bos_token"]) + self.last_prompt_token_count = len(token_ids) if state['auto_max_new_tokens']: max_new_tokens = state['truncation_length'] - len(token_ids) else: diff --git a/modules/models_settings.py b/modules/models_settings.py index 283a9744..37aa37cf 100644 --- a/modules/models_settings.py +++ b/modules/models_settings.py @@ -9,6 +9,7 @@ import gradio as gr import yaml from modules import chat, loaders, metadata_gguf, shared, ui +from modules.logging_colors import logger def get_fallback_settings(): @@ -56,7 +57,13 @@ def get_model_metadata(model): if path.is_file(): model_file = path else: - model_file = list(path.glob('*.gguf'))[0] + gguf_files = list(path.glob('*.gguf')) + if not gguf_files: + error_msg = f"No .gguf models found in directory: {path}" + logger.error(error_msg) + raise FileNotFoundError(error_msg) + + model_file = gguf_files[0] metadata = load_gguf_metadata_with_cache(model_file) @@ -171,6 +178,8 @@ def infer_loader(model_name, model_settings, hf_quant_method=None): path_to_model = Path(f'{shared.args.model_dir}/{model_name}') if not path_to_model.exists(): loader = None + elif shared.args.portable: + loader = 'llama.cpp' elif len(list(path_to_model.glob('*.gguf'))) > 0: loader = 'llama.cpp' elif re.match(r'.*\.gguf', model_name.lower()): @@ -450,26 +459,19 @@ def update_gpu_layers_and_vram(loader, model, gpu_layers, ctx_size, cache_type, else: return (0, gpu_layers) if auto_adjust else 0 + # Get model settings including user preferences + model_settings = get_model_metadata(model) + current_layers = gpu_layers - max_layers = gpu_layers + max_layers = model_settings.get('max_gpu_layers', 256) if auto_adjust: - # Get model settings including user preferences - model_settings = get_model_metadata(model) - - # Get the true maximum layers - max_layers = model_settings.get('max_gpu_layers', model_settings.get('gpu_layers', gpu_layers)) - # Check if this is a user-saved setting user_config = shared.user_config model_regex = Path(model).name + '$' has_user_setting = model_regex in user_config and 'gpu_layers' in user_config[model_regex] - if has_user_setting: - # For user settings, just use the current value (which already has user pref) - # but ensure the slider maximum is correct - current_layers = gpu_layers # Already has user setting - else: + if not has_user_setting: # No user setting, auto-adjust from the maximum current_layers = max_layers # Start from max diff --git a/modules/prompts.py b/modules/prompts.py index 8f00cac2..79d9b56e 100644 --- a/modules/prompts.py +++ b/modules/prompts.py @@ -1,22 +1,33 @@ from pathlib import Path +from modules import shared, utils from modules.text_generation import get_encoded_length def load_prompt(fname): - if fname in ['None', '']: - return '' - else: - file_path = Path(f'user_data/prompts/{fname}.txt') - if not file_path.exists(): - return '' + if not fname: + # Create new file + new_name = utils.current_time() + prompt_path = Path("user_data/logs/notebook") / f"{new_name}.txt" + prompt_path.parent.mkdir(parents=True, exist_ok=True) + initial_content = "In this story," + prompt_path.write_text(initial_content, encoding='utf-8') + # Update settings to point to new file + shared.settings['prompt-notebook'] = new_name + + return initial_content + + file_path = Path(f'user_data/logs/notebook/{fname}.txt') + if file_path.exists(): with open(file_path, 'r', encoding='utf-8') as f: text = f.read() - if text[-1] == '\n': + if len(text) > 0 and text[-1] == '\n': text = text[:-1] return text + else: + return '' def count_tokens(text): diff --git a/modules/shared.py b/modules/shared.py index 83920df8..5333ec4f 100644 --- a/modules/shared.py +++ b/modules/shared.py @@ -202,8 +202,7 @@ settings = { 'chat-instruct_command': 'Continue the chat dialogue below. Write a single reply for the character "<|character|>".\n\n<|prompt|>', 'enable_web_search': False, 'web_search_pages': 3, - 'prompt-default': 'QA', - 'prompt-notebook': 'QA', + 'prompt-notebook': '', 'preset': 'Qwen3 - Thinking' if Path('user_data/presets/Qwen3 - Thinking.yaml').exists() else None, 'max_new_tokens': 512, 'max_new_tokens_min': 1, @@ -223,7 +222,9 @@ settings = { 'custom_token_bans': '', 'negative_prompt': '', 'dark_theme': True, + 'show_two_notebook_columns': False, 'paste_to_attachment': False, + 'include_past_attachments': True, # Generation parameters - Curve shape 'temperature': 0.6, diff --git a/modules/text_generation.py b/modules/text_generation.py index 55b538b0..a75141f1 100644 --- a/modules/text_generation.py +++ b/modules/text_generation.py @@ -498,8 +498,14 @@ def generate_reply_custom(question, original_question, state, stopping_strings=N traceback.print_exc() finally: t1 = time.time() - original_tokens = len(encode(original_question)[0]) - new_tokens = len(encode(original_question + reply)[0]) - original_tokens + + if hasattr(shared.model, 'last_prompt_token_count'): + original_tokens = shared.model.last_prompt_token_count + new_tokens = len(encode(reply)[0]) if reply else 0 + else: + original_tokens = len(encode(original_question)[0]) + new_tokens = len(encode(original_question + reply)[0]) - original_tokens + logger.info(f'Output generated in {(t1-t0):.2f} seconds ({new_tokens/(t1-t0):.2f} tokens/s, {new_tokens} tokens, context {original_tokens}, seed {state["seed"]})') return diff --git a/modules/ui.py b/modules/ui.py index 2925faa5..0e8afa8f 100644 --- a/modules/ui.py +++ b/modules/ui.py @@ -6,6 +6,7 @@ import gradio as gr import yaml import extensions +import modules.extensions as extensions_module from modules import shared from modules.chat import load_history from modules.utils import gradio @@ -273,7 +274,9 @@ def list_interface_input_elements(): # Other elements elements += [ - 'paste_to_attachment' + 'show_two_notebook_columns', + 'paste_to_attachment', + 'include_past_attachments', ] return elements @@ -324,8 +327,7 @@ def save_settings(state, preset, extensions_list, show_controls, theme_state, ma output[k] = state[k] output['preset'] = preset - output['prompt-default'] = state['prompt_menu-default'] - output['prompt-notebook'] = state['prompt_menu-notebook'] + output['prompt-notebook'] = state['prompt_menu-default'] if state['show_two_notebook_columns'] else state['prompt_menu-notebook'] output['character'] = state['character_menu'] output['seed'] = int(output['seed']) output['show_controls'] = show_controls @@ -333,35 +335,41 @@ def save_settings(state, preset, extensions_list, show_controls, theme_state, ma output.pop('instruction_template_str') output.pop('truncation_length') - # Only save extensions on manual save + # Handle extensions and extension parameters if manual_save: + # Save current extensions and their parameter values output['default_extensions'] = extensions_list + + for extension_name in extensions_list: + extension = getattr(extensions, extension_name, None) + if extension: + extension = extension.script + if hasattr(extension, 'params'): + params = getattr(extension, 'params') + for param in params: + _id = f"{extension_name}-{param}" + # Only save if different from default value + if param not in shared.default_settings or params[param] != shared.default_settings[param]: + output[_id] = params[param] else: - # Preserve existing extensions from settings file during autosave + # Preserve existing extensions and extension parameters during autosave settings_path = Path('user_data') / 'settings.yaml' if settings_path.exists(): try: with open(settings_path, 'r', encoding='utf-8') as f: existing_settings = yaml.safe_load(f.read()) or {} + # Preserve default_extensions if 'default_extensions' in existing_settings: output['default_extensions'] = existing_settings['default_extensions'] + + # Preserve extension parameter values + for key, value in existing_settings.items(): + if any(key.startswith(f"{ext_name}-") for ext_name in extensions_module.available_extensions): + output[key] = value except Exception: pass # If we can't read the file, just don't modify extensions - # Save extension values in the UI - for extension_name in extensions_list: - extension = getattr(extensions, extension_name, None) - if extension: - extension = extension.script - if hasattr(extension, 'params'): - params = getattr(extension, 'params') - for param in params: - _id = f"{extension_name}-{param}" - # Only save if different from default value - if param not in shared.default_settings or params[param] != shared.default_settings[param]: - output[_id] = params[param] - # Do not save unchanged settings for key in list(output.keys()): if key in shared.default_settings and output[key] == shared.default_settings[key]: @@ -497,7 +505,9 @@ def setup_auto_save(): # Session tab (ui_session.py) 'show_controls', 'theme_state', - 'paste_to_attachment' + 'show_two_notebook_columns', + 'paste_to_attachment', + 'include_past_attachments' ] for element_name in change_elements: diff --git a/modules/ui_chat.py b/modules/ui_chat.py index 3b841b8b..8a90608f 100644 --- a/modules/ui_chat.py +++ b/modules/ui_chat.py @@ -70,7 +70,6 @@ def create_ui(): shared.gradio['Impersonate'] = gr.Button('Impersonate (Ctrl + Shift + M)', elem_id='Impersonate') shared.gradio['Send dummy message'] = gr.Button('Send dummy message') shared.gradio['Send dummy reply'] = gr.Button('Send dummy reply') - shared.gradio['send-chat-to-default'] = gr.Button('Send to Default') shared.gradio['send-chat-to-notebook'] = gr.Button('Send to Notebook') shared.gradio['show_controls'] = gr.Checkbox(value=shared.settings['show_controls'], label='Show controls (Ctrl+S)', elem_id='show-controls') @@ -111,9 +110,9 @@ def create_ui(): shared.gradio['edit_message'] = gr.Button(elem_id="Edit-message") -def create_chat_settings_ui(): +def create_character_settings_ui(): mu = shared.args.multi_user - with gr.Tab('Chat'): + with gr.Tab('Character', elem_id="character-tab"): with gr.Row(): with gr.Column(scale=8): with gr.Tab("Character"): @@ -125,12 +124,12 @@ def create_chat_settings_ui(): shared.gradio['restore_character'] = gr.Button('Restore character', elem_classes='refresh-button', interactive=True, elem_id='restore-character') shared.gradio['name2'] = gr.Textbox(value=shared.settings['name2'], lines=1, label='Character\'s name') - shared.gradio['context'] = gr.Textbox(value=shared.settings['context'], lines=10, label='Context', elem_classes=['add_scrollbar']) - shared.gradio['greeting'] = gr.Textbox(value=shared.settings['greeting'], lines=5, label='Greeting', elem_classes=['add_scrollbar']) + shared.gradio['context'] = gr.Textbox(value=shared.settings['context'], lines=10, label='Context', elem_classes=['add_scrollbar'], elem_id="character-context") + shared.gradio['greeting'] = gr.Textbox(value=shared.settings['greeting'], lines=5, label='Greeting', elem_classes=['add_scrollbar'], elem_id="character-greeting") with gr.Tab("User"): shared.gradio['name1'] = gr.Textbox(value=shared.settings['name1'], lines=1, label='Name') - shared.gradio['user_bio'] = gr.Textbox(value=shared.settings['user_bio'], lines=10, label='Description', info='Here you can optionally write a description of yourself.', placeholder='{{user}}\'s personality: ...', elem_classes=['add_scrollbar']) + shared.gradio['user_bio'] = gr.Textbox(value=shared.settings['user_bio'], lines=10, label='Description', info='Here you can optionally write a description of yourself.', placeholder='{{user}}\'s personality: ...', elem_classes=['add_scrollbar'], elem_id="user-description") with gr.Tab('Chat history'): with gr.Row(): @@ -163,6 +162,9 @@ def create_chat_settings_ui(): shared.gradio['character_picture'] = gr.Image(label='Character picture', type='pil', interactive=not mu) shared.gradio['your_picture'] = gr.Image(label='Your picture', type='pil', value=Image.open(Path('user_data/cache/pfp_me.png')) if Path('user_data/cache/pfp_me.png').exists() else None, interactive=not mu) + +def create_chat_settings_ui(): + mu = shared.args.multi_user with gr.Tab('Instruction template'): with gr.Row(): with gr.Column(): @@ -178,15 +180,12 @@ def create_chat_settings_ui(): with gr.Row(): with gr.Column(): - shared.gradio['custom_system_message'] = gr.Textbox(value=shared.settings['custom_system_message'], lines=2, label='Custom system message', info='If not empty, will be used instead of the default one.', elem_classes=['add_scrollbar']) - shared.gradio['instruction_template_str'] = gr.Textbox(value=shared.settings['instruction_template_str'], label='Instruction template', lines=24, info='This gets autodetected; you usually don\'t need to change it. Used in instruct and chat-instruct modes.', elem_classes=['add_scrollbar', 'monospace']) + shared.gradio['instruction_template_str'] = gr.Textbox(value=shared.settings['instruction_template_str'], label='Instruction template', lines=24, info='This gets autodetected; you usually don\'t need to change it. Used in instruct and chat-instruct modes.', elem_classes=['add_scrollbar', 'monospace'], elem_id='instruction-template-str') with gr.Row(): - shared.gradio['send_instruction_to_default'] = gr.Button('Send to default', elem_classes=['small-button']) shared.gradio['send_instruction_to_notebook'] = gr.Button('Send to notebook', elem_classes=['small-button']) - shared.gradio['send_instruction_to_negative_prompt'] = gr.Button('Send to negative prompt', elem_classes=['small-button']) with gr.Column(): - shared.gradio['chat_template_str'] = gr.Textbox(value=shared.settings['chat_template_str'], label='Chat template', lines=22, elem_classes=['add_scrollbar', 'monospace']) + shared.gradio['chat_template_str'] = gr.Textbox(value=shared.settings['chat_template_str'], label='Chat template', lines=22, elem_classes=['add_scrollbar', 'monospace'], info='Defines how the chat prompt in chat/chat-instruct modes is generated.', elem_id='chat-template-str') def create_event_handlers(): @@ -298,7 +297,7 @@ def create_event_handlers(): shared.gradio['mode'].change( ui.gather_interface_values, gradio(shared.input_elements), gradio('interface_state')).then( chat.handle_mode_change, gradio('interface_state'), gradio('history', 'display', 'chat_style', 'chat-instruct_command', 'unique_id'), show_progress=False).then( - None, gradio('mode'), None, js="(mode) => {const characterContainer = document.getElementById('character-menu').parentNode.parentNode; const isInChatTab = document.querySelector('#chat-controls').contains(characterContainer); if (isInChatTab) { characterContainer.style.display = mode === 'instruct' ? 'none' : ''; }}") + None, gradio('mode'), None, js="(mode) => {const characterContainer = document.getElementById('character-menu').parentNode.parentNode; const isInChatTab = document.querySelector('#chat-controls').contains(characterContainer); if (isInChatTab) { characterContainer.style.display = mode === 'instruct' ? 'none' : ''; } if (mode === 'instruct') document.querySelectorAll('.bigProfilePicture').forEach(el => el.remove());}") shared.gradio['chat_style'].change(chat.redraw_html, gradio(reload_arr), gradio('display'), show_progress=False) @@ -343,29 +342,14 @@ def create_event_handlers(): ui.gather_interface_values, gradio(shared.input_elements), gradio('interface_state')).then( chat.handle_your_picture_change, gradio('your_picture', 'interface_state'), gradio('display'), show_progress=False) - shared.gradio['send_instruction_to_default'].click( - ui.gather_interface_values, gradio(shared.input_elements), gradio('interface_state')).then( - chat.handle_send_instruction_click, gradio('interface_state'), gradio('textbox-default'), show_progress=False).then( - None, None, None, js=f'() => {{{ui.switch_tabs_js}; switch_to_default()}}') - shared.gradio['send_instruction_to_notebook'].click( ui.gather_interface_values, gradio(shared.input_elements), gradio('interface_state')).then( - chat.handle_send_instruction_click, gradio('interface_state'), gradio('textbox-notebook'), show_progress=False).then( + chat.handle_send_instruction_click, gradio('interface_state'), gradio('textbox-notebook', 'textbox-default', 'output_textbox'), show_progress=False).then( None, None, None, js=f'() => {{{ui.switch_tabs_js}; switch_to_notebook()}}') - shared.gradio['send_instruction_to_negative_prompt'].click( - ui.gather_interface_values, gradio(shared.input_elements), gradio('interface_state')).then( - chat.handle_send_instruction_click, gradio('interface_state'), gradio('negative_prompt'), show_progress=False).then( - None, None, None, js=f'() => {{{ui.switch_tabs_js}; switch_to_generation_parameters()}}') - - shared.gradio['send-chat-to-default'].click( - ui.gather_interface_values, gradio(shared.input_elements), gradio('interface_state')).then( - chat.handle_send_chat_click, gradio('interface_state'), gradio('textbox-default'), show_progress=False).then( - None, None, None, js=f'() => {{{ui.switch_tabs_js}; switch_to_default()}}') - shared.gradio['send-chat-to-notebook'].click( ui.gather_interface_values, gradio(shared.input_elements), gradio('interface_state')).then( - chat.handle_send_chat_click, gradio('interface_state'), gradio('textbox-notebook'), show_progress=False).then( + chat.handle_send_chat_click, gradio('interface_state'), gradio('textbox-notebook', 'textbox-default', 'output_textbox'), show_progress=False).then( None, None, None, js=f'() => {{{ui.switch_tabs_js}; switch_to_notebook()}}') shared.gradio['show_controls'].change(None, gradio('show_controls'), None, js=f'(x) => {{{ui.show_controls_js}; toggle_controls(x)}}') diff --git a/modules/ui_default.py b/modules/ui_default.py index 8acc4b10..44af48a3 100644 --- a/modules/ui_default.py +++ b/modules/ui_default.py @@ -1,3 +1,5 @@ +from pathlib import Path + import gradio as gr from modules import logits, shared, ui, utils @@ -7,6 +9,7 @@ from modules.text_generation import ( get_token_ids, stop_everything_event ) +from modules.ui_notebook import store_notebook_state_and_debounce from modules.utils import gradio inputs = ('textbox-default', 'interface_state') @@ -15,11 +18,12 @@ outputs = ('output_textbox', 'html-default') def create_ui(): mu = shared.args.multi_user - with gr.Tab('Default', elem_id='default-tab'): + with gr.Row(visible=shared.settings['show_two_notebook_columns']) as shared.gradio['default-tab']: with gr.Row(): with gr.Column(): with gr.Row(): - shared.gradio['textbox-default'] = gr.Textbox(value=load_prompt(shared.settings['prompt-default']), lines=27, label='Input', elem_classes=['textbox_default', 'add_scrollbar']) + initial_text = load_prompt(shared.settings['prompt-notebook']) + shared.gradio['textbox-default'] = gr.Textbox(value=initial_text, lines=27, label='Input', elem_classes=['textbox_default', 'add_scrollbar']) shared.gradio['token-counter-default'] = gr.HTML(value="0", elem_id="default-token-counter") with gr.Row(): @@ -28,11 +32,21 @@ def create_ui(): shared.gradio['Generate-default'] = gr.Button('Generate', variant='primary') with gr.Row(): - shared.gradio['prompt_menu-default'] = gr.Dropdown(choices=utils.get_available_prompts(), value=shared.settings['prompt-default'], label='Prompt', elem_classes='slim-dropdown') + shared.gradio['prompt_menu-default'] = gr.Dropdown(choices=utils.get_available_prompts(), value=shared.settings['prompt-notebook'], label='Prompt', elem_classes='slim-dropdown') ui.create_refresh_button(shared.gradio['prompt_menu-default'], lambda: None, lambda: {'choices': utils.get_available_prompts()}, 'refresh-button', interactive=not mu) - shared.gradio['save_prompt-default'] = gr.Button('💾', elem_classes='refresh-button', interactive=not mu) + shared.gradio['new_prompt-default'] = gr.Button('New', elem_classes='refresh-button', interactive=not mu) + shared.gradio['rename_prompt-default'] = gr.Button('Rename', elem_classes='refresh-button', interactive=not mu) shared.gradio['delete_prompt-default'] = gr.Button('🗑️', elem_classes='refresh-button', interactive=not mu) + # Rename elements (initially hidden) + shared.gradio['rename_prompt_to-default'] = gr.Textbox(label="New name", elem_classes=['no-background'], visible=False) + shared.gradio['rename_prompt-cancel-default'] = gr.Button('Cancel', elem_classes=['refresh-button'], visible=False) + shared.gradio['rename_prompt-confirm-default'] = gr.Button('Confirm', elem_classes=['refresh-button'], variant='primary', visible=False) + + # Delete confirmation elements (initially hidden) + shared.gradio['delete_prompt-cancel-default'] = gr.Button('Cancel', elem_classes=['refresh-button'], visible=False) + shared.gradio['delete_prompt-confirm-default'] = gr.Button('Confirm', variant='stop', elem_classes=['refresh-button'], visible=False) + with gr.Column(): with gr.Tab('Raw'): shared.gradio['output_textbox'] = gr.Textbox(lines=27, label='Output', elem_id='textbox-default', elem_classes=['textbox_default_output', 'add_scrollbar']) @@ -64,7 +78,7 @@ def create_event_handlers(): shared.gradio['Generate-default'].click( ui.gather_interface_values, gradio(shared.input_elements), gradio('interface_state')).then( lambda: [gr.update(visible=True), gr.update(visible=False)], None, gradio('Stop-default', 'Generate-default')).then( - generate_reply_wrapper, gradio(inputs), gradio(outputs), show_progress=False).then( + generate_reply_wrapper, gradio('textbox-default', 'interface_state'), gradio(outputs), show_progress=False).then( lambda state, left, right: state.update({'textbox-default': left, 'output_textbox': right}), gradio('interface_state', 'textbox-default', 'output_textbox'), None).then( lambda: [gr.update(visible=False), gr.update(visible=True)], None, gradio('Stop-default', 'Generate-default')).then( None, None, None, js=f'() => {{{ui.audio_notification_js}}}') @@ -72,7 +86,7 @@ def create_event_handlers(): shared.gradio['textbox-default'].submit( ui.gather_interface_values, gradio(shared.input_elements), gradio('interface_state')).then( lambda: [gr.update(visible=True), gr.update(visible=False)], None, gradio('Stop-default', 'Generate-default')).then( - generate_reply_wrapper, gradio(inputs), gradio(outputs), show_progress=False).then( + generate_reply_wrapper, gradio('textbox-default', 'interface_state'), gradio(outputs), show_progress=False).then( lambda state, left, right: state.update({'textbox-default': left, 'output_textbox': right}), gradio('interface_state', 'textbox-default', 'output_textbox'), None).then( lambda: [gr.update(visible=False), gr.update(visible=True)], None, gradio('Stop-default', 'Generate-default')).then( None, None, None, js=f'() => {{{ui.audio_notification_js}}}') @@ -80,16 +94,60 @@ def create_event_handlers(): shared.gradio['Continue-default'].click( ui.gather_interface_values, gradio(shared.input_elements), gradio('interface_state')).then( lambda: [gr.update(visible=True), gr.update(visible=False)], None, gradio('Stop-default', 'Generate-default')).then( - generate_reply_wrapper, [shared.gradio['output_textbox']] + gradio(inputs)[1:], gradio(outputs), show_progress=False).then( + generate_reply_wrapper, gradio('output_textbox', 'interface_state'), gradio(outputs), show_progress=False).then( lambda state, left, right: state.update({'textbox-default': left, 'output_textbox': right}), gradio('interface_state', 'textbox-default', 'output_textbox'), None).then( lambda: [gr.update(visible=False), gr.update(visible=True)], None, gradio('Stop-default', 'Generate-default')).then( None, None, None, js=f'() => {{{ui.audio_notification_js}}}') shared.gradio['Stop-default'].click(stop_everything_event, None, None, queue=False) shared.gradio['markdown_render-default'].click(lambda x: x, gradio('output_textbox'), gradio('markdown-default'), queue=False) - shared.gradio['prompt_menu-default'].change(load_prompt, gradio('prompt_menu-default'), gradio('textbox-default'), show_progress=False) - shared.gradio['save_prompt-default'].click(handle_save_prompt, gradio('textbox-default'), gradio('save_contents', 'save_filename', 'save_root', 'file_saver'), show_progress=False) - shared.gradio['delete_prompt-default'].click(handle_delete_prompt, gradio('prompt_menu-default'), gradio('delete_filename', 'delete_root', 'file_deleter'), show_progress=False) + shared.gradio['prompt_menu-default'].change(lambda x: (load_prompt(x), ""), gradio('prompt_menu-default'), gradio('textbox-default', 'output_textbox'), show_progress=False) + shared.gradio['new_prompt-default'].click(handle_new_prompt, None, gradio('prompt_menu-default'), show_progress=False) + + # Input change handler to save input (reusing notebook's debounced saving) + shared.gradio['textbox-default'].change( + store_notebook_state_and_debounce, + gradio('textbox-default', 'prompt_menu-default'), + None, + show_progress=False + ) + + shared.gradio['delete_prompt-default'].click( + lambda: [gr.update(visible=False), gr.update(visible=True), gr.update(visible=True)], + None, + gradio('delete_prompt-default', 'delete_prompt-cancel-default', 'delete_prompt-confirm-default'), + show_progress=False) + + shared.gradio['delete_prompt-cancel-default'].click( + lambda: [gr.update(visible=True), gr.update(visible=False), gr.update(visible=False)], + None, + gradio('delete_prompt-default', 'delete_prompt-cancel-default', 'delete_prompt-confirm-default'), + show_progress=False) + + shared.gradio['delete_prompt-confirm-default'].click( + handle_delete_prompt_confirm_default, + gradio('prompt_menu-default'), + gradio('prompt_menu-default', 'delete_prompt-default', 'delete_prompt-cancel-default', 'delete_prompt-confirm-default'), + show_progress=False) + + shared.gradio['rename_prompt-default'].click( + handle_rename_prompt_click_default, + gradio('prompt_menu-default'), + gradio('rename_prompt_to-default', 'rename_prompt-default', 'rename_prompt-cancel-default', 'rename_prompt-confirm-default'), + show_progress=False) + + shared.gradio['rename_prompt-cancel-default'].click( + lambda: [gr.update(visible=False), gr.update(visible=True), gr.update(visible=False), gr.update(visible=False)], + None, + gradio('rename_prompt_to-default', 'rename_prompt-default', 'rename_prompt-cancel-default', 'rename_prompt-confirm-default'), + show_progress=False) + + shared.gradio['rename_prompt-confirm-default'].click( + handle_rename_prompt_confirm_default, + gradio('rename_prompt_to-default', 'prompt_menu-default'), + gradio('prompt_menu-default', 'rename_prompt_to-default', 'rename_prompt-default', 'rename_prompt-cancel-default', 'rename_prompt-confirm-default'), + show_progress=False) + shared.gradio['textbox-default'].change(lambda x: f"{count_tokens(x)}", gradio('textbox-default'), gradio('token-counter-default'), show_progress=False) shared.gradio['get_logits-default'].click( ui.gather_interface_values, gradio(shared.input_elements), gradio('interface_state')).then( @@ -98,18 +156,61 @@ def create_event_handlers(): shared.gradio['get_tokens-default'].click(get_token_ids, gradio('textbox-default'), gradio('tokens-default'), show_progress=False) -def handle_save_prompt(text): +def handle_new_prompt(): + new_name = utils.current_time() + + # Create the new prompt file + prompt_path = Path("user_data/logs/notebook") / f"{new_name}.txt" + prompt_path.parent.mkdir(parents=True, exist_ok=True) + prompt_path.write_text("In this story,", encoding='utf-8') + + return gr.update(choices=utils.get_available_prompts(), value=new_name) + + +def handle_delete_prompt_confirm_default(prompt_name): + available_prompts = utils.get_available_prompts() + current_index = available_prompts.index(prompt_name) if prompt_name in available_prompts else 0 + + (Path("user_data/logs/notebook") / f"{prompt_name}.txt").unlink(missing_ok=True) + available_prompts = utils.get_available_prompts() + + if available_prompts: + new_value = available_prompts[min(current_index, len(available_prompts) - 1)] + else: + new_value = utils.current_time() + Path("user_data/logs/notebook").mkdir(parents=True, exist_ok=True) + (Path("user_data/logs/notebook") / f"{new_value}.txt").write_text("In this story,") + available_prompts = [new_value] + return [ - text, - utils.current_time() + ".txt", - "user_data/prompts/", + gr.update(choices=available_prompts, value=new_value), + gr.update(visible=True), + gr.update(visible=False), + gr.update(visible=False) + ] + + +def handle_rename_prompt_click_default(current_name): + return [ + gr.update(value=current_name, visible=True), + gr.update(visible=False), + gr.update(visible=True), gr.update(visible=True) ] -def handle_delete_prompt(prompt): +def handle_rename_prompt_confirm_default(new_name, current_name): + old_path = Path("user_data/logs/notebook") / f"{current_name}.txt" + new_path = Path("user_data/logs/notebook") / f"{new_name}.txt" + + if old_path.exists() and not new_path.exists(): + old_path.rename(new_path) + + available_prompts = utils.get_available_prompts() return [ - prompt + ".txt", - "user_data/prompts/", - gr.update(visible=True) + gr.update(choices=available_prompts, value=new_name), + gr.update(visible=False), + gr.update(visible=True), + gr.update(visible=False), + gr.update(visible=False) ] diff --git a/modules/ui_model_menu.py b/modules/ui_model_menu.py index 9e982f0e..6b106203 100644 --- a/modules/ui_model_menu.py +++ b/modules/ui_model_menu.py @@ -135,7 +135,7 @@ def create_event_handlers(): # with the model defaults (if any), and then the model is loaded shared.gradio['model_menu'].change( ui.gather_interface_values, gradio(shared.input_elements), gradio('interface_state')).then( - handle_load_model_event_initial, gradio('model_menu', 'interface_state'), gradio(ui.list_interface_input_elements()) + gradio('interface_state'), show_progress=False).then( + handle_load_model_event_initial, gradio('model_menu', 'interface_state'), gradio(ui.list_interface_input_elements()) + gradio('interface_state') + gradio('vram_info'), show_progress=False).then( partial(load_model_wrapper, autoload=False), gradio('model_menu', 'loader'), gradio('model_status'), show_progress=True).success( handle_load_model_event_final, gradio('truncation_length', 'loader', 'interface_state'), gradio('truncation_length', 'filter_by_loader'), show_progress=False) @@ -174,7 +174,12 @@ def create_event_handlers(): def load_model_wrapper(selected_model, loader, autoload=False): - settings = get_model_metadata(selected_model) + try: + settings = get_model_metadata(selected_model) + except FileNotFoundError: + exc = traceback.format_exc() + yield exc.replace('\n', '\n\n') + return if not autoload: yield "### {}\n\n- Settings updated: Click \"Load\" to load the model\n- Max sequence length: {}".format(selected_model, settings['truncation_length_info']) @@ -374,7 +379,8 @@ def handle_load_model_event_initial(model, state): output = ui.apply_interface_values(state) update_model_parameters(state) # This updates the command-line flags - return output + [state] + vram_info = state.get('vram_info', "
Estimated VRAM to load the model:
") + return output + [state] + [vram_info] def handle_load_model_event_final(truncation_length, loader, state): diff --git a/modules/ui_notebook.py b/modules/ui_notebook.py index 3f79a93c..939d81f7 100644 --- a/modules/ui_notebook.py +++ b/modules/ui_notebook.py @@ -1,3 +1,7 @@ +import threading +import time +from pathlib import Path + import gradio as gr from modules import logits, shared, ui, utils @@ -7,22 +11,27 @@ from modules.text_generation import ( get_token_ids, stop_everything_event ) -from modules.ui_default import handle_delete_prompt, handle_save_prompt from modules.utils import gradio +_notebook_file_lock = threading.Lock() +_notebook_auto_save_timer = None +_last_notebook_text = None +_last_notebook_prompt = None + inputs = ('textbox-notebook', 'interface_state') outputs = ('textbox-notebook', 'html-notebook') def create_ui(): mu = shared.args.multi_user - with gr.Tab('Notebook', elem_id='notebook-tab'): + with gr.Row(visible=not shared.settings['show_two_notebook_columns']) as shared.gradio['notebook-tab']: shared.gradio['last_input-notebook'] = gr.State('') with gr.Row(): with gr.Column(scale=4): with gr.Tab('Raw'): with gr.Row(): - shared.gradio['textbox-notebook'] = gr.Textbox(value=load_prompt(shared.settings['prompt-notebook']), lines=27, elem_id='textbox-notebook', elem_classes=['textbox', 'add_scrollbar']) + initial_text = load_prompt(shared.settings['prompt-notebook']) + shared.gradio['textbox-notebook'] = gr.Textbox(label="", value=initial_text, lines=27, elem_id='textbox-notebook', elem_classes=['textbox', 'add_scrollbar']) shared.gradio['token-counter-notebook'] = gr.HTML(value="0", elem_id="notebook-token-counter") with gr.Tab('Markdown'): @@ -57,9 +66,19 @@ def create_ui(): gr.HTML('
') with gr.Row(): shared.gradio['prompt_menu-notebook'] = gr.Dropdown(choices=utils.get_available_prompts(), value=shared.settings['prompt-notebook'], label='Prompt', elem_classes='slim-dropdown') - ui.create_refresh_button(shared.gradio['prompt_menu-notebook'], lambda: None, lambda: {'choices': utils.get_available_prompts()}, ['refresh-button', 'refresh-button-small'], interactive=not mu) - shared.gradio['save_prompt-notebook'] = gr.Button('💾', elem_classes=['refresh-button', 'refresh-button-small'], interactive=not mu) - shared.gradio['delete_prompt-notebook'] = gr.Button('🗑️', elem_classes=['refresh-button', 'refresh-button-small'], interactive=not mu) + + with gr.Row(): + ui.create_refresh_button(shared.gradio['prompt_menu-notebook'], lambda: None, lambda: {'choices': utils.get_available_prompts()}, ['refresh-button'], interactive=not mu) + shared.gradio['new_prompt-notebook'] = gr.Button('New', elem_classes=['refresh-button'], interactive=not mu) + shared.gradio['rename_prompt-notebook'] = gr.Button('Rename', elem_classes=['refresh-button'], interactive=not mu) + shared.gradio['delete_prompt-notebook'] = gr.Button('🗑️', elem_classes=['refresh-button'], interactive=not mu) + shared.gradio['delete_prompt-confirm-notebook'] = gr.Button('Confirm', variant='stop', elem_classes=['refresh-button'], visible=False) + shared.gradio['delete_prompt-cancel-notebook'] = gr.Button('Cancel', elem_classes=['refresh-button'], visible=False) + + with gr.Row(visible=False) as shared.gradio['rename-row-notebook']: + shared.gradio['rename_prompt_to-notebook'] = gr.Textbox(label="New name", elem_classes=['no-background']) + shared.gradio['rename_prompt-cancel-notebook'] = gr.Button('Cancel', elem_classes=['refresh-button']) + shared.gradio['rename_prompt-confirm-notebook'] = gr.Button('Confirm', elem_classes=['refresh-button'], variant='primary') def create_event_handlers(): @@ -67,7 +86,7 @@ def create_event_handlers(): lambda x: x, gradio('textbox-notebook'), gradio('last_input-notebook')).then( ui.gather_interface_values, gradio(shared.input_elements), gradio('interface_state')).then( lambda: [gr.update(visible=True), gr.update(visible=False)], None, gradio('Stop-notebook', 'Generate-notebook')).then( - generate_reply_wrapper, gradio(inputs), gradio(outputs), show_progress=False).then( + generate_and_save_wrapper_notebook, gradio('textbox-notebook', 'interface_state', 'prompt_menu-notebook'), gradio(outputs), show_progress=False).then( lambda state, text: state.update({'textbox-notebook': text}), gradio('interface_state', 'textbox-notebook'), None).then( lambda: [gr.update(visible=False), gr.update(visible=True)], None, gradio('Stop-notebook', 'Generate-notebook')).then( None, None, None, js=f'() => {{{ui.audio_notification_js}}}') @@ -76,7 +95,7 @@ def create_event_handlers(): lambda x: x, gradio('textbox-notebook'), gradio('last_input-notebook')).then( ui.gather_interface_values, gradio(shared.input_elements), gradio('interface_state')).then( lambda: [gr.update(visible=True), gr.update(visible=False)], None, gradio('Stop-notebook', 'Generate-notebook')).then( - generate_reply_wrapper, gradio(inputs), gradio(outputs), show_progress=False).then( + generate_and_save_wrapper_notebook, gradio('textbox-notebook', 'interface_state', 'prompt_menu-notebook'), gradio(outputs), show_progress=False).then( lambda state, text: state.update({'textbox-notebook': text}), gradio('interface_state', 'textbox-notebook'), None).then( lambda: [gr.update(visible=False), gr.update(visible=True)], None, gradio('Stop-notebook', 'Generate-notebook')).then( None, None, None, js=f'() => {{{ui.audio_notification_js}}}') @@ -85,7 +104,7 @@ def create_event_handlers(): lambda x: x, gradio('last_input-notebook'), gradio('textbox-notebook'), show_progress=False).then( ui.gather_interface_values, gradio(shared.input_elements), gradio('interface_state')).then( lambda: [gr.update(visible=True), gr.update(visible=False)], None, gradio('Stop-notebook', 'Generate-notebook')).then( - generate_reply_wrapper, gradio(inputs), gradio(outputs), show_progress=False).then( + generate_and_save_wrapper_notebook, gradio('textbox-notebook', 'interface_state', 'prompt_menu-notebook'), gradio(outputs), show_progress=False).then( lambda state, text: state.update({'textbox-notebook': text}), gradio('interface_state', 'textbox-notebook'), None).then( lambda: [gr.update(visible=False), gr.update(visible=True)], None, gradio('Stop-notebook', 'Generate-notebook')).then( None, None, None, js=f'() => {{{ui.audio_notification_js}}}') @@ -97,11 +116,173 @@ def create_event_handlers(): shared.gradio['markdown_render-notebook'].click(lambda x: x, gradio('textbox-notebook'), gradio('markdown-notebook'), queue=False) shared.gradio['Stop-notebook'].click(stop_everything_event, None, None, queue=False) shared.gradio['prompt_menu-notebook'].change(load_prompt, gradio('prompt_menu-notebook'), gradio('textbox-notebook'), show_progress=False) - shared.gradio['save_prompt-notebook'].click(handle_save_prompt, gradio('textbox-notebook'), gradio('save_contents', 'save_filename', 'save_root', 'file_saver'), show_progress=False) - shared.gradio['delete_prompt-notebook'].click(handle_delete_prompt, gradio('prompt_menu-notebook'), gradio('delete_filename', 'delete_root', 'file_deleter'), show_progress=False) + shared.gradio['new_prompt-notebook'].click(handle_new_prompt, None, gradio('prompt_menu-notebook'), show_progress=False) + + shared.gradio['delete_prompt-notebook'].click( + lambda: [gr.update(visible=False), gr.update(visible=True), gr.update(visible=True)], + None, + gradio('delete_prompt-notebook', 'delete_prompt-cancel-notebook', 'delete_prompt-confirm-notebook'), + show_progress=False) + + shared.gradio['delete_prompt-cancel-notebook'].click( + lambda: [gr.update(visible=True), gr.update(visible=False), gr.update(visible=False)], + None, + gradio('delete_prompt-notebook', 'delete_prompt-cancel-notebook', 'delete_prompt-confirm-notebook'), + show_progress=False) + + shared.gradio['delete_prompt-confirm-notebook'].click( + handle_delete_prompt_confirm_notebook, + gradio('prompt_menu-notebook'), + gradio('prompt_menu-notebook', 'delete_prompt-notebook', 'delete_prompt-cancel-notebook', 'delete_prompt-confirm-notebook'), + show_progress=False) + + shared.gradio['rename_prompt-notebook'].click( + handle_rename_prompt_click_notebook, + gradio('prompt_menu-notebook'), + gradio('rename_prompt_to-notebook', 'rename_prompt-notebook', 'rename-row-notebook'), + show_progress=False) + + shared.gradio['rename_prompt-cancel-notebook'].click( + lambda: [gr.update(visible=True), gr.update(visible=False)], + None, + gradio('rename_prompt-notebook', 'rename-row-notebook'), + show_progress=False) + + shared.gradio['rename_prompt-confirm-notebook'].click( + handle_rename_prompt_confirm_notebook, + gradio('rename_prompt_to-notebook', 'prompt_menu-notebook'), + gradio('prompt_menu-notebook', 'rename_prompt-notebook', 'rename-row-notebook'), + show_progress=False) + shared.gradio['textbox-notebook'].input(lambda x: f"{count_tokens(x)}", gradio('textbox-notebook'), gradio('token-counter-notebook'), show_progress=False) + shared.gradio['textbox-notebook'].change( + store_notebook_state_and_debounce, + gradio('textbox-notebook', 'prompt_menu-notebook'), + None, + show_progress=False + ) + shared.gradio['get_logits-notebook'].click( ui.gather_interface_values, gradio(shared.input_elements), gradio('interface_state')).then( logits.get_next_logits, gradio('textbox-notebook', 'interface_state', 'use_samplers-notebook', 'logits-notebook'), gradio('logits-notebook', 'logits-notebook-previous'), show_progress=False) shared.gradio['get_tokens-notebook'].click(get_token_ids, gradio('textbox-notebook'), gradio('tokens-notebook'), show_progress=False) + + +def generate_and_save_wrapper_notebook(textbox_content, interface_state, prompt_name): + """Generate reply and automatically save the result for notebook mode with periodic saves""" + last_save_time = time.monotonic() + save_interval = 8 + output = textbox_content + + # Initial autosave + safe_autosave_prompt(output, prompt_name) + + for i, (output, html_output) in enumerate(generate_reply_wrapper(textbox_content, interface_state)): + yield output, html_output + + current_time = time.monotonic() + # Save on first iteration or if save_interval seconds have passed + if i == 0 or (current_time - last_save_time) >= save_interval: + safe_autosave_prompt(output, prompt_name) + last_save_time = current_time + + # Final autosave + safe_autosave_prompt(output, prompt_name) + + +def handle_new_prompt(): + new_name = utils.current_time() + + # Create the new prompt file + prompt_path = Path("user_data/logs/notebook") / f"{new_name}.txt" + prompt_path.parent.mkdir(parents=True, exist_ok=True) + prompt_path.write_text("In this story,", encoding='utf-8') + + return gr.update(choices=utils.get_available_prompts(), value=new_name) + + +def handle_delete_prompt_confirm_notebook(prompt_name): + available_prompts = utils.get_available_prompts() + current_index = available_prompts.index(prompt_name) if prompt_name in available_prompts else 0 + + (Path("user_data/logs/notebook") / f"{prompt_name}.txt").unlink(missing_ok=True) + available_prompts = utils.get_available_prompts() + + if available_prompts: + new_value = available_prompts[min(current_index, len(available_prompts) - 1)] + else: + new_value = utils.current_time() + Path("user_data/logs/notebook").mkdir(parents=True, exist_ok=True) + (Path("user_data/logs/notebook") / f"{new_value}.txt").write_text("In this story,") + available_prompts = [new_value] + + return [ + gr.update(choices=available_prompts, value=new_value), + gr.update(visible=True), + gr.update(visible=False), + gr.update(visible=False) + ] + + +def handle_rename_prompt_click_notebook(current_name): + return [ + gr.update(value=current_name), + gr.update(visible=False), + gr.update(visible=True) + ] + + +def handle_rename_prompt_confirm_notebook(new_name, current_name): + old_path = Path("user_data/logs/notebook") / f"{current_name}.txt" + new_path = Path("user_data/logs/notebook") / f"{new_name}.txt" + + if old_path.exists() and not new_path.exists(): + old_path.rename(new_path) + + available_prompts = utils.get_available_prompts() + return [ + gr.update(choices=available_prompts, value=new_name), + gr.update(visible=True), + gr.update(visible=False) + ] + + +def autosave_prompt(text, prompt_name): + """Automatically save the text to the selected prompt file""" + if prompt_name and text.strip(): + prompt_path = Path("user_data/logs/notebook") / f"{prompt_name}.txt" + prompt_path.parent.mkdir(parents=True, exist_ok=True) + prompt_path.write_text(text, encoding='utf-8') + + +def safe_autosave_prompt(content, prompt_name): + """Thread-safe wrapper for autosave_prompt to prevent file corruption""" + with _notebook_file_lock: + autosave_prompt(content, prompt_name) + + +def store_notebook_state_and_debounce(text, prompt_name): + """Store current notebook state and trigger debounced save""" + global _notebook_auto_save_timer, _last_notebook_text, _last_notebook_prompt + + if shared.args.multi_user: + return + + _last_notebook_text = text + _last_notebook_prompt = prompt_name + + if _notebook_auto_save_timer is not None: + _notebook_auto_save_timer.cancel() + + _notebook_auto_save_timer = threading.Timer(1.0, _perform_notebook_debounced_save) + _notebook_auto_save_timer.start() + + +def _perform_notebook_debounced_save(): + """Actually perform the notebook save using the stored state""" + try: + if _last_notebook_text is not None and _last_notebook_prompt is not None: + safe_autosave_prompt(_last_notebook_text, _last_notebook_prompt) + except Exception as e: + print(f"Notebook auto-save failed: {e}") diff --git a/modules/ui_parameters.py b/modules/ui_parameters.py index e2b10554..e42e4c0c 100644 --- a/modules/ui_parameters.py +++ b/modules/ui_parameters.py @@ -93,7 +93,7 @@ def create_ui(): with gr.Column(): shared.gradio['truncation_length'] = gr.Number(precision=0, step=256, value=get_truncation_length(), label='Truncate the prompt up to this length', info='The leftmost tokens are removed if the prompt exceeds this length.') shared.gradio['seed'] = gr.Number(value=shared.settings['seed'], label='Seed (-1 for random)') - + shared.gradio['custom_system_message'] = gr.Textbox(value=shared.settings['custom_system_message'], lines=2, label='Custom system message', info='If not empty, will be used instead of the default one.', elem_classes=['add_scrollbar']) shared.gradio['custom_stopping_strings'] = gr.Textbox(lines=2, value=shared.settings["custom_stopping_strings"] or None, label='Custom stopping strings', info='Written between "" and separated by commas.', placeholder='"\\n", "\\nYou:"') shared.gradio['custom_token_bans'] = gr.Textbox(value=shared.settings['custom_token_bans'] or None, label='Token bans', info='Token IDs to ban, separated by commas. The IDs can be found in the Default or Notebook tab.') shared.gradio['negative_prompt'] = gr.Textbox(value=shared.settings['negative_prompt'], label='Negative prompt', info='For CFG. Only used when guidance_scale is different than 1.', lines=3, elem_classes=['add_scrollbar']) diff --git a/modules/ui_session.py b/modules/ui_session.py index 0673828e..a69e155b 100644 --- a/modules/ui_session.py +++ b/modules/ui_session.py @@ -11,7 +11,9 @@ def create_ui(): with gr.Column(): gr.Markdown("## Settings") shared.gradio['toggle_dark_mode'] = gr.Button('Toggle light/dark theme 💡', elem_classes='refresh-button') + shared.gradio['show_two_notebook_columns'] = gr.Checkbox(label='Show two columns in the Notebook tab', value=shared.settings['show_two_notebook_columns']) shared.gradio['paste_to_attachment'] = gr.Checkbox(label='Turn long pasted text into attachments in the Chat tab', value=shared.settings['paste_to_attachment'], elem_id='paste_to_attachment') + shared.gradio['include_past_attachments'] = gr.Checkbox(label='Include attachments/search results from previous messages in the chat prompt', value=shared.settings['include_past_attachments']) with gr.Column(): gr.Markdown("## Extensions & flags") @@ -33,6 +35,12 @@ def create_ui(): lambda x: 'dark' if x == 'light' else 'light', gradio('theme_state'), gradio('theme_state')).then( None, None, None, js=f'() => {{{ui.dark_theme_js}; toggleDarkMode(); localStorage.setItem("theme", document.body.classList.contains("dark") ? "dark" : "light")}}') + shared.gradio['show_two_notebook_columns'].change( + handle_default_to_notebook_change, + gradio('show_two_notebook_columns', 'textbox-default', 'output_textbox', 'prompt_menu-default', 'textbox-notebook', 'prompt_menu-notebook'), + gradio('default-tab', 'notebook-tab', 'textbox-default', 'output_textbox', 'prompt_menu-default', 'textbox-notebook', 'prompt_menu-notebook') + ) + # Reset interface event shared.gradio['reset_interface'].click( set_interface_arguments, gradio('extensions_menu', 'bool_menu'), None).then( @@ -49,6 +57,31 @@ def handle_save_settings(state, preset, extensions, show_controls, theme): ] +def handle_default_to_notebook_change(show_two_columns, default_input, default_output, default_prompt, notebook_input, notebook_prompt): + if show_two_columns: + # Notebook to default + return [ + gr.update(visible=True), + gr.update(visible=False), + notebook_input, + "", + gr.update(value=notebook_prompt, choices=utils.get_available_prompts()), + gr.update(), + gr.update(), + ] + else: + # Default to notebook + return [ + gr.update(visible=False), + gr.update(visible=True), + gr.update(), + gr.update(), + gr.update(), + default_input, + gr.update(value=default_prompt, choices=utils.get_available_prompts()) + ] + + def set_interface_arguments(extensions, bool_active): shared.args.extensions = extensions diff --git a/modules/utils.py b/modules/utils.py index 21873541..c285d401 100644 --- a/modules/utils.py +++ b/modules/utils.py @@ -53,7 +53,7 @@ def delete_file(fname): def current_time(): - return f"{datetime.now().strftime('%Y-%m-%d-%H%M%S')}" + return f"{datetime.now().strftime('%Y-%m-%d_%Hh%Mm%Ss')}" def atoi(text): @@ -159,10 +159,12 @@ def get_available_presets(): def get_available_prompts(): - prompt_files = list(Path('user_data/prompts').glob('*.txt')) + notebook_dir = Path('user_data/logs/notebook') + notebook_dir.mkdir(parents=True, exist_ok=True) + + prompt_files = list(notebook_dir.glob('*.txt')) sorted_files = sorted(prompt_files, key=lambda x: x.stat().st_mtime, reverse=True) prompts = [file.stem for file in sorted_files] - prompts.append('None') return prompts diff --git a/modules/web_search.py b/modules/web_search.py index ffd7e483..401a42bb 100644 --- a/modules/web_search.py +++ b/modules/web_search.py @@ -4,6 +4,7 @@ from datetime import datetime import requests +from modules import shared from modules.logging_colors import logger @@ -28,6 +29,8 @@ def download_web_page(url, timeout=10): # Initialize the HTML to Markdown converter h = html2text.HTML2Text() h.body_width = 0 + h.ignore_images = True + h.ignore_links = True # Convert the HTML to Markdown markdown_text = h.handle(response.text) @@ -90,6 +93,22 @@ def perform_web_search(query, num_pages=3, max_workers=5): return [] +def truncate_content_by_tokens(content, max_tokens=8192): + """Truncate content to fit within token limit using binary search""" + if len(shared.tokenizer.encode(content)) <= max_tokens: + return content + + left, right = 0, len(content) + while left < right: + mid = (left + right + 1) // 2 + if len(shared.tokenizer.encode(content[:mid])) <= max_tokens: + left = mid + else: + right = mid - 1 + + return content[:left] + + def add_web_search_attachments(history, row_idx, user_message, search_query, state): """Perform web search and add results as attachments""" if not search_query: @@ -126,7 +145,7 @@ def add_web_search_attachments(history, row_idx, user_message, search_query, sta "name": result['title'], "type": "text/html", "url": result['url'], - "content": result['content'] + "content": truncate_content_by_tokens(result['content']) } history['metadata'][key]["attachments"].append(attachment) diff --git a/requirements/full/requirements.txt b/requirements/full/requirements.txt index a71e5240..19e5e0fe 100644 --- a/requirements/full/requirements.txt +++ b/requirements/full/requirements.txt @@ -34,10 +34,10 @@ sse-starlette==1.6.5 tiktoken # CUDA wheels -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cu124-py3-none-win_amd64.whl; platform_system == "Windows" and python_version == "3.11" -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cu124-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11" -https://github.com/oobabooga/exllamav3/releases/download/v0.0.3/exllamav3-0.0.3+cu124.torch2.6.0-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11" -https://github.com/oobabooga/exllamav3/releases/download/v0.0.3/exllamav3-0.0.3+cu124.torch2.6.0-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cu124-py3-none-win_amd64.whl; platform_system == "Windows" and python_version == "3.11" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cu124-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11" +https://github.com/oobabooga/exllamav3/releases/download/v0.0.4/exllamav3-0.0.4+cu124.torch2.6.0-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11" +https://github.com/oobabooga/exllamav3/releases/download/v0.0.4/exllamav3-0.0.4+cu124.torch2.6.0-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11" https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1+cu124.torch2.6.0-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11" https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1+cu124.torch2.6.0-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11" https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1-py3-none-any.whl; platform_system == "Linux" and platform_machine != "x86_64" diff --git a/requirements/full/requirements_amd.txt b/requirements/full/requirements_amd.txt index db1ead1a..ebef87a6 100644 --- a/requirements/full/requirements_amd.txt +++ b/requirements/full/requirements_amd.txt @@ -33,7 +33,7 @@ sse-starlette==1.6.5 tiktoken # AMD wheels -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+vulkan-py3-none-win_amd64.whl; platform_system == "Windows" -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+vulkan-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+vulkan-py3-none-win_amd64.whl; platform_system == "Windows" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+vulkan-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1+rocm6.2.4.torch2.6.0-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11" https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1-py3-none-any.whl; platform_system != "Darwin" and platform_machine != "x86_64" diff --git a/requirements/full/requirements_amd_noavx2.txt b/requirements/full/requirements_amd_noavx2.txt index a08aa392..f1fccc93 100644 --- a/requirements/full/requirements_amd_noavx2.txt +++ b/requirements/full/requirements_amd_noavx2.txt @@ -33,7 +33,7 @@ sse-starlette==1.6.5 tiktoken # AMD wheels -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+vulkanavx-py3-none-win_amd64.whl; platform_system == "Windows" -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+vulkanavx-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+vulkanavx-py3-none-win_amd64.whl; platform_system == "Windows" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+vulkanavx-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1+rocm6.2.4.torch2.6.0-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11" https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1-py3-none-any.whl; platform_system != "Darwin" and platform_machine != "x86_64" diff --git a/requirements/full/requirements_apple_intel.txt b/requirements/full/requirements_apple_intel.txt index fa217c3e..734f22c7 100644 --- a/requirements/full/requirements_apple_intel.txt +++ b/requirements/full/requirements_apple_intel.txt @@ -33,7 +33,7 @@ sse-starlette==1.6.5 tiktoken # Mac wheels -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0-py3-none-macosx_15_0_x86_64.whl; platform_system == "Darwin" and platform_release >= "24.0.0" and platform_release < "25.0.0" and python_version == "3.11" -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0-py3-none-macosx_14_0_x86_64.whl; platform_system == "Darwin" and platform_release >= "23.0.0" and platform_release < "24.0.0" and python_version == "3.11" -https://github.com/oobabooga/exllamav3/releases/download/v0.0.3/exllamav3-0.0.3-py3-none-any.whl +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0-py3-none-macosx_15_0_x86_64.whl; platform_system == "Darwin" and platform_release >= "24.0.0" and platform_release < "25.0.0" and python_version == "3.11" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0-py3-none-macosx_14_0_x86_64.whl; platform_system == "Darwin" and platform_release >= "23.0.0" and platform_release < "24.0.0" and python_version == "3.11" +https://github.com/oobabooga/exllamav3/releases/download/v0.0.4/exllamav3-0.0.4-py3-none-any.whl https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1-py3-none-any.whl diff --git a/requirements/full/requirements_apple_silicon.txt b/requirements/full/requirements_apple_silicon.txt index 52581f1a..f837aade 100644 --- a/requirements/full/requirements_apple_silicon.txt +++ b/requirements/full/requirements_apple_silicon.txt @@ -33,8 +33,8 @@ sse-starlette==1.6.5 tiktoken # Mac wheels -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0-py3-none-macosx_15_0_arm64.whl; platform_system == "Darwin" and platform_release >= "24.0.0" and platform_release < "25.0.0" and python_version == "3.11" -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0-py3-none-macosx_14_0_arm64.whl; platform_system == "Darwin" and platform_release >= "23.0.0" and platform_release < "24.0.0" and python_version == "3.11" -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0-py3-none-macosx_13_0_arm64.whl; platform_system == "Darwin" and platform_release >= "22.0.0" and platform_release < "23.0.0" and python_version == "3.11" -https://github.com/oobabooga/exllamav3/releases/download/v0.0.3/exllamav3-0.0.3-py3-none-any.whl +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0-py3-none-macosx_15_0_arm64.whl; platform_system == "Darwin" and platform_release >= "24.0.0" and platform_release < "25.0.0" and python_version == "3.11" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0-py3-none-macosx_14_0_arm64.whl; platform_system == "Darwin" and platform_release >= "23.0.0" and platform_release < "24.0.0" and python_version == "3.11" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0-py3-none-macosx_13_0_arm64.whl; platform_system == "Darwin" and platform_release >= "22.0.0" and platform_release < "23.0.0" and python_version == "3.11" +https://github.com/oobabooga/exllamav3/releases/download/v0.0.4/exllamav3-0.0.4-py3-none-any.whl https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1-py3-none-any.whl diff --git a/requirements/full/requirements_cpu_only.txt b/requirements/full/requirements_cpu_only.txt index b72f22aa..9ec8a720 100644 --- a/requirements/full/requirements_cpu_only.txt +++ b/requirements/full/requirements_cpu_only.txt @@ -33,5 +33,5 @@ sse-starlette==1.6.5 tiktoken # llama.cpp (CPU only, AVX2) -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cpuavx2-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11" -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cpuavx2-py3-none-win_amd64.whl; platform_system == "Windows" and python_version == "3.11" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cpuavx2-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cpuavx2-py3-none-win_amd64.whl; platform_system == "Windows" and python_version == "3.11" diff --git a/requirements/full/requirements_cpu_only_noavx2.txt b/requirements/full/requirements_cpu_only_noavx2.txt index e8de6057..3a3fcde9 100644 --- a/requirements/full/requirements_cpu_only_noavx2.txt +++ b/requirements/full/requirements_cpu_only_noavx2.txt @@ -33,5 +33,5 @@ sse-starlette==1.6.5 tiktoken # llama.cpp (CPU only, no AVX2) -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cpuavx-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11" -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cpuavx-py3-none-win_amd64.whl; platform_system == "Windows" and python_version == "3.11" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cpuavx-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cpuavx-py3-none-win_amd64.whl; platform_system == "Windows" and python_version == "3.11" diff --git a/requirements/full/requirements_cuda128.txt b/requirements/full/requirements_cuda128.txt index 7851041f..84ffa327 100644 --- a/requirements/full/requirements_cuda128.txt +++ b/requirements/full/requirements_cuda128.txt @@ -34,10 +34,10 @@ sse-starlette==1.6.5 tiktoken # CUDA wheels -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cu124-py3-none-win_amd64.whl; platform_system == "Windows" and python_version == "3.11" -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cu124-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11" -https://github.com/turboderp-org/exllamav3/releases/download/v0.0.3/exllamav3-0.0.3+cu128.torch2.7.0-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11" -https://github.com/turboderp-org/exllamav3/releases/download/v0.0.3/exllamav3-0.0.3+cu128.torch2.7.0-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cu124-py3-none-win_amd64.whl; platform_system == "Windows" and python_version == "3.11" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cu124-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11" +https://github.com/turboderp-org/exllamav3/releases/download/v0.0.4/exllamav3-0.0.4+cu128.torch2.7.0-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11" +https://github.com/turboderp-org/exllamav3/releases/download/v0.0.4/exllamav3-0.0.4+cu128.torch2.7.0-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11" https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1+cu128.torch2.7.0-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11" https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1+cu128.torch2.7.0-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11" https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1-py3-none-any.whl; platform_system == "Linux" and platform_machine != "x86_64" diff --git a/requirements/full/requirements_cuda128_noavx2.txt b/requirements/full/requirements_cuda128_noavx2.txt index c8015166..da995438 100644 --- a/requirements/full/requirements_cuda128_noavx2.txt +++ b/requirements/full/requirements_cuda128_noavx2.txt @@ -34,10 +34,10 @@ sse-starlette==1.6.5 tiktoken # CUDA wheels -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cu124avx-py3-none-win_amd64.whl; platform_system == "Windows" and python_version == "3.11" -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cu124avx-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11" -https://github.com/turboderp-org/exllamav3/releases/download/v0.0.3/exllamav3-0.0.3+cu128.torch2.7.0-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11" -https://github.com/turboderp-org/exllamav3/releases/download/v0.0.3/exllamav3-0.0.3+cu128.torch2.7.0-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cu124avx-py3-none-win_amd64.whl; platform_system == "Windows" and python_version == "3.11" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cu124avx-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11" +https://github.com/turboderp-org/exllamav3/releases/download/v0.0.4/exllamav3-0.0.4+cu128.torch2.7.0-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11" +https://github.com/turboderp-org/exllamav3/releases/download/v0.0.4/exllamav3-0.0.4+cu128.torch2.7.0-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11" https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1+cu128.torch2.7.0-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11" https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1+cu128.torch2.7.0-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11" https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1-py3-none-any.whl; platform_system == "Linux" and platform_machine != "x86_64" diff --git a/requirements/full/requirements_noavx2.txt b/requirements/full/requirements_noavx2.txt index 5e81ce1f..e68e8187 100644 --- a/requirements/full/requirements_noavx2.txt +++ b/requirements/full/requirements_noavx2.txt @@ -34,10 +34,10 @@ sse-starlette==1.6.5 tiktoken # CUDA wheels -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cu124avx-py3-none-win_amd64.whl; platform_system == "Windows" and python_version == "3.11" -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cu124avx-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11" -https://github.com/oobabooga/exllamav3/releases/download/v0.0.3/exllamav3-0.0.3+cu124.torch2.6.0-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11" -https://github.com/oobabooga/exllamav3/releases/download/v0.0.3/exllamav3-0.0.3+cu124.torch2.6.0-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cu124avx-py3-none-win_amd64.whl; platform_system == "Windows" and python_version == "3.11" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cu124avx-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11" +https://github.com/oobabooga/exllamav3/releases/download/v0.0.4/exllamav3-0.0.4+cu124.torch2.6.0-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11" +https://github.com/oobabooga/exllamav3/releases/download/v0.0.4/exllamav3-0.0.4+cu124.torch2.6.0-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11" https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1+cu124.torch2.6.0-cp311-cp311-win_amd64.whl; platform_system == "Windows" and python_version == "3.11" https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1+cu124.torch2.6.0-cp311-cp311-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11" https://github.com/turboderp-org/exllamav2/releases/download/v0.3.1/exllamav2-0.3.1-py3-none-any.whl; platform_system == "Linux" and platform_machine != "x86_64" diff --git a/requirements/portable/requirements.txt b/requirements/portable/requirements.txt index 4ddcf43f..f596675c 100644 --- a/requirements/portable/requirements.txt +++ b/requirements/portable/requirements.txt @@ -19,5 +19,5 @@ sse-starlette==1.6.5 tiktoken # CUDA wheels -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cu124-py3-none-win_amd64.whl; platform_system == "Windows" -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cu124-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cu124-py3-none-win_amd64.whl; platform_system == "Windows" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cu124-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" diff --git a/requirements/portable/requirements_apple_intel.txt b/requirements/portable/requirements_apple_intel.txt index 38a21618..e472e428 100644 --- a/requirements/portable/requirements_apple_intel.txt +++ b/requirements/portable/requirements_apple_intel.txt @@ -19,5 +19,5 @@ sse-starlette==1.6.5 tiktoken # Mac wheels -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0-py3-none-macosx_15_0_x86_64.whl; platform_system == "Darwin" and platform_release >= "24.0.0" and platform_release < "25.0.0" -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0-py3-none-macosx_14_0_x86_64.whl; platform_system == "Darwin" and platform_release >= "23.0.0" and platform_release < "24.0.0" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0-py3-none-macosx_15_0_x86_64.whl; platform_system == "Darwin" and platform_release >= "24.0.0" and platform_release < "25.0.0" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0-py3-none-macosx_14_0_x86_64.whl; platform_system == "Darwin" and platform_release >= "23.0.0" and platform_release < "24.0.0" diff --git a/requirements/portable/requirements_apple_silicon.txt b/requirements/portable/requirements_apple_silicon.txt index 0b70c800..b60eccf5 100644 --- a/requirements/portable/requirements_apple_silicon.txt +++ b/requirements/portable/requirements_apple_silicon.txt @@ -19,6 +19,6 @@ sse-starlette==1.6.5 tiktoken # Mac wheels -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0-py3-none-macosx_15_0_arm64.whl; platform_system == "Darwin" and platform_release >= "24.0.0" and platform_release < "25.0.0" -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0-py3-none-macosx_14_0_arm64.whl; platform_system == "Darwin" and platform_release >= "23.0.0" and platform_release < "24.0.0" -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0-py3-none-macosx_13_0_arm64.whl; platform_system == "Darwin" and platform_release >= "22.0.0" and platform_release < "23.0.0" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0-py3-none-macosx_15_0_arm64.whl; platform_system == "Darwin" and platform_release >= "24.0.0" and platform_release < "25.0.0" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0-py3-none-macosx_14_0_arm64.whl; platform_system == "Darwin" and platform_release >= "23.0.0" and platform_release < "24.0.0" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0-py3-none-macosx_13_0_arm64.whl; platform_system == "Darwin" and platform_release >= "22.0.0" and platform_release < "23.0.0" diff --git a/requirements/portable/requirements_cpu_only.txt b/requirements/portable/requirements_cpu_only.txt index 510a20f4..c6586848 100644 --- a/requirements/portable/requirements_cpu_only.txt +++ b/requirements/portable/requirements_cpu_only.txt @@ -19,5 +19,5 @@ sse-starlette==1.6.5 tiktoken # llama.cpp (CPU only, AVX2) -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cpuavx2-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cpuavx2-py3-none-win_amd64.whl; platform_system == "Windows" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cpuavx2-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cpuavx2-py3-none-win_amd64.whl; platform_system == "Windows" diff --git a/requirements/portable/requirements_cpu_only_noavx2.txt b/requirements/portable/requirements_cpu_only_noavx2.txt index e6d9f0c5..d0f113a7 100644 --- a/requirements/portable/requirements_cpu_only_noavx2.txt +++ b/requirements/portable/requirements_cpu_only_noavx2.txt @@ -19,5 +19,5 @@ sse-starlette==1.6.5 tiktoken # llama.cpp (CPU only, no AVX2) -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cpuavx-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cpuavx-py3-none-win_amd64.whl; platform_system == "Windows" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cpuavx-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cpuavx-py3-none-win_amd64.whl; platform_system == "Windows" diff --git a/requirements/portable/requirements_noavx2.txt b/requirements/portable/requirements_noavx2.txt index 48f92e0a..df1c5762 100644 --- a/requirements/portable/requirements_noavx2.txt +++ b/requirements/portable/requirements_noavx2.txt @@ -19,5 +19,5 @@ sse-starlette==1.6.5 tiktoken # CUDA wheels -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cu124avx-py3-none-win_amd64.whl; platform_system == "Windows" -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+cu124avx-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cu124avx-py3-none-win_amd64.whl; platform_system == "Windows" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+cu124avx-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" diff --git a/requirements/portable/requirements_vulkan.txt b/requirements/portable/requirements_vulkan.txt index 9f93424f..2da3a81a 100644 --- a/requirements/portable/requirements_vulkan.txt +++ b/requirements/portable/requirements_vulkan.txt @@ -19,5 +19,5 @@ sse-starlette==1.6.5 tiktoken # CUDA wheels -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+vulkan-py3-none-win_amd64.whl; platform_system == "Windows" -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+vulkan-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+vulkan-py3-none-win_amd64.whl; platform_system == "Windows" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+vulkan-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" diff --git a/requirements/portable/requirements_vulkan_noavx2.txt b/requirements/portable/requirements_vulkan_noavx2.txt index 9070b9a6..f53432d8 100644 --- a/requirements/portable/requirements_vulkan_noavx2.txt +++ b/requirements/portable/requirements_vulkan_noavx2.txt @@ -19,5 +19,5 @@ sse-starlette==1.6.5 tiktoken # CUDA wheels -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+vulkanavx-py3-none-win_amd64.whl; platform_system == "Windows" -https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.18.0/llama_cpp_binaries-0.18.0+vulkanavx-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+vulkanavx-py3-none-win_amd64.whl; platform_system == "Windows" +https://github.com/oobabooga/llama-cpp-binaries/releases/download/v0.20.0/llama_cpp_binaries-0.20.0+vulkanavx-py3-none-linux_x86_64.whl; platform_system == "Linux" and platform_machine == "x86_64" diff --git a/server.py b/server.py index cfb21a6e..7ce3c208 100644 --- a/server.py +++ b/server.py @@ -33,7 +33,6 @@ import matplotlib matplotlib.use('Agg') # This fixes LaTeX rendering on some systems -import json import os import signal import sys @@ -144,12 +143,16 @@ def create_interface(): # Temporary clipboard for saving files shared.gradio['temporary_text'] = gr.Textbox(visible=False) - # Text Generation tab + # Chat tab ui_chat.create_ui() - ui_default.create_ui() - ui_notebook.create_ui() + + # Notebook tab + with gr.Tab("Notebook", elem_id='notebook-parent-tab'): + ui_default.create_ui() + ui_notebook.create_ui() ui_parameters.create_ui() # Parameters tab + ui_chat.create_character_settings_ui() # Character tab ui_model_menu.create_ui() # Model tab if not shared.args.portable: training.create_ui() # Training tab diff --git a/user_data/prompts/Alpaca-with-Input.txt b/user_data/prompts/Alpaca-with-Input.txt deleted file mode 100644 index 56df0e28..00000000 --- a/user_data/prompts/Alpaca-with-Input.txt +++ /dev/null @@ -1,10 +0,0 @@ -Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. - -### Instruction: -Instruction - -### Input: -Input - -### Response: - diff --git a/user_data/prompts/QA.txt b/user_data/prompts/QA.txt deleted file mode 100644 index 32b0e235..00000000 --- a/user_data/prompts/QA.txt +++ /dev/null @@ -1,4 +0,0 @@ -Common sense questions and answers - -Question: -Factual answer: