Full documentation update to match current codebase

2026-04-05 06:35:15 +00:00 · 2026-03-05 12:46:21 -03:00 · 2026-03-05 12:46:21 -03:00 · 1ffe540c97
commit 1ffe540c97
parent 1c2548fd89
10 changed files with 388 additions and 326 deletions
--- a/docs/12
+++ b/docs/12
@ -39,7 +39,7 @@ curl http://127.0.0.1:5000/v1/completions \

 #### Chat completions

-Works best with instruction-following models. If the "instruction_template" variable is not provided, it will be guessed automatically based on the model name using the regex patterns in `models/config.yaml`.
+Works best with instruction-following models. If the "instruction_template" variable is not provided, it will be guessed automatically based on the model name using the regex patterns in `user_data/models/config.yaml`.

 ```shell
 curl http://127.0.0.1:5000/v1/chat/completions \
@ -476,51 +476,45 @@ OPENAI_API_KEY=sk-111111111111111111111111111111111111111111111111
 OPENAI_API_BASE=http://127.0.0.1:5000/v1
 ```

-With the [official python openai client](https://github.com/openai/openai-python), the address can be set like this:
+With the [official python openai client](https://github.com/openai/openai-python) (v1.x), the address can be set like this:

 ```python
-import openai
+from openai import OpenAI

-openai.api_key = "..."
-openai.api_base = "http://127.0.0.1:5000/v1"
-openai.api_version = "2023-05-15"
+client = OpenAI(
+    api_key="sk-111111111111111111111111111111111111111111111111",
+    base_url="http://127.0.0.1:5000/v1"
+)
+
+response = client.chat.completions.create(
+    model="x",
+    messages=[{"role": "user", "content": "Hello!"}]
+)
+print(response.choices[0].message.content)
 ```

-If using .env files to save the `OPENAI_API_BASE` and `OPENAI_API_KEY` variables, make sure the .env file is loaded before the openai module is imported:
-
-```python
-from dotenv import load_dotenv
-load_dotenv() # make sure the environment variables are set before import
-import openai
-```
-
-With the [official Node.js openai client](https://github.com/openai/openai-node) it is slightly more more complex because the environment variables are not used by default, so small source code changes may be required to use the environment variables, like so:
+With the [official Node.js openai client](https://github.com/openai/openai-node) (v4.x):

 ```js
-const openai = OpenAI(
-  Configuration({
-    apiKey: process.env.OPENAI_API_KEY,
-    basePath: process.env.OPENAI_API_BASE
-  })
-);
-```
+import OpenAI from "openai";

-For apps made with the [chatgpt-api Node.js client library](https://github.com/transitive-bullshit/chatgpt-api):
-
-```js
-const api = new ChatGPTAPI({
+const client = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
-  apiBaseUrl: process.env.OPENAI_API_BASE
+  baseURL: "http://127.0.0.1:5000/v1",
 });
+
+const response = await client.chat.completions.create({
+  model: "x",
+  messages: [{ role: "user", content: "Hello!" }],
+});
+console.log(response.choices[0].message.content);
 ```
 ### Embeddings (alpha)

-Embeddings requires `sentence-transformers` installed, but chat and completions will function without it loaded. The embeddings endpoint is currently using the HuggingFace model: `sentence-transformers/all-mpnet-base-v2` for embeddings. This produces 768 dimensional embeddings (the same as the text-davinci-002 embeddings), which is different from OpenAI's current default `text-embedding-ada-002` model which produces 1536 dimensional embeddings. The model is small-ish and fast-ish. This model and embedding size may change in the future.
+Embeddings requires `sentence-transformers` installed, but chat and completions will function without it loaded. The embeddings endpoint is currently using the HuggingFace model: `sentence-transformers/all-mpnet-base-v2` for embeddings. This produces 768 dimensional embeddings. The model is small and fast. This model and embedding size may change in the future.

 | model name             | dimensions | input max tokens | speed | size | Avg. performance |
 | ---------------------- | ---------- | ---------------- | ----- | ---- | ---------------- |
-| text-embedding-ada-002 | 1536       | 8192             | -     | -    | -                |
-| text-davinci-002       | 768        | 2046             | -     | -    | -                |
 | all-mpnet-base-v2      | 768        | 384              | 2800  | 420M | 63.3             |
 | all-MiniLM-L6-v2       | 384        | 256              | 14200 | 80M  | 58.8             |

@ -528,50 +522,33 @@ In short, the all-MiniLM-L6-v2 model is 5x faster, 5x smaller ram, 2x smaller st

 Warning: You cannot mix embeddings from different models even if they have the same dimensions. They are not comparable.

-### Compatibility & not so compatibility
+### Compatibility

-Note: the table below may be obsolete.
-
-| API endpoint              | tested with                        | notes                                                                       |
-| ------------------------- | ---------------------------------- | --------------------------------------------------------------------------- |
-| /v1/chat/completions      | openai.ChatCompletion.create()     | Use it with instruction following models                                    |
-| /v1/embeddings            | openai.Embedding.create()          | Using SentenceTransformer embeddings                                        |
-| /v1/images/generations    | openai.Image.create()              | Bare bones, no model configuration, response_format='b64_json' only.        |
-| /v1/moderations           | openai.Moderation.create()         | Basic initial support via embeddings                                        |
-| /v1/models                | openai.Model.list()                | Lists models, Currently loaded model first, plus some compatibility options |
-| /v1/models/{id}           | openai.Model.get()                 | returns whatever you ask for                                                |
-| /v1/edits                 | openai.Edit.create()               | Removed, use /v1/chat/completions instead                                   |
-| /v1/text_completion       | openai.Completion.create()         | Legacy endpoint, variable quality based on the model                        |
-| /v1/completions           | openai api completions.create      | Legacy endpoint (v0.25)                                                     |
-| /v1/engines/\*/embeddings | python-openai v0.25                | Legacy endpoint                                                             |
-| /v1/engines/\*/generate   | openai engines.generate            | Legacy endpoint                                                             |
-| /v1/engines               | openai engines.list                | Legacy Lists models                                                         |
-| /v1/engines/{model_name}  | openai engines.get -i {model_name} | You can use this legacy endpoint to load models via the api or command line |
-| /v1/images/edits          | openai.Image.create_edit()         | not yet supported                                                           |
-| /v1/images/variations     | openai.Image.create_variation()    | not yet supported                                                           |
-| /v1/audio/\*              | openai.Audio.\*                    | supported                                                                   |
-| /v1/files\*               | openai.Files.\*                    | not yet supported                                                           |
-| /v1/fine-tunes\*          | openai.FineTune.\*                 | not yet supported                                                           |
-| /v1/search                | openai.search, engines.search      | not yet supported                                                           |
+| API endpoint              | notes                                                                       |
+| ------------------------- | --------------------------------------------------------------------------- |
+| /v1/chat/completions      | Use with instruction-following models. Supports streaming, tool calls.      |
+| /v1/completions           | Text completion endpoint.                                                   |
+| /v1/embeddings            | Using SentenceTransformer embeddings.                                       |
+| /v1/images/generations    | Image generation, response_format='b64_json' only.                         |
+| /v1/moderations           | Basic support via embeddings.                                               |
+| /v1/models                | Lists models. Currently loaded model first.                                 |
+| /v1/models/{id}           | Returns model info.                                                         |
+| /v1/audio/\*              | Supported.                                                                  |
+| /v1/images/edits          | Not yet supported.                                                          |
+| /v1/images/variations     | Not yet supported.                                                          |

 #### Applications

-Almost everything needs the `OPENAI_API_KEY` and `OPENAI_API_BASE` environment variable set, but there are some exceptions.
+Almost everything needs the `OPENAI_API_KEY` and `OPENAI_API_BASE` environment variables set, but there are some exceptions.

-Note: the table below may be obsolete.
-
-| Compatibility | Application/Library    | Website                                                                        | Notes                                                                                                                                                                                                        |
-| ------------- | ---------------------- | ------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
-| ✅❌          | openai-python (v0.25+) | https://github.com/openai/openai-python                                        | only the endpoints from above are working. OPENAI_API_BASE=http://127.0.0.1:5001/v1                                                                                                                          |
-| ✅❌          | openai-node            | https://github.com/openai/openai-node                                          | only the endpoints from above are working. environment variables don't work by default, but can be configured (see above)                                                                                    |
-| ✅❌          | chatgpt-api            | https://github.com/transitive-bullshit/chatgpt-api                             | only the endpoints from above are working. environment variables don't work by default, but can be configured (see above)                                                                                    |
-| ✅            | anse                   | https://github.com/anse-app/anse                                               | API Key & URL configurable in UI, Images also work                                                                                                                                                           |
-| ✅            | shell_gpt              | https://github.com/TheR1D/shell_gpt                                            | OPENAI_API_HOST=http://127.0.0.1:5001                                                                                                                                                                        |
-| ✅            | gpt-shell              | https://github.com/jla/gpt-shell                                               | OPENAI_API_BASE=http://127.0.0.1:5001/v1                                                                                                                                                                     |
-| ✅            | gpt-discord-bot        | https://github.com/openai/gpt-discord-bot                                      | OPENAI_API_BASE=http://127.0.0.1:5001/v1                                                                                                                                                                     |
-| ✅            | OpenAI for Notepad++   | https://github.com/Krazal/nppopenai                                            | api_url=http://127.0.0.1:5001 in the config file, or environment variables                                                                                                                                   |
-| ✅            | vscode-openai          | https://marketplace.visualstudio.com/items?itemName=AndrewButson.vscode-openai | OPENAI_API_BASE=http://127.0.0.1:5001/v1                                                                                                                                                                     |
-| ✅❌          | langchain              | https://github.com/hwchase17/langchain                                         | OPENAI_API_BASE=http://127.0.0.1:5001/v1 even with a good 30B-4bit model the result is poor so far. It assumes zero shot python/json coding. Some model tailored prompt formatting improves results greatly. |
-| ✅❌          | Auto-GPT               | https://github.com/Significant-Gravitas/Auto-GPT                               | OPENAI_API_BASE=http://127.0.0.1:5001/v1 Same issues as langchain. Also assumes a 4k+ context                                                                                                                |
-| ✅❌          | babyagi                | https://github.com/yoheinakajima/babyagi                                       | OPENAI_API_BASE=http://127.0.0.1:5001/v1                                                                                                                                                                     |
-| ❌            | guidance               | https://github.com/microsoft/guidance                                          | logit_bias and logprobs not yet supported                                                                                                                                                                    |
+| Compatibility | Application/Library  | Website                                                                        | Notes                                                                                     |
+| ------------- | -------------------- | ------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------- |
+| ✅❌          | openai-python        | https://github.com/openai/openai-python                                        | Use `OpenAI(base_url="http://127.0.0.1:5000/v1")`. Only the endpoints from above work.   |
+| ✅❌          | openai-node          | https://github.com/openai/openai-node                                          | Use `new OpenAI({baseURL: "http://127.0.0.1:5000/v1"})`. See example above.              |
+| ✅            | anse                 | https://github.com/anse-app/anse                                               | API Key & URL configurable in UI, Images also work.                                       |
+| ✅            | shell_gpt            | https://github.com/TheR1D/shell_gpt                                            | OPENAI_API_HOST=http://127.0.0.1:5000                                                    |
+| ✅            | gpt-shell            | https://github.com/jla/gpt-shell                                               | OPENAI_API_BASE=http://127.0.0.1:5000/v1                                                 |
+| ✅            | gpt-discord-bot      | https://github.com/openai/gpt-discord-bot                                      | OPENAI_API_BASE=http://127.0.0.1:5000/v1                                                 |
+| ✅            | OpenAI for Notepad++ | https://github.com/Krazal/nppopenai                                            | api_url=http://127.0.0.1:5000 in the config file, or environment variables.               |
+| ✅            | vscode-openai        | https://marketplace.visualstudio.com/items?itemName=AndrewButson.vscode-openai | OPENAI_API_BASE=http://127.0.0.1:5000/v1                                                 |
+| ✅❌          | langchain            | https://github.com/hwchase17/langchain                                         | Use `base_url="http://127.0.0.1:5000/v1"`. Results depend on model and prompt formatting. |