Add a docker_data dir and improve caching

Rather than relying on things outside the project dir, have a data dir with all the caches, model files etc, and exclude them from git and the docker context with the .ignore files. Also move the install step to last and put it in its own step, so it doesn't have to rebuild all the slow conda stuff every time there's a change to a file in the project.
2026-02-11 02:04:28 +01:00 · 2023-09-13 00:26:59 +01:00 · 2023-09-13 00:26:59 +01:00 · 097f379c2d
parent b94a396f47
commit 097f379c2d
9 changed files with 36 additions and 15 deletions
--- a/.dockerignore
+++ b/.dockerignore
@ -0,0 +1,3 @@
+docker_data
+build
+*.pyc
--- a/.gitignore
+++ b/.gitignore
@ -132,4 +132,7 @@ dmypy.json
 .models/*
 .custom/*
 results/*
-debug_states/*
+debug_states/*
+
+# Ignore docker data files
+docker_data
--- a/18
+++ b/18
@ -1,7 +1,5 @@
 FROM nvidia/cuda:12.2.0-base-ubuntu22.04

-COPY . /app
-
 RUN apt-get update && \
    apt-get install -y --allow-unauthenticated --no-install-recommends \
    wget \
@ -14,7 +12,6 @@ ENV HOME="/root"
 ENV CONDA_DIR="${HOME}/miniconda"
 ENV PATH="$CONDA_DIR/bin":$PATH
 ENV CONDA_AUTO_UPDATE_CONDA=false
-ENV PIP_DOWNLOAD_CACHE="$HOME/.pip/cache"

 RUN wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O /tmp/miniconda3.sh \
    && bash /tmp/miniconda3.sh -b -p "${CONDA_DIR}" -f -u \
@ -29,5 +26,16 @@ RUN conda create --name tortoise python=3.9 numba inflect \
    && conda activate tortoise \
    && conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia \
    && conda install transformers=4.29.2 \
-    && cd /app \
-    && python setup.py install
+    && conda clean --all \
+    && echo "conda activate tortoise; cd /app" >> "${HOME}/.bashrc"
+
+# Cache the pip deps after all the slow Conda deps
+COPY requirements.txt /app/requirements.txt
+WORKDIR /app
+RUN conda activate tortoise \
+    && pip install -r requirements.txt --no-cache-dir
+
+# and finally copy the source over
+COPY . /app/
+RUN conda activate tortoise \
+    && pip install . --no-cache-dir
--- a/README.md
+++ b/README.md
@ -99,15 +99,7 @@ An easy way to hit the ground running and a good jumping off point depending on
 git clone https://github.com/neonbjb/tortoise-tts.git
 cd tortoise-tts

-docker build . -t tts
-
-docker run --gpus all \
-    -e TORTOISE_MODELS_DIR=/models \
-    -v /mnt/user/data/tortoise_tts/models:/models \
-    -v /mnt/user/data/tortoise_tts/results:/results \
-    -v /mnt/user/data/.cache/huggingface:/root/.cache/huggingface \
-    -v /root:/work \
-    -it tts
+./scripts/run_docker.sh
 ```
 This gives you an interactive terminal in an environment that's ready to do some tts. Now you can explore the different interfaces that tortoise exposes for tts.

--- a/docker_data/data/.gitkeep
+++ b/docker_data/data/.gitkeep
--- a/docker_data/models/.gitkeep
+++ b/docker_data/models/.gitkeep
--- a/docker_data/results/.gitkeep
+++ b/docker_data/results/.gitkeep
--- a/docker_data/work/.gitkeep
+++ b/docker_data/work/.gitkeep
--- a/scripts/run_docker.sh
+++ b/scripts/run_docker.sh
@ -0,0 +1,15 @@
+#!/bin/bash
+
+set -e
+
+docker build . -t tts
+
+pwd=$(pwd)
+
+docker run --gpus all \
+    -e TORTOISE_MODELS_DIR=/models \
+    -v $pwd/docker_data/models:/models \
+    -v $pwd/docker_data/results:/results \
+    -v $pwd/docker_data/.cache/huggingface:/root/.cache/huggingface \
+    -v $pwd/docker_data/work:/work \
+    -it tts