-
Notifications
You must be signed in to change notification settings - Fork 4
Admin Guide
Docker-compose is used to orchestrate the frontend, API, workers, databases, and other services that are used in the Discourse Analysis Tool Suite.
0. Requirements
- Machine with NVIDIA GPU
- Docker with NVIDIA Container Toolkit
1. Clone the repository
git clone https://github.com/uhh-lt/dats.git2. Run setup scripts
./bin/setup-envs.sh --project_name dats --port_prefix 101./bin/setup-folders.sh3. Start docker containers
docker compose -f compose.vllm.yml up -ddocker compose -f compose.ray.yml up -ddocker compose -f compose.docling.yml up -ddocker compose -f compose.yml -f compose.production.yml up --wait4. Open DATS
Open https://localhost:10100/ in your browser
First, locate the DATS directory on the machine and navigate to the docker directory. Then, get the newest code from git:
git switch main
git pull1. Stop all containers
docker compose -f compose.yml -f compose.production.yml down2. Update the /docker/.env file
You have to update the /docker/.env file manually. Compare it with the .env.example file to find all differences. Then, use nano to change the .env file. Most likely, you need to update the DATS_BACKEND_DOCKER_VERSION and DATS_FRONTEND_DOCKER_VERSION variables to the newest version.
git diff --no-index .env.example .env
nano .env3. Pull the newest docker containers
docker compose -f compose.yml -f compose.production.yml pull4. Start all containers
docker compose -f compose.yml -f compose.production.yml up --waitNow, DATS is updated to the new version. Note that you also may need to update the ray and vllm containers!
Ray only needs to run once per machine. It should always be up-to-date!
1. Stop the ray container
docker compose -f compose.ray.yml down2. Update the /docker/.env file
You have to manually set the DATS_RAY_DOCKER_VERSION environment variable to the newest version, for example with nano:
nano .env3. Pull the new docker container
docker compose -f compose.ray.yml pull4. Start Ray
docker compose -f compose.ray.yml up --waitNow, Ray is updated to the new version. Note that ray only needs to run once per machine!
vLLM only needs to run once per machine. It should always be up-to-date! However, vLLM is not developed by the DATS team, and its version number does not match our DATS version. Sometimes, even after deploying a new DATS version, the vLLM version remains unchanged.
1. Stop the vLLM container
docker compose -f compose.vllm.yml down2. Pull the new docker container
docker compose -f compose.vllm.yml pull3. Start vLLM
docker compose -f compose.vllm.yml up --waitNow, vLLM is updated to the new version. Note that vLLM only needs to run once per machine!
Docling only needs to run once per machine. It should always be up-to-date! However, Docling is not developed by the DATS team, and its version number does not match our DATS version. Sometimes, even after deploying a new DATS version, the Docling version remains unchanged.
1. Stop the Docling container
docker compose -f compose.docling.yml down2. Pull the new Docling container
docker compose -f compose.docling.yml pull3. Start Docling
docker compose -f compose.docling.yml up --waitNow, Docling is updated to the new version. Note that Docling only needs to run once per machine!
The script ./bin/setup-folders.sh creates multiple folders:
- /backend_repo - User data
- various cache directories (api, rq, ray, vllm)
- various backup directories (weaviate, elasticsearch, postgres, repo)
- data directories (elasticsearch, pg, redis, weaviate)
There are two main files to configure DATS in production mode:
/docker/.env/backend/configs/production.yaml
The .env file overrides frequently changing variables of the production.yaml config.
It is Strongly recommended to change the following configs in .env:
SYSTEM_USER_EMAILSYSTEM_USER_PASSWORD
You can find some additional configurations here. However, we do not expect these to be changed:
/docker/compose.yml/docker/compose.production.yml-
/docker/configs/es/elasticsearch.yml- Special Elasticsearch configuration -
/docker/configs/frontend/nginx.conf- Special Frontend / NGINX configuration -
/backend/src/ray_model_worker/config_gpu.yaml- Configure ML models
We provide several scripts to automatically create backups of all databases and uploaded user data. This is the recommended backup process
1. Stop backend and frontend Ensure that the backup process cannot be interrupted by users.
docker compose -f compose.yml -f compose.production.yml stop dats-frontend dats-backend-api
2. Create backups
./bin/backup-postgres.sh
./bin/backup-repo.sh
./bin/backup-elasticsearch.sh
./bin/backup-weaviate.sh
3. Restart containers
docker compose -f compose.yml -f compose.production.yml up --wait
DATS supports SSO using OAuth2. We tested it with Authentik as the Identity Provider and Single Sign On.
We include a compose.authentik.yml to start an Authentik instance, but you can use any service that supports OAuth2/OpenID.
This section explains the setup using Authentik.
1. Configure Authentik
- First, a new application has to be created in Authentik.
- Use
datsas the name and slug, then choose Oauth2/OpenID Provider as the Provider Type. - Note the
Client IDandClient secret. You will need it in the next step. - It is important to leave the private key field empty, as Authlib does not currently support token decryption.
- Do not specify any groups. DATS does not support roles or groups.
Next, we need to find the metadata/OpenID-config-URL. In Authentik, this can be found under Applications/Provider/dats.
2. Configure DATS
- Navigate to the docker directory and open the .env file
- Fill the corresponding variables:
OIDC_CLIENT_ID,OIDC_CLIENT_SECRET, andOIDC_SERVER_METADATA_URL. Also, setOIDC_ENABLED=True.
DATS is a complex application that consists of various Docker containers that are managed with Docker Compose. A monitoring system that watches the Docker containers' state and health is important for running applications reliably in Docker and not relying on users to report outages. We use Uptime Kuma, a simple, self-hosted, UI-focused monitoring software. It is open-source and can be run as another Docker container. Kuma uses MariaDB to store its data.
1. Configure Uptime Kuma
Kuma is configured as every other docker container in DATS using the /docker/.env file.
Modify the corresponding variables: KUMA_*, MARIA_* and DOCKER_GROUP_ID.
2. Start Uptime Kuma
docker compose -f compose.kuma.yml up --wait3. Configuration
Now, it is necessary to set up the monitoring manually.
- View
http://localhost:<KUMA_EXPOSED> - Setup Monitoring
More info can be found in Kuma's Documentation.