This repository provides easy, consistent way for spinning up various common softwares for operation engineers. Since most cases helm/kustomize are being used, most components use the same Makefile (symbolic link to scripts/Makefile and make targets are the same).
This repository is also meant to be runnable on k8s with ArgoCD with GitOps methodology, to spin up most softwares.
Mac
make colimaMac - destroy and redo
colima stop --force
colima delete --force
make colimaLinux
make k3sLinux - destroy and redo
make k3s-redoGeneral make targets for all components:
make get- to list out k8s resources (deployment, pod, replicaset, services, ingresses, pvc, pv, etc) and container images.make up- to spin up as a production environment (usually usesvalues.yaml, orkustomization/base/kustomization.yaml).make local(instead ofmake up) - to spin up as a local environment (usually usesvalues.local.yaml, orkustomization/overlays/local/).make down- to shutdown the componentmake img- to pull the container image upfront, and save it as a file on local harddrive (to speed up next pull).make test- to run relevant test to verify the deployment was successful.make cli- run command line interface (if any).
Link: https://github.com/kvcache-ai/ktransformers/
Preparations:
- make sure k3s is running (e.g. Linux)
make k3s-redo- make sure GPU operator is deployed to configure NVIDIA drivers (assumption: you have
nvidia-drivers-550already installed on the host machine, e.g.sudo apt-get install nvidia-drivers-550)
cd nvidia-gpu-operator/
make img local wait test
cd -- make sure relevant model files are downloaded into the pv (persistent volume)
cd datascience-models/
cd models/
./DeepSeek-V3.sh # download the repo
./DeepSeek-V3-GGUF.sh # download the GGUF files
cd ..
make local test # setup pv and pvc
cd ..- deploy the k8s components
cd ai-ktransformers/
make img local test
cd ..- you can play with the API
./test "Tell me what is MoE in Machine Learning"*The above commands can actually be replaced by running:
./tests/e2e.ktransformers.shSpin up on local machine
cd infra-mysql@8.2.0/
make localSpin up on local machine, with mysql-exporter
make mysqlDestroy
cd infra-mysql@8.2.0/
make downShow k8s resources
cd infra-mysql@8.2.0/
make get
----------------------------------------
kubectl -n $(make -s ns) get all
NAME READY STATUS RESTARTS AGE
pod/mysql-0 1/1 Running 0 104s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/mysql-headless ClusterIP None <none> 3306/TCP 104s
service/mysql ClusterIP 10.43.30.6 <none> 3306/TCP 104s
NAME READY AGE
statefulset.apps/mysql 1/1 104s
----------------------------------------
kubectl -n $(make -s ns) get ing
No resources found in infra-mysql namespace.
----------------------------------------
kubectl -n $(make -s ns) get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
data-mysql-0 Bound pvc-fdad5a94-824f-4cc5-a3b0-1dca064975e3 8Gi RWO local-path 104s
----------------------------------------
kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-fdad5a94-824f-4cc5-a3b0-1dca064975e3 8Gi RWO Delete Bound infra-mysql/data-mysql-0 local-path 96s
----------------------------------------
if [ -f get.rc ]; then source get.rc; fimysql cli
cd infra-mysql@8.2.0/
make cliSpin up on local machine
cd infra-mongodb@7.0.5/
make localDestroy
cd infra-mongodb@7.0.5/
make downShow k8s resources
cd infra-mongodb@7.0.5/
make get
----------------------------------------
kubectl -n $(make -s ns) get all
NAME READY STATUS RESTARTS AGE
pod/mongodb-0 1/1 Running 0 35s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/mongodb ClusterIP 10.43.209.29 <none> 27017/TCP 35s
NAME READY AGE
statefulset.apps/mongodb 1/1 35s
----------------------------------------
kubectl -n $(make -s ns) get ing
No resources found in db namespace.
----------------------------------------
kubectl -n $(make -s ns) get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
datadir-mongodb-0 Bound pvc-c21e8702-999e-4943-87a3-a69bbb8b6474 8Gi RWO local-path 35s
----------------------------------------
kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-c21e8702-999e-4943-87a3-a69bbb8b6474 8Gi RWO Delete Bound db/datadir-mongodb-0 local-path 33s
----------------------------------------
if [ -f get.rc ]; then source get.rc; fi
mongodb cli
cd infra-mongodb@7.0.5/
make cli
if [ -f cli.rc ]; then source cli.rc; fi
If you don't see a command prompt, try pressing enter.
admin> show dbs;
admin 100.00 KiB
config 12.00 KiB
local 40.00 KiB
admin> With ollama, you may add models on your own, please refer to: https://ollama.com/search
and configure in values.yaml / values.local.yaml.
Spin up on local machine
# ollama will download models from the internet, it needs long time to be pod ready
cd ai-ollama
make localDestory
cd ai-ollama
make downShow k8s resources:
cd ai-ollama && make get
----------------------------------------
kubectl -n $(make -s ns) get all
NAME READY STATUS RESTARTS AGE
pod/ollama-7cd6c64695-f9bwq 1/1 Running 0 11h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/ollama ClusterIP 10.43.35.67 <none> 11434/TCP 11h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/ollama 1/1 1 1 11h
NAME DESIRED CURRENT READY AGE
replicaset.apps/ollama-7cd6c64695 1 1 1 11h
----------------------------------------
kubectl -n $(make -s ns) get ing
NAME CLASS HOSTS ADDRESS PORTS AGE
ollama traefik ollama-api.local 192.168.3.128 80 11h
----------------------------------------
kubectl -n $(make -s ns) get pvc
No resources found in ai-ollama namespace.
----------------------------------------
kubectl get pv
No resources found
----------------------------------------
if [ -f get.rc ]; then source get.rc; fiTo test (with 127.0.0.1 ollama-api.local in /etc/hosts):
cd ai-ollama
./test.sh
* processing: http://ollama-api.local/api/generate
* Trying 127.0.0.1:80...
* Connected to ollama-api.local (127.0.0.1) port 80
> POST /api/generate HTTP/1.1
> Host: ollama-api.local
> User-Agent: curl/8.2.1
> Accept: */*
> Content-Length: 50
> Content-Type: application/x-www-form-urlencoded
>
< HTTP/1.1 200 OK
< Content-Type: application/x-ndjson
< Date: Fri, 29 Mar 2024 03:24:08 GMT
< Transfer-Encoding: chunked
<
{"model":"llama2","created_at":"2024-03-29T03:24:08.427451961Z","response":"\n","done":false}
{"model":"llama2","created_at":"2024-03-29T03:24:08.929457502Z","response":"The","done":false}
{"model":"llama2","created_at":"2024-03-29T03:24:09.439385692Z","response":" sky","done":false}
{"model":"llama2","created_at":"2024-03-29T03:24:10.027851422Z","response":" appears","done":false}
{"model":"llama2","created_at":"2024-03-29T03:24:10.640386355Z","response":" blue","done":false}
{"model":"llama2","created_at":"2024-03-29T03:24:11.330515223Z","response":" to","done":false}
{"model":"llama2","created_at":"2024-03-29T03:24:11.921239622Z","response":" us","done":false}
{"model":"llama2","created_at":"2024-03-29T03:24:12.522344661Z","response":" because","done":false}
...*Edit values.yaml / values.local.yaml to control what models to be served.