AIVN AIO 2024 LLM Ops

A comprehensive MLOps solution for deploying, serving, and monitoring fine-tuned LLMs.

Project Overview

This project demonstrates an end-to-end LLM deployment with a focus on:

Serving fine-tuned Llama 3.2 1B models with vLLM
Backend API with LangChain for structured LLM interactions
Frontend interface with Gradio
Comprehensive monitoring with Prometheus and Grafana
Log aggregation with Loki

Components

vLLM API: Serves the Llama 3.2 1B base model and custom LoRA adapters
Backend: FastAPI service that handles prompting, model selection, and response formatting
Frontend: Gradio web interface for easy interaction with the models
Monitoring: Prometheus, Grafana, and Loki for observability

Use Cases

Sentiment Analysis: Analyzes text sentiment using a fine-tuned model
Medical QA: Answers medical multiple-choice questions with domain-specific tuning

Getting Started

Set up the network:
```
docker network create aio-network
```
Start the monitoring stack:
```
cd monitor
docker compose up -d
```
Launch the vLLM API server:
```
cd vllm_api
docker compose up -d
```
Start the backend API:
```
cd backend
docker compose up -d
```
Launch the frontend application:
```
cd frontend
docker compose up -d
```

Accessing Services

vLLM API: http://localhost:8000
Backend API: http://localhost:8001
Gradio UI: http://localhost:7861
Open WebUI: http://localhost:8080
Grafana: http://localhost:3000
Prometheus: http://localhost:9090

Benchmark

export OPENAI_API_KEY=<your vllm api key>
make bench_serving

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
backend		backend
benchmarks		benchmarks
frontend		frontend
monitor		monitor
vllm_api		vllm_api
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AIVN AIO 2024 LLM Ops

Project Overview

Components

Use Cases

Getting Started

Accessing Services

Benchmark

About

Uh oh!

Releases

Packages

Languages

ThuanNaN/aio2024-llmops

Folders and files

Latest commit

History

Repository files navigation

AIVN AIO 2024 LLM Ops

Project Overview

Components

Use Cases

Getting Started

Accessing Services

Benchmark

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages