# mini-AIOps

**Repository Path**: kevinlights/mini-aiops

## Basic Information

- **Project Name**: mini-AIOps
- **Description**: AIOps research, including ML Flow and LLM integration, etc.
- **Primary Language**: Python
- **License**: MIT
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2026-05-24
- **Last Updated**: 2026-05-25

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# Mini AIOPS

## Project Overview

### MLFlow 实现
这个项目通过一个循环（模拟 Airflow 的调度），演示以下企业级核心逻辑：
1. Data Ingestion (数据采集)：模拟从 Prometheus 抓取实时指标。 
2. Feature Engineering (特征工程)：计算滑动窗口特征。 
3. Model Versioning (模型版本管理)：使用类似 MLflow 的逻辑，记录每一次训练的参数、指标和模型文件。 
4. Continuous Training (持续训练)：随着新数据的进入，不断更新模型权重。 
5. Model Monitoring (模型监控)：对比新旧模型的表现，判断是否需要触发重新部署。

## LLM 集成
MLFlow 提供异常数据，让 LLM 模型分析并生成异常报告。

### 架构设计
```
ELK (模拟日志) ──┐
                  ├──→ LLM (Ollama/ministral-3) ──→ Anomaly Report
Prometheus (指标) ─┤
                  │
MLflow (异常数据) ─┘
```

### 核心流程
1. 模拟从 ELK 中获取日志信息 — MockELK 生成 K8s 服务运行日志
2. 模拟从 Prometheus 中获取指标数据 — MockPrometheus 提供 CPU 时序指标
3. 从 MLFlow 中获取异常数据 — MiniMLflow 输出模型检测到的异常点
4. 调用 LLM 模型，生成异常报告 — OllamaLLM 将多源数据汇总后让 LLM 分析

### 技术要点
- **多源数据融合**: AIOps 的核心价值在于将分散的监控数据（指标、日志、事件）关联起来
- **LLM 作为分析引擎**: LLM 擅长从非结构化文本中提取信息、识别模式、生成可读报告
- **OpenAI 兼容接口**: Ollama 提供 `/v1/chat/completions` 兼容端点，可直接使用 OpenAI SDK
- **Prompt Engineering**: 通过结构化 Prompt 引导 LLM 按运维场景输出分析结论