Tags: Gthulhu/Gthulhu
Tags
Release 1.2.0 (#111) * feat: support per-node runtime scheduler selection (#110) - add schedulerName runtime config support across manager, decisionmaker, and daemon - launch none, Gthulhu, or bundled scx scheduler binaries from daemon runtime config - persist per-node desired runtime configs in MongoDB and report drift status - expose scx scheduler selection and desired/applied state in the web UI - update Helm defaults and docs for the scx-flavored scheduler image * fix: Adaptive Classification * feat: rm unused component (KEDA Auto-Scaling) * fix: restart scheduler periodically * feat: apply config to specific node --------- Co-authored-by: gthulhu-work <gthulhu.scheduler@gmail.com>
Issue #89 [refactor] web ui for workload orchestrator (#90) * feat(monitor): add pod-level scheduling metrics monitor subsystem Introduce a complete eBPF-based pod-level scheduling metrics monitor that watches PodSchedulingMetrics CRDs, collects per-PID kernel scheduling statistics, and exposes aggregated pod-level metrics via Prometheus for KEDA-driven autoscaling. Architecture: PodSchedulingMetrics CRD → CRD Watcher → eBPF Collector (tp_btf/sched_switch, tp_btf/sched_process_exit) → Prometheus → KEDA → HPA New packages: - monitor/bpf: eBPF program (sched_monitor.bpf.c/h) hooking sched_switch and sched_process_exit tracepoints for per-PID metric collection - monitor/collector: BPF lifecycle management, per-PID metric aggregation, PID-to-Pod mapping via /proc cgroup parsing (v1/v2), and Prometheus collector exposing 7 metrics per pod (wait_time, run_time, switches, migrations, latency_avg, latency_p99, priority_boosts) - monitor/crdwatcher: Dynamic Kubernetes client watching PodSchedulingMetrics CRs, reconciling eBPF monitored_pids map with label-matched pods - monitor/monitor.go: Orchestrator wiring collector, CRD watcher, and Prometheus HTTP server with kubeconfig support CRD & Helm: - PodSchedulingMetrics CRD (gthulhu.io/v1alpha1) with label selectors, metric thresholds, and scaling hints - KEDA ScaledObject template for Prometheus-based pod autoscaling - Prometheus ServiceMonitor for monitor scraping - values.yaml additions for KEDA and prometheusAdapter configuration Configuration: - MonitorConfig with enable_crd_watcher and kubeconfig_path fields - IsMonitorEnabled()/IsSchedulerEnabled() for flexible mode selection - Monitor startup integrated into main.go with standalone monitor mode Testing & Build: - 21 unit tests across collector and crdwatcher packages (all passing) - Makefile: added test-monitor target, monitor lint coverage - Dockerfile for containerized monitor builds Domain: - PodSchedulingMetricsSpec, TaskSchedMetrics, PodSchedMetrics types in api/decisionmaker/domain for cross-component type sharing * chore: update ci/cd trigger condition * fix: ci linter failed * chore: added concurrency settings for CI/CD * fix: ci linter failed * fix: NewModuleFromFile: open sched_monitor.bpf.o: no such file or directory * chore: added test config * chore: update container image release flow * fix: ingore scheduler's image release * chore: update ci/cd trigger condition for image release * fix: only trigger schtest ci/cd procedure when PR is created for main/develop branch * fix: Go / Build and Push Container Image (pull_request) ignored * feat: FE & RBAC for configuring the metrics CRD * fix: dm doesn't record the pod-leveling metrics * feat: refactor web gui based on figma file * chore: update README.md * chore: update logo * feat: adjust sidebar's layout * chore: update README * chore: update README * chore: added new demo video --------- Co-authored-by: ianchen0119 <ychen.desl@gmail.com>
feature: Add config for max time watchdog toggle (#58) * chore: Update qumun submodule Signed-off-by: YU-WEI,HSU <weiso131@weiso131.com> * feature: Add config for max time watchdog toggle The max time watchdog can be enabled or disabled by modifying the `max_time_watchdog` setting in config.yaml. Default value is true. Signed-off-by: YU-WEI,HSU <weiso131@weiso131.com> * fix: ci lint failed --------- Signed-off-by: YU-WEI,HSU <weiso131@weiso131.com> Co-authored-by: Ian Chen <iancodinghtml@gmail.com>
PreviousNext