Skip to content
#

SRE

Site reliability engineering (SRE) is a set of principles and practices that incorporates aspects of software engineering and applies them to infrastructure and operations problems. The main goals are to create scalable and highly reliable software systems. Site reliability engineering is closely related to DevOps, a set of practices that combine software development and IT operations, and SRE has also been described as a specific implementation of DevOps.

Here are 60 public repositories matching this topic...

🚨 IncidentFlow - Production-Ready Incident Management Modern microservices-based platform for DevOps teams. Built with React, Node.js, MongoDB. Features Docker containerization, Nginx reverse proxy, real-time updates, dark mode UI, and comprehensive security. ✨ Docker • Nginx • SSL • Real-time • Dark Mode • Microservices 🚀 make nginx-start

  • Updated Aug 8, 2025
  • JavaScript

This project implements a Self-Healing Infrastructure designed to automatically detect, respond, and recover from system failures without human intervention. By combining cloud-native tooling, event-driven automation, and observability-driven intelligence, it ensures high availability, scalability, and operational resilience.

  • Updated Aug 13, 2025
  • JavaScript
Followers
145 followers
Website
github.com/topics/sre
Wikipedia
Wikipedia