Accelerating AI Training and Inference from Storage Perspective (Must-read Papers on Storage for AI)
-
Updated
Dec 17, 2025
Accelerating AI Training and Inference from Storage Perspective (Must-read Papers on Storage for AI)
Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.
This project builds an LLM-powered audio summarization pipeline that converts spoken content into concise, meaningful text summaries. It integrates speech-to-text processing with large language models to demonstrate practical applications of Generative AI for content understanding and automation.
Example distributed system for ML model inference by using Kafka, including spring boot REST+JPA server with Java consumer program
EGM is a deep learning tool that learns your image editing style from raw/edited pairs and applies it to new images. It uses a Pix2Pix (conditional GAN) architecture.
rocktop is a singing voice model training/inference system with a full test env and MCP server for devs
RapidVision is a real-time object detection tool powered by the PP-YOLOE deep learning model and the COCO object class dataset. It supports switching between multiple video sources and is built for responsive, flexible object recognition.
Resources of our survey paper "Optimizing Edge AI: A Comprehensive Survey on Data, Model, and System Strategies"
An End-to-end AI Application classifying images as either a cat or a dog. The project leverages OpenVINO Model Server, a Node.js backend, and a React-based frontend.
A personal journey into model inference engineering — learning, building, and sharing along the way.
CLIP as a service - Embed image and sentences, object recognition, visual reasoning, image classification and reverse image search
This vehicle identification project utilizes the YOLOv5 deep learning model for detecting and classifying vehicles from images, videos, and live streams. It supports real-time inference, saving outputs with bounding boxes, confidence scores, and class labels, making it ideal for traffic monitoring and smart surveillance systems.
An end‑to‑end TensorFlow/Keras implementation of the YOLO object detection pipeline. Load images, run fast and accurate bounding‑box inference, filter and refine predictions and visualize results side‑by‑side - all organized into a clean, modular workflow.
Successfully developed a wildlife detection model using Faster R-CNN to identify and localize animals in natural habitats, supporting conservation efforts and ecological research.
😊📸 Real-Time Facial Emotion Recognition using Deep Learning 🤖🧠
A cloud run function to invoke a prediction against a machine learning model that has been trained outside of a cloud provider.
Successfully established a multiclass text classification model by fine-tuning pretrained DistilBERT transformer model to classify several distinct types of mental health statuses such as anxiety, stress, personality disorder, etc. with an accuracy of 77%.
Successfully developed a multiclass text classification model by fine-tuning pretrained DistilBERT transformer model to classify various distinct types of luxury apparels into their respective categories i.e. pants, accessories, underwear, shoes, etc.
Successfully established an image classification model using PyTorch to classify the images of several distinct natural sceneries such as mountains, glaciers, forests, seas, streets and buildings with an accuracy of 86%.
Add a description, image, and links to the model-inference topic page so that developers can more easily learn about it.
To associate your repository with the model-inference topic, visit your repo's landing page and select "manage topics."