-
Delta Research Center
- Taiwan, Taipei
-
08:24
(UTC +08:00) - blog.philip-huang.tech
Starred repositories
Algorithm powering the For You feed on X
Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs
A small command line tool to simplify releasing software by updating all version strings in your source code by the correct increment and optionally commit and tag the changes.
A collection of examples for the ROCm software stack
Allow torch tensor memory to be released and resumed later
My learning notes for ML SYS.
Official Code Repository for the paper "Distilling LLM Agent into Small Models with Retrieval and Code Tools"
FlashInfer: Kernel Library for LLM Serving
Awesome Reasoning LLM Tutorial/Survey/Guide
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
Minimal example for DeepSpeed Universal Checkpoint
Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends
WikiTableSet: A largest publicly available image-based table recognition dataset in three languages built from Wikipedia
Automatic GPU+CPU memory profiling, re-use and memory leaks detection using jupyter/ipython experiment containers
Run PyTorch LLMs locally on servers, desktop and mobile
We collect papers about "large language models (LLM) for table-related tasks", e.g., using LLM for Table QA task. “表格+LLM”相关论文整理
[EMNLP 2023] Enabling Large Language Models to Generate Text with Citations. Paper: https://arxiv.org/abs/2305.14627
This project has implemented the RAG function on Jetson and supports TXT and PDF document formats. It uses MLC for 4-bit quantization of the Llama2-7b model, utilizes ChromaDB as the vector databas…
LLM training code for Databricks foundation models
egui: an easy-to-use immediate mode GUI in Rust that runs on both web and native
🐳 Web Interface for the Docker Registry HTTP API V2 written in Ruby on Rails.
The simplest and most complete UI for your private docker registry v2 and v3
Multiple NVIDIA GPUs or Apple Silicon for Large Language Model Inference?
Source code of "Reasons to Reject? Aligning Language Models with Judgments"
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
An open-source remote desktop application designed for self-hosting, as an alternative to TeamViewer.
Implementation of Nougat Neural Optical Understanding for Academic Documents