KAG is a knowledge-enhanced generation framework based on OpenSPG engine, which is used to build knowledge-enhanced rigorous decision-making and information retrieval knowledge services

Python 331 18 Updated Oct 30, 2024

Tongji-KGLLM / RAG-Survey

1,798 120 Updated May 8, 2024

dvlab-research / LongLoRA

Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)

Python 2,621 274 Updated Aug 14, 2024

AkariAsai / self-rag

This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi.

Python 1,806 167 Updated May 25, 2024

microsoft / LLMLingua

[EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.

Python 4,571 253 Updated Aug 22, 2024

MVPYPC / HUST_DBMS_baiTeam

华中科技大学计算机学院2019级系统能力培养DBMS方向_扌四去队

C++ 4 Updated Oct 27, 2022

Robert-Stackflow / HUST-Miniob

Huazhong University of Science and Technology System Capability Training-DBMS.华中科技大学系统能力培养-DBMS

C++ 7 Updated Dec 18, 2023

hijkzzz / Awesome-LLM-Strawberry

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.

4,919 273 Updated Oct 23, 2024

langchain-ai / rag-from-scratch

Jupyter Notebook 2,549 768 Updated Jul 9, 2024

hymie122 / RAG-Survey

Collecting awesome papers of RAG for AIGC. We propose a taxonomy of RAG foundations, enhancements, and applications in paper "Retrieval-Augmented Generation for AI-Generated Content: A Survey".

1,218 86 Updated Aug 20, 2024

jxzhangjhu / Awesome-LLM-RAG

Awesome-LLM-RAG: a curated list of advanced retrieval augmented generation (RAG) in Large Language Models

934 61 Updated Sep 27, 2024

DefTruth / CUDA-Learn-Notes

🎉 Modern CUDA Learn Notes with PyTorch: fp32/tf32, fp16/bf16, fp8/int8, flash_attn, rope, sgemm, sgemv, warp/block reduce, dot, elementwise, softmax, layernorm, rmsnorm.

Cuda 1,345 149 Updated Oct 30, 2024

microsoft / DiskANN

Graph-structured Indices for Scalable, Fast, Fresh and Filtered Approximate Nearest Neighbor Search

C++ 1,114 218 Updated Oct 24, 2024

tspeterkim / flash-attention-minimal

Flash Attention in ~100 lines of CUDA (forward pass only)

Cuda 598 53 Updated Apr 7, 2024

TsinghuaDatabaseGroup / DB-GPT

An LLM Based Diagnosis System (https://arxiv.org/pdf/2312.01454.pdf)

Python 553 77 Updated Sep 13, 2024

ELS-RD / kernl

Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.

Jupyter Notebook 1,529 95 Updated Feb 16, 2024

mudler / LocalAI

🤖 The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transf…

C++ 24,273 1,861 Updated Oct 30, 2024

langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…

TypeScript 50,106 7,170 Updated Oct 30, 2024

infiniflow / ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

Python 21,329 2,087 Updated Oct 30, 2024

chatchat-space / Langchain-Chatchat

Langchain-Chatchat（原Langchain-ChatGLM）基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and…

TypeScript 31,793 5,547 Updated Oct 15, 2024

run-llama / llama_index

LlamaIndex is a data framework for your LLM applications

Python 36,378 5,193 Updated Oct 30, 2024

langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications

Jupyter Notebook 94,128 15,209 Updated Oct 30, 2024

swordlidev / Evaluation-Multimodal-LLMs-Survey

A Survey on Benchmarks of Multimodal Large Language Models

56 2 Updated Oct 12, 2024

ceph / ceph

Ceph is a distributed object, block, and file storage platform

C++ 14,128 6,008 Updated Oct 30, 2024

astra-sim / astra-sim

ASTRA-sim2.0: Modeling Hierarchical Networks and Disaggregated Systems for Large-model Training at Scale

C++ 260 111 Updated Oct 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

zzxxxl zzxxxl

Highlights

Block or report zzxxxl

Stars

microsoft / sarathi-serve

mit-han-lab / streaming-llm

mit-han-lab / llm-awq

mit-han-lab / smoothquant

mit-han-lab / duo-attention

OpenSPG / KAG