llama3
Here are 16 public repositories matching this topic...
Compress context data to optimize memory and performance in C++ large language model applications within the llm-cpp toolkit.
-
Updated
Mar 28, 2026 - C++
🚀 Build high-performance AI applications with this C++ engine for Retrieval Augmented Generation (RAG) and efficient memory management.
-
Updated
Mar 28, 2026 - C++
llama.cpp 🦙 LLM inference in TypeScript
-
Updated
Sep 26, 2024 - C++
Explore LLM model deployment based on AXera's AI chips
-
Updated
Mar 27, 2026 - C++
A high-performance inference system for large language models, designed for production environments.
-
Updated
Dec 19, 2025 - C++
Run generative AI models in sophgo BM1684X/BM1688
-
Updated
Mar 26, 2026 - C++
校招、秋招、春招、实习好项目,带你从零动手实现支持LLama2/3和Qwen2.5的大模型推理框架。
-
Updated
Oct 28, 2025 - C++
Distributed LLM inference. Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference.
-
Updated
Feb 10, 2026 - C++
Improve this page
Add a description, image, and links to the llama3 topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the llama3 topic, visit your repo's landing page and select "manage topics."