Skip to content

lemonade-sdk/llamacpp-rocm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

96 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

llamacpp-rocm

GitHub release (latest by date) Latest release date License ROCm 7.0 Powered by llama.cpp Platform: Windows | Ubuntu GPU Targets

We provide nightly builds of llama.cpp with AMD ROCm™ 7 acceleration based on TheRock - delivering the freshest, cutting-edge builds available. Our automated pipeline specifically targets seamless integration with 🍋 Lemonade and similar AI applications requiring high-performance GPU inference.

Important

Contribution & Support Notice: While this project currently focuses on integrating llama.cpp+ROCm in a specific production context, our broader goal is to contribute meaningfully to the llama.cpp+ROCm ecosystem. We're not set up to provide comprehensive technical support, but we welcome collaborations, idea exchanges, or contributions that help advance this space.

🎯 Supported Devices

This build specifically targets the following GPU architectures:

  • gfx1151 (STX Halo APU) - Ryzen AI MAX+ Pro 395
  • gfx1150 (STX Point APU) - Ryzen AI 300
  • gfx120X (RDNA4 GPUs) - includes AMD Radeon RX 9070 XT/GRE/9070, RX 9060 XT/9060
  • gfx110X (RDNA3 GPUs) - includes AMD Radeon PRO W7900/W7800/W7700/W7600, RX 7900 XTX/XT/GRE, RX 7800 XT, RX 7700 XT/7700, RX 7600 XT/7600

All builds include ROCm™ 7 built-in - no separate ROCm™ installation required!

🚀 Automated Builds

Our automated GitHub Actions workflow creates nightly builds for:

  • Windows and Ubuntu operating systems
  • Multiple GPU targets: gfx1151, gfx1150, gfx110X, gfx120X
  • ROCm™ 7 built-in - complete runtime libraries included
GPU Target Ubuntu Windows
gfx110X Download Ubuntu gfx110X Download Windows gfx110X
gfx1150 Download Ubuntu gfx1150 Download Windows gfx1150
gfx1151 Download Ubuntu gfx1151 Download Windows gfx1151
gfx120X Download Ubuntu gfx120X Download Windows gfx120X

⚡ Ready to Run: All releases include complete ROCm™ 7 runtime libraries - just download and go!


🧪 Quick Smoketest

To verify your download is working correctly:

  1. Download the appropriate build for your GPU target from our latest releases
  2. Extract the archive to your preferred directory
  3. Test with any GGUF model from Hugging Face:
llama-server -m YOUR_GGUF_MODEL_PATH -ngl 99

💡 Tip: Use -ngl 99 to offload all layers to GPU for maximum acceleration. The exact number of layers may vary by model, but 99 ensures all available layers are offloaded.

🍋 Lemonade Integration: You can also test these builds directly with Lemonade for a seamless AI application experience (coming soon!)


📦 Dependencies

This project relies on the following external software and tools:

Core Dependencies

  • Llama.cpp - Efficient, cross-platform inference engine for running GGUF models locally.
  • ROCm SDK (TheRock) - AMD’s open-source platform for GPU-accelerated computing.
  • HIP - C++ API for writing portable GPU code within the ROCm ecosystem.

Build Tools & Compilers


🏗️ Code and Artifact Structure

Note

Active Development: This project is under active development. Code and artifact structure are subject to change as we continue to improve and expand functionality.

Key Components

  • docs/ - Contains build documentation and setup guides
  • utils/ - Houses utility scripts for build automation and dependency management
  • GitHub Actions Workflows - Located in .github/workflows/ (automated build pipeline)
  • Build Artifacts - Generated during CI/CD and published as releases

The build process is primarily handled through GitHub Actions, with the repository serving as the source for automated compilation and packaging of llama.cpp with ROCm™ 7 support.


📋 Manual Build Instructions

For detailed manual build instructions, please see: docs/manual_instructions.md

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.