You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noticed Llama cpp running on my M1 was flushing the memory during and after each generation causing slower-than-expected outputs.
This can be fixed by passing "-mlock" argument, which massively boosts Mac M1 performance by locking the model into the memory.
However, currently, LlamaChat has a similar issue, and I believe it can be fixed by passing a simple '-mlock' argument. In fact, I suggest leaving it ON by default for a seamless beginner's experience for M1s.
Moreover, please also consider an advanced feature to allow users to change the parameters.
The text was updated successfully, but these errors were encountered:
Thanks @xISSAx. You're right, LlamaChat sets the mlock parameter to false always, since this was touted as a big performance improvement over the previous versions (which for large models I think is true)?
I need to do some more investigation into this, but I was definitely thinking of adding a switch for this. Perhaps you're right, maybe this should be enabled by default for a good FTUE, but configurable if people need it.
alexrozanski
changed the title
Mac M1 Memory Flush - Llama cpp
Support configuring whether to load the entire model into memory or use mmap
Apr 17, 2023
Greetings,
Love the application and UX!
I noticed Llama cpp running on my M1 was flushing the memory during and after each generation causing slower-than-expected outputs.
This can be fixed by passing "-mlock" argument, which massively boosts Mac M1 performance by locking the model into the memory.
However, currently, LlamaChat has a similar issue, and I believe it can be fixed by passing a simple '-mlock' argument. In fact, I suggest leaving it ON by default for a seamless beginner's experience for M1s.
Moreover, please also consider an advanced feature to allow users to change the parameters.
The text was updated successfully, but these errors were encountered: