Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Windows arm64 support to official builds #5712

Merged
merged 3 commits into from
Sep 20, 2024
Merged

Conversation

dhiltgen
Copy link
Collaborator

@dhiltgen dhiltgen commented Jul 15, 2024

Wire up CI and build rigging to generate a unified Windows installer with x64 and arm64 payloads. At install time, the correct binaries will be installed for the platform.

I was unable to find a combination of hand-picked msvc redist DLLs manually that yielded a working setup on a pristine Windows 11 install, but running the vc_redist installer works reliably, so for arm64, we run the nested installer conditionally. If it is already installed, that step will be skipped.

Fixes #2589

Note: I've tested most of the CI steps in the PR, but signing isn't yet verified and might require minor fixes on the first release after this merges.

Resulting build artifacts: (Note: current OllamaSetup.exe with only x64 binaries is 273MB)

% ls -lh dist/
total 932M
-rw-r--r-- 1 daniel 197609  12K Jul 17 09:24 ollama_welcome.ps1
-rwxr-xr-x 1 daniel 197609 291M Jul 17 09:27 OllamaSetup.exe*
-rw-r--r-- 1 daniel 197609 649M Jul 17 09:27 ollama-windows-amd64.zip
-rw-r--r-- 1 daniel 197609  20M Jul 19 15:41 ollama-windows-arm64.zip
drwxr-xr-x 1 daniel 197609    0 Jul 17 09:24 windows-amd64/
-rwxr-xr-x 1 daniel 197609 5.9M Jul 17 09:24 windows-amd64-app.exe*
drwxr-xr-x 1 daniel 197609    0 Jul 16 15:53 windows-arm64/
-rwxr-xr-x 1 daniel 197609 5.5M Jul 16 16:12 windows-arm64-app.exe*
% du -sh dist/windows-a*64
2.1G    dist/windows-amd64
37M     dist/windows-arm64

On a Snapdragon X 12-core laptop:

> ollama run --verbose llama3 why is the sky blue
...
total duration:       23.6819409s
load duration:        4.738127s
prompt eval count:    16 token(s)
prompt eval duration: 430.297ms
prompt eval rate:     37.18 tokens/s
eval count:           348 token(s)
eval duration:        18.513796s
eval rate:            18.80 tokens/s

@dhiltgen
Copy link
Collaborator Author

@AndreasKunar thanks for giving it a shot. I'm still tinkering to find the best combination, but you might want to try commenting out that check and try setting something like $env:CC="clang"; $env:CXX="clang++" to see if Go can pick up clang instead of gcc.

@AndreasKunar
Copy link

@AndreasKunar thanks for giving it a shot. I'm still tinkering to find the best combination, but you might want to try commenting out that check and try setting something like $env:CC="clang"; $env:CXX="clang++" to see if Go can pick up clang instead of gcc.

Sorry, does not work for me. Tested in powershell developer prompt and via set-commands in developer command prompt.

In line 189 in lllm\generate\gen_windows.ps1 it checks for gcc with get-command gcc and fails. It would also fail with the following line get-command mingw32-make. mingw32 and gcc are not (yet?) available for WoA, its clang LLVM.

@hmartinez82
Copy link
Contributor

hmartinez82 commented Aug 15, 2024

@AndreasKunar if you install the gcc-compat MSYS2 package then you don't have to do anything (gcc becomes an alias to clang, and so on). Did you try building with these instructions? #5268 (See https://github.com/hmartinez82/ollama/blob/win_arm64_docs/docs/development.md#windows-on-arm-arm64)

I even get the installer working (the installer itself will still be non ARM64, but all the binaries inside it will be).

@AndreasKunar I'm waiting for Go to be able to detect the presence of IMM8 support by the CPU at runtime (this should be in the next release of Go https://go-review.googlesource.com/c/sys/+/595678). Once that's done, I want to create a new PR here that will have two builds of the arm64 llama.cpp binary (one armv8.2 and one armv8.7) that can be chosen at runtime just like cpu, cpu_avx, and cpu_avx2 are in x64 builds.

@AndreasKunar
Copy link

@AndreasKunar if you install the gcc-compat MSYS2 package then you don't have to do anything (gcc becomes an alias to clang, and so on). Did you try building with these instructions? #5268 (See https://github.com/hmartinez82/ollama/blob/win_arm64_docs/docs/development.md#windows-on-arm-arm64)

I even get the installer working (the installer itself will still be non ARM64, but all the binaries inside it will be).

Yes, I build natively on Windows on ARM (WoA), and with nearly all arm64 toolsets. But no, like with the recommendations for building llama.cpp (which I understand ollama uses/compiles), I use Visuals Studio 2022 and its included clang (LLVM), CMake installation as a fully released / supported compiler for WoA (see the build recommendations for llama.cpp on WoA), not the preliminary MSYS2 arm64 support.

So my bad, sorry! I will try and build it with these instructions, but see if I still can use my llama.cpp tooling (e.g. adding aliases to my installed clang) with just adding go.

@AndreasKunar I'm waiting for Go to be able to detect the presence of IMM8 support by the CPU at runtime (this should be in the next release of Go https://go-review.googlesource.com/c/sys/+/595678). Once that's done, I want to create a new PR here that will have two builds of the arm64 llama.cpp binary (one armv8.2 and one armv8.7) that can be chosen at runtime just like cpu, cpu_avx, and cpu_avx2 are in x64 builds.

Cool, thanks a lot!!! For Snapdragon X, llama.cpp's new Q4_0_4_4/Q4_0_4_8 quantization provides an up to 2-3x acceleration (for the compute-bound prompt-processing tasks, token-generation is mostly memory-bandwidth constrained).

@hmartinez82
Copy link
Contributor

hmartinez82 commented Aug 15, 2024

@AndreasKunar #6379

No runtime dynamic detection yet of course. But that's the blunt of the changes.

@dhiltgen Can you try these steps ? https://github.com/hmartinez82/ollama/blob/win_arm64_docs/docs/development.md#windows-on-arm-arm64
With those steps, I can build an ARM64 installer without any changes to build_windows.ps1

@hmartinez82
Copy link
Contributor

I just realized that I repeated a bunch of things you were already doing here @dhiltgen :(

@hmartinez82
Copy link
Contributor

@dhiltgen I'm going ahead and close #6379 since your PR is much more comprehensive (it includes installer, etc).

@@ -28,8 +28,8 @@ AppPublisher={#MyAppPublisher}
AppPublisherURL={#MyAppURL}
AppSupportURL={#MyAppURL}
AppUpdatesURL={#MyAppURL}
ArchitecturesAllowed=x64 arm64
ArchitecturesInstallIn64BitMode=x64 arm64
ArchitecturesAllowed=x64compatible arm64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this impact the x64 install?

Copy link
Collaborator Author

@dhiltgen dhiltgen Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed the following warning from inno setup during the build calling out this deprecated value

Warning: Architecture identifier "x64" is deprecated. Substituting "x64os", but note that "x64compatible" is preferred in most cases. See the "Architecture Identifiers" topic in help file for more information.

https://jrsoftware.org/ishelp/index.php?topic=setup_architecturesallowed

app/ollama.iss Outdated Show resolved Hide resolved
app/ollama.iss Outdated Show resolved Hide resolved
Copy link
Member

@jmorganca jmorganca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, some small nit comments.

This adjusts the installer payloads to be architecture aware so we can cary
both amd64 and arm64 binaries in the installer, and install only the applicable
architecture at install time.
This test seems to be a bit flaky on windows, so give it more time to converge
@dhiltgen dhiltgen merged commit d632e23 into ollama:main Sep 20, 2024
15 checks passed
@dhiltgen dhiltgen deleted the win_arm branch September 20, 2024 20:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Windows ARM support
4 participants