Add Windows arm64 support to official builds #5712

dhiltgen · 2024-07-15T23:18:15Z

Wire up CI and build rigging to generate a unified Windows installer with x64 and arm64 payloads. At install time, the correct binaries will be installed for the platform.

I was unable to find a combination of hand-picked msvc redist DLLs manually that yielded a working setup on a pristine Windows 11 install, but running the vc_redist installer works reliably, so for arm64, we run the nested installer conditionally. If it is already installed, that step will be skipped.

Fixes #2589

Note: I've tested most of the CI steps in the PR, but signing isn't yet verified and might require minor fixes on the first release after this merges.

Resulting build artifacts: (Note: current OllamaSetup.exe with only x64 binaries is 273MB)

% ls -lh dist/
total 932M
-rw-r--r-- 1 daniel 197609  12K Jul 17 09:24 ollama_welcome.ps1
-rwxr-xr-x 1 daniel 197609 291M Jul 17 09:27 OllamaSetup.exe*
-rw-r--r-- 1 daniel 197609 649M Jul 17 09:27 ollama-windows-amd64.zip
-rw-r--r-- 1 daniel 197609  20M Jul 19 15:41 ollama-windows-arm64.zip
drwxr-xr-x 1 daniel 197609    0 Jul 17 09:24 windows-amd64/
-rwxr-xr-x 1 daniel 197609 5.9M Jul 17 09:24 windows-amd64-app.exe*
drwxr-xr-x 1 daniel 197609    0 Jul 16 15:53 windows-arm64/
-rwxr-xr-x 1 daniel 197609 5.5M Jul 16 16:12 windows-arm64-app.exe*
% du -sh dist/windows-a*64
2.1G    dist/windows-amd64
37M     dist/windows-arm64

On a Snapdragon X 12-core laptop:

> ollama run --verbose llama3 why is the sky blue
...
total duration:       23.6819409s
load duration:        4.738127s
prompt eval count:    16 token(s)
prompt eval duration: 430.297ms
prompt eval rate:     37.18 tokens/s
eval count:           348 token(s)
eval duration:        18.513796s
eval rate:            18.80 tokens/s

dhiltgen · 2024-08-02T23:24:28Z

Thanks @AndreasKunar ! I've adjusted to use the clang compiler so we can target armv8.7-a

I've switched to draft until I can test this on more permutations to make sure this new binary is generally compatible with arm64 systems the prior build worked on.

AndreasKunar · 2024-08-03T12:47:16Z

@dhiltgen I tested your current changes on Windows for ARM on a Surface 11 Pro base model, and they require MinGW, gcc, which are not available for arm. For ARM we use clang(LLVM) - best installed via Visual Studio Community Edition 2022 (note clang on Windows requires the MSVC backend), which also can install CMake, Git, ... you then only need to additionally install Go via winget install --id=GoLang.Go -e. The same toolset can also be used for Windows x64 - but there the MinGW alternative is available.

My build (in a Developer Powershell for VS 2022):

$env:CGO_ENABLED="1"
go generate ./...        -> generates error
go build .

Build- Error (sorry for the German Windows messages):
Checking for MinGW...
get-command : Die Benennung "gcc" wurde nicht als Name eines Cmdlet, einer Funktion, einer Skriptdatei oder eines
ausführbaren Programms erkannt. Überprüfen Sie die Schreibweise des Namens, oder ob der Pfad korrekt ist (sofern
enthalten), und wiederholen Sie den Vorgang.
In C:\Users\andre\Projects\ollama\llm\generate\gen_windows.ps1:189 Zeichen:9

```
    get-command gcc
```
```
    ~~~~~~~~~~~~~~~
```
- CategoryInfo : ObjectNotFound: (gcc:String) [Get-Command], CommandNotFoundException
- FullyQualifiedErrorId : CommandNotFoundException,Microsoft.PowerShell.Commands.GetCommandCommand

Also please be aware that armv8.7-a might not work correctly for Macs, running Windows on ARM in Parallels. Currently, there probably are more Mac/Parallels/Windows on ARM users than the new Copilot+PCs (hopefully changing in the future). I will try and test both, since I'm mainly using Macs, only have one Snapdragon X/Windows machine.

Note: There also is a problem with current ollama and the new, fast Q4_0_4_8 quantization acceleration of modern ARM CPUs. ollama PR #6126 tries to address this and I'm trying to test/verify that it's running correctly.

dhiltgen · 2024-08-06T20:54:42Z

Windows arm64 tools do seem to be a bit challenging, but I found it via msys2 via the mingw-w64-clang-aarch64-gcc-compat package

get-command gcc

CommandType     Name                                               Version    Source
-----------     ----                                               -------    ------
Application     gcc.exe                                            0.0.0.0    C:\msys64\clangarm64\bin\gcc.exe

I haven't decided, but I'm contemplating moving this to a cross-compilation model instead of trying to build native though.

AndreasKunar · 2024-08-07T09:49:24Z

Windows arm64 tools do seem to be a bit challenging, but I found it via msys2 via the mingw-w64-clang-aarch64-gcc-compat package
get-command gcc

CommandType     Name                                               Version    Source
-----------     ----                                               -------    ------
Application     gcc.exe                                            0.0.0.0    C:\msys64\clangarm64\bin\gcc.exe     
I haven't decided, but I'm contemplating moving this to a cross-compilation model instead of trying to build native though.

Since you need MSVC (clang on Windows normally depends on the MSVC backend), I still suggest best to directly install CMake via the Visual Studio 2022 Community Edition/... Installer. You can also directly install CMake and Git with it in just one installation step. Just search for these components (Note: Clang is 2 components, the Clang compiler and the MSBuild support vor LLVM (clang-cl) toolset). And this works for native and cross-compile, you could then use different Developer powersehell/command-prompt settings.

AndreasKunar · 2024-08-14T07:10:25Z

I copied your changed files into the current ollama version. When trying to go generate ./... with Visual Studio 2022 installed clang (LLVM and not MinGW), it still generates the following error:

Checking for MinGW...
get-command : The term 'gcc' is not recognized as the name of a cmdlet, function, script file, or operable program.
Check the spelling of the name, or if a path was included, verify that the path is correct and try again.
At C:\Users\andi\Projects\ollama\llm\generate\gen_windows.ps1:189 char:9
+         get-command gcc
+         ~~~~~~~~~~~~~~~
    + CategoryInfo          : ObjectNotFound: (gcc:String) [Get-Command], CommandNotFoundException
    + FullyQualifiedErrorId : CommandNotFoundException,Microsoft.PowerShell.Commands.GetCommandCommand

llm\generate\generate_windows.go:3: running "powershell": exit status 1

I'm currently on Windows 11 for ARM, running on a M2 MacBook Air in Parallels because I'm moving from my Surface Pro to a Surface Laptop and have not yet got my new machine.

dhiltgen · 2024-08-14T16:52:44Z

@AndreasKunar thanks for giving it a shot. I'm still tinkering to find the best combination, but you might want to try commenting out that check and try setting something like $env:CC="clang"; $env:CXX="clang++" to see if Go can pick up clang instead of gcc.

AndreasKunar · 2024-08-14T21:37:58Z

@AndreasKunar thanks for giving it a shot. I'm still tinkering to find the best combination, but you might want to try commenting out that check and try setting something like $env:CC="clang"; $env:CXX="clang++" to see if Go can pick up clang instead of gcc.

Sorry, does not work for me. Tested in powershell developer prompt and via set-commands in developer command prompt.

In line 189 in lllm\generate\gen_windows.ps1 it checks for gcc with get-command gcc and fails. It would also fail with the following line get-command mingw32-make. mingw32 and gcc are not (yet?) available for WoA, its clang LLVM.

hmartinez82 · 2024-08-15T18:04:41Z

@AndreasKunar if you install the gcc-compat MSYS2 package then you don't have to do anything (gcc becomes an alias to clang, and so on). Did you try building with these instructions? #5268 (See https://github.com/hmartinez82/ollama/blob/win_arm64_docs/docs/development.md#windows-on-arm-arm64)

I even get the installer working (the installer itself will still be non ARM64, but all the binaries inside it will be).

@AndreasKunar I'm waiting for Go to be able to detect the presence of IMM8 support by the CPU at runtime (this should be in the next release of Go https://go-review.googlesource.com/c/sys/+/595678). Once that's done, I want to create a new PR here that will have two builds of the arm64 llama.cpp binary (one armv8.2 and one armv8.7) that can be chosen at runtime just like cpu, cpu_avx, and cpu_avx2 are in x64 builds.

AndreasKunar · 2024-08-15T18:51:37Z

@AndreasKunar if you install the gcc-compat MSYS2 package then you don't have to do anything (gcc becomes an alias to clang, and so on). Did you try building with these instructions? #5268 (See https://github.com/hmartinez82/ollama/blob/win_arm64_docs/docs/development.md#windows-on-arm-arm64)

I even get the installer working (the installer itself will still be non ARM64, but all the binaries inside it will be).

Yes, I build natively on Windows on ARM (WoA), and with nearly all arm64 toolsets. But no, like with the recommendations for building llama.cpp (which I understand ollama uses/compiles), I use Visuals Studio 2022 and its included clang (LLVM), CMake installation as a fully released / supported compiler for WoA (see the build recommendations for llama.cpp on WoA), not the preliminary MSYS2 arm64 support.

So my bad, sorry! I will try and build it with these instructions, but see if I still can use my llama.cpp tooling (e.g. adding aliases to my installed clang) with just adding go.

@AndreasKunar I'm waiting for Go to be able to detect the presence of IMM8 support by the CPU at runtime (this should be in the next release of Go https://go-review.googlesource.com/c/sys/+/595678). Once that's done, I want to create a new PR here that will have two builds of the arm64 llama.cpp binary (one armv8.2 and one armv8.7) that can be chosen at runtime just like cpu, cpu_avx, and cpu_avx2 are in x64 builds.

Cool, thanks a lot!!! For Snapdragon X, llama.cpp's new Q4_0_4_4/Q4_0_4_8 quantization provides an up to 2-3x acceleration (for the compute-bound prompt-processing tasks, token-generation is mostly memory-bandwidth constrained).

hmartinez82 · 2024-08-15T21:40:27Z

@AndreasKunar #6379

No runtime dynamic detection yet of course. But that's the blunt of the changes.

@dhiltgen Can you try these steps ? https://github.com/hmartinez82/ollama/blob/win_arm64_docs/docs/development.md#windows-on-arm-arm64
With those steps, I can build an ARM64 installer without any changes to build_windows.ps1

hmartinez82 · 2024-08-15T21:53:06Z

I just realized that I repeated a bunch of things you were already doing here @dhiltgen :(

hmartinez82 · 2024-08-24T08:05:05Z

@dhiltgen I'm going ahead and close #6379 since your PR is much more comprehensive (it includes installer, etc).

This adjusts the installer payloads to be architecture aware so we can cary both amd64 and arm64 binaries in the installer, and install only the applicable architecture at install time.

This test seems to be a bit flaky on windows, so give it more time to converge

dhiltgen force-pushed the win_arm branch 29 times, most recently from 490b56e to 0bc2731 Compare July 17, 2024 15:47

dhiltgen had a problem deploying to release July 17, 2024 15:47 — with GitHub Actions Failure

dhiltgen force-pushed the win_arm branch from 9576006 to f528ea8 Compare August 2, 2024 23:19

dhiltgen marked this pull request as draft August 2, 2024 23:20

dhiltgen force-pushed the win_arm branch from f528ea8 to b69d39f Compare August 2, 2024 23:51

dhiltgen force-pushed the win_arm branch from b69d39f to bba2139 Compare August 13, 2024 22:43

dhiltgen force-pushed the win_arm branch from bba2139 to 64b2bae Compare August 23, 2024 22:54

hmartinez82 mentioned this pull request Aug 24, 2024

Separate ARM64 CPU builds from x64 CPU builds and use Clang instead #6379

Closed

dhiltgen force-pushed the win_arm branch 5 times, most recently from 6e04ca7 to ab0363f Compare September 3, 2024 18:43

dhiltgen force-pushed the win_arm branch from ab0363f to cee3d92 Compare September 5, 2024 19:07

dhiltgen force-pushed the win_arm branch 2 times, most recently from f3d3677 to 4c66be1 Compare September 13, 2024 22:25

dhiltgen added 3 commits September 17, 2024 11:45

Unified arm/x86 windows installer

43d2355

This adjusts the installer payloads to be architecture aware so we can cary both amd64 and arm64 binaries in the installer, and install only the applicable architecture at install time.

Include arm64 in official windows build

83e8496

Harden schedule test for slow windows timers

554d612

This test seems to be a bit flaky on windows, so give it more time to converge

dhiltgen force-pushed the win_arm branch from 4c66be1 to 554d612 Compare September 17, 2024 18:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Windows arm64 support to official builds #5712

Add Windows arm64 support to official builds #5712

dhiltgen commented Jul 15, 2024 •

edited

Loading

dhiltgen commented Aug 2, 2024

AndreasKunar commented Aug 3, 2024 •

edited

Loading

dhiltgen commented Aug 6, 2024 •

edited

Loading

AndreasKunar commented Aug 7, 2024 •

edited

Loading

AndreasKunar commented Aug 14, 2024

dhiltgen commented Aug 14, 2024

AndreasKunar commented Aug 14, 2024

hmartinez82 commented Aug 15, 2024 •

edited

Loading

AndreasKunar commented Aug 15, 2024

hmartinez82 commented Aug 15, 2024 •

edited

Loading

hmartinez82 commented Aug 15, 2024

hmartinez82 commented Aug 24, 2024

Add Windows arm64 support to official builds #5712

Are you sure you want to change the base?

Add Windows arm64 support to official builds #5712

Conversation

dhiltgen commented Jul 15, 2024 • edited Loading

dhiltgen commented Aug 2, 2024

AndreasKunar commented Aug 3, 2024 • edited Loading

dhiltgen commented Aug 6, 2024 • edited Loading

AndreasKunar commented Aug 7, 2024 • edited Loading

AndreasKunar commented Aug 14, 2024

dhiltgen commented Aug 14, 2024

AndreasKunar commented Aug 14, 2024

hmartinez82 commented Aug 15, 2024 • edited Loading

AndreasKunar commented Aug 15, 2024

hmartinez82 commented Aug 15, 2024 • edited Loading

hmartinez82 commented Aug 15, 2024

hmartinez82 commented Aug 24, 2024

dhiltgen commented Jul 15, 2024 •

edited

Loading

AndreasKunar commented Aug 3, 2024 •

edited

Loading

dhiltgen commented Aug 6, 2024 •

edited

Loading

AndreasKunar commented Aug 7, 2024 •

edited

Loading

hmartinez82 commented Aug 15, 2024 •

edited

Loading

hmartinez82 commented Aug 15, 2024 •

edited

Loading