Skip to content

Tags: zalid/dotLLM

Tags

v0.1.0-preview.3

Toggle v0.1.0-preview.3's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Modify .NET tool installation command

Updated installation command to include --prerelease option.

v0.1.0-preview.2

Toggle v0.1.0-preview.2's commit message
Fix Windows AOT publish: use -p: not /p: in bash shell (kkokosa#119)

The publish-aot job's `dotnet publish /p:PublishAot=true` used a
slash-prefixed MSBuild property flag. On Windows runners with
`shell: bash` (Git Bash), MSYS path conversion interprets /p: as a
filesystem path, mangling the argument so MSBuild sees two projects
and fails with MSB1008.

The single-file job's `-p:PublishSingleFile=true` (dash prefix)
already worked correctly because dashes don't trigger MSYS conversion.
Match that convention: `/p:` → `-p:`.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

v0.1.0-preview.1

Toggle v0.1.0-preview.1's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
First public release: NuGet packages, release workflow, community fil…

…es (kkokosa#120)

* First public release: NuGet packages, release workflow, community files (kkokosa#119)

Wires up everything needed for the first v0.1.0-preview.1 tag to produce
a coordinated public release: GitHub Release assets, NuGet.org packages,
and the surrounding repo polish expected on a public OSS project.

NuGet packaging
- Directory.Build.props: shared package metadata — PackageProjectUrl
  (https://dotllm.dev/), README, tags, symbols (.snupkg), Copyright
- MinVer 6.0.0 drives PackageVersion from git tags (prefix "v")
- Per-project <Description> on all 10 src libraries and DotLLM.Cli
- <IsPackable>false</IsPackable> on samples/* and benchmarks/*
- DotLLM.Cli exe rename ("dotllm") and publish cleanup now fire for
  both PublishAot=true and PublishSingleFile=true

Release workflow (.github/workflows/release.yml)
- Triggers on tags matching v*
- pack-nuget: dotnet pack produces .nupkg + .snupkg for all packable
  projects, uploads as artifact
- publish-single-file: self-contained single-file dotllm for win-x64,
  linux-x64, osx-arm64 (zip/tar.gz, bundled with ptx/, README, LICENSE)
- publish-aot: experimental Native AOT for win-x64 and linux-x64
  (continue-on-error so AOT failures don't block the release)
- push-nuget: pushes to nuget.org when NUGET_API_KEY secret is set
- create-release: softprops/action-gh-release with auto-generated notes
  and pre-release flag based on tag suffix

Documentation and discoverability
- README: Website badge and nav link (dotllm.dev), Getting Started split
  into "Use a pre-built release" and "Build from source", new NuGet
  Packages table, Author section linking to kokosa.dev, News entry
- CLAUDE.md: note the companion website repo (kkokosa/dotllm-page)
- CONTRIBUTING.md codifies the issue-driven workflow
- CODE_OF_CONDUCT.md (Contributor Covenant 2.1)
- SECURITY.md pointing at GitHub Security Advisories

Closes kkokosa#119

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Refactor DotLLM.Server to a proper library (kkokosa#119)

Problem
Microsoft.NET.Sdk.Web + OutputType=Exe defaulted IsPackable=false,
so DotLLM.Server was silently skipped by `dotnet pack` and the first
release workflow would have shipped 10 packages instead of 11.
Forcing IsPackable=true works but ships a library nupkg whose DLL
carries an unused Main method — semantically confused for a package
whose whole purpose is to be referenced from consumer ASP.NET hosts
(as advertised on https://dotllm.dev/).

Change
- DotLLM.Server.csproj: OutputType=Library (override Web SDK's Exe
  default). IsPackable=true still needed because the Web SDK keeps
  IsPackable=false even for Library outputs.
- Delete src/DotLLM.Server/Program.cs — it was dead code. DotLLM.Cli's
  ServeCommand already calls ServerStartup.LoadModel/BuildApp and
  app.RunAsync directly in-process, so removing the Server's own Main
  breaks nothing.
- DotLLM.Cli still references DotLLM.Server via ProjectReference and
  `dotllm serve <model>` continues to work unchanged (verified:
  `dotllm serve --help` prints full options; full solution builds
  clean, 0 warnings / 0 errors).

Result
- DotLLM.Server.nupkg has a clean lib/net10.0/DotLLM.Server.dll with
  no Main, no runtimeconfig.json. Consumers can
  `dotnet add package DotLLM.Server` and host the OpenAI-compatible
  API inside their own ASP.NET Core process.
- The chat UI assets (index.html, app.js, app.css) ship via the Web
  SDK's static web asset system and are auto-wired in consumer apps.

Docs
README "Host the OpenAI API in your ASP.NET app" section added under
NuGet Packages, showing both embedding modes:
1. Dedicated dotLLM WebApplication via ServerStartup.LoadModel +
   BuildApp + app.RunAsync (matches the dotllm.dev snippet).
2. Attach dotLLM routes to your own WebApplication via
   app.MapDotLLMEndpoints() — lives alongside your own routes,
   middleware, and services.

Verification
- `dotnet build src/DotLLM.Cli -c Release` — clean
- `dotllm serve --help` — full options print, in-process path OK
- `dotnet pack` loop over all 11 src projects with
  -p:MinVerVersionOverride=0.1.0-local.1 — produces 11 .nupkg +
  11 .snupkg, including DotLLM.Server with no Main in lib/net10.0/

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Fix README embedding example: set required ServerOptions.Model (kkokosa#119)

The Mode 2 ("attach dotLLM routes to your own WebApplication") snippet
in README omitted the required Model init-setter on ServerOptions, so
the example didn't compile. Caught while running the local test flow.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Add AddDotLLM service-registration helper for Mode 2 embedding (kkokosa#119)

Problem
Mode 2 of the embedding story (attach dotLLM routes to the consumer's
own WebApplication via MapDotLLMEndpoints) was documented in README but
not actually usable, because:

1. dotLLM endpoints bind ServerState through DI parameter injection, so
   state must be registered before builder.Build(). The original Mode 2
   example loaded state after Build(), producing a runtime "Failure to
   infer one or more parameters. state: UNKNOWN" error on the first
   request.

2. The source-generated ServerJsonContext required for AOT-safe JSON
   serialization is internal. Consumers literally cannot reference it
   from their Program.cs — they had no way to wire it into
   ConfigureHttpJsonOptions themselves, so even after registering state
   they would have hit serialization errors on the OpenAI-compatible
   DTOs.

Both pieces of DI setup live inside ServerStartup.BuildApp today (which
is the Mode 1 dedicated-app path). Mode 2 had no equivalent entry point.

Change
Add src/DotLLM.Server/ServiceCollectionExtensions.cs with a single
public helper:

    public static IServiceCollection AddDotLLM(
        this IServiceCollection services, ServerState state)

It registers the loaded ServerState as a singleton and inserts
ServerJsonContext.Default into the HTTP JSON type resolver chain,
mirroring what BuildApp does internally. ServerJsonContext stays
internal — it's referenced from inside the assembly where it's visible.

Consumers now write:

    builder.Services.AddDotLLM(state);
    // ... builder.Build(); app.MapDotLLMEndpoints(); ...

Docs
README Mode 2 example rewritten: loads the model up front, calls
builder.Services.AddDotLLM(state) before builder.Build(), then mounts
MapDotLLMEndpoints alongside the consumer's own routes.

Verified end-to-end locally: consumer ASP.NET Core app on port 5050
serves both its own GET /hello and dotLLM's POST /v1/chat/completions
from the same WebApplication, with SmolLM-135M loaded via
ServerStartup.LoadModel and a valid OpenAI-shaped completion response.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Add plan-issue Claude Code skill

Scaffolding for the /plan-issue slash command: reads a GitHub issue,
gathers referenced docs, and enters plan mode to produce a concrete
implementation plan following the conventions in CLAUDE.md. Mirrors
the issue-driven workflow codified in CONTRIBUTING.md so anyone
cloning the repo can use the same planning flow.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* README: review pass — phase status, model pull flow, usage restructure (kkokosa#119)

Four fixes from a read-through against actual command behavior.

1. Status line was stale ("Phase 5 complete"). Updated to reflect
   Phase 6 complete (warm-up, AOT, paged KV-cache, speculative
   decoding) and Phase 7 in progress (logprobs landed).

2. Model download is NOT automatic. GgufFileResolver.Resolve prints
   "[red]Not found as file or downloaded model[/]" and returns null
   when the model isn't already under ~/.dotllm/models/. Every quick
   example now shows `dotllm model pull <repo>` before `run`/`chat`/
   `serve`, and the docs text no longer claims auto-download.

3. Restructured the middle of the README:

   Getting Started   → install paths only (pre-built release / source)
   Usage (new H2)    → Manage models, Run, Chat, Serve, CLI reference
   Development (new) → Debug build, Tests, Benchmarks, llama.cpp

   Serve was missing from the usage examples entirely — now shows
   starting the server, binding host/port, hybrid GPU offload,
   speculative decoding, and a curl example against the running API.
   Also points at the "Host the OpenAI API in your ASP.NET app" NuGet
   section for the embed-in-your-app story.

4. CLI option reference was a single table missing ~half the flags
   (--response-format, --schema, --pattern, --grammar, --tools,
   --paged, --cache-type-k/v, --cache-window, --speculative-*,
   --tool-choice, --no-prompt-cache, --prompt-cache-size, --verbose,
   all of serve's --host/--port/--no-ui/--no-browser/--warmup-*),
   and conflated default values across run/chat (128 vs 512 max-tokens,
   different prompt-cache defaults). Replaced with four tables:
   Common, Sampling & constraints, run-only, chat-only, serve-only,
   plus a note about the -p (prompt vs port) and -s (seed vs system)
   short-flag collisions.

Cross-checked every option and default against RunCommand.cs,
ChatCommand.cs, ServeCommand.cs, and GgufFileResolver.cs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* README: fix speculative decoding examples to use a vocab-compatible pair (kkokosa#119)

Speculative decoding requires target and draft to share the same
tokenizer vocabulary (enforced at load time by
SpeculativeConstants.AreVocabsCompatible in RunCommand.cs:383). The
previous examples paired Llama-3.2-3B with SmolLM-135M, which have
entirely different vocabularies — dotLLM would reject the combo with
a "Draft model vocab size ... differs from target" error before any
generation could happen.

Replace with the first validated pair from
scripts/test_models_speculative.py: Llama-3.2-3B-Instruct Q8_0 as
target and Llama-3.2-1B-Instruct Q4_K_M as draft, both 128256 tokens.
Add the required `dotllm model pull` steps so the run example works
end-to-end on first try.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Add docs/icon.png for NuGet packages (kkokosa#119)

Rasterized from dotllm-page's assets/img/favicon.svg (the ·LLM
wordmark — dark indigo #1a1a2e rounded square, period in #6366f1,
"LLM" in white) to a 128x128 PNG at docs/icon.png.

Wired into Directory.Build.props: PackageIcon=icon.png plus a
conditional None Include / Pack=true item so every packable project
automatically bundles it at the package root. Verified against a
freshly packed DotLLM.Engine nupkg — the nuspec has
<icon>icon.png</icon> and the file is present at the root.

The icon now appears on NuGet.org package pages and in IDE package
explorers alongside the dotllm.dev project URL.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Address PR review feedback (kkokosa#119)

Applies Gemini review suggestions from PR kkokosa#120:

1. Directory.Build.props: set ContinuousIntegrationBuild=true when
   GITHUB_ACTIONS=true, for deterministic builds and normalized source
   paths in embedded PDBs. Local builds are unaffected.

2. .github/workflows/release.yml: replace the manual actions/cache@v4
   step in pack-nuget with setup-dotnet@v4's built-in cache, and
   extend the same config to publish-single-file and publish-aot
   (which had no cache before). cache-dependency-path uses both
   '**/*.csproj' and '**/Directory.Packages.props' so the cache
   invalidates on both new PackageReferences and centralized version
   bumps.

3. .github/workflows/release.yml: add an explicit
   'Install AOT dependencies (Linux)' step before 'Publish (AOT)',
   guarded by `if: runner.os == 'Linux'`, running
   `sudo apt-get update && sudo apt-get install -y clang zlib1g-dev`.
   Defensive against future ubuntu-latest image drift.

4. README.md: add '(experimental)' next to 'Native AOT' in the Phase 6
   status line to match the `continue-on-error: true` framing of the
   AOT release job and the language used in the 'Use a pre-built
   release' section.

The CleanPublishOutput target in src/DotLLM.Cli/DotLLM.Cli.csproj is
intentionally kept as-is per the reviewer's 'no changes strictly
required' note — it stays as a defensive net for local dotnet publish
invocations without -p:DebugType=none.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>