Skip to content

Add PaCMAP and LocalMAP dimensionality reduction algorithms#49

Open
patcon wants to merge 11 commits into
saehm:masterfrom
patcon:add-pacmap-support
Open

Add PaCMAP and LocalMAP dimensionality reduction algorithms#49
patcon wants to merge 11 commits into
saehm:masterfrom
patcon:add-pacmap-support

Conversation

@patcon

@patcon patcon commented Apr 14, 2026

Copy link
Copy Markdown

Closes #48

Summary

  • Implements PaCMAP (Pairwise Controlled Manifold Approximation Projection) and LocalMAP as new DR algorithms, following the existing UMAP/TriMap stylistic patterns
  • PaCMAP uses three explicit pair types (nearest-neighbor, mid-near, further) with a dynamic three-phase weight schedule and Adam optimization; initialized via PCA
  • LocalMAP extends PaCMAP with embedding-space FP pair resampling in phase 3 for sharper local cluster separation
  • Both algorithms are exported from the main package, have 17 new tests (9 for PaCMAP, 8 for LocalMAP) with full coverage across Node + Chromium + Firefox + WebKit via Playwright, and include VitePress doc pages with live IRIS visualizations
  • All existing 1460 tests continue to pass

Reference

Test plan

  • PaCMAP and LocalMAP unit tests pass across Node and all three browser environments
  • Full test suite (1460 tests) passes with no regressions
  • npm run build succeeds with no new type errors
  • Doc pages render live IRIS scatter plots via the showcase grid

🤖 Generated with Claude Code (~200 words of PR description from ~230 words of human prompts across this session)

patcon and others added 4 commits April 13, 2026 21:50
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Ports the Python PaCMAP implementation to JavaScript, following the
existing UMAP/TriMap stylistic patterns. PaCMAP uses three explicit
pair types (NN, MN, FP) with a dynamic three-phase weight schedule
and Adam optimization. LocalMAP extends PaCMAP with embedding-space
FP resampling in phase 3 for sharper local cluster separation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Updates all doc pages that list or reference UMAP as a representative
DR algorithm to also include PaCMAP and LocalMAP, and adds a cross-
reference from the UMAP page to its newer alternatives.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@patcon

patcon commented Apr 14, 2026

Copy link
Copy Markdown
Author

Ah shoot, the CLAUDE.md was accidentally committed here, but if you dislike that, I can remove it (it just helps people using these tools to save tokens)

Anyhow, I'm very happy to discuss or learn that such PRs are not welcome -- I've been using them for my own side-projects, but am new to how etiquette is developing for contributors, and this is my first go at that :)

patcon and others added 3 commits May 15, 2026 21:01
Three bugs caused the embedding to collapse at iteration 200:

1. The local adjustment was on FP gradient weights, but should be on NN
   gradients — scale by nn_scale/sqrt(d_ij) to amplify attraction for
   already-close NN pairs and dampen it for far ones.

2. FP pairs were rebuilt every iteration via BallTree KNN on the embedding,
   targeting the closest non-neighbors (mostly intra-cluster) with amplified
   repulsion. Reference uses random candidates filtered by distance threshold,
   resampled only every 10 iterations.

3. FP gradient used a distance-based weight scaling; reference uses the
   standard unscaled repulsive gradient.

Reference: https://github.com/hanxiao/mlx-vis/blob/main/mlx_vis/_pacmap/pacmap.py

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The canonical implementation (YingfanWang/PaCMAP) uses a strictly-greater-than
condition to enter the local NN gradient phase, meaning the first phase 3 step
still runs the standard PaCMAP gradient. The second reference (hanxiao/mlx-vis)
switches immediately at phase 3 start, matching our current behaviour.

Added a TODO comment explaining the difference, the two conflicting references,
and the suggested fix to investigate empirically.

Reference: https://github.com/YingfanWang/PaCMAP/blob/master/source/pacmap/pacmap.py

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@patcon

patcon commented May 16, 2026

Copy link
Copy Markdown
Author

LocalMAP fix: The phase 3 implementation was broken — the embedding would collapse into a structureless ball at iteration 200. Three bugs were identified by comparing against the canonical reference (YingfanWang/PaCMAP) and a second reference (hanxiao/mlx-vis):

  1. Wrong gradient target — the local scaling (nn_scale / √d_ij) should be applied to the NN gradient, not the FP gradient weight.
  2. Wrong FP pair selection — FP pairs were rebuilt every iteration via BallTree KNN on the embedding, which specifically targeted intra-cluster pairs for repulsion and destroyed cluster structure. The reference uses random candidates filtered by distance threshold.
  3. Wrong resampling frequency — FP pairs should only be resampled every 10 iterations (and not at the very first phase 3 step), not every iteration.

Also noted a minor discrepancy between the two references on exactly when the local NN gradient kicks in (see TODO comment in LocalMAP.js).

🤖 Generated with Claude Code (~200 words of PR comment from ~15 words of human prompt)

@patcon

patcon commented May 16, 2026

Copy link
Copy Markdown
Author

video of localmap algo working in an app I'm building: https://imgur.com/kA3muZv.mp4

Screenshot 2026-05-15 at 11 53 18 PM

patcon and others added 4 commits May 20, 2026 01:48
Replaces hard-coded BallTree with a knn_backend parameter accepting
"annoy" (default, 20 trees to match reference's tree.build(20)) or
"hnsw", matching the canonical PaCMAP reference implementation.
LocalMAP inherits the change via super.init().

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pass complete parameter objects to Annoy (numTrees: 20, maxPointsPerLeaf: 10)
and HNSW (all required fields) to satisfy strict TypeScript typedefs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When input has >100 dimensions and apply_pca=true (default), reduce to
100 dims via PCA before KNN search, MN pair sampling, and Y initialization
— matching the reference implementation's TruncatedSVD preprocessing path.
For D<=100, behaviour is unchanged. LocalMAP inherits this via super.init().

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Allows callers to override KNN backend defaults (e.g. numTrees for Annoy,
m/ef for HNSW) via a knn_params object that is spread into the constructor.
Rebuild dist.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Adding support for PaCMAP/LocalMAP

1 participant