Add PaCMAP and LocalMAP dimensionality reduction algorithms#49
Conversation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Ports the Python PaCMAP implementation to JavaScript, following the existing UMAP/TriMap stylistic patterns. PaCMAP uses three explicit pair types (NN, MN, FP) with a dynamic three-phase weight schedule and Adam optimization. LocalMAP extends PaCMAP with embedding-space FP resampling in phase 3 for sharper local cluster separation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Updates all doc pages that list or reference UMAP as a representative DR algorithm to also include PaCMAP and LocalMAP, and adds a cross- reference from the UMAP page to its newer alternatives. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Ah shoot, the CLAUDE.md was accidentally committed here, but if you dislike that, I can remove it (it just helps people using these tools to save tokens) Anyhow, I'm very happy to discuss or learn that such PRs are not welcome -- I've been using them for my own side-projects, but am new to how etiquette is developing for contributors, and this is my first go at that :) |
Three bugs caused the embedding to collapse at iteration 200: 1. The local adjustment was on FP gradient weights, but should be on NN gradients — scale by nn_scale/sqrt(d_ij) to amplify attraction for already-close NN pairs and dampen it for far ones. 2. FP pairs were rebuilt every iteration via BallTree KNN on the embedding, targeting the closest non-neighbors (mostly intra-cluster) with amplified repulsion. Reference uses random candidates filtered by distance threshold, resampled only every 10 iterations. 3. FP gradient used a distance-based weight scaling; reference uses the standard unscaled repulsive gradient. Reference: https://github.com/hanxiao/mlx-vis/blob/main/mlx_vis/_pacmap/pacmap.py Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The canonical implementation (YingfanWang/PaCMAP) uses a strictly-greater-than condition to enter the local NN gradient phase, meaning the first phase 3 step still runs the standard PaCMAP gradient. The second reference (hanxiao/mlx-vis) switches immediately at phase 3 start, matching our current behaviour. Added a TODO comment explaining the difference, the two conflicting references, and the suggested fix to investigate empirically. Reference: https://github.com/YingfanWang/PaCMAP/blob/master/source/pacmap/pacmap.py Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
LocalMAP fix: The phase 3 implementation was broken — the embedding would collapse into a structureless ball at iteration 200. Three bugs were identified by comparing against the canonical reference (YingfanWang/PaCMAP) and a second reference (hanxiao/mlx-vis):
Also noted a minor discrepancy between the two references on exactly when the local NN gradient kicks in (see TODO comment in 🤖 Generated with Claude Code (~200 words of PR comment from ~15 words of human prompt) |
|
video of localmap algo working in an app I'm building: https://imgur.com/kA3muZv.mp4 |
Replaces hard-coded BallTree with a knn_backend parameter accepting "annoy" (default, 20 trees to match reference's tree.build(20)) or "hnsw", matching the canonical PaCMAP reference implementation. LocalMAP inherits the change via super.init(). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pass complete parameter objects to Annoy (numTrees: 20, maxPointsPerLeaf: 10) and HNSW (all required fields) to satisfy strict TypeScript typedefs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When input has >100 dimensions and apply_pca=true (default), reduce to 100 dims via PCA before KNN search, MN pair sampling, and Y initialization — matching the reference implementation's TruncatedSVD preprocessing path. For D<=100, behaviour is unchanged. LocalMAP inherits this via super.init(). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Allows callers to override KNN backend defaults (e.g. numTrees for Annoy, m/ef for HNSW) via a knn_params object that is spread into the constructor. Rebuild dist. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Closes #48
Summary
PaCMAP(Pairwise Controlled Manifold Approximation Projection) andLocalMAPas new DR algorithms, following the existing UMAP/TriMap stylistic patternsReference
Test plan
PaCMAPandLocalMAPunit tests pass across Node and all three browser environmentsnpm run buildsucceeds with no new type errors🤖 Generated with Claude Code (~200 words of PR description from ~230 words of human prompts across this session)