feat: implementing token-highlighter worker, analyzer, demo#9
Conversation
55f9c6f to
6107b60
Compare
|
@asanth7 Thank you for the PR! Can you please prepare a notebook example of token-highlighter as well? A good reference is here: https://github.com/IBM/vLLM-Hook/blob/main/notebooks/demo_attntracker.ipynb (similar to your |
3f45a0e to
619dc00
Compare
7a8b339 to
3549daa
Compare
IRENEKO
left a comment
There was a problem hiding this comment.
Thanks Arav for your PR revision! Please see the comments in the individual file for details.
| from vllm_hook_plugins import HookLLM | ||
|
|
||
|
|
||
| class HighlighterDemoLLM(HookLLM): |
There was a problem hiding this comment.
This is a wrapper for highlighter-specific config/loading to avoid modifying hook_llm.py, but can be removed if we are able to edit that directly instead.
| highlighter = cfg.get("highlighter", {}) | ||
| os.environ["VLLM_HIGHLIGHTER_TARGET_PHRASE"] = highlighter.get( | ||
| "target_phrase", | ||
| "Sure! I can help with that.", |
There was a problem hiding this comment.
Is this a duplicate of the config? I see you have defined target phrase in the config
| "target_phrase", | ||
| "Sure! I can help with that.", | ||
| ) | ||
| os.environ["VLLM_HIGHLIGHTER_SCORER"] = highlighter.get( |
There was a problem hiding this comment.
Can we put most (or even better, all) these configs into the config files? and add a description for each of the variable and their intended use. In the demo, we just load all configs from the config files and send them as environment variables if needed.
There was a problem hiding this comment.
Should this be kept locally as well for your own reference if this is a debugging script?
|
|
||
| import torch | ||
| import torch.nn.functional as F | ||
| from transformers import AutoModelForCausalLM, AutoTokenizer |
There was a problem hiding this comment.
Do we still need to load huggingface model in the worker?
| needs_hooks = wants_hs or wants_qk or wants_steer | ||
| # Token Highlighter writes its own artifacts inside the wrapped execute_model; | ||
| # it needs install_hooks but not the probe-style flush/get below. | ||
| needs_highlighter = bool(extra.get("highlighter_mode")) |
There was a problem hiding this comment.
I am reluctant to open a flag for any specific use case... Is there a specific reason why you can't reuse needs_hooks?
There was a problem hiding this comment.
I included the highlighter_mode check within need_hooks so that we can re-use need_hooks (just added a wants_highlighter flag like with other workers) and followed a very similar structure as the steering worker; Token Highlighter handles its own hook artifacts (highlighter_activations.pt from the worker for score computation and highlighter.pt from the analyzer for mitigation) and so it doesn't require the post-processing/flushing that the probe method uses.
e40ee39 to
b78d16d
Compare
Signed-off-by: aravs <aravsanthanam578@gmail.como> Signed-off-by: asanth7 <aravsanthanam578@gmail.com>
Signed-off-by: aravs <aravsanthanam578@gmail.como> Signed-off-by: asanth7 <aravsanthanam578@gmail.com>
Signed-off-by: aravs <aravsanthanam578@gmail.como> Signed-off-by: asanth7 <aravsanthanam578@gmail.com>
Add forward_attr scorer alongside autograd: capture last-layer Q/K/V from real scheduler prefill, merge with teacher-forced suffix activations, and score via closed-form last-attention attribution. Shared helpers in TokenHighlighter/utils.py; worker handles capture timing, soft re-prefill, and VLLM_HIGHLIGHTER_SCORER switch. Docs and local/Colab notebooks include Spearman autograd vs forward_attr comparison and beta sweeps. Signed-off-by: aravs <aravsanthanam578@gmail.como> Signed-off-by: asanth7 <aravsanthanam578@gmail.com>
Align examples/demo_token_highlighter.py with tok_grads_soft_hook, config-driven VLLM_HIGHLIGHTER_SCORER, and local snapshot tokenizer load. Add scorer field to Qwen2-1.5B-Instruct.json. Signed-off-by: aravs <aravsanthanam578@gmail.como> Signed-off-by: asanth7 <aravsanthanam578@gmail.com>
Add forward_attr scorer alongside autograd: capture last-layer Q/K/V from real scheduler prefill, merge with teacher-forced suffix activations, and score via closed-form last-attention attribution. Shared helpers in TokenHighlighter/utils.py; worker handles capture timing, soft re-prefill, and VLLM_HIGHLIGHTER_SCORER switch. Docs and local/Colab notebooks include Spearman autograd vs forward_attr comparison and beta sweeps. Signed-off-by: aravs <aravsanthanam578@gmail.como> Signed-off-by: asanth7 <aravsanthanam578@gmail.com>
- unify token highlighter capture/mitigate flow around explicit per-run IDs so artifacts and mitigation are tied to the same capture run - harden worker/analyzer artifact lifecycle to reduce missing highlighter_activations races and improve analyzer fallback/trace handling - strengthen forward_attr gradient approximation plumbing and add scorer validation utilities and derivation documentation - upgrade local and Colab demos with paper-model presets, 12GB AWQ-friendly settings, GCG-focused prompt path, and clearer analysis/debug output - consolidate Token Highlighter docs and notebook guidance so users can reproduce paper-style runs with fewer manual tweaks Signed-off-by: asanth7 <aravsanthanam578@gmail.com>
- Introduce support for highlighter mode in hook installation logic to accommodate token highlighter artifacts. - Improve gradient influence computation by ensuring model configuration checks for attention heads. - Refine handling of forward hooks in the highlighter worker to prevent orphaned references and ensure accurate gradient capture. - Update documentation and comments for clarity on the interaction between highlighter mode and gradient calculations. Signed-off-by: asanth7 <aravsanthanam578@gmail.com>
…rker - precompute the affirmation loss gradient (g_loss) in the worker at capture time so the analyzer no longer needs the full unembedding matrix; forward_attr now ships a ~MB activations artifact instead of a multi-hundred-MB W_U bundle and analyzes with no second model load - make export_forward_attr_weights drop W_U by default (include_unembedding flag) and tolerate its absence end-to-end in grad_influence and the analyzer - fix forward_attr query-path per-head contraction (einsum) and decoder-block input capture for vLLM's fused (positions, hidden_states, residual) layout - remove the autograd last-block gating from the worker; the apples-to-apples last-block reference now lives entirely in examples/compare_token_highlighter_scorers.py via a standalone HF model and a retain_grad pre-hook - add the scorer comparison harness and document forward_attr vs autograd validation (Spearman 0.93, Pearson 0.99) plus the in-pipeline efficiency rationale Signed-off-by: asanth7 <aravsanthanam578@gmail.com>
Remove examples/compare_token_highlighter_scorers.py and .vscode/settings.json from the branch; both remain on disk locally and are gitignored so they are not re-committed to the PR. Signed-off-by: asanth7 <aravsanthanam578@gmail.com>
Signed-off-by: asanth7 <aravsanthanam578@gmail.com>
… in worker Pass the highlighter block from JSON through HookLLM into extra_args (like steer), remove VLLM_HIGHLIGHTER_* env usage, and keep plugin install minimal: needs_hooks triggers collective_rpc install_hooks while probe post-RPCs stay HS/QK-only. Worker wraps execute_model at install, registers mitigate embedding hooks immediately, and installs forward-attr/RoPE capture hooks on first capture after config sync. Update analyzer, demos, notebooks, model JSONs, and document vLLM mixin/install_hooks/execute_model internals in TokenHighlighter.md. Signed-off-by: asanth7 <aravsanthanam578@gmail.com>
… tooling Rewrite the Token Highlighter documentation as a formal technical report focused on the vLLM-Hook integration. Add pandoc/LaTeX config and a build script for PDF export. Extend gitignore to keep local scorer comparison, vscode settings, and generated PDF out of the PR. Signed-off-by: asanth7 <aravsanthanam578@gmail.com>
b78d16d to
82e3d28
Compare
Cite TokenHighlighter.pdf from TokenHighlighter.md, link both the markdown writeup and PDF from docs/use_cases/README.md per contributor guidelines, and remove local-only pandoc build tooling from the branch. Signed-off-by: asanth7 <aravsanthanam578@gmail.com>
Signed-off-by: Arav Santhanam <aravsanthanam578@gmail.com>
Signed-off-by: Arav Santhanam <aravsanthanam578@gmail.com>
Signed-off-by: Arav Santhanam <aravsanthanam578@gmail.com>
Integrate Spotlight from main while keeping Token Highlighter registrations in __init__.py and the use cases README table.
Signed-off-by: aravs <aravsanthanam578@gmail.como>
There was a problem hiding this comment.
Please move this file to the respective folder following the PR template.
There was a problem hiding this comment.
Not sure if we have touched on this topic, but is there a way to enable token highlighter without modifying hook_llm.py and _hook_plugin.py? I think the other use case https://github.com/IBM/vLLM-Hook/blob/main/docs/use_cases/spotlight.md faces similar challenges and was able to avoid modifying the core files. Can you please take a look?
Signed-off-by: aravs <aravsanthanam578@gmail.como>
Signed-off-by: aravs <aravsanthanam578@gmail.como>
Summary
TokenHighlighter.mdfor more detailed information on structure, architecture, and contribution.Type of contribution
Files modified
vllm_hook_plugins/vllm_hook_plugins/workers/highlighter_worker.py(new worker)vllm_hook_plugins/vllm_hook_plugins/analyzers/highlighter_analyzer.py(new analyzer)examples/demo_token_highlighter.py(demo)TokenHighlighter.md(detailed description of contribution)vllm_hook_plugins/vllm_hook_plugins/__init__.py(registered worker/analyzer)model_configs/token_highlighter/Qwen2-1.5B-Instruct.json(model configs)I have NOT modified
hook_llm.pyPlugin architecture checklist
PluginRegistryin__init__.pyV1Worker(notHookLLM)hooks_on=(prefill, generate)flag is set correctly for any new worker registrationTesting
Tested with
examples/demo_token_highlighter.pyusing a local Qwen2-1.5B-Instruct snapshot on a single GPU (tensor_parallel_size=1, max_model_len=512, NVIDIA RTX 5070, 8GB VRAM). For each prompt, the demo runs one hookedllm.generate()pass and onellm.analyze()pass, then prints applied driver tokens, optional analyzer-side driver tokens (if analysis spec differs), and top tokens by score. I validated that scores are non-zero under the autograd scorer path, soft-removal is applied to selected prompt tokens, and the analyzer can be rerun with different specs without recomputing gradients.Related issue
N/A
Contribution acknowledgement
If this contribution is included in a future version of the vLLM-Hook technical report, would you like to be credited as a co-author?
If yes, please provide: