Features/refactor2 by liutongxuan · Pull Request #1 · liutongxuan/xllm

liutongxuan · 2026-04-28T13:47:08Z

No description provided.

Introduce LLM/VLM/DIT/REC model-input param group types plus a ModelInput wrapper and ModelInputFactory to build model-specific payload partitions from legacy ModelInputParams without modifying the legacy struct.

Add default create_model_input hooks to CausalLM, CausalVLM, and DiTModel so model-specific input partitions are created by model type, and add tests that verify these class-level creation paths.

Keep CausalLM::create_model_input focused on LLM input only, and let RecCausalLM attach rec payload explicitly to match the model abstraction split.

Switch ModelInput and ModelInputParamBundle partitions to optional value semantics to avoid shared_ptr allocations, and standardize conversion helpers around make_*_from_legacy APIs while keeping model-specific input assembly behavior unchanged. Made-with: Cursor

Introduce rvalue overloads for model-input and param-bundle legacy conversion helpers so callers can transfer large tensors/vectors with std::move and reduce copy overhead in hot conversion paths.

Unify executor and graph fallback flows on typed ModelInput while preserving legacy compatibility adapters. Add ACL/MLU/CUDA regression coverage to assert typed run output matches legacy run behavior. Made-with: Cursor

Add rvalue create_model_input overloads on model base classes and route VLM executor's post-processing path through model-level typed input creation to reduce redundant legacy conversions. Extend model_input_factory tests to keep moved-params behavior aligned with existing semantics. Made-with: Cursor

Add rvalue overloads for apply_*_to_legacy and ModelInput run/forward entries across model and runtime layers, and route BaseExecutorImpl/VlmExecutorImpl to move-based typed-to-legacy adapters. Extend factory tests to cover the new move-based apply path. Made-with: Cursor

Hoist the typed-forward helper duplicated in acl/mlu/cuda graph executors into a single shared inline definition under runtime/executor_impl.h, and add an rvalue overload to enable move-based typed forward at fallback paths. Made-with: Cursor

Add has_typed_forward traits and route CausalLMImpl/CausalVLMImpl/RecCausalLMImpl typed forward overloads to the underlying model when it implements them, so models can opt into typed ModelInput forward without changing the runtime adapters. Cover the trait detection with compile-time assertions. Made-with: Cursor

Add typed ModelInput forward overloads on LlmForCausalLMImplBase (common and NPU variants). The body currently unwraps the LLM partition into a legacy ModelInputParams as a Step 3 entry point, which lets CausalLMImpl skip its base typed-to-legacy adapter and dispatch directly into the LLM base for future deeper migration. Made-with: Cursor

Mirror the LlmForCausalLMImplBase typed forward entry on RecForCausalLMImplBase, Qwen3HybridForCausalLMImplBase, and the NPU MtpForCausalLMImplBase so that Rec/Hybrid/MTP families all opt into typed forward dispatch and let CausalLMImpl/RecCausalLMImpl skip the base typed-to-legacy adapter. Made-with: Cursor

Replace the ModelInputParamBundle relay inside apply_model_input_to_legacy and make_model_input_from_legacy with direct per-partition apply/move calls. Saves one optional copy of all four partitions per typed-to-legacy or legacy-to-typed conversion on the hot path. Made-with: Cursor

Remove the forward_with_typed_input helper used by graph eager fallbacks and stop pre-converting params to typed input inside BaseExecutorImpl::run(legacy). The model chain still consumes ModelInputParams natively, so the round trip only added one allocation, one apply_to_legacy and one extra virtual dispatch per call. Typed run/forward entries remain intact for callers that explicitly construct ModelInput. Made-with: Cursor

Acl/Mlu/Cuda graph executors override run(const ModelInput&) and run(ModelInput&&) with the same body as the ExecutorImpl base default (apply_to_legacy + virtual dispatch into run(legacy)). Remove those duplicate overrides and rely on the base default; the typed entry behavior stays identical and the per-executor surface shrinks. Made-with: Cursor

ModelInputFactory was a pass-through wrapper around make_model_input_from_legacy / apply_model_input_to_legacy and the per-model create_model_input methods. Production code has migrated to the direct APIs, leaving the wrapper used only by its own tests. Remove the header/source, fold the still-relevant assertions into the renamed model_input_test.cpp, and drop the redundant Create-For-* coverage already exercised by the model-class tests. Made-with: Cursor

Add CHECK(input.llm.has_value()) to the typed forward overrides on LlmForCausalLMImplBase (common and NPU), RecForCausalLMImplBase, Qwen3HybridForCausalLMImplBase, and the NPU MtpForCausalLMImplBase. This is the first real read of a typed partition inside the model bases; misconstructed ModelInput values now fail fast at the LLM-family entry instead of producing default-valued legacy params downstream. Made-with: Cursor

Replace apply_model_input_to_legacy with explicit per-partition apply calls inside the typed forward bodies of LlmForCausalLMImplBase (common and NPU), Qwen3HybridForCausalLMImplBase, the NPU MtpForCausalLMImplBase, and RecForCausalLMImplBase. Each family now applies only the partitions it actually consumes (LLM always, VLM/Rec when relevant) and skips the DiT branch entirely. Made-with: Cursor

Add a header comment on ModelInput that documents the typed/legacy coexistence, how trait-based dispatch picks up model typed forward overrides, and the minimal opt-in step for new models. Made-with: Cursor

Promote the CausalLMImpl typed forward dispatch coverage from compile-time static_asserts to runtime tests by giving the helper holders real backing impls and verifying that const& and && typed forward overloads on the wrapper hit the model's typed forward when supported, and fall back to the legacy path otherwise. Made-with: Cursor

Add typed ModelInput forward overloads to every VL conditional generation impl (Qwen2/2.5/3 VL, Qwen3 VL Moe, Oxygen VLM, GLM-4V/4V-Moe, MiniCPMV across common and NPU paths). Pure delegate models pass typed input straight to the opted-in language model, while Qwen3 VL variants lower the typed input into a legacy ModelInputParams to preserve their get_deep_stacks pre-processing. MM embedding specializations inherit the typed forward through the CausalVLM/CausalLM chain and need no change. Made-with: Cursor

Construct ModelInput from the legacy ModelInputParams at the worker boundary and call model_executor_->forward with the typed entry. Existing pre-forward mutations (e.g. layer_synchronizer in PUSH mode for LLMWorkerImpl) still run before the typed conversion, so behavior is preserved while production now exercises the typed dispatch chain end-to-end for these worker types. Made-with: Cursor

Migrate the six runtime_.executor->forward call sites in RecWorkerImpl (single forward, encoder/decoder prefill split, decode-only path, and the multi-round loop) to construct ModelInput at the boundary and call the typed forward overload. The multi-round loop keeps a const-ref copy because prepare_round_input_* mutates mutable_input.input_params for the next iteration; other sites move from local ModelInputParams since they are single-use. Made-with: Cursor

Make typed forward(ModelInput const&) and forward(ModelInput&&) the canonical pure-virtual entries on CausalLM/RecCausalLM/CausalVLM, while demoting the legacy ModelInputParams overload to a non-virtual compatibility wrapper. Made-with: Cursor

Promote run(const ModelInput&) and run(ModelInput&&) to the only virtual entrypoints on ExecutorImpl and downgrade ModelInputParams overloads to non-virtual compatibility wrappers in ExecutorImpl/Executor. Made-with: Cursor

Make AttentionMetadataBuilder and dp_utils consume LLMModelInputParams as their primary interface, while keeping legacy ModelInputParams overloads as compatibility wrappers that lower through make_llm_model_input_params_from_legacy. Made-with: Cursor

Switch Qwen2DecoderLayer, Qwen3MoeDecoderLayer, and FusedMoE (common/MLU/ILU) to consume LLMModelInputParams as the primary forward-path input while keeping ModelInputParams overloads as compatibility adapters. Made-with: Cursor

… paths. Continue replacing legacy ModelInputParams access in workers, graph executors, and model forward adapters so typed ModelInput becomes the primary runtime contract while keeping compatibility boundaries localized. Made-with: Cursor

liutongxuan added 28 commits April 28, 2026 10:19

refactor: add model input param groups and factory.

0ef0b89

Introduce LLM/VLM/DIT/REC model-input param group types plus a ModelInput wrapper and ModelInputFactory to build model-specific payload partitions from legacy ModelInputParams without modifying the legacy struct.

refactor: let model classes create typed model input.

7dced75

Add default create_model_input hooks to CausalLM, CausalVLM, and DiTModel so model-specific input partitions are created by model type, and add tests that verify these class-level creation paths.

refactor: move rec model-input assembly into RecCausalLM.

4dc2130

Keep CausalLM::create_model_input focused on LLM input only, and let RecCausalLM attach rec payload explicitly to match the model abstraction split.

refactor: add move-based model-input conversion overloads.

7c25ba5

Introduce rvalue overloads for model-input and param-bundle legacy conversion helpers so callers can transfer large tensors/vectors with std::move and reduce copy overhead in hot conversion paths.

refactor: close typed model-input path in runtime executors.

e9f6e91

Unify executor and graph fallback flows on typed ModelInput while preserving legacy compatibility adapters. Add ACL/MLU/CUDA regression coverage to assert typed run output matches legacy run behavior. Made-with: Cursor

docs: explain ModelInput migration entry points.

9aeb368

Add a header comment on ModelInput that documents the typed/legacy coexistence, how trait-based dispatch picks up model typed forward overrides, and the minimal opt-in step for new models. Made-with: Cursor

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Features/refactor2#1

Features/refactor2#1
liutongxuan wants to merge 28 commits into
mainfrom
features/refactor2

liutongxuan commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

liutongxuan commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant