Summary
LstmLayer.evaluate (and evaluateSequence, which delegates to it) returns only the final hidden state regardless of the layer's OutputMode:
[~,hiddenState,~] = lstm(x,h0,c0,obj.inputWeights,obj.recurrentWeights,obj.bias,"DataFormat","ST");
y = extractdata(hiddenState);
The first output of lstm(...) (the full hidden-state sequence Y) is discarded. For a layer parsed from lstmLayer(..., 'OutputMode','sequence'), the concrete evaluation therefore returns an HxT-expected result as Hx1, and any downstream layer in a sequence network receives a wrong-shaped (and wrong-valued) input. This makes NNV's concrete evaluation disagree with MATLAB's own predict for sequence-output LSTM networks (e.g. stacked LSTMs or CNN-LSTM architectures), and it is inconsistent with the layer's own star reachability, which does handle outputMode == "sequence".
Reproduction (MATLAB R2025b, master @ 3b97fca)
rng(1); D=2; H=3; T=4;
L = LstmLayer('Name','m','InputSize',D,'NumHiddenUnits',H, ...
'InputWeights',0.3*randn(4*H,D),'RecurrentWeights',0.3*randn(4*H,H), ...
'Bias',zeros(4*H,1),'CellState',zeros(H,1),'HiddenState',zeros(H,1), ...
'OutputMode','sequence');
x = randn(D,T);
y = L.evaluateSequence(x);
size(y) % [3 1] - expected [3 4] for OutputMode='sequence'
[Y,~,~] = lstm(dlarray(x), dlarray(zeros(H,1)), dlarray(zeros(H,1)), ...
L.inputWeights, L.recurrentWeights, L.bias, "DataFormat","ST");
size(extractdata(Y)) % [3 4] - what MATLAB's own lstm returns
Suggested resolution plan
In evaluate, capture the first output and select based on obj.outputMode:
[Y,hiddenState,~] = lstm(x,h0,c0,obj.inputWeights,obj.recurrentWeights,obj.bias,"DataFormat","ST");
if strcmp(obj.outputMode, 'sequence')
y = extractdata(Y);
else
y = extractdata(hiddenState);
end
plus a unit test comparing against dlnetwork/predict outputs for both output modes.
🤖 Generated with Claude Code
Summary
LstmLayer.evaluate(andevaluateSequence, which delegates to it) returns only the final hidden state regardless of the layer'sOutputMode:The first output of
lstm(...)(the full hidden-state sequenceY) is discarded. For a layer parsed fromlstmLayer(..., 'OutputMode','sequence'), the concrete evaluation therefore returns an HxT-expected result as Hx1, and any downstream layer in a sequence network receives a wrong-shaped (and wrong-valued) input. This makes NNV's concrete evaluation disagree with MATLAB's ownpredictfor sequence-output LSTM networks (e.g. stacked LSTMs or CNN-LSTM architectures), and it is inconsistent with the layer's own star reachability, which does handleoutputMode == "sequence".Reproduction (MATLAB R2025b, master @ 3b97fca)
Suggested resolution plan
In
evaluate, capture the first output and select based onobj.outputMode:plus a unit test comparing against
dlnetwork/predictoutputs for both output modes.🤖 Generated with Claude Code