An Efficient Stack Machine
Martin Schöberl
Overview
JVM stack machine
Parameter passing
Stack access patterns
Common stack caches
Two-level stack cache
Results
JOP Stack Architecture 2
The Java Virtual Machine
JVM is a stack machine
All instructions access the stack
40% access local variables
Stack and local variables need caching
JOP Stack Architecture 3
An Efficient Stack Machine
JVM stack is a logical stack
Frame for return information
Local variable area
Operand stack
We could use independent stacks
Argument-passing regulates the layout
JOP Stack Architecture 4
Parameter passing
int val = foo(1, 2);
...
public int foo(int a, int b) {
int c = 1;
return a+b+c;
}
The invocation sequence:
aload_0 // Push the object reference
iconst_1 // and the parameter onto the
iconst_2 // operand stack.
invokevirtual #2 // Invoke method foo:(II)I.
istore_1 // Store the result in val.
public int foo(int,int):
iconst_1 // The constant is stored in a method
istore_3 // local variable (at position 3).
iload_1 // Arguments are accessed as locals
iload_2 // and pushed onto the operand stack.
iadd // Operation on the operand stack.
iload_3 // Push c onto the operand stack.
iadd
ireturn // Return value is on top of stack.
JOP Stack Architecture 5
Stack Layout
JOP Stack Architecture 6
Stack Content
Operand stack A=B+C*D
TOS and TOS-1 Stack JVM
Local variable area push B iload_1
Former op stack push C iload_2
At a deeper position push D iload_3
Saved context * imul
Between locals and + iadd
operand stack pop A istore_0
JOP Stack Architecture 7
Stack access
Stack operation
Read TOS and TOS-1
Execute
Write back TOS
Variable load
Read from deeper stack location
Write into TOS
Variable store
Read TOS
Write into deeper stack location
JOP Stack Architecture 8
Three Port Stack Memory
Single cycle execution
Two read ports for
TOS and TOS-1 or
Local variable
One write port for
TOS or
Local variable
JOP Stack Architecture 9
Register File Stack Cache
Register file as circular Instruction fetch
buffer - small Instruction decode
Automatic spill/fill
RF read and execute
Five access ports
RF write back
picoJava, aJile
JOP Stack Architecture 10
On-chip Memory Stack Cache
Large cache Instruction fetch
Three-port memory Instruction decode
Additional pipeline stage Memory read
Komodo, FemtoJava Execute
Memory write back
JOP Stack Architecture 11
JVM Stack Access Revised
ALU operation A is TOS
A <- A op B B is TOS-1
B <- sm[p] sm is stack array
p <- p -1 p points to TOS-2
Variable load (Push) v points to local area
A <- sm[v+n] n is the local offset
B <- A
sm[p+1] <- B
op is a two operand stack
operation
p <- p +1
Variable store (Pop)
sm[v+n] <- A
A <- B
B <- sm[p]
p <- p -1
JOP Stack Architecture 12
Do we need a 3-port memory?
Stack operation:
Dual read from TOS and TOS-1
Write to TOS
Variable load/store:
One read port
One write port
TOS and TOS-1 as register
Deeper locations as on-chip memory
JOP Stack Architecture 13
Two-Level Stack Cache
Dual read only from TOS and Instruction fetch
TOS-1 Instruction decode
Two register (A/B)
Execute, load or store
Dual-port memory
Simpler Pipeline
No forwarding logic
JOP Stack Architecture 14
Stack Caches Compared
Design Cache fmax Size
(LC) (bit) (MHz) (word)
ALU - - 237 -
16 register 707 0 110 16
RAM 111 8192 153 128
Two-level 112 4096 213 130
JOP Stack Architecture 15
Summary
The JVM is a stack machine
Stack and local variables need caching
Two-level cache
Two top levels as register
Rest as on-chip memory (two ports)
Small design
Short pipeline
JOP Stack Architecture 16
Further Information
JOP Thesis: p 78-93
Martin Schoeberl, Design and
Implementation of an Efficient Stack
Machine, In Proceedings of the 12th
IEEE Reconfigurable Architecture
Workshop, RAW 2005, Denver,
Colorado, USA, April 2005.
JOP Stack Architecture 17