0% found this document useful (0 votes)
22 views3 pages

Final Tanaka

The document discusses computer memory hierarchies and types. It begins by explaining that most computer components are a type of memory that stores bits as 1s and 0s using different physical mechanisms like voltage levels. It then distinguishes between volatile memory like CPU registers and RAM that loses data when powered off, and non-volatile memory like NAND flash and HDDs that retain data without power. The hierarchy ranges from fastest but smallest volatile CPU/cache memory to slower but larger non-volatile storage. Research aims to bridge the speed gap between DRAM and NAND flash using storage class memory technologies like RRAM and STT-MRAM. FinFET transistor technology also helps enable further CPU performance and scaling improvements over planar MOSFETs.

Uploaded by

shs5feb
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views3 pages

Final Tanaka

The document discusses computer memory hierarchies and types. It begins by explaining that most computer components are a type of memory that stores bits as 1s and 0s using different physical mechanisms like voltage levels. It then distinguishes between volatile memory like CPU registers and RAM that loses data when powered off, and non-volatile memory like NAND flash and HDDs that retain data without power. The hierarchy ranges from fastest but smallest volatile CPU/cache memory to slower but larger non-volatile storage. Research aims to bridge the speed gap between DRAM and NAND flash using storage class memory technologies like RRAM and STT-MRAM. FinFET transistor technology also helps enable further CPU performance and scaling improvements over planar MOSFETs.

Uploaded by

shs5feb
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

1. Most components in our computer are in fact, a memory.

From CPU registers, RAM, and our


main storage are all memories, due to the fact that they store bits, i.e. 1 and 0. There are lot of
ways to store the bits information. It could be simple mechanical device which open and close,
could also be electric voltage, etc. It is due to the fact that among many options available that
drives the choice of memory system to be used. The main tradeoff is whether to choose speed,
or to choose lower price. Motivated by this fact, the memory hierarchy explains the available
choices of memory to build our computer.

Overall, we can distinguish two categories of memory, volatile and non-volatile. They differ by
their ability to keep the information stored once the system is turned off. It is then
understandable that volatile memory loses the information while non-volatile keeps them. The
ability of non-volatile memory to keep information albeit power is turned off comes to a price
that it often performs slower.

Volatile memory usually comes at the top of the hierarchy, comprising of CPU and registers
which are ultrafast, followed by cache memories and/or SRAM which is a bit slower, then DRAM,
respectively. SRAM (or cache) is used to bridge the speed difference between registers and
DRAM by utilizing temporal and space locality. Anyway, even though DRAM is far slower than
CPU or SRAM, it is in fact still much faster than the fastest non-volatile memory.

NAND-Flash and HDD falls into non-volatile category. They are far slower than volatile memories,
but they have the ability to store the data even after the power is completely off. They are also
much cheaper per bits stored, thus usually comes with a huge storage. From this fact, they are
mainly used to store OS and another data such as movies, music, etc. In terms of speed,
NAND-Flash is normally 10 times faster than HDD as they completely differ in operating
mechanisms. NAND-Flash is also more robust due to the fact that they do not use spinning disks,
as what HDD does.

So to summarize, normally in memory hierarchy, from top to bottom, we have CPU, SRAM,
DRAM, NAND-Flash, and finally HDD.

2. In the previous discussion (question 1), we discussed the boundary between volatile and
non-volatile memory, or more specifically DRAM and NAND-Flash. Even though DRAM list in the
bottom-most of the volatile memory hierarchy while NAND-Flash sits on the topmost of the
non-volatile memory, the speed differs in 10^3 order which is significant. Specifically, DRAM runs
at 50 nsec while NAND-Flash at around 10 micro sec. This causes a huge latency gap and might
degrade the performance.

It is due to this fact that researchers are finding solution to bridge this latency gap by introducing
SCM, storage class memory. The goal is to find a memory which cost per storage is lower than
DRAM while the speed is faster than that of the NAND-Flash. Unlike DRAM, SCM is designed to
be persistent in nature and retains data written to it across power cycles. SCM should also be
more resistant to data re-writes and should has much higher endurance properties.

Several alternatives are available for SCM candidates. Resistive RAM (RRAM),
Spin-Transfer-Torque MRAM (STT-MRAM), Ferroelectric RAM (FeRAM), to name a few. In
principle, RRAM operates by changing the resistance across a dielectric solid-state material. It
involves generating defects in a thin oxide layer, known as oxygen vacancies. Upon set, the
oxygen ions drift and separate with the oxygen vacancies thus enable it to have a low-resistance
state. While upon reset, the voltage with the opposite polarity is given causing the oxygen ions
goes into oxygen vacancies, leading to a high-resistance state. It gives a decent endurance as it
demonstrates 10 year data retention at 85oC with 100k cycles (Wei, IEDM 2015). One of RRAM
advantages is its cell size, which is significantly lower than DRAM, enables to pack a large
capacity in a small size.

STT-MRAM, on the other hand, also works similarly based on changing resistance to change
state. It write the data by spin transfer torque. On parallel-spin, it generates low resistance state
while anti parallel spin leads to high resistance state. A high Tunnel-Magneto-Resistance ratio is
also needed for robust read margin. STT-MRAM looks very promising as it offers a very fast read
and write time. It also offers a very long endurance. In terms of cell-size, it is still in the same
order with DRAM, although it is a bit bigger. While STT-MRAM offers a very durable and high
speed read and write, it is still very expensive in terms of cost/bit at the moment. RRAM looks
more promising for SCM in my point of view for the time being.

3. The effect of scaling down might degrade the transistor performance. Up to this point, it is
thought that the MOSFET technology has reached its end in terms of scaling down, i.e. up until
to 14 nm size. Further scaling down might seems impossible, or at least very prone to unpleasant
effects, including leakage and wide variability.

To make it clear, in planar MOSFET, a single gate controls the source-drain channel. It is intuitive
that a single gate does not have good electrostatic field control, leading to leakage between
source and drain even though the gate is under closed position. To alleviate this, a new
technology called FinFET was proposed. In principle, the FinFet replaces the source-drain
channel with a vertical fin, penetrating to the gate, as if the gate is wrapped around the
source-drain channel. This enables a better control on electric field, thus leading to a more
robust gate which is not prone to a leakage. This ability of FinFET is the main reason why it can
alleviate the scaling down problem of planar MOSFET.

There are other advantages of FinFET compared to planar MOSFET. This includes the high
integration density due to its natural 3D shape. FinFET also offers smaller variability, mainly
variability due to random dopant fluctuation. It is due to all of these advantages that FinFET
enables scaling down further, even up to 7 nm.

4. The average instruction cycle number defines how many clock cycles required to perform one
instruction. It is thus intuitive that it comprises of register cycle plus memory access cycle. In
terms of memory access cycle itself, due to the cache memory technology, often the register
finds the required data from cache memory instead of main memory, thus enable faster
instruction cycle.
For case 1, since only level 1 cache is applied, we have,

Case 1:
Tam (DRAM) = 60 ns
Tac (cache) = 0.8 ns
Clock freq = 2.5 GHz
H = 95%
T=1+0.3×h×Tc+0.3×(1-h)×Tm
T=1+0.3×0.95×(0.8 x 10^(-9) 2.5 x 10^9)+0.3×(0.05)×(60 x 10^(-9) 2.5 x 10^9)
T = 1 + 0.57 + 2.25
T = 3.82 cycles

However for case 2, the process is a bit longer. Upon cache 1 miss, the register will try to fetch
from cache level 2, and if still fail it will then find data from main memory. Thus, the formula is
adapted to be,

Case 2:
Tam (DRAM) = 60 ns
Tac1 (cache1) = 0.4 ns
Tac2 (cache2) = 6 ns
Clock freq = 2.5 GHz
H1 = 95%
H2 = 97%

T=1+0.3×h1×Tc1+0.3×(1-h1)× (h2 x Tc2 + (1-h2) x Tm)


T=1+0.3×0.95 × (0.4 x 10^(-9) 2.5 x 10^9)+0.3×(0.05)× (0.97 x (6 x 10^(-9) 2.5 x 10^9) +
(0.03) x (60 x 10^(-9) 2.5 x 10^9))
T = 1 + 0.285 + 0.28575
T = 1.57075 cycles

You might also like