Intel'S 3D Xpoint: A Revolutionary Breakthrough in Memory Technology
Intel'S 3D Xpoint: A Revolutionary Breakthrough in Memory Technology
Intel’s 3D XPoint
            A revolutionary breakthrough in Memory Technology
Jose Jestin George, MCA-S5, Kristu Jyoti College of Management and Technology
                                                                    tube filled with mercury and plugged at each end with a quartz
   Abstract— The explosion of connected devises and digital          crystal, delay lines could store bits of information in the form
services is generating massive amount of new data. For this data     of sound waves propagating through mercury, with the quartz
to be useful, it must be stored and analyzed very quickly. 3D        crystals acting as transducers to read and write bits. Delay line
XPoint technology is an extensively new class of non-volatile        memory would be limited to a capacity of up to a few hundred
memory technology that can help turn immense amount of data
                                                                     thousand bits to remain efficient.
into valuable information in real time. With up to 1000 times
lower latency and exponentially greater endurance than NAND.
3D XPoint technology can deliver game changing performance              Two alternatives to the delay line, the William’s tube and
for big data applications and transactional workloads. Its ability   Selectron tube, originated in 1946, both using electron beams
to enable high speed, high capacity data storage close to the        in glass tubes as means of storage. Using cathode ray tubes,
processor creates new possibilities for system architects and        Fred Williams would invent the Williams tube, which would
promises to enable new applications.                                 be the first random-access computer memory. The William’s
   3D XPoint technology innovative , transistor less cross point
                                                                     tube would prove more capacious than the Selectron tube (the
architecture creates a three dimensional check board where
memory cells at the intersection of word lines and bit lines         Selectron was limited to 256 bits, while the Williams tube
allowing the cells to be addressed individually. As a result, data   could store thousands) and less expensive. The Williams tube
can be written and read in small sizes, leading to fast and          would nevertheless prove to be frustratingly sensitive to
efficient read / write processes.                                    environmental disturbances.
    Index Terms— 3D XPoint, NAND, Flash Memory.                         Efforts began in the late 1940s to find non-volatile memory.
                                                                     Jay Forrester, Jan A. Rajchman and An Wang developed
                                                                     magnetic-core memory, which allowed for recall of memory
                     I. INTRODUCTION                                 after power loss. Magnetic core memory would become the
   In computing, memory refers         to       the computer         dominant form of memory until the development of transistor-
hardware integrated circuits that store information for              based memory in the late 1960s.
immediate use in a computer; it is synonymous with the term
"primary storage". Computer memory operates at a high                  Developments in technology and economies of scale have
speed, for example random-access memory (RAM), as a                  made possible so-called Very Large Memory (VLM)
distinction     from storage that      provides      slow-to-        computers.
access information but offers higher capacities. If needed,
contents of the computer memory can be transferred                           II. CURRENT MEMORY TECHNOLOGIES
to secondary storage, through a memory management
technique called "virtual memory". An archaic synonym for            The term "memory", meaning "primary storage" or "main
memory is store.                                                     memory", is often associated with addressable semiconductor
                                                                     memory, i.e. integrated circuits consisting of silicon-
                                                                     based transistors, used for example as primary storage but also
   In the early 1940s, memory technology often permitted a
                                                                     other          purposes         in       computers         and
capacity of a few bytes. The first electronic programmable
                                                                     other digital electronic devices.
digital computer, the ENIAC, using thousands of octal-base
radio vacuum tubes, could perform simple calculations                   Most semiconductor memory is organized into memory
involving 20 numbers of ten decimal digits which were held in        cells or bistable flip-flops, each storing one bit (0 or 1). Flash
the vacuum tube accumulators.                                        memory organization includes both one bit per memory cell
                                                                     and multiple bits per cell (called MLC, Multiple Level Cell).
  The next significant advance in computer memory came               The memory cells are grouped into words of fixed word
with acoustic delay line memory, developed by J. Presper             length, for example 1, 2, 4, 8, 16, 32, 64 or 128 bit. Each word
Eckert in the early 1940s. Through the construction of a glass       can be accessed by a binary address of N bit, making it
                                                                     possible to store 2 raised by N words in the memory. This
                                                                     implies that processor registers normally are not considered as
                              Kristu Jyoti College of Management and Technology, Changanacherry                                     2
memory, since they only store one word and do not include an         Disk memory is what holds all of our files and programs when
addressing mechanism. Typical secondary storage devices              not in use. It is the memory we are all most familiar with.
are hard disk drives and solid-state drives.                         We want to design computers that can store lots of data AND
                                                                     operate very fast. However, when we build memory storage
  There are two main kinds                  of    semiconductor
                                                                     devices, there is a trade-off between speed and memory. Either
memory, volatile and non-volatile.
                                                                     you can build very large memory, like your spinning disk hard
                                                                     drive, or very fast memory, like the registers in your CPU.
 A. Volatile memory                                                  Memory Hierarchy lets us have the best of both worlds - speed
                                                                     and size. You have a small amount of ultra-fast memory, a
  Volatile memory is computer storage that only maintains its
                                                                     larger amount of slower memory, and a huge amount of very
data while the device is powered. Most RAM (random access
memory) used for primary storage in personal computers is            slow memory. By cleverly choosing what data to store in
volatile memory. RAM is much faster to read from and write           which type of memory, we can appear to have a huge amount
                                                                     of very fast memory.
to than the other kinds of storage in a computer, such as
the hard disk or removable media. However, the data in RAM           Chip designers have to know a ton about how to pass data
stays there only while the computer is running; when the             from registers to caches, from cache to main memory, and
computer is shut off, RAM loses its data.                            from main memory to the hard disk. Low-level programmers
  Volatile memory contrasts with non-volatile memory, which          need to be aware of how memory management works if they
does not lose content when power is lost. Non-volatile               want to write programs that manipulate lots of data quickly.
memory has a continuous source of power and does not need            Memory hierarchy describes each level of computer storage
to have its memory content periodically refreshed. Examples          by response time. For example your RAM is faster to access
of non-volatile memory are flash memory (used as secondary           than your hard drive so it is above it in memory hierarchy. It is
(memory) and ROM, PROM, EPROM and EEPROM memory                      important to note that capacity and complexity are intricately
(used for storing firmware such as BIOS).                            related.
 B.   Non-volatile memory
                                                                      A. Internal register:
  Non-volatile memory (NVM) is a type of computer memory
that has the capability to hold saved data even if the power is        Internal register in a CPU is used for holding variables and
turned off. Unlike volatile memory, NVM does not require its          temporary results. Internal registers have a very small
memory data to be periodically refreshed. It is commonly used         storage; however they can be accessed instantly. Accessing
for secondary storage or long-term consistent storage.                data from the internal register is the fastest way to access
                                                                      memory.
Non-volatile memory is highly popular among digital media;
it is widely used in memory chips for USB memory sticks and
digital cameras. Non-volatile memory eradicates the need for
relatively slow types of secondary storage systems, including
hard disks.Non-volatile memory is also known as non-volatile
storage.
   Examples of volatile memory are primary storage, which is
typically dynamic random-access memory (DRAM), and
fast CPU cache memory, which is typically static random-
access memory (SRAM) that is fast but energy-consuming,
offering lower memory areal density than DRAM.
   The 3D XPoint technology innovative, transistor-less cross         3D XPoint has a different architecture from other flash
point architecture creates a three-dimensional checkerboard        products. It's reputed to be based on phase-change
where memory cells sit at the intersection of word lines and bit   memory technology, with a transistor-less, cross-point
lines, allowing the cells to be addressed individually. As a       architecture that positions selectors and memory cells at the
result, data can be written and read in small sizes, leading to    intersection of perpendicular wires. Those cells, made of an
fast and efficient read/write processes.                           unspecified material, can be accessed individually by a current
                                                                   sent through the top and bottom wires touching each cell. To
                                                                   improve storage density, the 3D XPoint cells can be stacked in
                                                                   three dimensions.
Selector:                                                          Each cell stores a single piece of data, making a cell represent
   Memory cells are written or read by varying the amount of       either a 1 or a 0 through a bulk property change in the cell
voltage sent to each selector. This eliminates the need for        material, which modifies the cell's resistance level. The cell
transistors, increasing capacity and reducing cost.                can occupy either a high- or low-resistance state, or changing
                                                                   the resistance level of the cell changes whether the cell is read
Fast Switching Cell:                                               as a 1 or a 0. Because the cells are persistent, they hold their
   With a small cell size, fast switching selector, low-latency    values indefinitely, even when there is a power loss.
cross point array, and fast write algorithm, the cell is able to
switch states faster than any existing nonvolatile memory
                                                                   Read and write operations occur by varying the amount of
technologies today.
                                                                   voltage sent to each selector. For write operations, a specific
                                                                   voltage is sent through the wires around a cell and selector.
  In their 2015 announcement of the technology, Intel and
                                                                   This activates the selector and enables voltage through to the
Micron claimed 3D XPoint would be up to 1,000 times faster
                                                                   cell to initiate the bulk property change. For read operations, a
and have up to 1,000 times more endurance than NAND flash,
                                                                   different voltage is sent through to determine whether the cell
and have 10 times the storage density of conventional
                                                                   is in a high- or low-resistance state.
memory. Early products are faster and more durable than
NAND and denser than conventional memory, but they
haven't lived up to the full extent of the vendors' claims.        3D XPoint has the ability to write data at a bit level, an
                                                                   advantage over NAND. All the bits in a NAND flash block
                                                                   must be erased before data can be written. In theory, this
                                                                   capability enables 3D XPoint to have higher performance and
                                                                   lower power consumption than NAND flash.
even if system power is removed. While we don't know how               C. Use cases
the resistance change works, one thing that we do know is that
unlike DRAM, each data cell does not need any transistors,           3D XPoint is used as an additional layer of storage between
which gives rise to Optane's next important property: it's a lot   flash and DRAM. It's a relatively common practice to tier
denser than DRAM, with Intel and Micron variously claiming         storage between hard disk drives (HDDs) and flash. High-
a density improvement of four to ten times.                        intensity data and applications that benefit more from high
                                                                   speeds are stored on the flash layer, while data and
               VII. FEATURES OF 3D XPOINT                          applications that are accessed less frequently are put on disk.
                                                                   3D XPoint is another layer of storage above flash for data and
 A. Speed and performance
                                                                   applications that need even greater speeds.
                                                                     Intel expects the 3D XPoint Optane SSD will be used for
  With the 3D XPoint architecture, data no longer has to be        high-performance storage and caching, as well as to extend
stored in 4 KB blocks using a slow, file I/O stack. The new        and replace memory. According to the company's projections,
technology enables small amounts of data to be written and         users will be able to increase server memory by as much as
read, making the read/write process faster and more efficient      eight times and displace DRAM by as much as a 10:1 ratio for
than NAND. Initial products using the 3D XPoint technology         select workloads.
bear this out, though not at the speed and performance levels
Intel and Micron promised when they rolled out the
technology.                                                            D. Extensions:
  While not as fast as DRAM, 3D XPoint has the advantage of         Intel has provided three ways to extend memory with 3D
being nonvolatile memory. From a performance and price             XPoint Optane SSDs:
standpoint, 3D XPoint technology falls between fast, but
costly DRAM and slower, cheaper NAND flash.                        • via an operating system paging mechanism that moves data
                                                                   out to the PCIe-attached SSD when DRAM fills for a
  According to Intel, the P4800X drive performed five to eight     workload;
times faster than the company's NAND flash-based DC P3700          • via optimized applications; or
in internal tests at low queue depths using a mixed workload.      • via Intel's Memory Drive Technology supported on its Xeon
The P4800X can reach as much as 500,000 IOPS -- or                 processors.
approximately 2 GBps -- at a queue depth of 11, Intel claimed.
Observers have speculated that the PCI Express (PCIe) bus            In the future, it will be possible to extend memory with the
used by the P4800X is holding it back from the promised            3D XPoint DIMMs that Intel plans to release. Observers
speed of 1,000 times faster than NAND. Other system changes        speculate that 3D XPoint Optane, and particularly Optane
thought to be needed for the 3D XPoint technology to meet          NVDIMMs, will be used to:
higher performance goals include segregating persistent from
nonpersistent memory when handling machine check errors            •  expand the apparent size of DRAM;
and using a compiler that enables persistent memory to be          •  enable bigger, more-effective databases;
declared, along with using link editors that can build that        •  help overcome big data network bottlenecks;
memory into an application. The applications themselves must       •  facilitate high-performance computing applications;
be rewritten to eliminate file I/O and to use single instruction   •  extend memory and boost instance storage performance in
and vector operations.                                                the cloud;
                                                                   • provide the storage capacity and speed that hybrid clouds
  Nonvolatile 3D XPoint dual in-line memory modules                   need; and
(DIMMs) that fit into DRAM slots and use the double data           • possibly serve as primary memory tiers in hyper-converged
rate bus also may help 3D XPoint reach its full performance          systems
potential.
                                                                       E. Future possibilities:
 B.    Cost:
  As of August 2017, the 375 GB Optane P4800X add-in card            3D XPoint drives excel at servicing random, transactional
 is priced at $1,520, or $4.05 per gigabyte. By comparison,        data sets that are not optimized for in-memory processing. For
 Intel's 400 GB flash-based NVMe PCIe P3700 SSD is $879,           businesses that rely on complex, random analytics, 3D XPoint
 or approximately $2.20 per gigabyte.                              drives would be useful for performing limited real-time
                                                                   analytics on current data sets or for storing and updating
 Intel Optane memory for PCs is $44 for a 16 GB module and         records in real time.
 $79 for a 32 GB module.                                             Intel clearly sees a broad range of other analytical uses for
                                                                   its implementation of 3D XPoint. According to the company’s
                                                                   Intel Optane web pages, the advance memory could be used
                                                                   by retailers to more quickly identify fraud detection patterns,
                              Kristu Jyoti College of Management and Technology, Changanacherry                                             6