Compression
Compression
As designs become larger and more complex, design testing becomes more important to catch silicon
manufacturing defects. Designs increasingly require on-chip test hardware—called scan compression—to
compress automatic test pattern generation (ATPG) tests to manageable budgets. This white paper discusses the
compression architectures available with Cadence® Encounter ® Test that can help customers meet their design
and test criteria.
                                                           Introduction
   Contents
   Introduction ......................................1    Fact:
   What Is Scan Compression                                As designs become larger, the number of scannable flops significantly
   and How Can It Help?........................2           increases, making design testability even more important to catch silicon
                                                           manufacturing defects.
   Different Scan Compression
   Architectures .....................................3
                                                           Reality:
   Compression Architectures
                                                           Test costs have to be kept as low as possible. Traditional FULLSCAN test
   Comparison........................................8
                                                           patterns—where flops are connected up into long scannable chains between
   Analyzing Design for Optimal                            I/Os—are quite expensive to use on testers, both from a test time and test
   Compression Efficiency......................8           data volume perspective.
   Insertion and Validation
   of Scan Compression..........................8          Need:
   Conclusion.........................................9    Designs require on-chip test hardware to compress the time and memory of
                                                           automatic test pattern generation (ATPG) tests to manageable budgets. This
   References .........................................9
                                                           on-chip test hardware is generally referred to as scan compression or simply
                                                           compression.
        The ratio of the number of scan channels to the external FULLSCAN chains is the target compression ratio. When
        the scan chains are properly balanced, you can reduce test time and test data volume close to the target ratio. In
        many designs, using the right architecture, DFT engineers can expect to achieve 200X compression efficiency or
        more, translating into equivalent test time and data volume savings.
                                                                                                                                              Scan out to
              Scan in from
COMPRESSOR
                                                                                                                                                 Tester
                 Tester
                                                                                                                    X-MASK
                                                                                     Mbs
              Mask Controls
        Inevitably, designs contain X-sources (static and/or dynamic) that would make their way into the scan chains during
        ATPG capture. As these Xs are shifted out through the compression logic, they adversely affect efficiency and
        result in higher pattern count and lower test coverage. The X-mask block prevents the X-sources from entering
        the compressor and corrupting it, resulting in higher compression efficiency. Figure 2 shows sample results. The
        X-mask block consists of shift register segments made up of mask registers that are loaded via the scan inputs with
        mask/no-mask values, on a per pattern basis as required.
        Encounter Test offers two types of X-mask logic: WIDE1 and WIDE2. In WIDE1 mode, each scan channel terminates
        at a single mask register and provides the ability to mask the channel on a per scan cycle basis. In WIDE2 mode,
        each scan channel has two mask registers, giving ATPG much greater flexibility in suppressing the Xs from affecting
        compression.
120
100
                                                  80
                             Compression Ratio
                                                  60                                                                         Target
                                                                                                                             With X-Mask
                                                  40
                                                                                                                             Without X-Mask
                                                  20
                                                   0
                                                                      1   2           3           4             5
                                                                                  Designs
www.cadence.com                                                                                                                                             2
                                                                Choosing the Right Scan Compression Architecture for Your Design
        Decompressor options
        Two possible implementations are currently available to decompress the input scan data: broadcast and spreader.
        In the broadcast decompressor, the scan input pins directly load the scan input stimuli onto the channels, i.e.,
        broadcast the data. As the number of channels is far more than the scan input pins, the data is fanned out to the
        rest of the channels as shown in Figure 3a. This scheme is also referred to as Illinois fan-out. The number of unique
        values the channels receive is equal to the number of scan input pins.
        In the spreader decompressor, an XOR network is used to decompress the scan input pins as shown in Figure 3b.
        The number of uniquely controllable scan channels is equal to ∑nCr , where n is equal to the number of scan input
        pins and r is the range from 1 to n. For example, for 8 scan chains, the total number of channels that are uniquely
        controlled is 8C1 + 8C2 + 8C3 + 8C4 + 8C5 + 8C6 + 8C7 + 8C8 = 255.
        Both the broadcast and spreader decompressor are supported in all the compression architectures discussed in
        this white paper. Unless the design has stringent area constraints for test, it is recommended that you include both
        decompressors in the compression macro to take advantage of the different scan input combinations to reduce the
        effect of correlation and improve test coverage. Correlation occurs when a fault requires two or more scan flops
        in a scan slice to have opposite values to test, but those channels are fed by the same scan data. A scan slice is
        defined as a list of scan flops across all the channels at a particular controllable or observable position.
www.cadence.com                                                                                                                 3
                                                                 Choosing the Right Scan Compression Architecture for Your Design
        XOR compression
        In this architecture, the compressor is an XOR network as shown in Figure 4. This non-proprietary combinational
        compression logic is designed to reduce logic area overhead and to meet better timing paths through the XOR
        trees. To minimize the aliasing effect in the compression, each scan channel is designed to be observed across
        multiple scan outputs as shown by the green paths in Figure 4. Aliasing occurs when two scan channels having
        faults on the same scan slice are canceled out by an XOR operation but instead a false positive value is detected on
        the scan outputs.
        The advantage of this XOR compression is simplicity—no dedicated test pins required, easy debugging, efficient
        diagnostic support, smaller area overhead, and a fair amount of X-tolerance. With this architecture, you can target
        compression ratios of 100X or more, and it lends itself well to almost any design style.
        OPMISR compression
        This architecture, based on logic built-in self test (BIST), utilizes an on product multiple input signature register
        (MISR) within the compressor logic. Each scan channel is observed at an MISR, and these registers are connected
        to make multiple MISRs as shown in Figure 5. As the data gets scanned from the channels, the MISRs capture
        and recirculate the data to convert into signatures. At the end of the pattern’s scan operation, the MISR values go
        through an XOR in the space compactor logic and are available at the output pins for measurement.
        In on-product MISR (OPMISR) mode1, the scan output pins do not participate in pattern comparison on a cycle-by-
        cycle basis, but instead are only used during the MISR signature compare event. This would allow the scan output
        pins to be used as scan inputs, effectively doubling (2X) the test compression. The test application time savings
        is primarily due to shorter scan channels, while the test data volume savings occurs because only MISR response
        values—not all scan element values—have to be stored on the automated test equipment (ATE).
www.cadence.com                                                                                                                 4
                                                                   Choosing the Right Scan Compression Architecture for Your Design
Decompressor
… … …
Space Compactor
        OPMISR compression is intolerant of Xs in the design, as any X captured into the MISR would corrupt the calculated
        signature and effectively mark the pattern as invalid. This will result in significant degradation of the overall quality
        of results (QoR) (pattern count and coverage), making X-masking a requirement. WIDE2 masking provides the best
        possible suppression of Xs, but WIDE1 provides acceptable results if the design does not have too many X-sources.
        In addition, Cadence also recommends using test points to block any static X-sources2.
        The advantage of OPMISR architecture is much higher possible scan compression (200X+) compared to XOR
        compression. OPMISR is also designed to reduce the effect of aliasing and correlation, improving compression
        efficiency. Debugging and diagnostics of silicon failures is possible in a single-pass operation. OPMISR compression
        lends itself well to larger designs and packages where more scan and test control pins are available.
        Hybrid compression
        Hybrid compression architecture merges the simplicity and easy debugging of XOR compression with the higher
        compression efficiency provided by OPMISR. For easier debugging and diagnostics during initial silicon bring-up,
        XOR compression mode is preferred. Once test plans are fairly stable and the design enters production flow, users
        can switch to OPMISR mode to take the 2X test time and test data volume reduction advantages.
        From a hardware implementation view, a multiplexer (MUX) must be added to switch between the MISR outputs
        driving the space compactor and the channel tails. The MUX selection can be controlled via a programmable test
        data register (TDR) during test mode setup.
        Any design implementing OPMISR compression can take advantage of hybrid architecture with trivial impact to
        implementation or test development.
        The advantage of serial MISR compression is that the signature can be read out at the end of all the patterns
        instead of on a per pattern basis. This architecture is suited for pin-limited designs that require higher compression
        and in production mode where failure observation can be postponed until the end of the test.
www.cadence.com                                                                                                                     5
                                                                                                         Choosing the Right Scan Compression Architecture for Your Design
        SmartScan compression
        XOR and OPMISR compressions require a minimum number of scan input pins for the ATPG to perform well
        and provide a high compression efficiency. For low-pin-count test designs or for multi-site testing, SmartScan3
        architecture provides all the advantages of XOR compression with as low as a single scan input/output pair. In
        SmartScan mode, data is loaded onto a shift register called the deserializer and then applied to the decompressor.
        On the output side, the XOR compressed values are serially loaded on the serializer before being observed on a
        scan output pin. This architecture also includes a finite-state machine (FSM) controller as shown in Figure 6. When
        generating ATPG patterns, the deserializer/serializer is made transparent and the parallel pin interface into the XOR
        compressor is used. If the parallel pins are also available at the package level, SmartScan mode can be disabled
        during silicon debugging mode to access XOR compression directly via the pins.
SERIAL_SCAN_IN SERIAL_SCAN_OUT
                                                                                                Internal Channels
                                                                DECOMPRESSOR
                                                                                                                                                  COMPRESSOR
                                                                                                                                      X-MASK
                                                                                                     Mbs
Mask Enable
                              Deserializer                                                                                                                                     Serializer
                                   N-Bit Serially                                                                                                                           N-Bit Serially
                                   Loaded Flops                                                                                                                            Unloaded Flops
        Figure 7 shows the comparison between FULLSCAN, single-scan pair XOR compression, and SmartScan for test
        data volume and test application time. Using a single pair for XOR compression achieves very low efficiency due to
        correlation and aliasing effects. With SmartScan, an 8-bit wide ATPG parallel interface is used to achieve the higher
        compression efficiency. Further test time savings can be obtained by using a faster test clock for the SmartScan
        registers while still using the slower scan clock for the design.
        SmartScan architecture is suitable for automotive and mixed-signal designs that are very pin-limited and for
        packages being targeted for multi-site testing. It can also be configured to provide a low-power scan shift
        operation and would be useful on low pin count test (LPCT) and low-power designs. Since it is based on XOR
        compression, validation and tester diagnostics is straightforward.
www.cadence.com                                                                                                                                                                                           6
                                                                                        Choosing the Right Scan Compression Architecture for Your Design
        Hierarchical compression
        When designs are very large and contain multiple IP blocks, hierarchical physical synthesis and implementation is
        the preferred approach. In this case, it would be quite difficult, and in some situations impossible to implement a
        single compression logic for the design. Power, timing, routing, and area considerations would have a much bigger
        impact on DFT. A hierarchical approach to compression is also desired.
        In hierarchical compression architecture, multiple levels of compression are implemented. The lowest-level blocks
        would have scan channels, with compression logic placed around them. Many compressed blocks are then further
        compressed at the next level until the chip-level I/Os are accessible as shown in Figure 8. With this approach, test
        budget goals and physical parameters can be met at each of the compressed blocks, reducing the overall impact at
        the chip level.
                                                                             XOR Spreader/Broadcast
                                    ..                                                      ..                                                     ..
                             XOR/Broadcast                                          XOR/Broadcast                                           XOR/Broadcast
                             …               …                                      …                    …                                  …               …
                                                          Mask Load Bus
                                                                                                                                Control
                                                                                            ....
           Control
                                    ..                                                                                                            ....
                           Block Selector                                         Block Selector                                          Block Selector
           Compression
                                                                                                                     Select 2
                                                         Select 1
                                                 Block
                                    ..                                                       ..                                                   ..
                                                                                                             Block
                                                                                                                                                                        Select 3
                                                                                                                                                                Block
                                                                                   XOR Compactor
        When a hierarchical methodology is implemented, two approaches to pattern generation are available. In the first
        approach, ATPG patterns can be generated and validated at the block level, but then are discarded. Instead, when
        the chip-level design is available, ATPG is generated from the top, targeting one or more of the specific blocks and
        the glue logic.
        In the second approach, once the ATPG patterns are validated at the block level, they are migrated to the top level
        and made available for testing. Faults tested are marked off, and only the glue logic and the interconnect paths
        between the migrated blocks are tested at the chip level. The advantage of this approach is that the CPU require-
        ments for chip-level ATPG are far lower.
        With hierarchical compression, the lower-level blocks can use either XOR or OPMISR compression. At the next level
        and beyond, only XOR compression is possible because Encounter Test does not allow an MISR register feeding
        another MISR register as part of the hierarchical flow. Because data paths from the scan channels to the top-level
        scan pins via XOR trees are long, pipelines are recommended to ensure scan timing goals are met. Pipelines placed
        between each level of hierarchy are known as embedded pipelines, and those placed at the scan I/Os are known as
        external pipelines.
        Hierarchical compression is suitable for large designs that use hierarchical methodologies. The choice
        of pattern generation would depend upon the rest of the DFT architecture, including IEEE 1500 usage and test
        partitioning plans.
www.cadence.com                                                                                                                                                                    7
                                                                 Choosing the Right Scan Compression Architecture for Your Design
www.cadence.com                                                                                                                       8
                                                                      Choosing the Right Scan Compression Architecture for Your Design
                                                              SmartScan
                          Encounter                         Methodology                            Incisive
                          Conformal LEC                                                            NcVerilog
                          • Formal verification                                                    • Validation of ATPG
                                                                                                     patterns
Conclusion
The various Encounter Test compression architectures from Cadence are designed to meet any design and test
criteria—small or large designs, simplicity, low pin count, higher compression or hierarchical implementations.
References
1.	OPMISR Compression – Architecture, Insertion and ATPG Application Note
4. Integrating DFT during Synthesis Rapid Adoption Kit, available for download at Cadence Support
                             Cadence Design Systems enables global electronic design innovation and plays an essential role in the
                             creation of today’s electronics. Customers use Cadence software, hardware, IP, and expertise to design
                             and verify today’s mobile, cloud and connectivity applications. www.cadence.com www.cadence.com
                             © 2015 Cadence Design Systems, Inc. All rights reserved. Cadence, the Cadence logo, Conformal, Encounter, and Incisive are registered trademarks
                             of Cadence Design Systems, Inc.All other trademarks are the property of their respective owners. 3818 01/15 SC/DM/PDF