0% found this document useful (0 votes)
16 views103 pages

Ranger Goldstone Test PRD

The document outlines the product test requirements for the Ranger Goldstone Switchboard manufacturing and diagnostics. It includes detailed sections on nomenclature, document structure, test requirements, and definitions, along with a version history indicating updates and revisions. The purpose is to ensure that all necessary features and behaviors are defined and verified for product viability.

Uploaded by

Raviteja Patnam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views103 pages

Ranger Goldstone Test PRD

The document outlines the product test requirements for the Ranger Goldstone Switchboard manufacturing and diagnostics. It includes detailed sections on nomenclature, document structure, test requirements, and definitions, along with a version history indicating updates and revisions. The purpose is to ensure that all necessary features and behaviors are defined and verified for product viability.

Uploaded by

Raviteja Patnam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 103

Goldstone Switchboard Manufacturing Test Specification

Product Requirements Document


Version 1.0

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

1
Version History
Revisio
Description Changed by
Revised on n

11/8/2022 0.1 Document created Jinshui Liu

1/3/2023 0.2 Update document with most sections Jinshui Liu

1/12/2023 0.3 Added some test items, please see items in this shading color Jinshui Liu

1/16/2023 1.0 Added some test items, Submitted for Review & MODS/DIAG development Jinshui Liu

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

2
Contents
1 PURPOSE 7
2 NOMENCLATURE 7
3 DOCUMENT STRUCTURE 7
4 REFERENCE DOCUMENTS 8
5 DEFINITIONS 9
6 RANGER SYSTEM & GOLDSTONE SWITCH BOARD OVERVIEW 11
7 TEST REQUIREMENTS 16
7.1 FRONT PANEL LED/BUTTON CHECK 19
7.2 OS 21
7.3 COME 22
7.4 LS10 NVLINK4 SWITCH IC 29
7.5 OSFP PORTS 33
7.6 COME & LS10 I2C TREE 35
7.7 GOLDSTONE SWITCHBOARD SENSORS 40
7.8 USB 44
7.9 COME UART PORT 46
7.10 GOLDSTONE SWITCH NODE PDB AND FANS 47
7.11 SYSTEM THERMAL & POWER STRESS TEST 51
7.12 PCIE I/O DEVICES 53
7.13 PCIE ENHANCEMENT 59
7.14 M.2 SSDS 60
7.15 FPGA/CPLD DEVICES 66
7.16 FRU EEPROM 72
7.17 EROT & EROT-PROTECTED SW/FW 74
8 SENSOR LIST 78
9 GOLDSTONE NODE I2C TREES 80
10 JTAG & BOUNDARY SCAN TEST 81
11 EQUIPMENT LIST 82
12 REFERENCES 85

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

3
Table of Figures
FIGURE 1. RANGER CHASSIS FRONT / REAR / SIDE VIEWS 11
FIGURE 2. RANGER CHASSIS CABLE BACKPLANE CARTRIDGE (CBC) 12
FIGURE 3. RANGER SYSTEM EXPLODED VIEW 13
FIGURE 4 GOLDSTONE SWITCH NODE AND SWITCHBOARD 13
FIGURE 5. GOLDSTONE SWITCHBOARD BLOCK DIAGRAM 14
FIGURE 6. GOLDSTONE SWITCHBOARD BLOCK DIAGRAM 15
FIGURE 7. GOLDSTONE NODE VS. GOLDSTONE SWITCHBOARD 16
FIGURE 8. 1-SLOT GOLDSTONE TESTER FOR MANUFACTURING TEST 16
FIGURE 9. DGX-GOLDSTONE SYSTEM INTEGRATION MFG TEST FLOW 17
FIGURE 10. DGX-GOLDSTONE BOM STRUCTURE (692-24262-0000-000) 18
FIGURE 11. GOLDSTONE SWITCHBOARD & KEYSTONE BASEBOARD TS1A BUILD PLAN 18
FIGURE 12. GOLDSTONE SWITCHBOARD FACEPLATE 19
FIGURE 13. NBU P2318 COME BLOCK DIAGRAM 22
FIGURE 14. COME DETAILED BLOCK DIAGRAM 23
FIGURE 15. COME TYPE 7 FORM-FACTOR 27
FIGURE 16. COME TYPE 7 2X220-PIN BOARD-2-BOARD CONNECTORS 28
FIGURE 17. LS10 NVLINK4 SWITCH IC EXTERNAL INTERFACES 29
FIGURE 18. GOLDSTONE SWITCHBOARD LS10 PORT ASSIGNMENT 30
FIGURE 19. TE 2344064-4 OSFP CONNECTOR ON GOLDSTONE 32
FIGURE 20. OSFP LOOPBACK DONGLE 32
FIGURE 21. GOLDSTONE SWITCHBOARD I2C TREE 35
FIGURE 22. SWITCHBOARD I2C DEVICES CONNECTED TO COME LPC2I2C INTERFACE 36
FIGURE 23. I2C DEVICES ON COME PCH SML1 36
FIGURE 24. I2C DEVICES & INTERFACES ON COME 39
FIGURE 25. USB 2.0 TYPE 2.0 PINOUT 44
FIGURE 26. GOLDSTONE SWITCHBOARD RS-232 RJ-45 UART PORT PINOUT 46
FIGURE 27. SWITCHBOARD - PDB - FANS INTERCONNECT 50
FIGURE 28. PCIE CONFIGURATION SPACE LAYOUT, 4KB 53
FIGURE 29. PCIE CAPABILITY REGISTERS (THE BASE ADDRESS IS TYPICALLY 0X70) 54
FIGURE 30. GOLDSTONE NODE PCIE DEVICES 55
FIGURE 31. PCIE LANE MARGINING AT RECEIVER REGISTERS 56
FIGURE 32. COME PCH FLEX I/O LANE MAPPING 56
FIGURE 33. SSD INTERFACE MARKET FORECAST (SOURCE: IDC, 2020.12) 60
FIGURE 34. EXAMPLE OF SMART LOG RETURNED BY NVME-CLI / NVME SMART-LOG 65
FIGURE 35. DIFFERENCES BETWEEN MACHX03D AND LCMX03D FAMILIES 66
FIGURE 36. MACHX03D FPGA CONFIGURATION PROCESS 67
FIGURE 37. MACHX03D FPGA CONFIGURATION PORTS (SYSCONFIG) 67
FIGURE 38. GOLDSTONE SWITCHBOARD IN-SYSTEM PROGRAMMING PATH 68
FIGURE 39. MACHX03D INTERNAL FLASH LAYOUT 68
FIGURE 40. MACHX03D FEATURE ROW ELEMENTS 69
FIGURE 41. MAIN AND PORT CPLDS CPU ACCESS PATHS 69
FIGURE 42. GOLDSTONE NODE'S EROT-PROTECTED AP FWS 74
FIGURE 43. EROT-PROTECTED AP-FW UPDATE FLOW WITH BMC 75
FIGURE 44. EROT/CEC1736 & EC-FW FLASH 75
FIGURE 45. GOLDSTONE SWITCHBOARD FW UPDATE PATHS 75
FIGURE 46. OSFP LOOPBACK DONGLE (2000-2250 MATING CYCLES) 82
FIGURE 47. EZDUPE M.2 NVME SSD DUPLICATOR (DM-HE0-8V07NTP) 83
FIGURE 48. PERLE 24-PORT RS-232 TERMINAL SERVER 83
FIGURE 49. M.2 NVME/SATA DUPLICATOR (PRODUPLICATOR.COM) 83
FIGURE 50. QSFP-112 PINOUT 85

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

4
FIGURE 51. QSFP-DD800 VS. QSFP112 PINOUT 86
FIGURE 52. QSFP-DD/QSFP-DD800 CONNECTOR 87
FIGURE 53. OSFP CONNECTOR & PINOUT IN GOLDSTONE 87
FIGURE 54. QSFP FROM QSFP+ TO QSFP-DD800 88
FIGURE 55. QSFP-DD MSA CONNECTOR PCB LAYOUT 89
FIGURE 56. QSFP-DD VS. OSFP 89
FIGURE 57. QSFP-DD VS. OSFP IN SIZES 90
FIGURE 58. CFP VS. QSFP VS. OSFP VS. QSFP-DD 90
FIGURE 59. QSFP-DD VS. OSFP FOR 400G 91
FIGURE 60. USB CONNECTOR PINOUT 91
FIGURE 61. SATA PINOUT 91

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

5
Table of Tables
TABLE 1 DOCUMENT REFERENCE 8
TABLE 2. FRONT PANEL TEST ITEMS 20
TABLE 3. OS VERSION 21
TABLE 4. COME B2B CONNECTOR INTERFACES 25
TABLE 5. COME TEST ITEMS 26
TABLE 6. LS10 TEST ITEMS 31
TABLE 7. OSFP PORTS NON-TRAFFIC TEST ITEMS 33
TABLE 8. GS SWITCHBOARD AND PDB I2C DEVICES TEST ITEMS 37
TABLE 9. COME & LS10 I2C DEVICES ON SWITCHBOARD & PDB 38
TABLE 10. GOLDSTONE SWITCHBOARD SENSOR TEST ITEMS 40
TABLE 11. USB TEST ITEMS 44
TABLE 12. BMC & CG1 CPU UARTS 46
TABLE 13. SUMMARY OF PDB I2C DEVICES (8-BIT I2C ADDRESS) 47
TABLE 14. SWITCHBOARD-PDB INTERCONNECT 48
TABLE 15. TEST ITEMS FOR NODE PDB AND FANS’ SIGNALS 49
TABLE 16. SYSTEM THERMAL & POWER STRESS TEST 51
TABLE 17 GOLDSTONE SWITCHBOARD PCIE DEVICES 55
TABLE 18. PCIE DEVICE GENERAL TEST ITEMS 57
TABLE 19. PCIE ENHANCEMENT TEST ITEMS 59
TABLE 20. M.2 SATA SSD TEST ITEMS (DEFAULT) 61
TABLE 21. M.2 & NVME SSD TEST ITEMS (ONLY APPLICABLE WITH M.2 NVME SSD, NOT USED IN CURRENT VERSION) 63
TABLE 22. MAIN CPLD TEST ITEMS 70
TABLE 23. FRU EEPROM TEST ITEMS 72
TABLE 24. EROT TEST ITEMS 76
TABLE 25 GOLDSTONE NODE SENSORS 78
TABLE 26. QSFP COMPARISON 88

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

6
1 Purpose
This document defines product test requirements for Ranger Goldstone Switchboard manufacturing testing & diagnostics.

2 Nomenclature
Requirements are numbered with somewhat sequential IDs. The terms “shall,” “should,” and “may” are used
interchangeably to specify requirements. The term “will” is not used. To avoid affixing a special meaning to each of these terms,
descriptors are used to prioritize 7 levels of importance, the requirement type. See below.

MANDATORY: Defines features or behaviors which must be included for product viability. These requirements must
be implemented and verified for product release.
REQUIRED: Defines features or behaviors which are intended to be included in the product design, for which both
resources and schedule are to be allocated for implementation, but which may be omitted from a particular
product release for budgetary, scheduling, technical, or commercial reasons.
DESIRED: Defines features or behaviors which are desirable, but not necessary for product use, maintenance,
marketing, or manufacture, and for which no significant resources or schedule allocation are intended during
product development; these requirements are sometimes described as “optional” or “nice-to-have”.
PERFORMANCE: Defines characteristics for which no narrow boundary exists between acceptable and
unacceptable behavior; this type of requirement defines characteristics that are sometimes referred to as
“goals”. These requirements indicate a target performance level and should be treated as aspirational, rather
than as prescriptive. In general, PERFORMANCE indicates that improved performance equates to improved
marketability or usability, and the implicit design requirement is “best possible performance within reasonable
bounds for budget and schedule”. Performance characteristics must be characterized and reported as part of
verification and any shortfalls with respect to the specification must be reviewed and approved prior to product
release.
EXPECTED RESULTS: Defines what the test outcome should be or what the operator needs to look for.
LIMITS: Defines the test limits if any.
GUIDANCE: Defines information that is provided only for clarification, to provide context or justification for other
requirements, as guidance to the system developer, as guidance to the project manager, or to indicate planned
design or implementation strategies; this information is not to be treated as a verifiable requirement. This
designator is provided to ensure there is no ambiguity when guidance statements appear interspersed with
requirements or appear in a form that might be misinterpreted as a verifiable requirement.

3 Document structure
The requirements are structured in sections and presented in tables. Sections can be organized by category (inventory,
functional, stress, monitor), by components, functional area, or any other organization that is natural to the product. The suggest
requirement enumeration/naming is to use three capital letters representing the section, followed by 3 digits. This will create
separate enumerated lists of requirements making it easy manage and allows for addition and deletion requirements as the
understanding of the product increases. The number sequence can be strictly sequential or can have gaps, allowing for insertion
of new requirements into the sequence. The goal is to name requirements and not to maintain an ordered sequence of
requirements.

The body of the requirements shall have a “Name:”, which is a short name used for the requirement, followed by one or more
<Requirement Type> stating the requirement. See the following sections for further guidance and examples.

4 Reference Documents
Table 1 Document Reference

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

7
Docu Document Document Link
# Description

01 OPQ71045 Standard Specifications and / or Requirements Template

02

03

04 Ranger HW PAS https://docs.google.com/document/d/1LpuP0pVP6BffiIc1RZFdzUl1wVtJMkJvhRhU30J_YC8/edit?usp=share_link

05 CMIS Common Management Interface Specification,


http://www.qsfp-dd.com/wp-content/uploads/2021/05/CMIS5p0.pdf

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

8
5 Definitions
For the purposes of this document, the following definitions apply:

Acronym Definition Acronym Definition

ODM Original Design Manufacturer CM Contract Manufacturer

NVEX NVIDIA Enterprise Experience SPE Server Product Engineering

WWFO Worldwide Field Ops SN Serial Number

FRU Field Replaceable Unit CRU Customer Replaceable Unit

BOM Bill of Materials BMC Baseboard management controller

COMe Computer on Module Express, a computer CBC Cable Backplane Cartridge


module form-fact defined by PICMG,
https://en.wikipedia.org/wiki/COM_Express

FCT-B Functional Test at Bench / Assembly Line FCT-R Functional testing at Rack

FLA Flashing, firmware programming RIN Run-in testing or Stress testing

FIN Final testing

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

9
Priority Description

Must have - P0 Critical for the release and delivery target date.
The release must be delayed if any of the requirements marked P0 are missing or
incomplete.

Should have - P1 Important but not necessary for the delivery target date.
While these requirements can be equally important as P0, they are often not as time-critical
or may be satisfied through other means so they can be postponed to a later time.

Could have - P2 Wanted or desired but not necessary.


They may improve user experience and customer satisfaction and will typically be included
if time and resources permit.

Won't have - P3 Agreed by stakeholders to not be scheduled for the release.


These requirements are not planned into the schedule for release and are dropped or may
be reconsidered for inclusion in a later release or target date.

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

10
6 Ranger System & Goldstone
Switch Board Overview
Ranger Systems may mean many things – Ranger Cluster or Ranger POD, Ranger Rack and Ranger Chassis:
✔ Ranger Chassis: a 15-OU height chassis with 8x Keystone Compute Nodes (Compute Node) and 3x Goldstone NVLink4 Switch
nodes (Switch Node) that are interconnected with a cable backplane (called Cable Backplane Cartridge – CBC). There is no
dedicated management node and Power Supply Unit (PSU) in the chassis. Only 54VDC is delivered to the Compute nodes and
Switch Nodes via a Bus Bar from the PSU Node external to the Ranger Chassis.
✔ Ranger Rack: A Ranger Rack is a 21” wide OCP-Rack with 1-2 Ranger Chassis, PSU Node, TOR MGNT & In-Band Ethernet / IB
Switches; a Bus bar is used to connect the Ranger Chassis with its PSU node.
✔ Ranger Cluster / POD: A system with multiple Ranger Racks interconnected with Kong NVLink4 Switches, Ethernet / IB
Switches for GPU Scale-up applications.
✔ Cable Backplane Cartridge: a Backplane that connects the NVLink4 links between the Compute Nodes and the Switch nodes
within the Ranger Chassis, and is formed with cables (Not PCB traces). There is no interconnect between the Switch nodes.

Figure 1. Ranger Chassis Front / Rear / Side Views

The Strada Whisper Absolute backplane connectors from TE are used for Ranger Backplane NVLink4 interconnect:
✔ Goldstone Switchboard Node: 8-Pair x 12 connector (TE P/N 2416358) is used, and total 192 differential pairs (96 lanes)
with 2pcs connectors, for 12 lanes (6 NVLink4 links / ports) per Keystone Compute node.
✔ Keystone Compute Node: 4-Pair x 9 connector (TE P/N 2416357) is used, and 72 differential pairs (36 lanes) with 2 pcs
connectors, for NVLink4 connections with the Goldstone Nodes.

The Goldstone Node provides NVLink4 protocol switching between the Keystone Compute nodes for Scale-up GPU accelerating
computing such as Ultra Large AI models with Model Parallelism (rather than data parallelism with Scale-Out GPU cluster interconnected
with IB or RDMA Ethernet switching). For these Ultra Large AI models, they are so large and not able to fit into a single node’s GPU
HBM2E/HBM3 memory.

The 699- P/N of Goldstone Switchboard is 699-24262-0000-000, and the P/N for the Switchboard PCBA to be built and tested
at FXN SJ is 692-24262-0000-000.

The Goldstone Node consists of the following modules and components:


✔ The Goldstone Switchboard with 2pcs LS10 NVLink4 switching devices, one COMe module for control and management –
this is what NVDA is responsible for manufacturing and testing.
✔ The PDB board that converts the 54VDC input to the voltages required by the Goldstone Switchboard
✔ Cables used on Goldstone Node
✔ Goldstone Node Mechanic tray.

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

11
Please refer to Figure 4 Goldstone Switch Node and Switchboard for Goldstone Switch node layout and Figure 5. Goldstone
Switchboard Block Diagram for schematics block diagram.

Each of the LS10 provides 64 NVLink4 links/ports with 2 lanes per link/port. In Goldstone design, only 56 out of the 64 ports are
used: 24 links/ports for NVLink4 interconnects via Backplane and 32 links/ports via 8x OSFP connectors on the front panel.
✔ Total 112 NVLink4 Ports per Goldstone Switchboard for both OSFP and Backplane connectors
✔ 64 NVLink4 ports via OSFP for Interconnect to Kong Switches
✔ 56 NVLink4 ports via Backplane connectors for intra-chassis interconnects with 8x Keystone compute nodes.

The Goldstone Switchboard itself is an orderable product for OEM parts, and the BOM structure is shown in

Figure 2. Ranger Chassis Cable Backplane Cartridge (CBC)

Figure 3. Ranger System Exploded View

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

12
Figure 4 Goldstone Switch Node and Switchboard

Figure 5. Goldstone Switchboard Block Diagram

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

13
Figure 6. Goldstone Switchboard Block Diagram

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

14
7 Test Requirements
The NVBUGS for Goldstone/DGX-Goldstone MODS Requirements is Bug 3902273.
These requirements encompass the Ranger Goldstone Switchboard test requirements. The overall Goldstone Switchboard
integration manufacturing test flow is shown in Figure 9. Please note that NVDA is only responsible for the manufacturing &
testing the Goldstone Switch Board, not the whole Goldstone Node, thus the following modules / components are Golden parts
for Goldstone Switchboard manufacturing at CM (FXN SJ):
● Mechanical tray with opening for JTAG header access for Boundary Scan testing (BSI)
● PDB: power distribution board, from ZT
● Cable set: cables for Goldstone Node internal connections: Power cable, Fan cables, etc.
● OSFP Loopback Dongles: total 16pcs are required for board and are recycled >> 96pcs per 2-chassis Rack.
● 1-Slot Goldstone Tester: 1-slot Goldstone Test chassis with a 3KW PSU, six cooling fans and 2x loopback cables for
Goldstone Switchboard backplane interface, as shown in Figure 8.

Figure 7. Goldstone Node vs. Goldstone Switchboard

Figure 8. 1-Slot Goldstone Tester for Manufacturing Test

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

15
Figure 9. DGX-Goldstone System Integration MFG Test Flow

The DGX-Goldstone BOM structure is shown in Figure 10, indicating Dragon Chassis (w/ PSUs) is a DOM subsystem.
In case of DGX-Goldstone is not built at factory, the flow shown in Figure 9 could be executed at the lab or data center
where the DGX-Goldstone integration is done and the Runin and Power cycling tests could be waived.

Figure 10. DGX-Goldstone BOM Structure (692-24262-0000-000)

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

16
Figure 11. Goldstone Switchboard & Keystone Baseboard TS1A Build Plan

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

17
7.1 Front Panel LED/Button Check
The Goldstone Switchboard/Node Faceplate is shown in Figure 12, and there are following items on the faceplate:
● One RJ45 for COMe 1000Base_T LAN Port: mounted on Switchboard PCB
● One RJ45 for COMe UART RS232 Port: mounted on Switchboard PCB
● One USB 2.0 Type A Port: Mounted on Switchboard PCB
● 16x OSFP Ports: Mounted on Switchboard PCB, See Sections 7.4 and 7.5 for details.
The following items are also on front panel but are mounted on a Front Panel PCB that is connected to the Switchboard
PCBA via a cable:
● One Power Push Button & LED: Mounted on Front Panel PCB and connected to Switchboard via a cable
● One UID Push Button & LED: Mounted on Front Panel PCB and connected to Switchboard via a cable
● One Reset Push Button: Mounted on Front Panel PCB and connected to Switchboard via a cable
● One System Status LED: Mounted on Front Panel PCB and connected to Switchboard via a cable
● One System Fault LED: Mounted on Front Panel PCB and connected to Switchboard via a cable
● Two NVLink Status LEDs: Mounted on Front Panel PCB and connected to Switchboard via a cable

In addition, a 7-Segement Display is on front panel via a cable to the Switchboard PCBA.

These items and the cable are golden parts, but to test related circuits and connections on the Switchboard, the
following tests are required.

Figure 12. Goldstone Switchboard Faceplate

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

18
Table 2. Front Panel Test Items

Test
Priori Nvbug Production
Req-# Requirement Meth
ty tracking Stage
od

Front Panel LED/Button Connection Check


GOLDSTO Name: Power Button & Power Status LED Connections Manu P1 MFG Diag FCT-B
NE-FP000 Location: Power button & LED at Right front panel al EVT
Mandatory: Shall check power button & Power Status LED signal
path connections.
Expected Result: Pressing button to power ON/OFF system and
Power LED ON/FF
Guidance: Manually press power button and obverse Power LED
status Change

GOLDSTO Name: UID Button & UID LED Connections Manu P1 N/A - One FCT-B
NE-FP001 Location: Power button & LED at Right Front Panel. al EVT Diag cannot
Mandatory: Shall check UID Button and UID LED signal path implement
connections. manual checks
Expected Result: Default status - blue color off. Identify server
location – Blue color flashing 1Hz.
Guidance: Manual Check. Controlled by Main CPLD

GOLDSTO Name: Reset Button connection Manu P1 N/A - One FCT-B


NE-FP002 Location: Reset Button at Right Front Panel. al EVT Diag cannot
Mandatory: Shall check Reset Button signal path connection. implement
Expected Result: System should reset once the reset button pressed manual checks
& released.
Guidance: Manual Check. Reset signal to main CPLD

GOLDSTO Name: System Status LED Connection Manu P1 N/A - One FCT-B
NE-FP003 Location: LED at Right Front Panel. al EVT Diag cannot
Mandatory: Shall check System Status LED signal path connection. implement
Expected Result: System Status LED should be ON/OFF as specified manual checks
by Main CPLD Spec.
Guidance: Manual Check, driven by main CPLD

GOLDSTO Name: System Fault LED Connection Manu P1 N/A - One FCT-B
NE-FP004 Location: LED at Right Front Panel. al EVT Diag cannot
Mandatory: Shall check System fault LED signal path connection. implement
Expected Result: System Fault LED should be ON/OFF as specified by manual checks
Main CPLD Spec.
Guidance: Manual Check. Driven by main CPLD

GOLDSTO Name: NVLink LED Connection (2x LEDs) Manu P1 N/A - One FCT-B
NE-FP005 Location: LED at Right Front Panel. al EVT Diag cannot
Mandatory: Shall check NVLink LED signal path connections. implement
Expected Result: The NVLink LEDs should be ON/OFF as specified by manual checks
Main CPLD Spec.
Guidance: Manual Check. Driven by Port CPLD

GOLDSTO Name: 7-Segment Display Connection Manu P1 N/A - One FCT-B


NE-FP006 Location: at left Front Panel. al EVT Diag cannot

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

19
Mandatory: Shall check 7-Segment Display Signal path connections. implement
Expected Result: The 7-Segment Display should work as specified by manual checks
Main CPLD Spec.
Guidance: Manual Check. Driven by Main CPLD

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

20
7.2 OS
Table 3. OS Version

Test
Priorit Nvbug Production
Req-# Requirement Metho
y tracking Stage
d

GOLDSTO Name: Test OS version check Auto P0 Same as FCT-R


NE-OS000 Mandatory: Shall verify that the OS Test Version EVT GOLDSTONE-
Expected Result: OS version should match pre-defined OS version COME001
Guidance: Goldstone MFG Diag uses OPT_OS and Shipping OS is
NVOS.
NVOS: NVIDIA Networking OS, formerly known as MLNX-OS.
MLNX-OS provides a full suite of management options, including
support for UFM® (Unified Fabric Manager), SNMPv1, 2, 3, and web
user interface (Web UI). In addition, it incorporates a familiar
industry standard CLI, which enables administrators to easily
configure and manage the system.

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

21
NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

22
7.3 COMe
Goldstone Switchboard uses a COMe module as the control and management CPU, and the COMe used is developed by
NBU with the following features:
● Shared folder: CFL, E3684
● Product Code: P2318
● Form-Factor: COMe Type 7, Basic form-factor of 95x125mm
● CPU: Intel Coffee Lake H (CFL-H), I3-81000H, 4-Core, 45W TDP. ECC DRAM is supported.
● Memory: up to 2x 260-Pin ECC SO-DIMM is supported and only 1pcs 8GB SO-DIMM DDR4-2666 is used.
● Connector Type: TE 3-1827231-6 5mm, 2X220 pin 0.5pitch (the COM-Express Type 6 / 7 standard
connector), as one connector as shown in Figure 16. COMe Type 7 2x220-Pin Board-2-Board Connectors.

Figure 13. NBU P2318 COMe Block Diagram

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

23
Figure 14. COMe Detailed Block Diagram
For Goldstone Switchboard, COMe is a tested Finished Goods (FG), no need to run a full functional test during the
Goldstone Switchboard manufacturing test; but a sanity test is required to catch component damage(s) and defect(s) during
transportation and storage, as well as for test logs.
As shown in Figure 13. NBU P2318 COMe Block Diagram, the COMe board has the following features:
• CPU: I3-8100H, 4 cores, 6MB cache, 3.0GHz base frequency, 45W TDP

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

24
• PCH: Intel FH82CM246 PLATFORM CONTROLLER HUB (PCH)
• DDR Configuration:
✔ 2 channels, One SO-DIMM per channel.
✔ Each channel: up to 2666MTs, 16GB, DDR4, 1.2V, ECC SO-DIMM
• 256Mb SPI flash for BIOS code (Known Good Image)
• TPM2.0 (SPI bus)
• Die Temperature monitoring mechanism from carrier
• Real Time clock mechanism based on external feed
• Com Express Module VPD/FRU EEPROM
• Voltage monitoring with A2D.
• Interrupt controller (In CPLD)

The COMe 2x220-Pin B2B connector provides the interfaces to Goldstone Switchboard as shown in Table 4.

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

25
Table 4. COMe B2B Connector Interfaces
B2B Interface Device on COMe Device on Goldstone Switch Board

PCIe Gen3 X16: Lanes PEG[31:16] CPU: Intel I3-8100H PEG[23:16] & PEG[31:28] not used; PEG[27:24]
for M.2 NVMe SSD (X4). See M.2 NVMe SSD
Section

PCIe Gen3 X15: Lanes PCIe[15:8][6:0] PCH: Intel FH82CM246 PCIe[15:8] & [6:4] not used, PCIe[1:0] for LS1,
PCIe[3:2] for LS2,

1x PCIe clock PCH: Intel FH82CM246 PCIE_REFCLK_B2B, to M.2 and LS10

1x optional PCIe clock PCH: Intel FH82CM246 As Recovery clock, but NC on COMe

1GbE MDI: MDI[3:0]P/N; ACT, LINK, Intel I219 1000Base-T Ethernet PHY J101 RJ45 w/ transformer. ACT, LINK, LINK100,
LINK100, LINK1000, CTREF (not used) LINK1000 to Main CPLD and Main CPLD to drive
RJ45 ACT & LINK LEDs

2x SATA 3.0 PCH:SATA Port 0 & SATA Port 1 Only SATA Port 0 is used for M.2 SATA SSD

2x UART (TX/RX only) PCH:

4x USB3.1 PCH:USB 3.1 Ports 1, 2, 3, 4 Not used in Goldstone

4x USB 2.0 PCH: USB 2.0 Ports 1-4, (total 14 for PCH) Only USB 2.0 Port 1 is used in Goldstone for
Front Panel USB 2.0 Port. See USB Section.

1x LPC bus PCH: LPC, 9 signals Main CPLD internal registers for COMe

I2C Master from PCH(optional): PCH: SML1: I2C_B2B_SMB_SCL/SDA I2C_TESTING_GPIO_SCL/SDA,


SMB-CLK/DAT

I2C Master from CPLD (LPC2I2C) CPLD: I2C-CPLD-B2B-SCL/SDA B2B: I2C-B2B-SCL/SDA >> I2C-SW-SCL/SDA >>
(I2C-B2B-SCL/SDA) U66/U67 >> Many I2C devices on Switchboard

Optional Carrier BMC I2C bus on SMBus PCH: N/A as no BMC on Switchboard
pins

SPI0 for Programming MUX with PCH to program SBIOS Flash: Connector J60 for using of SPI programmer
U23 > U53 (MT25QL256)

Optional GSPI0 for future usage PCH: GSPI0, 4 bits: CEX_EROT_OOB_* LS10 CEC1736 QSPI1 for EC-FW * AP-FW update,
selected w/ 1-to-2 MUX controlled by Main
CPLD

CPLD field upgrade via B2B JTAG with


fail safe capabilities

CPLD, CPU and PCH JTAG for testing

5VSB and 12V

3V RTC supply: 3V from Battery


supplied by carrier

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

26
4 GPI and 4 GPO (CPLD Field Upgrade CPLD GPIO_JTAG, SW drives GPIOs for GS Switchboard Main CPLD and Port CPLD JTAG
by SW based on GPO/GPI pins) JTAG to update GS CPLD FWs interface

COM Express spec Misc. signals 1. TYPE2-0: COMe Type 1. Tied to GND on Switchboard

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

27
Table 5. COMe Test Items

Test Priorit Nvbug Production


Requirement
Method y tracking Stage
Req-#

GOLDSTO Name: COMe Reset by Reset Button Manual P1 N/A - One FCT-B
NE-COME Location: Reset Button on front panel. EVT Diag cannot
000 Mandatory: Shall check COME reset button function. implement
Expected Result: COME reset after button is pressed. manual
Guidance: Front panel S6, Manual Check. As Part of front panel checks
Reset Button test

GOLDSTO Name: COME CPU Version Auto P1 Jinshui L… FCT-R


NE-COME Mandatory: Shall check COME CPU Version EVT to file and
001 Guidance: COMe I3-8100H post bug #
for all
Goldstone
attributes in
inventory

GOLDSTO Name: COME CPU FW Version Auto P1 Same bug # FCT-R


NE-COME Mandatory: Shall check COME CPU FW Version EVT as
002 Guidance: SPI Flash attached to PCH for BIOS GOLDSTONE-
COME001

GOLDSTO Name: COME CPU SO-DIMM Quantity & Capacity Auto P1 Same bug # FCT-R
NE-COME Mandatory: Shall check COME SO-DIMM Quantity & Capacity EVT as
003 Guidance: GOLDSTONE-
COME001

GOLDSTO Name: COME CPU Memory Test Auto P1 Jinshui L… FCT-R


NE-COME Mandatory: Shall Scan COMe CPU SO-DIMM memory and report EVT to file and
004 ECC status post bug #
Guidance:

GOLDSTO Name: COME LAN Port MAC address Auto P1 Same bug # FCT-B
NE-COME Mandatory: Shall collect & Report COME LAN Port MAC addresses. EVT as
005 Guidance: COMe U9, I219-AT GOLDSTONE-
COME001

GOLDSTO Name: COME LAN Port Link Speed Check Auto P1 Kong-like FCT-B
NE-COME Mandatory: Shall check link speed of COME LAN Port EVT coverage
006 Guidance: COME U9, I219-AT

GOLDSTO Name: COME LAN Port Link Up LED Manual P1 N/A - One FCT-B
NE-COME Location: please refer to Figure 12 for location EVT Diag cannot
007 Mandatory: Shall check LED status of COMe LAN port. implement
Expected Result: The Link Up LED should be on/ GREEN when manual
connected to a TOR or Ethernet Switch checks
Guidance: Manual Check

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

28
GOLDSTO Name: COME LAN Port Activity LED Manual P1 N/A - One FCT-B
NE-COME Location: please refer to for location EVT Diag cannot
008 Mandatory: Shall check LED status of COMe LAN port. implement
Expected Result: The Link Up LED should be blinking when manual
connected to a TOR or Ethernet Switch checks
Guidance: Manual Check

GOLDSTO Name: COME LAN Port Connection Auto P0 Jinshui L… FCT-R


NE-COME Mandatory: Shall check COMe Lan Port network connection by TS1B to file and
009 pinging to a known server at the factory. post "DGX
Guidance: Ping a known server at the factory Diag Tools"
Bug

GOLDSTO Name: COME LAN PORT MISC Signals Checking Auto P1 FA FCT-R
NE-COME Mandatory: Shall check the ACT, LINK, LINK100 & LINK1000 Signals enhancemen
010 from COMe to Switchboard Main CPLD. t, lower
Expected Result: With LAN Port connected to an external Working priority
Switch, ACT, LINK and LINK1000 should be asserted, LINK100 request
deasserted.
Guidance: Please see Switchboard Main CPLD Register Definition Jinshui L…
to provide
what cable is
used for
manufacturi
ng test to
know which
signal to
check

GOLDSTO Name: COME LPC Bus Checking Auto P0 Jinshui L… FCT-R


NE-COME Mandatory: Shall check COMe LPC works with Switchboard Main TS1B to file and
011 CPLD successfully and reliably. post bug #
Method: Read and Write Main CPLD Registers via LPC interface Jinshui L…
Guidance: Please see Switchboard Main CPLD Register Definition to specify
which
registers to
check

GOLDSTO Name: COME-Main CPLD SYNC Bus Checking Auto P1 Jinshui L… FCT-R
NE-COME Mandatory: Shall check COMe – Switchboard Main CPLD SYNC Bus EVT to clarify
012 works successfully and reliably. requirement
Method: Please see COMe CPLD and Switchboard Main CPLD Specs s before
Guidance: B2B Connector Pins A15, A18, A24, B18 filing bug -

One DIag
team needs
more details
on how
exactly yo
test SYNC
bus and
assess effort

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

29
GOLDSTO Name: Switchboard 3V RTC Power Checking Auto P1 Jinshui L… FCT-R
NE-COME Mandatory: Shall check the 3V RTC Power from Switchboard is EVT to file and
013 Present. post "DGX
Method: Please see COMe CPLD Spec Diag Tools"
Guidance: B2B Connector Pin A47 bug

Figure 15. COMe Type 7 Form-Factor

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

30
Figure 16. COMe Type 7 2x220-Pin Board-2-Board Connectors

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

31
7.4 LS10 NVLink4 Switch IC
Goldstone Switchboard uses 2pcs LS10 NVLink Switch ICs to implement NVLink4 protocol switching between GPUs.
Each LS10 device has 64 NVLink4 Ports with each port has 2 lanes (2x TX SerDes + 2x RX SerDes) of 50GT/s PAM4
signaling rate for 100Gbps data rate per SerDes, thus 200Gb/s Bandwidth for each TX and RX direction, i.e., 25GB/s for TX &
25GB/s for RX.
With 64 ports per LS10, the throughput per chip is 12.8Tb/s x 2 (TX+RX) = 25.6Tb/s.
LS10 provides a 2-lane PCIe Gen3 EP to host/CPU (COMe in Goldstone) for control & management via 64MB Memory
space. In addition, an I2C Slave (I2CS) is available for OOB communication to LS10 internal security processor.
LS10 uses a 4-pin SPI interface for LS10 ROM/FW Flash and in Goldstone Switch design, LS10 accesses the ROM/FW
Flash via EROT CEC1736 device for secure boot.
For more details of LS10 interfaces, please refer to Figure 17. LS10 NVLink4 Switch IC External Interfaces.

Figure 17. LS10 NVLink4 Switch IC External Interfaces

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

32
Figure 18. Goldstone Switchboard LS10 Port Assignment
Goldstone Switchboard uses 2pcs LS10 NVLInk4 Switch devices to provide total 128 NVLink4 Ports as shown in Figure
18. Goldstone Switchboard LS10 Port Assignment. As shown, for each LS10, 56 out of the 64 ports are used, 8 ports are
not used; out of the 56 ports used, 24 ports are connected to an 8-Pair x 12 backplane connector from TE and 32 ports are 8
OSFP connectors.
Very limited information is available for LS10 about its testing capabilities, and the following is the list:
✔ LS10 supports internal near-end digital loopback for each Port

✔ LS10 supports external PHY level loopback with TX connected to RX on the same port

✔ LS10 supports EOM measurement per lane

✔ LS10 supports traffic generator per port, (thus we could run loopback test on all ports simultaneously
for stress testing).

For Goldstone Switchboard product manufacturing testing, OSFP loopback dongles and Backplane loopback cables are
used to connect each link’s TX to its RX.

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

33
Table 6. LS10 Test Items

Test Priorit Nvbug


Req-# Requirement Bug # & TS
Method y tracking

GOLDSTO Name: LS10 PCIe ID & Version Check Auto P0 Same bug FCT-R
NE-LS000 Mandatory: Shall check each of the 2pcs LS10 Devices’ PCIe ID and TS1B as
Version. GOLDSTON
Guidance: LS10 PCIe Configuration Space E-COME001

GOLDSTO Name: LS10 PCIe I/F width & speed check Auto P0 Same as FCT-R
NE-LS001 Mandatory: Shall check each LS10’s PCIe I/F width & speed TS1B GOLDSTON
Guidance: LS10 PCIe Gen3, X2 E-COME001

GOLDSTO Name: LS10 Memory Address Space Read/Write access check Auto P1 Indirectly FCT-R
NE-LS002 Mandatory: Shall check the read/write accesses to each LS10’s EVT covered by
Scratch memory/register MODS
Guidance: Each LS10 presents 64MB memory space to Host/COMe initialization

GOLDSTO Name: LS10 FW Loading Status Check Auto P0 Covered in FCT-R


NE-LS003 Mandatory: Shall check each LS10 has loaded its FW successfully. TS1B GOLDSTON
Guidance: LS10 E-LS004

GOLDSTO Name: LS10 FW Version check Auto P0 Same bug # FCT-R


NE-LS004 Mandatory: Shall check each LS10’s FW Version. TS1B as
Guidance: LS10 GOLDSTON
E-LS000

GOLDSTO Name: LS10 NVLink4 Port Internal loopback test Auto P0 Kong-like FCT-R
NE-LS005 Required: Shall perform internal loopback test without any error for TS1B coverage
all used NVLink4 Ports.
Guidance: Please refer to Figure 18. Goldstone Switchboard LS10
Port Assignment for used ports on each LS10

GOLDSTO Name: LS10 NVLink4 Port OPT loopback test Auto P0 Kong-like FCT-R
NE-LS006 Required: Shall perform external loopback test without any error for TS1B coverage
all used NVLink4 Ports using OSFP loopback dongles / backplane
loopback.
Guidance: Please refer to Figure 18. Goldstone Switchboard LS10
Port Assignment for used ports on each LS10

GOLDSTO Name: LS10 NVLink4 Port EOM check w/ loopback Auto P0 Kong-like FCT-R
NE-LS007 Required: Shall check & report NVLink4 port EOM for all used TS1B coverage
NVLink4 Ports using OSFP loopback dongles / backplane loopback.
Guidance: Please refer to Figure 18. Goldstone Switchboard LS10
Port Assignment for used ports on each LS10

GOLDSTO Name: LS10 NVLink4 Port Loopback on all used ports simultaneously Auto P1 Kong-like Runin Test /
NE-LS008 Required: Shall run max traffic test on all used LS10 ports with EVT coverage stress test
external loopback simultaneously for 30 minutes to stress SI, PI,
power and thermal, and test should finish within specific error limit Note
(BER < 1E-13). limitation in
Guidance: Please refer to Figure 18. Goldstone Switchboard LS10 TREX - not
Port Assignment for used ports on each LS10 all ports can
be stressed
at the same

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

34
time as
some would
need to be
dedicated
as
generator

GOLDSTO Name: LS10 PCIe Reset Test Auto P1 Same as Final Test
NE-LS008 Required: Shall check if each LS10 device’s PCIe Reset (PEX-RST) is EVT GOLDSTON
working properly during the final test. E-PCI001
Guidance: Set LS10’s PEX-RST to LOW (via U80/PCA9505) and check
LS10’s PCIe Configuration Space Command Register Bits[2:0] = 0b000 Same as
SBR test

GOLDSTO Name: COMe-LS1 I2CS Link Test Auto P1 Jinshui … FCT-R


NE-LS009 Required: Shall check the COMe-LS10 I2CS Links working normally EVT to clarify
(2x LS10). the
Expected Result: COMe CPU/PCH/CPLD communicates with LS10 difference
(2x) over LS10 I2CS link successfully and reliably between
Guidance: See Figure 22 this and
GOLDSTON
E-LS010

Name: COMe-LS1 I2CC Link Test to clarify


Required: Shall check the COMe-LS10 I2CC Links working normally the
(2x LS10). difference
GOLDSTO P1
Expected Result: COMe CPU/PCH/CPLD communicates with LS10 Auto between FCT-RI2C
NE-LS010 EVT
(2x) over LS10 I2CC link successfully and reliably this and
Guidance: See Figure 22 GOLDSTON
E-LS009

Figure 19. TE 2344064-4 OSFP Connector on Goldstone

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

35
Figure 20. OSFP Loopback Dongle

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

36
7.5 OSFP Ports
During the Switchboard MFG Testing, all these OSFP ports are populated with an OSFP loopback dongle for NVLink connectivity
and traffic testing, please refer to Section 7.4 for details. This section lists the OSFP ports’ other items to be tested during manufacturing
testing:
● +3.3V to OSFP: indirectly test
● GND to OSFP: indirectly test
● I2C: SCL and SDA, described here.
● INT_RST_N: Active Low Interrupt from OSFP and Active Low Reset to OSFP
● PRN_LPW_N: Active Low Present from OSFP and Active Low Low-Power mode to OSFP
● OSFP_POWER_GOOD: OSFP Port Load Power Switch MP5087 Status, Active High, not from OSFP Port
● OSFP_POWER_ENABLE: OSFP Port Load Power Switch MP5087 Enable Signal, Active High.

OSFP transceivers adopt the “Common Management Interface Specification (CMIS)” for module management with I2C interface,
and the I2C address used is 0xA0 (8b) , please refer to CMIS for details.

As shown in Figure 22, OSFP modules’ I2C management interfaces could be accessed by LS10 via its I2CB (LS1 for OSFP Ports 1-8
and LS2 for OSFP Ports 9-16), or by COMe CPU via COMe CPU > PCH > LPC > COMe CPLD > I2C_B2B/I2C_SW interface; and the default is
during normal operations LS10 will access and manage the OSFP modules.
Table 7. OSFP Ports Non-traffic test items
Test Priorit Nvbug Bug # &
Req-# Requirement
Method y tracking TS
GOLDSTO Name: OSFP Ports’ I2C Access (16x OSFP Ports) Auto P0 Jinshui … FCT-R
NE-OSFP01 Mandatory: Shall check each OSFP Port’s I2C SCL/SDA signals EVT to file and
by reading OSFP Loopback Dongle’s internal FRU EEPROM post bug #
Guidance: Reading OSFP FRU EEPROM, not need to write
GOLDSTO Name: OSFP Ports’ INT/RST-N Signals (16x OSFP Ports) Auto P1 FA FCT-R
NE-OSFP02 Mandatory: Shall check each OSFP Port’s INT/RST-N signal EVT enhancemen
Method: Port CPLD to toggle RST-N and read INT-N status, t
should be the same. See OSFP Spec & Port CPLD Spec for more
details
Guidance: Controlled by Port CPLD
GOLDSTO Name: OSFP Ports’ PRN-LPW-N Signals (16x OSFP Ports) Auto P1 FA FCT-R
NE-OSFP03 Mandatory: Shall check each OSFP Port’s PRN-LPW-N signal EVT enhancemen
Method: Port CPLD to toggle LPW-N and read PRN-N status, t
should be the same. Need to check OSFP Spec & Port CPLD Spec
for more details
Guidance: Controlled by Port CPLD
GOLDSTO Name: OSFP Ports’ Green LED1 (16x OSFP Ports) Manual P1 N/A - One FCT-B
NE-OSFP04 Mandatory: Shall check each OSFP Port’s Green LED1 by EVT Diag cannot
toggling Port CPLD related Register bit and check the LED status implement
Guidance: Controlled by Port CPLD manual
checks
GOLDSTO Name: OSFP Ports’ Green LED2 (16x OSFP Ports) Manual P1 N/A - One FCT-B
NE-OSFP05 Mandatory: Shall check each OSFP Port’s Green LED2 by EVT Diag cannot
toggling Port CPLD related Register bit and check the LED status implement
Guidance: Controlled by Port CPLD manual
checks
GOLDSTO Name: OSFP Ports’ Power Good Signal Status Auto P1 FA FCT-R
NE-OSFP06 Mandatory: Shall check each OSFP Port’s Power Good Signal EVT enhancemen
Method: Toggle an OSFP port’s Power enable bit and check t
Power Good signal status
Guidance: Port CPLD Power Enable and Good Register
GOLDSTO Name: OSFP Ports’ Power Enable Signal Auto P1 FA FCT-R
NE-OSFP07 Mandatory: Shall check each OSFP Port’s Power Enable Signal EVT enhancemen
t

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

37
Method: Toggle an OSFP port’s Power enable bit and check
Power Good signal status and read OSFP Port FRU EEPROM.
When Enable the Power Good signal should be high and able to
read FRU EEPROM; When Disable the Power Good signal should
be low and not able to read FRU EEPROM.
Guidance: Port CPLD Power Enable and Good Register
GOLDSTO Name: OSFP Ports’ Temperature Auto P1 Jinshui … FCT-R
NE-OSFP08 Mandatory: Shall check each OSFP Port’s Temperature EVT to file and
Method: Read OSFP modules’ Temperature sensor value via its post bug #
I2C. for OSFP
Expected Result: All OSFP modules’ temperature reading should telemetry
be below 70°C.
Guidance: OSFP Module’s CMIS Spec
GOLDSTO Name: OSFP Ports’ Power Voltage Auto P1 Same as FCT-R
NE-OSFP09 Mandatory: Shall check each OSFP Port’s Power Voltage EVT GOLDSTO
Method: Read OSFP modules’ power voltage sensor value via its NE-OSFP0
I2C. 8 - include
Expected Result: All OSFP modules’ voltage reading should be Voltage
within spec (nominal 3.3V).
Guidance: OSFP Module’s CMIS Spec

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

38
NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

39
7.6 COMe & LS10 I2C Tree
Please refer to Figure 21 for Goldstone I2C tree and Table 9 for detailed information about I2C devices connected to the
COMe. As shown, these are the related interfaces between COMe and Goldstone Switchboard:
● I2C_B2B_SCL/SDA: Main I2C Interface from COMe CPLD LPC2I2C for majority I2C devices, and it is accessed via COMe PCH
LPC interface.
● LPC: from COMe PCH to Switchboard Main CPLD
● EROT_ATTEST_I2C: not used as EROT_ATTEST_I2C on Switchboard is connected to I2C_B2B_SCL/SDA
● SMB_SCL/SDA: I2C_TESTING_GPIO_SCL/SDA on Switchboard, from COMe PCH SML1

Figure 21. Goldstone Switchboard I2C Tree

Figure 22. Switchboard I2C Devices connected to COMe LPC2I2C interface

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

40
Figure 23. I2C Devices on COMe PCH SML1
Table 8. GS Switchboard and PDB I2C Devices Test Items

Test Priorit Nvbug


Req-# Requirement Bug # & TS
Method y tracking

GOLDSTO Name: COMe CPLD LPCI2C I2C_B2B Tree Scan


NE-I2C000 Mandatory: Shall check connectivity of all I2C devices on COMe
CPLD LPCI2C I2C_B2B SCL/SDA bus Same as
P0
Method: Scan the I2C devices on I2C_B2B SCL/SDA bus Auto GOLDSTON FCT-R
EVT
Expected Result: The I2C devices found match the I2C Device list on E-COME011
Table 9
Guidance: See Table 9 for I2C devices

GOLDSTO Name: COMe PCH SML1 I2C_TESTING_GPIO Tree Scan


NE-I2C001 Mandatory: Shall check connectivity of all I2C devices on COMe PCH
SML1 I2C_TESTING_GPIO SCL/SDA bus Jinshui …
P1
Method: Scan the I2C devices on I2C_TESTING_GPIO bus Auto to file and FCT-R
EVT
Expected Result: The I2C devices found match the I2C Device list on post bug #
Table 9
Guidance: See Table 9 for I2C devices

GOLDSTO Name: LS1 I2CA Tree Scan


NE-I2C002 Mandatory: Shall check connectivity of all I2C devices on LS1 I2CA
bus Kong-like
P0
Method: Scan the I2C devices on LS1 I2CA bus from LS1 Auto coverage FCT-R
EVT
Expected Result: The I2C devices found match the I2C Device list on for I2C
Table 9
Guidance: See Table 9 for I2C devices

GOLDSTO Name: LS1 I2CB Tree Scan


Kong-like
NE-I2C003 Mandatory: Shall check connectivity of all I2C devices on LS1 I2CB P0
Auto coverage FCT-R
bus EVT
for I2C
Method: Scan the I2C devices on LS1 I2CB bus from LS1

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

41
Expected Result: The I2C devices found match the I2C Device list on
Table 9
Guidance: See Table 9 for I2C devices

GOLDSTO Name: LS2 I2CA Tree Scan


NE-I2C004 Mandatory: Shall check connectivity of all I2C devices on LS2 I2CA
bus Kong-like
P0
Method: Scan the I2C devices on LS2 I2CA bus from LS2 Auto coverage FCT-R
EVT
Expected Result: The I2C devices found match the I2C Device list on for I2C
Table 9
Guidance: See Table 9 for I2C devices

GOLDSTO Name: LS2 I2CB Tree Scan


NE-I2C005 Mandatory: Shall check connectivity of all I2C devices on LS2 I2CB
bus Kong-like
P0
Method: Scan the I2C devices on LS2 I2CB bus from LS2 Auto coverage FCT-R
EVT
Expected Result: The I2C devices found match the I2C Device list on for I2C
Table 9
Guidance: See Table 9 for I2C devices

GOLDSTO Name: I2C Stability Stress Test


NE-I2C006 Mandatory: Shall check above GOLDSTONE-I2C000 to
GOLDSTONE-I2C005 I2C bus device access robustness
Method: Repeat above GOLDSTONE-I2C000 to GOLDSTONE-I2C005 Covered in
Auto P1 RIN
I2C bus device scan for 5 times –loops
Expected Result: The I2C devices found match the I2C Device list on
Table 9 every time
Guidance: See Table 9 for I2C devices

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

42
Table 9. COMe & LS10 I2C Devices on Switchboard & PDB

I2C Port # Device LOC I2C Add (8b)

G1_A: LS10-1, G1_A, I2CS interface See LS10 Spec

G1_A: LS10-1, G1_A, I2CC interface

G1_B: LS10-2, G1_B, I2CS interface

G1_B: LS10-2, G1_B, I2CC Interface

U77: Main CPLD I2C_SW interface See Main CPLD Spec

U1_CP1: Port CPLD I2C_SW Interface See Port CPLD Spec

M.2 SSD Management 0x3A

M.2 SSD SPD 0xA6

U192: ADT75 Temp Sensor 0x94

U191: ADT75 Temp Sensor 0x92


GS
U139: ADT75 Temp Sensor 0x90
Switchboard
U193: ADT75 Temp Sensor 0x96

U195: ADT75 Temp Sensor 0x9C


COME: PCH >
CPLD LPCI2C >
U189: ADT75 Temp Sensor 0x98
I2C_B2B /
I2C_SW
U196: ADT75 Temp Sensor 0x9E

U194: ADT75 Temp Sensor 0x9A

U95: MAX11603 Voltage sensor, default N/A 0xDA

U92: M24512, Switchboard FRU EEPROM 0xA2

U190: M24512, Reserved FRU EEPROM, default N/A 0xA4

U97: CEC1736 I2C06 Attest Interface for LS10-1 (G1_A) See CEC1736 Spec

U124: CEC1736 I2C06 Attest interface for LS10-2 (G1_B) See CEC1736 Spec

PU4: LM5066 54V HSC 0x22

PU7: U50SU4P180 DC-DC P12V for Switchboard 0x26


PDB:
PU8: U50SU4P180 DC-DC P12V for Switchboard 0x2E
I2C_HSC_
U30: M24C02, PDB FRU EEPROM 0xA0

PU5: QS54SH12060 DC-DC 12V for Fans 0x36

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

43
PU10: MP8880 LDO P12V_STBY 0x2C

PU11: MP8880 LDO P12V_STBY 0x24

U29: TMP75 Temperature sensor PDB & Fan 0x9A


Tray via PDB:
U8: TMP75 Temperature sensor I2C_TEMP_* 0x9C

U43: PCA9555 I2C 2x8bits GPIO Registers 0x44

CB&C FRU EEPROM via J14, Device TBD TBD

COMe: PCH > U77: Main CPLD, See Main CPLD Spec
SML1 > B2B >
I2C_TESTING GS
U80: PCA9505 40 Pins GPIO Device 0x42
_GPIO_* Switchboard

U1_CP1: Port CPLD LS1 I2CA for OSFP Ports 1-8 See Port CPLD Spec
control & Status

U5_35: MP2975 for LS1 VDD 0xC4


GS
LS1 (LS10-A) Switchboard
U4_35: MP2975 for LS1 DVDD & HVDD 0xCA
I2CA
U1_33: MP2975 for OSFP Ports 1-16 P3.3V via 0x54
MP86975

LS1 (LS10-A) U5: PCA9847, G1_A LS1 I2CB fanout buffer 0xE2: select which
I2CB port

J1_01: OSFP Port 1 NVLink transceiver I2C 0xA0

J1_02: OSFP Port 2 NVLink transceiver I2C 0xA0

J1_03: OSFP Port 3 NVLink transceiver I2C GS 0xA0


Switchboard
J1_04: OSFP Port 4 NVLink transceiver I2C 0xA0

J1_05: OSFP Port 5 NVLink transceiver I2C 0xA0

J1_06: OSFP Port 6 NVLink transceiver I2C 0xA0

J1_07: OSFP Port 7 NVLink transceiver I2C 0xA0

LS1: I2CB J1_08: OSFP Port 8 NVLink transceiver I2C GS 0xA0


Switchboard

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

44
U1_CP1: Port CPLD LS2 I2CA for Ports 9-16 control & See Port CPLD Spec
status
GS
LS2 (G1_B) U5_36: MP2975 for LS2 VDD Switchboard 0xC4
I2CA
U4_36: MP2975 for LS2 DVDD & HVDD 0xCA

U7: PCA9847 I2C Fanout buffer for LS2 I2CB 0xE2: select which
port

J1_09: OSFP Port 9 NVLink transceiver I2C 0xA0

J1_10: OSFP Port 10 NVLink transceiver I2C 0xA0

J1_11: OSFP Port 11 NVLink transceiver I2C 0xA0


GS
LS2 (G1_B) J1_12: OSFP Port 12 NVLink transceiver I2C Switchboard 0xA0
I2CB
J1_13: OSFP Port 13 NVLink transceiver I2C 0xA0

J1_14: OSFP Port 14 NVLink transceiver I2C 0xA0

J1_15: OSFP Port 15 NVLink transceiver I2C 0xA0

J1_16: OSFP Port 16 NVLink transceiver I2C 0xA0

Figure 24. I2C Devices & Interfaces on COMe

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

45
NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

46
7.7 Goldstone Switchboard Sensors
Please refer to Section 8, Sensor List, for sensor list. This section only covers the standalone sensors but will not cover
the built-in sensors of M.2 SSD, OSFP Transceivers / Loopback Dongles as they are covered in related sections. The sensors
on the PDB are covered here to check if any PDB error during Switchboard manufacturing testing even if PDB is a golden board
for Switchboard Manufacturing testing.
Please note that during normal operations, LS10 will access their Voltage Regulators and OSFP modules’ sensors via their
I2CA and I2CB interfaces but COMe CPU could access them under Main CPLD Register control.

Table 10. Goldstone Switchboard Sensor Test Items

Test Priorit Nvbug


Req-# Requirement Bug # & TS
Method y tracking

GOLDSTO Name: Switchboard Temperature Sensor Checking Auto P1 Jinshui … FCT-R


NE-SNR00 Mandatory: Shall check the following 8x ADT75 Temperature sensors EVT to file and
0 on the Goldstone Switchboard: U139, U189, U191, U192, U193, post bug #
U194, U195 and U196.
Method: Read these temperature sensors’ internal registers for
current temperature values over the I2C bus.
Expected Result: All these 8x temperature sensors should return
valid temperature values that are within the normal ranges per Spec.
Guidance: See Table 9 & Table 25 for sensors’ I2C Address.

GOLDSTO Name: Switchboard PDB Temperature Sensor Checking Auto P1 Same bug # FCT-R
NE-SNR00 Mandatory: Shall check the following 2x TMP75 Temperature EVT as
1 sensors on the Goldstone Switchboard PDB: U8, U29. GOLDSTON
Method: Read these temperature sensors’ internal registers for E-SNR000
current temperature values over the I2C bus.
Expected Result: All these 2x temperature sensors should return
valid temperature values that are within the normal ranges per Spec.
Guidance: See Table 9 & Table 25 for sensors’ I2C Address.

GOLDSTO Name: Switchboard PDB 54V HSC Sensor Checking Auto P0 Same bug # FCT-R
NE-SNR00 Mandatory: Shall check PDB LM5066 54V HSC (PU4) Status, Voltage EVT as
2 & Current sensors. GOLDSTON
Method: Read PU4 LM5066’s internal registers for E-SNR000
Input/Output/MOSFET Status, and sensor values for Input Voltage /
Current / Power, Output Voltage / Current, and Temperature.
Expected Result: Return valid values without any fault event and
Voltage, Current, Power and Temperature sensor values are within
the normal ranges per Spec.
Guidance: See Table 9 & Table 25 for sensors’ I2C Address, and
LM5066 data sheet for internal register description.

GOLDSTO Name: Switchboard PDB P12V DC-DC Converter Sensor Checking Auto P0 Same bug # FCT-R
NE-SNR00 Mandatory: Shall check PDB Switchboard P12V PU7 & PU8 DC-DC EVT as
3 Converters’ Status, Voltage & Current sensors. GOLDSTON
Method: Read PU7 & PU8 U50SU4P180’s internal registers for E-SNR000
Input/Output Status, and sensor values for Input Voltage / Current,
Output Voltage / Current, and Temperature.
Expected Result: Return valid values without any fault event and
Voltage, Current, Power and Temperature sensor values are within
the normal ranges per Spec.

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

47
Guidance: See Table 9 & Table 25 for sensors’ I2C Address, and
U50SU4P180 data sheet for internal register description.

GOLDSTO Name: Switchboard PDB 12V_STBY DC-DC Converter Sensor Auto P0 Same bug # FCT-R
NE-SNR00 Checking EVT as
4 Mandatory: Shall check PDB 12V_STBY MP8880 DC-DC Converter GOLDSTON
(PU10, PU11) Status, Voltage & Current sensor values. E-SNR000
Method: Read PU10 & PU11 MP8880’s internal registers for
Input/Output Status, and sensor values for Input Voltage / Current,
Output Voltage / Current, and Temperature.
Expected Result: Return valid values without any fault event and
Voltage, Current and Temperature sensor values are within the
normal ranges per Spec.
Guidance: See Table 9 & Table 25 for sensors’ I2C Address, and
MP8880 data sheet for internal register description.

GOLDSTO Name: Switchboard PDB 12V_FAN DC-DC Converter Sensor Checking Auto P0 Same bug # FCT-R
NE-SNR00 Mandatory: Shall check PDB 12V_FAN QS54SH12060 DC-DC EVT as
5 Converter (PU5) Status, Voltage & Current sensor values. GOLDSTON
Method: Read PU5 QS54SH12060’s internal registers for E-SNR000
Input/Output Status, and sensor values for Input Voltage / Current /
Power, Output Voltage / Current, and Temperature.
Expected Result: Return valid values without any fault event and
Voltage, Current, Power and Temperature sensor values are within
the normal ranges per Spec.
Guidance: See Table 9 & Table 25 for sensors’ I2C Address, and
QS54SH12060 data sheet for internal register description.

GOLDSTO Name: Switchboard LS1 VDD DC-DC Converter Sensor Checking Auto P0 Same bug # FCT-R
NE-SNR00 Mandatory: Shall check LS1 VDD MP2975 DC-DC Converter (U5_35) EVT as
6 Status, Voltage, Current & Power sensor values. GOLDSTON
Method: Read U5_35 MP2975’s internal registers for Input/Output E-SNR000
Status, and sensor values for Input Voltage / Current / Power, Output
Voltage / Current / Power, and Temperature.
Expected Result: Return valid values without any fault event and
Voltage, Current, Power and Temperature sensor values are within
the normal ranges per Spec.
Guidance: See Table 9 & Table 25 for sensors’ I2C Address, and
MP2975 data sheet for internal register description. Only
PWM1-PWM4 are used.

GOLDSTO Name: Switchboard LS1 DVDD & HVDD DC-DC Converter Sensor Auto P0 Same bug # FCT-R
NE-SNR00 Checking EVT as
7 Mandatory: Shall check LS1 DVDD & HVDD MP2975 DC-DC GOLDSTON
Converter (U4_35) Status, Voltage, Current & Power sensor values. E-SNR000
Method: Read U4_35 MP2975’s internal registers for Input/Output
Status, and sensor values for Input Voltage / Current / Power, Output
Voltage / Current / Power, and Temperature.
Expected Result: Return valid values without any fault event and
Voltage, Current, Power and Temperature sensor values are within
the normal ranges per Spec.
Guidance: See Table 9 & Table 25 for sensors’ I2C Address, and
MP2975 data sheet for internal register description. PM2975’s
PWM1-PWM4 for DVDD & PWM6-PWM8 for HVDD

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

48
GOLDSTO Name: Switchboard LS2 VDD DC-DC Converter Sensor Checking Auto P0 Same bug # FCT-R
NE-SNR00 Mandatory: Shall check LS2 VDD MP2975 DC-DC Converter (U5_36) EVT as
8 Status, Voltage, Current & Power sensor values. GOLDSTON
Method: Read U5_36 MP2975’s internal registers for Input/Output E-SNR000
Status, and sensor values for Input Voltage / Current / Power, Output
Voltage / Current / Power, and Temperature.
Expected Result: Return valid values without any fault event and
Voltage, Current, Power and Temperature sensor values are within
the normal ranges per Spec.
Guidance: See Table 9 & Table 25 for sensors’ I2C Address, and
MP2975 data sheet for internal register description. Only
PWM1-PWM4 are used

GOLDSTO Name: Switchboard LS2 DVDD & HVDD DC-DC Converter Sensor Auto P0 Same bug # FCT-R
NE-SNR00 Checking EVT as
9 Mandatory: Shall check LS2 DVDD & HVDD MP2975 DC-DC GOLDSTON
Converter (U4_36) Status, Voltage, Current & Power sensor values. E-SNR000
Method: Read U4_36 MP2975’s internal registers for Input/Output
Status, and sensor values for Input Voltage / Current / Power, Output
Voltage / Current / Power, and Temperature.
Expected Result: Return valid values without any fault event and
Voltage, Current, Power and Temperature sensor values are within
the normal ranges per Spec.
Guidance: See Table 9 & Table 25 for sensors’ I2C Address, and
MP2975 data sheet for internal register description. PM2975’s
PWM1-PWM4 for DVDD & PWM6-PWM8 for HVDD

GOLDSTO Name: Switchboard OSFP Ports 3.3V DC-DC Converter Sensor Auto P0 Same bug # FCT-R
NE-SNR01 Checking EVT as
0 Mandatory: Shall check OSFP Ports 3.3V MP2975 DC-DC Converter GOLDSTON
(U1_33) Status, Voltage, Current & Power sensor values. E-SNR000
Method: Read U1_33 MP2975’s internal registers for Input/Output
Status, and sensor values for Input Voltage / Current / Power, Output
Voltage / Current / Power, and Temperature.
Expected Result: Return valid values without any fault event and
Voltage, Current, Power and Temperature sensor values are within
the normal ranges per Spec.
Guidance: See Table 9 & Table 25 for sensors’ I2C Address, and
MP2975 data sheet for internal register description. Only PWM1,
PWM2, PWM7 & PWM8 are used.

GOLDSTO Name: COMe VCORE & VCCSA DC-DC Converter Sensor Checking Auto P1 Same bug # FCT-R
NE-SNR01 Mandatory: Shall check COMe VCORE & VCCSA MP2975 DC-DC EVT as
1 Converter (U43) Status, Voltage, Current & Power sensor values. GOLDSTON
Method: Read U43 MP2975’s internal registers for Input/Output E-SNR000
Status, and sensor values for Input Voltage / Current / Power, Output
Voltage / Current / Power, and Temperature.
Expected Result: Return valid values without any fault event and
Voltage, Current, Power and Temperature sensor values are within
the normal ranges per Spec.
Guidance: See Table 9 & Table 25 for sensors’ I2C Address, and
MP2975 data sheet for internal register description. Only PWM1,
PWM2, PWM3 & PWM8 are used.

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

49
GOLDSTO Name: COMe Voltage Sensor Checking Auto P1 Same bug # FCT-R
NE-SNR01 Mandatory: Shall check COMe Voltage Sensor (U40) MAX11603’s EVT as
2 values GOLDSTON
Method: Read U40 MAX11603’s internal registers for voltage values. E-SNR000
Expected Result: Return valid values without any fault event and
Voltage values are within the normal ranges per Spec.
Guidance: See Table 9 & Table 25 for sensors’ I2C Address, and
MAX11603 data sheet for internal register description. All 8x input
channels are used.

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

50
7.8 USB
There is one USB 2.0 Type A Port on Goldstone Switchboard accessible from the Goldstone Node faceplate as shown in Figure 12.
This USB 2.0 Type A Port is connected to COMe CPU USB Port 1.
The USB 2.0 Type A connector pinout on Goldstone Faceplate is shown in Figure 25. The USB 2.0 port’s current is limited with a TI
TPS25200 eFuse device; in addition, the USB 2.0 Port could be switched to I2C-B2B-SCL/SDA (to/from COMe CPLD) under software
control via a Register in Main CPLD. Manufacturing testing will not test this optional USB-I2C MUX feature.
During manufacturing test, the USB 2.0 Port will be tested with a USB 2.0 Flash Drive.

Figure 25. USB 2.0 Type 2.0 Pinout


Table 11. USB test Items

Test Priorit Nvbug


Req-# Requirement Bug # & TS
Method y tracking

GOLDSTO Name: USB 2.0 Port Read / Write Test Auto - P1 Jinshui … FCT-R
NE-USB00 Mandatory: Shall check the USB 2.0 Port Read / Write operations Jins… EVT to confirm if
0 with a USB2.0 Flash Drive. how ? - the
Method: With a USB 2.0 Flash Drive attached to the USB 2.0 Port, USB manufacturi
perform Write and Read operations to the Flash Drive, compare the installat ng test
read back data against the data written. ion is environmen
Expected result: The read & write operations on the USB 2.0 Flash not t will be
Drive should finish normally and the readback data should match the automa able to
data written. ted install a USB
drive.

GOLDSTO Name: USB 2.0 Port Read / Write Performance Auto P1 Same as FCT-R
NE-USB00 Mandatory: Shall check the USB 2.0 Port Read / Write operations EVT GOLDSTONE
1 performance with a USB2.0 Flash Drive. -USB000
Method: With a USB 2.0 Flash Drive attached to the USB 2.0 Port,
perform Write and Read operations to the Flash Drive, and measure
the performance
Expected result: The read & write operation performance should be
75% or higher of the specification.

GOLDSTO Name: USB 2.0 Port Enable Control Auto P1 Same as FCT-R
NE-USB00 Mandatory: Shall check USB 2.0 Port Power Enable Control function. EVT GOLDSTONE
2 Method: With a USB 2.0 Flash Drive attached to the USB 2.0 Port, -USB000
disable the USB Power eFuse TPS25200 via the register of Main
CPLD, wait 20s, then read the USB Flash drive.
Expected result: The USB 2.0 Flash Drive Read operation should fail
when the eFuse is disabled.

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

51
GOLDSTO Name: USB 2.0 Port eFuse Fault Signal Auto P1 FA FCT-R
NE-USB00 Mandatory: Shall check USB 2.0 Port eFuse Fault signal status. EVT enhanceme
3 Method: With a USB 2.0 Flash Drive attached to the USB 2.0 Port, nt
read the USB 2.0 Port eFuse Fault Signal status from COMe CPLD
Register.
Expected result: The eFuse Fault signal should be High, indicating no
fault (over-current / temperature / voltage) during normal
operations. Please note that not able to generate fault condition
during normal operations.
Guidance: COMe CPLD Pin A10, Signal GP_USB01_OC_B2B_L, and
PCH USB2_OC0# and USB2_OC1# Pins

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

52
7.9 COMe UART port
The Goldstone Switchboard has one RS-232 UART port with RJ45 connector accessed via front panel, please refer to Figure 12 for
location. The RJ45 pinout for this RS-232 Port is shown in Figure 26. This RS-232 UART port is connected to COMe CPU
FH82CM246-SR40E UART0:
● PCH UART0: SE_UART0_* on B2B Connector, and connected to Front panel RS-232 RJ45 connector below
● PCH UART1: SE_UART1_* on B2B Connector, not used on GS Switchboard.

During manufacturing test, this RS-232 port is connected to a Terminal Server (IOLAN STS24) for remote access over Internet or to a
local computer for COMe Console access.

Figure 26. Goldstone Switchboard RS-232 RJ-45 UART Port Pinout


Table 12. BMC & CG1 CPU UARTs

Test Priorit Nvbug


Req-# Requirement Bug # & TS
Method y tracking

GOLDSTON Name: COMe UART Port Auto P0 N/A - FCT-B or


E-COM000 Mandatory: Shall check the COMe UART RJ-45 port being able to TS1B manufacturi FCT-R
send and receive text data normally. ng test
Method: Connect the Goldstone Switchboard RJ-45 RS-232 Port to environmen
a Terminal Server (IOLAN STS24) or a local computer’s UART port, t should
and test input and output text message. indirectly
cover this
from
GOLDSTONE
-COME006

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

53
7.10 Goldstone Switch Node PDB and Fans
The Goldstone Switch Nodes (as well as the Keystone Compute Nodes) are powered by the 54VDC distributed through the backplane
bus bar. On each node, there is a Power Distribution Board (PDB) for converting the 54VDC to the voltages required for the node.
As shown in Figure 27, the Switchboard PDB receives the 54VDC via Bus bar clips J8 & J9, and then the 54VDC goes through a 125A
fuse before feeding a Hot-Swap Controller (HSC). The PDB is designed with multiple HSCs for supply chain considerations.
The default HSC is LM5066 from TI, and the other two HSC options are XDP710 and LTC4285 from Linear Tech. All these 3 HSCs
have an I2C interface for internal registers of status, control and voltage/current measurement.
The I2C address for these HSCs are strapped as following:
● LM5066: 0x22 (8-bit)
● XDP710: 0x22 (8-bit)
● LTC4285: 0x22 (8-bit)

The 54VDC after the HSC is P54V_HSC, which is used to generate the following voltages:
● P12V: 12VDC to Switchboard via 2pcs U50SU4P180P DC-DC connectors. The I2C addresses for the 2pcs DC-DC are:
▪ 0x26 (8-bit) for PU7
▪ 0x2E (8-bit) for PU8
● P12V_STBY: 12VDC Standby Power to Switchboard via one FAN65004 LDO, its power source is P54V_HSC and also enabled
by HSC’s Power Good output. The HSC could be enabled or disabled from Switchboard, and once the P54V_HSC is gone the
P12V_STBY is also unavailable. The design supports two LDO sources: FAN65004 and MP888. The I2C addresses for these
LDOs are:
▪ N/A: PU9, FAN65004
▪ 0x2C: PU10, MP888
▪ 0x24: PU11, MP888
● P12V_FAN: 12VDC to fans before the E-Fuse for each fan via a Q54SH12060 DC-DC converter. The I2C address for
Q54SH12060 is:
▪ 0x36: PU5, Q54SH12060
● P3V3_STBY: Generated from the P12V_STBY with a TPS563211 for the 3.3V circuits on the PDB.

There are 2pcs of connectors for interconnect between the Switchboard and the PDB with cables as shown in Table 14.

Table 13. Summary of PDB I2C Devices (8-bit I2C Address)

I2C-HSC-SCL/SDA

I2C Ref # I2C Device Description I2C Ref # I2C Device Description
ADDR ADDR

0x22 PU4 LM5066 HSC (co-layout XDP710, LTC4285) 0x36 PU5 Q54SH12060 DC-DC converter

0x26 PU7 U50SU4P180P DC-DC for P12V 0x2C PU10 MP888 LDO

0x2E PU8 U50SU4P180P DC-DC for P12V 0x24 PU11 MP888 LDO

0xA0 U30 M24C02 PDB FRU EEPROM

I2C-Temp-SCL/SDA

0x9A U29 TMP75 Temperature Sensor 0x9C U8 TMP75 Temperature Sensor

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

54
0x44 U43 PCA9555 I/O Expander TBD TBD Chassis & BP FRU EEPROM

Table 14. Switchboard-PDB Interconnect

Connector (SW/PDB) Signal Description

P3V3-STBY-MB Standby 3.3V from Switchboard (MB), generated from P12V-STBY on MB. Not
able to FCT automatically as it is not used on PDB

P3V3-STBY-PDB Standby 3.3V from PDB, generated from P12V-STBY on PDB. If not available,
the I2C interfaces to PDB will not work.

FAN[X]-Present-N Active Low PRESENT signal from FAN[X], check w/ Main CPLD register read

FAN[X]-TACH-Rear TACH signal from FAN[X] Rear Fan

FAN[X]-TACH-Front TACH signal from FAN[X] Front Fan

FAN[X]-PWM PWM signal from Switchboard Main CPLD to FAN[X]


J84 – J15
FAN[X]-LED-N Active LOW FAN[X] LED signal from Switchboard Main CPLD

I2C-TEMP-SCL/SDA I2C interface from Switchboard Main CPLD to temperature sensors on PDB

Global-WP Active High Write Protection Signal from Switchboard Main CPLD to PDB FRU
24C02 EEPROM

GP-TEMP-SNS-Alert Active Low PDB Temperature sensor (over-temp) Alert to Switchboard Main
CPLD. There are 2pcs TMP75 temperature sensors on PDB (U8 & U29), anyone
over-Temp will drive this signal LOW. These Alert Signal status could also be
read via I2C-TEMP-SCL/SDA

GND Signal GND reference

P12V +12VDC from PDB to Switchboard

P12V-STBY +12VDC Standby from PDB to Switchboard

GND GND

I2C-HSC-SCL/SDA I2C Interface between Switchboard and PDB

J69 – J16 FORCED-POWER-OFF From Switchboard Main CPLD, LOW to force power off / (Disable HSC) all
power supplies

HOLD-POWER-ON From Switchboard CPLD, High to force Power ON (Enable HSC), but
FORCED-POWER-OFF takes priority

TRAY-PRESNT-N Active low PDB is attached, from PDB to Switchboard Main CPLD

TRAY_ID[2-0] Tray ID, together w/ Tray-Present-N, from PDB J12 (BP)

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

55
PS-ON-N From Switchboard Main CPLD, Active Low to enable Fans’ 12VDC, Switchboard
12VDC, each fan’s 12VDC, but not the P12V-STBY & P3V3-STBY

P12V_IBC_PGD Active High Power Good from PDB, High – both P54V to P12V DC-DC are good;
Low – one or both P54V to P12V DC-DC are bad

P54V_HSC_PGD Active High P54V HSC is good to Switchboard CPLD

I2C_HSC_ALERT_N Active Low I2C Alert signal for all HSC and DC-DC converters, to Switchboard
Main CPLD

Each Goldstone Node has 6x Fans attached to Goldstone PDB, but the fans are fully controlled by Goldstone Switchboard Main CPLD
(U77, MACHX03D_9400LUT_BG400). For Goldstone Switchboard manufacturing testing, Goldstone Node PDB is a Finished Goods, but it
is still required to test the fan control signals between the Goldstone switchboard and the fans.
Each fan has the following signals to Goldstone Switchboard Main CPLD:
● FAN[X]_PWM: Output from Switchboard Main CPLD to fan for Fan’s PWM control.
● FAN[X]_TACH_FRONT: TACH signal from front FAN of each fan unit (each fan unit has dual fans) to Main CPLD.
● FAN[X]_TACH_REAR: TACH signal from rear FAN of each fan unit (each fan unit has dual fans) to Main CPLD.
● FAN[X]_PRESENT: Active low Fan present signal from fan unit to Main CPLD.
● FAN[X]_LED: Active low Fan LED from Main CPLD to Fan tray

The TACH (or Tachometer) is a square wave with 50% duty cycle and is the indication of fan’s speed, and fan’s RPM is
calculated as following (Cycles per Rotation is from fan’s specification):
RPM = (Tach Pulse Frequency / Cycles per Rotation) * 60.
The PDB generates the 12V (P12V_FAN) for the fans using a Q54SH12060RNDH DC-DC convertor from the 54V Input
from the bus bar; in addition, a NCP81295MNTXG Hot Swap Smart Fuse is used to control & regulate the P12V_FAN for
each fan, but the NCP81295 has no internal registers for software to read or write. So, the NCP81295’s function is tested
indirectly – If a given Fan is working normally then the corresponding NCP81295 circuits are good.

Table 15. Test Items for Node PDB and Fans’ Signals

Test Priorit Nvbug


Req-# Requirement Bug # & TS
Method y tracking

GOLDSTON Name: FAN’s PWM Signal and TACH Signals (6x Fans) Auto P1 Jinshui … FCT-R
E-FAN000 Mandatory: Shall check each fan’s PWM signal from Goldstone EVT to file and
Switchboard Main CPLD post bug #
Method: During a fan’s normal operation, change its PWM duty as fan
cycle via Goldstone Switchboard’s Main CPLD internal registers, sanity
and then check FAN[X]_TACH_FRONT and FAN[X]_TACH_REAR check
signals to see if the TACH signals change accordingly.
Guidance: Goldstone Switchboard Main CPLD Registers

GOLDSTON Name: FAN’s PRESENT Signal Auto P1 Same as FCT-R


E-FAN001 Mandatory: Shall check each fan’s PRESENT Signal. EVT GOLDSTON
Method: Check the Main CPLD’s Fan PRESENT register, the E-FAN000
PRESENT signal should be low during normal operation when the
fan is attached. No FAN connection and disconnection test (Risk:
Fan PRESENT is always LOW).
Guidance: Goldstone Switchboard Main CPLD Registers

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

56
GOLDSTON Name: FAN’s LED Signal Manual P1 N/A - One FCT-B
E-FAN002 Mandatory: Shall check each fan’s LED Signal from Main CPLD EVT Diag cannot
Method: Toggle each fan’s LED register bit of Main CPLD and implement
manually check the corresponding LED status. manual
Guidance: Goldstone Switchboard Main CPLD Registers checks

Figure 27. Switchboard - PDB - Fans Interconnect

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

57
7.11 System Thermal & Power Stress Test

The testing specified here is to test Goldstone Switchboard will work reliably with the maximum TDP of the devices under
the high DC operation ambient temperatures around 31-35°C; and this exercise also tests the power system.
Please also note that while the fans used in the Tester chassis are the same as the one used in Ranger production chassis,
the slot pitch of the Tester Chassis is 2 OUs, larger than the Ranger chassis’ 1.36 OUs, thus the Tester chassis may provide
lower or higher thermal resistance, not reflecting the Ranger Chassis’ thermal environment.

Table 16. System Thermal & Power Stress Test

Test Priorit Nvbug


Req-# Requirement Bug # & TS
Method y tracking

Name: Temperature Read & Log before System Thermal & Power Auto P0 Same as RIN
Stress Test EVT GOLDSTON
Mandatory: Shall check Goldstone Node temperature sensors E-SNR000
before Stress Testing
Method: Read & log all temperature sensors .
Expected result: all temperature sensors in normal range.
Guidance: Please see Section 8 for Temperature sensor list.

GOLDSTO Name: System Thermal & Power Stress Test Auto P0 Jinshui … RIN
NE-SYS001 Mandatory: Shall check Goldstone Switchboard working normally EVT to file bug
under maximum NVLink traffic at 31-35°C ambient temperature and post
Method: Run maximum NVLink4 traffic on all used ports of the bug #
2pcs LS10 devices for 20 minutes; then check the number of NVLink Note that
errors, read temperature and power sensors. One Diag
Expected result: the number of errors per port should not more cannot
than 12. (10-13 x 100G x 1200s = 12); and all sensors in normal control
range. ambient
Guidance: Please see LS10 for details of PRBS testing temperatur
e

GOLDSTO Name: System Thermal & Power Stress Test w/ Fan-1 Failure Auto P1 N/A - this RIN
NE-SYS002 Mandatory: Shall check Goldstone Switchboard working normally should be
under maximum NVLink traffic at 31-35°C ambient temperature w/ part of
Fan-1 failed. validation
Method: at the end of “GOLDSTONE-SYS001” test, turn off Fan-1 instead of
and continue running maximum NVLink4 traffic on all used ports of manufactur
the 2pcs LS10 devices for 5 minutes; then check the number of ing test
NVLink errors, read temperature and power sensors.
Expected result: the number of errors per port should not more
than 3. (10-13 x 100G x 300s = 3) ; and all sensors in normal range.
Guidance: Please see LS10 for details of PRBS testing

GOLDSTO Name: System Thermal & Power Stress Test w/ Fan-2 Failure Auto P1 N/A - this RIN
NE-SYS003 Mandatory: Shall check Goldstone Switchboard working normally should be
under maximum NVLink traffic at 31-35°C ambient temperature w/ part of
Fan-2 failed. validation
Method: at the end of “GOLDSTONE-SYS002” test, turn on Fan 1 instead of
and turn off Fan-2, continue running maximum NVLink4 traffic on manufactur
all used ports of the 2pcs LS10 devices for 5 minutes; then check ing test
the number of NVLink errors, read temperature and power sensors.

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

58
Expected result: the number of errors per port should not more
than 3. (10-13 x 100G x 300s = 3) ; and all sensors in normal range.
Guidance: Please see LS10 for details of PRBS testing

GOLDSTO Name: System Thermal & Power Stress Test w/ Fan-3 Failure Auto P1 N/A - this RIN
NE-SYS004 Mandatory: Shall check Goldstone Switchboard working normally should be
under maximum NVLink traffic at 31-35°C ambient temperature w/ part of
Fan-3 failed. validation
Method: at the end of “GOLDSTONE-SYS003” test, turn on Fan-2 instead of
and turn off Fan-3, continue running maximum NVLink4 traffic on manufactur
all used ports of the 2pcs LS10 devices for 5 minutes; then check ing test
the number of NVLink errors, read all temperature and power
sensors.
Expected result: the number of errors per port should not more
than 3. (10-13 x 100G x 300s = 3) ; and all sensors in normal range.
Guidance: Please see LS10 for details of PRBS testing

GOLDSTO Name: System Thermal & Power Stress Test w/ Fan-4 Failure Auto P1 N/A - this RIN
NE-SYS005 Mandatory: Shall check Goldstone Switchboard working normally should be
under maximum NVLink traffic at 31-35°C ambient temperature w/ part of
Fan-4 failed. validation
Method: at the end of “GOLDSTONE-SYS004” test, turn on Fan-3 instead of
and turn off Fan-4, continue running maximum NVLink4 traffic on manufactur
all used ports of the 2pcs LS10 devices for 5 minutes; then check ing test
the number of NVLink errors, read all temperature and power
sensors.
Expected result: the number of errors per port should not more
than 3. (10-13 x 100G x 300s = 3) ; and all sensors in normal range.
Guidance: Please see LS10 for details of PRBS testing

GOLDSTO Name: System Thermal & Power Stress Test w/ Fan-5 Failure Auto P1 N/A - this RIN
NE-SYS006 Mandatory: Shall check Goldstone Switchboard working normally should be
under maximum NVLink traffic at 31-35°C ambient temperature w/ part of
Fan-5 failed. validation
Method: at the end of “GOLDSTONE-SYS005” test, turn on Fan-4 instead of
and turn off Fan-5, continue running maximum NVLink4 traffic on manufactur
all used ports of the 2pcs LS10 devices for 5 minutes; then check ing test
the number of NVLink errors, read all temperature and power
sensors.
Expected result: the number of errors per port should not more
than 3. (10-13 x 100G x 300s = 3) ; and all sensors in normal range.
Guidance: Please see LS10 for details of PRBS testing

GOLDSTO Name: System Thermal & Power Stress Test w/ Fan-6 Failure Auto P1 N/A - this RIN
NE-SYS007 Mandatory: Shall check Goldstone Switchboard working normally should be
under maximum NVLink traffic at 31-35°C ambient temperature w/ part of
Fan-6 failed. validation
Method: at the end of “GOLDSTONE-SYS006” test, turn on Fan 5 instead of
and turn off Fan 6, continue running maximum NVLink4 traffic on manufactur
all used ports of the 2pcs LS10 devices for 5 minutes; then check ing test
the number of NVLink errors, read all temperature and power
sensors. At the end of this test, turn on Fan-6 to run all 6pcs fans in
normal state.
Expected result: the number of errors per port should not more
than 3. (10-13 x 100G x 300s = 3) ; and all sensors in normal range.

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

59
Guidance: Please see LS10 for details of PRBS testing

GOLDSTO Name: M.2 SSD Test Under System Thermal & Power Stress Testing Auto P1 Jinshui … RIN
NE-SYS008 Mandatory: Shall check Goldstone Switchboard M.2 SSD work to clarify
normally during System Thermal & Power Stress Testing. this test
Method: At the end of “GOLDSTONE-SYS007” test, perform M.2 case.
SSD read & write stress testing. Mandatory
Expected result: the M.2 SSD read/write stress testing should finish states SSD
within error threshold limit. workload
Guidance: Please see M.2 SSD for details during
GOLDSTON
E-SYS000,
but
Method
states SSD
test should
run at the
end of the
thermal
stress test.

Is this
separate
test than
GOLDSTON
E-SYS001 or
combine
SSD test in
GOLDSTON
E-SYS001?

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

60
NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

61
7.12 PCIe I/O Devices
Each PCIe device/function has 4KB Configuration Space accessible with PCI Configuration Read/Write commands, and the
first 256 bytes are compatible with PCI Base specification.

The host (CPU) could test if a PCIe device/function present or not by reading the PCI Device ID register at offsets 0x0-0x3,
the return value should not be all 0s’ or all 1s’.

Figure 28. PCIe Configuration Space Layout, 4KB


PCI Configuration Space 0x70-0xA3 are registers for PCIe capability registers as shown in Figure 29, the PCI Express Link
Capabilities Register (PCIE_LCAP) shows the maximum link width and speeds that the device could support, and the PCI
Express Link Status Register (PCIE_LS) shows the current link width and link speeds. By comparing these two registers, it
could be determined if the link is established as expected:
PCIE_LCAP [3:0]:
0x1 – 2.5GT/s,
0x2 – 5GT/s,
0x3 – 8GT/s,
0x4 – 16GT/s,
0x5 – 32GT/s,
0x6 – 64GT/s,
0x7 – 128GT/s
PCIE_LS [9:4]:
0x1 – X1,
0x2 – X2,
0x4 – X4,
0x8 – X8,

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

62
0x10 – X16,
0x20 – X32.

Please note that PCIe device will start link training after reset without any software involvement and software could trigger
link retrain by setting the Link Control Register (offset 0x10)’s Retrain Link bit to 0x1.
The PCIe Capability Device Status register indicates if any fatal or non-fatal error happened, during normal operations it
should have all 0s’ for no error.

Figure 29. PCIe Capability Registers (the base address is typically 0x70)
Starting from PCIe Gen4, “Lane Margining at the Receiver Extended Capability” is defined as an optional but very useful
feature with Extended Capability ID of 0x27. These registers show the timing and voltage margins of each receiver lane. Please
refer to PCIe Gen4 or Gen 5 Base Specification Sections 4.2.13 & 7.7.6 for details.

Intel provides a PCIe Lane Margin tool at


https://www.intel.com/content/www/us/en/forms/developer/standard-registration.html; and

Nvidia provides mlxlink utility for lane margin scan at


https://www.intel.com/content/www/us/en/forms/developer/standard-registration.html.

Lspci is a Linux built-in command that could be used to list all PCIe devices and could be used for some PCIe testing:
https://opensource.com/article/21/9/lspci-linux-hardware.

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

63
A PCIe endpoint device is identified as S/B/D/F (Segment Number, Bus Number, Device Number and Function Number
within a Device).

For Goldstone Switchboard, the PCIe devices are listed on Table 17 and Figure 30. Please note that while the hardware design
supports M.2 NVMe SSD, the current configuration only uses M.2 SATA SSD for backward compatible with the standard
alone Kong Switch.

As PCIe Gen3 is used for PCH - LS10 PCIe interconnect, advanced features such as “lane margining” are not available as
they are available starting from PCIe Gen4.
Table 17 Goldstone Switchboard PCIe Devices
Item PCIe Device Description Note

01 CPU Intel Core i3-8100H Mobile CPU, 45W TDP, 4 cores, 3GHz, 6MB shared L3, 2x 64-bit ECC Root Port
DDR4 channels, 16x PCIe Gen3 (Up to 1x16,2x8,1x8+2x4)

02 PCH-H Chipset CM246 Chipset / PCH, 8GT/s DMI3 X4, max 16 ports with 24 lanes PCIe Gen3 in X1/X2/X4, Root Port
max 14x USB 3.1/2.0 ports, Max 6x 6G SATA3 ports, integrated 1GE LAN

03 LS10 (#1) Nvidia 64 Ports 100G NVLink4 Switch, connected to PCH PCIe[18:17], PCIe Gen3 X2 EP

04 LS10 (#2) Nvidia 64 Ports 100G NVLink4 Switch, connected to PCH PCIe[20:19], PCIe Gen3 X2 EP

05 I219AT Intel I219 1GE MAC Controller + PHY EP

06 M.2 NVMe SSD M.2 NVMe SSD, PCIe Gen4, X4, not being used as SATA M.2 SSD is default EP, N/A

Figure 30. Goldstone Node PCIe Devices

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

64
Figure 31. PCIe Lane Margining at Receiver Registers

Figure 32. COMe PCH Flex I/O Lane Mapping

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

65
Table 18. PCIe Device General Test Items

Test Nvbug Bug # &


Req-# Requirement Priority
Method tracking TS

GOLDSTO Name: PCH – LS10 (A) PCIe Link Width and Speed Auto P0 Same FCT-R
NE-DEV00 Mandatory: Should check the PCH – LS10 (A) PCIe Link training TS1B bug # as
0 result: Link Width and Speed GOLDSTO
Method: Read PCH corresponding Root Port’s PCI Express Link Status NE-LS000
Register (PCIE_LS)
Expected Result: PCIe Gen3, X2

GOLDSTO Name: PCH – LS10 (B) PCIe Link Width and Speed Auto P0 Same FCT-R
NE-DEV00 Mandatory: Should check the PCH – LS10 (B) PCIe Link training TS1B bug # as
1 result: Link Width and Speed GOLDSTO
Method: Read PCH corresponding Root Port’s PCI Express Link Status NE-LS000
Register (PCIE_LS)
Expected Result: PCIe Gen3, X2

GOLDSTO Name: LS10 (A) – PCH PCIe Link Width and Speed Auto P0 Same FCT-R
NE-DEV00 Mandatory: Should check the LS10 (A) – PCH PCIe Link training TS1B bug # as
2 result: Link Width and Speed GOLDSTO
Method: Read LS10 (A) ’s PCI Express Link Status Register (PCIE_LS) NE-LS000
Expected Result: PCIe Gen3, X2

GOLDSTO Name: LS10 (B) – PCH PCIe Link Width and Speed Auto P0 Same FCT-R
NE-DEV00 Mandatory: Should check the LS10 (B) – PCH PCIe Link training TS1B bug # as
3 result: Link Width and Speed GOLDSTO
Method: Read LS10 (B)’s PCI Express Link Status Register (PCIE_LS) NE-LS000
Expected Result: PCIe Gen3, X2

GOLDSTO Name: LS10 (A) ID and Revision Checking Auto P0 Same FCT-R
NE-DEV00 Mandatory: Should check LS10 (A) Vendor ID, Device ID and Revision TS1B bug # as
4 against POR GOLDSTO
Method: Read LS10 (A) PCIe Configuration Space’s Vendor ID, Device NE-LS000
ID and Revision Registers
Expected Result: Nvidia, LS10

GOLDSTO Name: LS10 (B) ID and Revision Checking Auto P0 Same FCT-R
NE-DEV00 Mandatory: Should check LS10 (B) Vendor ID, Device ID and Revision TS1B bug # as
5 against POR GOLDSTO
Method: Read LS10 (B) PCIe Configuration Space’s Vendor ID, Device NE-LS000
ID and Revision Registers
Expected Result: Nvidia, LS10

GOLDSTO Name: LS10 (A) PCIe Reset Function Auto P1 Same as FIN
NE-DEV00 Mandatory: Should check if LS10 (A)’s PCIe Reset Signal works EVT GOLDSTO
6 normally NE-LS008
Method: Assert LS10 (A)’s PCIe Reset signal for 10ms then Deassert
it, and wait for 1ms then read LS10 (A)’s PCIe Configuration Space’s
Command Register
Expected Result: The Command Register’s “Bus Master Enable” and
“ Memory Space Enable” bits should be cleared. Once this test is
done, LS10 (A) is disabled.

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

66
GOLDSTO Name: LS10 (B) PCIe Reset Function Auto P1 Same as FIN
NE-DEV00 Mandatory: Should check if LS10 (B)’s PCIe Reset Signal works EVT GOLDSTO
7 normally NE-LS008
Method: Assert LS10 (B)’s PCIe Reset signal for 10ms then Deassert
it, and wait for 1ms then read LS10 (B)’s PCIe Configuration Space’s
Command Register
Expected Result: The Command Register’s “Bus Master Enable” and
“ Memory Space Enable” bits should be cleared. Once this test is
done, LS10 (B) is disabled.

GOLDSTO Name: CPU – M.2 NVMe SSD PCIe Link Width and Speed Auto P0 Same as FCT-R
NE-DEV00 Mandatory: Should check the CPU – M.2 NVMe SSD PCIe Link EVT GOLDSTO
8 training result: Link Width and Speed only if NE-COM
Method: Read CPU corresponding Root Port’s PCI Express Link Status M.2 E001
Register (PCIE_LS) NVMe
Expected Result: PCIe Gen3, X4 SSD is
used

GOLDSTO Name: M.2 NVMe SSD – PCH PCIe Link Width and Speed Auto P0 Same as FCT-R
NE-DEV00 Mandatory: Should check the M.2 NVMe SSD – PCH PCIe Link EVT GOLDSTO
9 training result: Link Width and Speed only if NE-COM
Method: Read M.2 NVMe SSD’s PCI Express Link Status Register M.2 E001
(PCIE_LS) NVMe
Expected Result: PCIe Gen3, X4 SSD is
used

GOLDSTO Name: M.2 NVMe SSD ID and Revision Checking Auto P0 Same as FCT-R
NE-DEV01 Mandatory: Should check M.2 NVMe SSD Vendor ID, Device ID and EVT GOLDSTO
0 Revision against POR only if NE-COM
Method: Read M.2 NVMe SSD PCIe Configuration Space’s Vendor ID, M.2 E001
Device ID and Revision Registers NVMe
Expected Result: Per POR SSD is
used

GOLDSTO Name: M.2 NVMe SSD PCIe Reset Function Auto P1 N/A - OS FIN
NE-DEV01 Mandatory: Should check if M.2 NVMe SSD’s PCIe Reset Signal works EVT is
1 normally only if installed
Method: Assert M.2 NVMe SSD’s PCIe Reset signal for 10ms then M.2 on a only
Deassert it, wait for 1ms and then read M.2 NVMe SSD’s PCIe NVMe M.2 on
Configuration Space’s Command Register SSD is the
Expected Result: The Command Register’s “Bus Master Enable” and used system.
“ Memory Space Enable” bits should be cleared. Once this test is This will
done, M.2 NVMe SSD is disabled. crash the
system

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

67
7.13 PCIe Enhancement
Please refer to the following links about PCI Express Advanced Error Reporting (AER) mechanism that applies to PCIe
Root Port(s), as well as the driver for Linux OS to use:
https://www.kernel.org/doc/html/latest/PCI/pcieaer-howto.html
https://www.design-reuse.com/articles/38374/pcie-error-logging-and-handling-on-a-typical-soc.html
The Linux OS captures and saves PCIe AER error statistical counters at /sys/bus/pci/devices/<dev>/ as documented by:
https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-bus-pci-devices-aer_stats
Table 19. PCIe Enhancement Test Items

Test Nvbug Bug # &


Req-# Requirement Priority
Method tracking TS

GOLDSTO Name: PCIe AER Counter Check Auto P0 Jinsh… FCT-R


NE-PCI000 Mandatory: Shall check the correctable errors, uncorrectable fatal EVT to file
errors, uncorrectable nonfatal errors, aer_rootport_total_err_cor, and post
aer_rootport_total_err_fatal, aer_rootport_total_err_nonfatal, bug #
defined by
https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-bus-p
ci-devices-aer_stats
Guidance: System, Linux

GOLDSTO Name: PCI-LS10 PCIe Interop Auto P1 Jinsh… FCT-R


NE-PCI001 Mandatory: Shall re-train PCH-LS10 (A) and PCH-LS10 (B) PCIe links EVT to file
25 times using the PCIe Base Spec 7.5.3.7 Link Control Register and post
(address offset 0x10), and check link training result. Please refer to bug #
PCIe Base Specification 7.5.3.7 for special attention.
Guidance: System.

GOLDSTO Name: CPU-M.2 NVMe SSD PCIe Interop Auto P1 N/A - FCT-R
NE-PCI002 Mandatory: Shall re-train CPU-M.2 NVMe SSD PCIe links 25 times EVT Manufact
using the PCIe Base Spec 7.5.3.7 Link Control Register (address offset only if uring test
0x10), and check link training result. Please refer to PCIe Base M.2 environm
Specification 7.5.3.7 for special attention. NVMe ent
Guidance: System. SSD is would
used have OS
installed
on the
only M.2
in
Goldston
e.
Resetting
the M.2
would
crash the
system

GOLDSTO Name: PCIe Error Check Auto P0 Same as FCT-R


NE-PCI003 Mandatory: Shall Check and report error bits in the Status Register EVT GOLDSTO
(address offset 0x6) of PCIe Configuration Space and Device Status NE-PCI00
Register(address offset 0xA) in PCIe Capability Structure, at the end 0
of FCT

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

68
Guidance: System, parse lspci -vvvs output

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

69
https://pcisig.com/faq?field_category_value%5B%5D=pci_express_3.0&keys=

https://mjmwired.net/kernel/Documentation/ABI/testing/sysfs-bus-pci

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

70
7.14 M.2 SSDs
Each Goldstone node has 1pcs M.2 SSD for OS and Fabric Manager Software storage, and the hardware design supports
both M.2 SATA and M.2 NVMe SSD options with default to SATA 6G M.2 SSD for compatibility with Kong Switch. As the writing
of this document, the M.2 SATA SSD Model Number is under selection. As shown in Figure 33, NVMe SSD has been
increasingly dominating the SSD market since 2020 for the following reasons:
● Much higher IOPS & BW performance than SATA SSD
● Minor Price premium than SATA SSD (NAND Flash devices dominate the SSD cost)
● Built-in NVMe SSD device driver in OS
● Richer performance benchmarking and debugging software tools

Figure 33. SSD Interface Market Forecast (Source: IDC, 2020.12)

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

71
The following software tools are very useful for PCIe & NVMe SSD testing and diagnosis:

● PCIe Lane Margin Tool: PCIe lane Margin tool at


https://www.intel.com/content/www/us/en/forms/developer/standard-registration.html, or Nvidia mlxlink utility.
● NVMe-CLI: a very good open-source tool for Linux to manage NVMe SSDs and is available for download at:
https://github.com/linux-nvme/nvme-cli;
● IOMETER: a very popular open-source HDD/SSD performance benchmarking tool.

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

72
Table 20. M.2 SATA SSD Test Items (default)

Test Priorit Nvbug


Req-# Requirement Bug # & TS
Method y tracking

GOLDSTO Name: M.2 SATA SSD Vendor ID, Model No. & Quantity Auto P0 Same as FCT-R
NE-SATA0 Mandatory: Shall check the quantity of M.2 SATA SSD devices (1 x EVT GOLDSTON
00 M.2). E-COME001
Method: Use SATA Identify Device Command to access M.2 SATA
SSD.
Expected Result: The Vendor ID and Model No. should match POR
or BOM, and the Qty should be 1.
Guidance: PCH SATA Port 0A, Identify Device Return Word

GOLDSTO Name: M.2 SATA SSD Capacity Auto P0 Same as FCT-R


NE-SATA0 Mandatory: Shall check the M.2 SATA SSD Capacity matching POR EVT GOLDSTON
01 or BOM. E-COME001
Method: Use SATA Identify Device Command to access M.2 SATA
SSD
Expected Result: The M.2 SATA SSD Capacity should match POR or
BOM
Guidance: PCH SATA Port 0A, Identify Device Return Word 106

GOLDSTO Name: M.2 SATA SSD FW Version Auto P0 Same as FCT-R


NE-SATA0 Mandatory: Shall check the M.2 SATA SSD FW version equal to or EVT GOLDSTON
02 higher than POR or BOM. E-COME001
Method: Use SATA Identify Device Command to access M.2 SATA
SSD
Expected Result: The M.2 SATA SSD Capacity should equal to or
higher than POR or BOM
Guidance: PCH SATA Port 0A, Identify Device Return Word 23-26

GOLDSTO Name: M.2 SATA SSD Interface speed Auto P0 Same as FCT-R
NE-SATA0 Mandatory: Shall check the M.2 SATA SSD Interface speed. EVT GOLDSTON
03 Method: Use SATA Identify Device Command to access M.2 SATA E-COME001
SSD
Expected Result: SATA 3 (6G)
Guidance: PCH SATA Port 0A, Identify Device Return Word 79.

GOLDSTO Name: M.2 SATA SSD Smart Data Read Auto P1 Same as FCT-R
NE-SATA0 Mandatory: Shall read and check M.2 SATA SSD Smart Data for E2E EVT GOLDSTON
04 Error Correction Count, Uncorrectable Error Count and E-COME001
Temperature .
Method: Use SATA Smart Data Read Command
Expected Result: log the first read Error Counts; Temperature
below 72°C
Guidance: PCH SATA Port 0A, Smart Data IDs 0xB8, 0xBB, 0xBE

GOLDSTO Name: M.2 SATA SSD Read-Write Operations & Performance Auto P1 Jinshui … FCT-R
NE-SATA0 Mandatory: Shall check the M.2 SATA SSD Read-Write Operations EVT to file and
05 and Performance. post bug #
Method: Write 1MB Data to M.2 SATA SSD; read back and
compare. May use IOMeter utility.
Expected Result: The read back data should match the Write Data;
I/O Performance should be 75% or Higher of the M.2 SSD
Specification.

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

73
Guidance: PCH SATA Port 0A, IOMeter, SATA Read & Write
Commands

GOLDSTO Name: M.2 SATA SSD Smart Data Read to Check Error Count Auto P1 Same as FCT-R
NE-SATA0 Mandatory: Shall read and check M.2 SATA SSD Smart Data for E2E EVT GOLDSTON
06 Error Correction Count, Uncorrectable Error Count and E-SATA004
Temperature after 1MB Data Write and Read .
Method: Use SATA Smart Data Read Command
Expected Result: Compare the new Error Counts to the value in
GOLDSTONE-SATA005, the Error Counts should within threshold
limits; Temperature below 72°C
Guidance: PCH SATA Port 0A, Smart Data IDs 0xB8, 0xBB, 0xBE

GOLDSTO Name: M.2 SATA SSD FW Update Auto P1 N/A - One FLA
NE-SATA0 Mandatory: Shall check if the M.2 SATA SSD FW could be updated EVT Diag does
07 successfully . not support
Method: Please refer to M.2 SATA SSD FW Update Tools FW flashing
Expected Result: M.2 SATA SSD FW could be updated successfully
and reliably
Guidance: M.2 SATA SSD FW Tools from Vendor

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

74
Table 21. M.2 & NVMe SSD Test Items (only applicable with M.2 NVMe SSD, not used in current version)

Test Priorit Nvbug


Req-# Requirement Bug # & TS
Method y tracking

GOLDSTO Name: M.2 NVMe SSD Quantity Auto P0 Same bug # FCT-R
NE-NVM0 Mandatory: Shall check the quantity of M.2 NVMe SSD device (1 x EVT as
00 M.2). GOLDSTON
Method: Read M.2 NVMe SSD’s PCIe Configuration Space Vendor E-COME001
ID and Device ID registers
Expected Result: The Vendor ID of SSD’s PCIe Configuration Space
should match POR or BOM, the quantity should be 1.
Guidance: COMe CPU PCIe Lanes [11:8]

GOLDSTO Name: M.2 NVMe SSD Version Auto P0 Same bug # FCT-R
NE-NVM0 Mandatory: Shall check the version of the M.2 NVMe SSD device (1 EVT as
01 x M.2). GOLDSTON
Method: COMe CPU to access the M.2 NVMe SSDs connected to E-COME001
CPU PCIe Lanes [11:8] with PCIe Configuration Read CMD
Expected Result: The REVID of SSD’s PCIe Configuration Space
should be 0x00 or Higher
Guidance: COMe CPU PCIe Lanes [11:8]

GOLDSTO Name: M.2 NVMe SSD PCIe interface capability and Status Auto P0 Same bug # FCT-R
NE-NVM0 Mandatory: Shall check PCIe Interface width and speed of the M.2 EVT as
02 NVMe SSD device (1 x M.2). GOLDSTON
Method: COMe CPU to access M.2 NVMe SSD connected to CPU E-COME001
PCIe Lanes [11:8] with PCIe Configuration Read CMD
Expected Result: PCI Express Link Capabilities Register: “Max link
width” should be 0x4, “max link speeds” should be 0x3 (PCIe
Gen4); PCI Express Link Status Register: “Negotiated Link Width”
should be 0x4, “Current Link Speed” should be 0x2 (Gen3); PCIe
Link Capability Device Status Register should be 0x00, indicating no
error.
Guidance: COMe CPU PCIe Lanes [11:8]

GOLDSTO Name: M.2 NVMe SSD Capacity Auto P0 Same bug # FCT-R
NE-NVM0 Mandatory: Shall check the size of the M.2 NVMe SSD device (1 x EVT as
03 M.2). GOLDSTON
Method: COMe CPU to access the M.2 NVMe SSD connected to E-COME001
CPU PCIe Lanes [11:8] with NVMe Identify Command or nvme-cli
utility.
Expected Result: The Total NVM Capacity returned by NVMe
Identify Command should match POR or BOM.
Guidance: COMe CPU PCIe Lanes [11:8] & nvme-cli.

GOLDSTO Name: M.2 SMART data Access Auto P1 Same as FCT-R


NE-NVM0 Mandatory: Shall check access to the SMART data of the M.2 EVT GOLDSTON
04 NVMe SSD device (1 x M.2). E-SATA004
Method: COMe CPU to access the M.2 NVMe SSD connected to
CPU PCIe Lanes [11:8] with NVMe Get Log Page Command or Jinshui …
nvme-cli/nvme smart-log command. The nvme-cli is a utility for to confirm if
Linux. manufacturi
Expected Result: the SMART critical warning should be 0, the ng test
SMART temperature should be below 72°C. setup will
Guidance: COMe CPU PCIe Lanes [11:8] & nvme-cli.

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

75
have SATA
or NVMe
M.2

GOLDSTO Name: M.2 read/write operations and performance test Auto P1 Same as FCT-R
NE-NVM0 Mandatory: Shall check read/write operations of the M.2 NVMe EVT GOLDSTON
05 SSD device (1 x M.2). E-SATA005
Method: COMe CPU to access the M.2 NVMe SSD connected to
CPU PCIe Lanes [11:8] with NVMe Read / Write commands or
IOMETER software (preferred)
Expected Result: the read / write operations should finish
normally, the data read should match the data written, the
sequential and random Read / Write IOPS / Bandwidth
performance should not below 75% of specification
Guidance: COMe CPU PCIe Lanes [11:8] & IOMeter

GOLDSTO Name: M.2 SSD SMBus Interface VPD (SMBus address 1010011b) Auto P1 Same as FCT-R
NE-NVM0 Mandatory: Shall check read/write operations of the M.2 devices EVT GOLDSTON
06 (1 x M.2). E-SATA004
Method: COMe CPU to access the VPD data of each M.2 (1x M.2)
with SMBus read CMD.
Expected Result: The vendor ID & Model Number should match
POR or BOM; M.2 NVMe SSD Port 0 max speed and port 0 max
width should be 0x04, the warning thresh should be 0x0480.
Guidance: COMe CPU PCIe Lanes [11:8] & M.2 SSD VPD

GOLDSTO Name: M.2 SSD SMBus Interface Temperature (SMBus address Auto P1 Same bug # FCT-R
NE-NVM0 1101010b) EVT as
07 Mandatory: Shall check the temperature of the M.2 device (1 x GOLDSTON
M.2). E-SNR000
Method: COMe CPU to access the VPD data of each M.2 (1x M.2)
via COMe IC2 with SMBus read and write CMDs, set high
temperature limit to 72°C, low temperature limit to 5°C.
Expected Result: the read values of the high / low temperature
limits should be equal to the values written to; the current ambient
temperature should be below 72°C.
Guidance: COMe CPU PCIe Lanes [11:8] & M.2 SSD VPD

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

76
Figure 34. Example of SMART log returned by nvme-cli / nvme smart-log

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

77
7.15 FPGA/CPLD Devices
There are 2pcs FPGA devices on Goldstone Switchboard as following even if they may be called CPLD in some design documents:
● Main CPLD: U77, MACHX03D_9400LUT_BG400 from Lattice Semi, for Switchboard main control functions, accessible to COMe
CPU via PCH LPC interface.
● Port CPLD: U1_CP1, LCMXO3D-9400HC-5BG256C from Lattice Semi, for OSFP Ports related control. As Each LS10 has 8x OSFP
ports, thus there are 2x I2C interfaces on Port CPLD, one for each LS10’s I2CA port.
The MachX03 is a FPGA as it uses internal SRAM to hold the configuration bits that define user’s chip functions. Upon
the power-on or reconfiguration, MachX03 internal configuration engine (boot loader) will automatically download the
configuration bits from internal Flash (due copies) or external device/interface determined by Configuration Modes.
As shown in Figure 35, LCMX03D and MACHX03D are in the same MachX03D family using internal dual-boot Flash for
configuration. The MachX03D CPLD can be configured in one of the following methods as shown in Figure 37:
● JTAG: Using Lattice Semi Diamond Programmer Software and USB-JTAG Dongle to program MachX03D FPGA internal
Configuration Flash. This mode is used in Goldstone Switchboard for in-system programming by driving JTAG signals from COMe CPLD
Registers, please refer to Figure 38.
● SDM: Self Download Mode, the FPGA download the Configuration bits from internal Configuration Flash to configuration
SRAM.
● MSPI: Master SPI Mode, FPGA as a SPI Master and loads Configuration bits from external SPI/QSPI Flash, and FPGA FW
update is done through updating external SPI Flash. Not used in Goldstone Switchboard.
● SSPI: Slave SPI Mode, FPGA as a SPI Slave and an external controller as a SPI master downloads Configuration Bits to FPGA’s
internal Configuration SRAM or SPI Flash, not used in Goldstone Switchboard design.
● I2C: FPGA as a I2C Slave device and an external I2C master downloads Configuration Bits to FPGA’s internal Configuration
SRAM or SPI Flash, not used in Goldstone Switchboard design. The FPGA’s I2C slave address is 0x40 (7b) or 0x3C0 (10b).

As shown in Figure 36, MachX03D FPGA’s Configuration Status could be monitoring by checking the INIT and DONE Pins, but current
design leaves them just pulled up. If Main CPLD could not Configure successfully, COMe would not be able to boot.

Also shown in Figure 36, MachX03D FPGA could be reconfigured by toggling its PROGRAMN pin, but current design leaves it just
pulled up.

Figure 35. Differences between MACHX03D and LCMX03D Families


The layout of MachX03D internal Configuration Flash is shown in Figure 39, and two copies of Configuration Bits are supported,
one in CFG0 and another I CFG1. Feature Row defines the some of the FPGA’s initial resource configurations as shown in Figure 40. As
shown, JTAG and Salve SPI interfaces are always enabled.

When FPGA internal Configuration Flash is being programmed (FPGA FW Updated) via JTAG interface, the FPGA will continue its
formal Configured User Functions until a “Transfer Refresh” JTAG command.

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

78
Goldstone Switchboard supports updating the Main CPLD and Port CPLD (actually FPGAs) through external JTAG USB Dongle and
Diamond Programming Software from Lattice Semi via Connector J67.

Figure 36. MachX03D FPGA Configuration Process

Figure 37. MachX03D FPGA Configuration Ports (sysCONFIG)

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

79
Figure 38. Goldstone Switchboard In-System Programming Path

Figure 39. MachX03D Internal Flash Layout

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

80
Figure 40. MachX03D Feature Row Elements

Figure 41. Main and Port CPLDs CPU Access Paths

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

81
Table 22. MAIN CPLD Test Items

Test Priorit Nvbug


Req-# Requirement Bug # & TS
Method y tracking

GOLDSTO Name: Main CPLD Version Auto P0 Same as FCT-R


NE-CPLD0 Mandatory: Shall check Correct Main CPLD version EVT GOLDSTON
00 Method: via LPC or B2B_I2C interface. E-COME001
Expected Result: Correct Main CPLD Version per TS.
Guidance: See Main CPLD Spec

GOLDSTO Name: Main CPLD B2B_I2C Interface Auto P0 Same as FCT-R


NE-CPLD0 Mandatory: Shall check Main CPLD B2B_I2C interface with COMe EVT GOLDSTON
01 CPLD E-COME011
Method: Read/Write Main CPLD internal register via B2B_I2C
interface.
Expected Result: Correct Main CPLD internal register access.
Guidance: See Main CPLD Spec
Note: Main CPLD LPC checking in “GOLDSTONE-COME011”

GOLDSTO Name: Main CPLD SML Interface Auto P1 Jinshui … FCT-R


NE-CPLD0 Mandatory: Shall check Main CPLD SML interface with COMe PCH EVT to clarify
02 Method: Read/Write Main CPLD internal register via SML interface. this request
Expected Result: Correct Main CPLD internal register access. - unable to
Guidance: See Main CPLD Spec find
information
about
checking
CPLD SML
interface in
main CPLD
spec

GOLDSTO Name: Main CPLD GPIO-JTAG Programming Interface Auto P0 Jinshui … FLA
NE-CPLD0 Mandatory: Shall check Main CPLD GPIO-JTAG interface with COMe EVT to clarify
03 CPLD how this
Method: Check Main CPLD JTAG Device ID from its JTAG interface. checked - is
Expected Result: Correct Main CPLD JTAG Device ID. this
Guidance: See Main CPLD Spec checked
from CPLD
version
name?

GOLDSTO Name: Main CPLD FW Update Auto P0 N/A - One FLA


NE-CPLD0 Mandatory: Shall check Main CPLD FW could be updated EVT Diag does
04 successfully & reliably not support
Method: Update Main CPLD FW via COMe CPLD GPIO-JTAG flashing
interface.
Expected Result: Update Main CPLD FW successfully every time.
Guidance: See MachX03D Programming Guide & cpldupdate --gpio
<file>

GOLDSTO Name: Main CPLD and Port CPLD Sync Interface Auto P1 Jinshui … FCT-R
NE-CPLD0 Mandatory: Shall check the SYNC Interface between Main CPLD & EVT to clarify
05 Port CPLD guidance -

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

82
Method: See Main CPLD and Port CPLD Design spec. what can be
Expected Result: Correct Sync interface communications between used to
Main CPLD & Port CPLD . check SYNC
Guidance: See Main CPLD and Port CPLD Design Specs. bus status
similar to
GOLDSTON
E-COME012

GOLDSTO Name: Port CPLD Version Auto P0 Same bug # FCT-R


NE-CPLD0 Mandatory: Shall check Correct Port CPLD version EVT as
05 Method: via B2B_I2C interface. GOLDSTON
Expected Result: Correct Port CPLD Version per TS. E-COME001
Guidance: See Port CPLD Spec

GOLDSTO Name: Port CPLD GPIO-JTAG Programming Interface Auto P0 Same as FLA
NE-CPLD0 Mandatory: Shall check Port CPLD GPIO-JTAG interface with COMe EVT GOLDSTON
06 CPLD E-CPLD003
Method: Check Port CPLD JTAG Device ID from its JTAG interface.
Expected Result: Correct Port CPLD JTAG Device ID.
Guidance: See Port CPLD Spec

GOLDSTO Name: Port CPLD FW Update Auto P0 N/A - One FLA


NE-CPLD0 Mandatory: Shall check Port CPLD FW could be updated EVT Diag does
07 successfully & reliably not support
Method: Update Port CPLD FW via COMe CPLD GPIO-JTAG flashing
interface.
Expected Result: Update Port CPLD FW successfully every time.
Guidance: See MachX03D Programming Guide & cpldupdate --gpio
<file>

GOLDSTO Name: LS10(A) – Port CPLD I2C Interface Auto P0 Same as FCT-R
NE-CPLD0 Mandatory: Shall check LS10 (A) – Port CPLD I2C Interface EVT GOLDSTON
08 Method: Read/Write Port CPLD internal registers via LS10 (A) I2CA. E-I2C002
Expected Result: Read/Write Port CPLD internal register(s)
correctly
Guidance: See LS10 I2CA and Port CPLD Spec

GOLDSTO Name: LS10(B) – Port CPLD I2C Interface Auto P0 Same as FCT-R
NE-CPLD0 Mandatory: Shall check LS10 (B) – Port CPLD I2C Interface EVT GOLDSTON
09 Method: Read/Write Port CPLD internal registers via LS10 (B) I2CA. E-I2C002
Expected Result: Read/Write Port CPLD internal register(s)
correctly
Guidance: See LS10 I2CA and Port CPLD Spec

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

83
7.16 FRU EEPROM
Goldstone switchboard has 3pcs FRU EEPROM chips, one for Goldstone Switchboard FRU, one reserved for Cable
Backplane & Chassis FRU, and one Cable Backplane & Chassis FRU in the Chassis.
The Cable Backplane & Chassis FRU (CB&C FRU) EEPROM is in the chassis and accessed via an I2C interface on the
Power connector; and would not be moved with the Keystone Node or Goldstone node. There is one CB&C FRU EEPROM per
slot, so total 11pcs in the Ranger Chassis. Ranger Chassis-level integration manufacturing testing needs to ensure FRU content
consistence cross these 11pcs CB&C FRU EEPROM Devices.
These 2pcs FRU EEPROM devices are connected to COMe CPU-PCH-(LPC)-CPLD U10 under control from U10 CPLD,
as shown in Figure 22.

Table 23. FRU EEPROM Test Items

FRU # FRU Description COMe I2C COMe I2C Address

Goldstone Switchboard FRU: U92, M24512 (64KB), U66/U67 MUX CPLD


1 0xA2 (8b)
< B2B Connector < COMe. LPCI2C

Reserved FRU for Cable Backplane: U190, M24512 (64KB),


CPLD
2 U66/U67 MUX < B2B Connector < COMe. Not programmed with 0xA4 (8b)
LPCI2C
Unique FRU data during Switchboard MFG FCT

Cable Backplane & Chassis (CB&C) FRU: in the chassis and CPLD
3 TBD
accessed via an I2C interface on the Power Signal connector LPCI2C

Test Priorit Nvbug


Req-# Requirement Bug # & TS
Method y tracking

GOLDSTO Name: Switchboard FRU EEPROM read Auto P0 Same bug # FLA
NE-FRU00 Mandatory: Shall check ability to read the Switchboard FRU EVT as
0 EEPROM content. GOLDSTONE
Guidance: U92 (M24512) via COMe CPLD LPC2I2C I2C -COME001

GOLDSTO Name: Switchboard FRU EEPROM Programing Auto P0 N/A - One FLA
NE-FRU00 Performance: Shall check ability to program the Switchboard FRU EVT Diag does
1 EEPROM content. not support
Guidance: U92 (M24512) via COMe CPLD LPC2I2C I2C flashing

GOLDSTO Name: Switchboard FRU EEPROM Write Protection Auto P0 Same bug # FLA
NE-FRU00 Mandatory: Shall check Write-Protection of the Switchboard FRU EVT as
2 EEPROM. GOLDSTONE
Guidance: U92 (M24512) via COMe CPLD LPC2I2C I2C. -COME001
Write-Protection controlled by Main CPLD.

GOLDSTO Name: Switchboard FRU EEPROM with FRU data Auto P0 Same bug # FLA
NE-FRU00 Mandatory: Shall report, program and check Switchboard FRU EVT as
3 EEPROM with board-unique data, at early stage of FCT GOLDSTONE
Guidance: Program each Switchboard FRU EEPROM with its unique -COME001
FRU Data.
Note that
One Diag
does not
support

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

84
programmin
g FRU data

GOLDSTO Name: Reserved FRU EEPROM read Auto P1 Same bug # FLA
NE-FRU00 Mandatory: Shall check ability to read the Reserved FRU EEPROM DVT as
4 content. GOLDSTONE
Guidance: U190 (M24512) via COMe CPLD LPC2I2C I2C -COME001

GOLDSTO Name: Reserved FRU EEPROM Programing Auto P1 Same bug # FLA
NE-FRU00 Performance: Shall check ability to program the Served FRU EVT as
5 EEPROM content. GOLDSTONE
Guidance: U190 (M24512) via COMe CPLD LPC2I2C I2C -COME001

Note that
One Diag
does not
support
programmin
g FRU data

GOLDSTO Name: Reserved FRU EEPROM Write Protection Auto P1 Same bug # FLA
NE-FRU00 Mandatory: Shall check Write-Protection of the Reserved FRU EVT as
6 EEPROM. GOLDSTONE
Guidance: U190 (M24512) via COMe CPLD LPC2I2C I2C. -COME001
Write-Protection controlled by Main CPLD.

GOLDSTO Name: CB&C FRU EEPROM read Auto P0 Same bug # FCT
NE-FRU00 Mandatory: Shall check ability to read the CB&C FRU EEPROM Range as
7 content. r GOLDSTONE
Guidance: via COMe CPLD LPCI2C Chassi -COME001
s

GOLDSTO Name: CB&C FRU EEPROM Programing Auto P0 N/A - One FCT
NE-FRU00 Performance: Shall check ability to program the CB&C FRU Range Diag does
8 EEPROM content. r not support
Guidance: via COMe CPLD LPCI2C Chassi programmin
s g FRU data

GOLDSTO Name: CB&C FRU EEPROM Write Protection Auto P0 Same bug # FCT
NE-FRU00 Mandatory: Shall check Write-Protection of the CB&C FRU Range as
9 EEPROM. r GOLDSTONE
Guidance: via COMe CPLD LPCI2C Chassi -COME001
s

Note: No FLA is planned for Ranger Chassis level integration manufacturing testing, only FCT + RIN + ACC (AC Cycling)

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

85
7.17 EROT & EROT-Protected SW/FW
The programming paths for Goldstone Switchboard’s FWs update are shown in Figure 45.
As shown in Figure 42, the FWs for the following chips on the Goldstone Node are protected with CEC1736 EROT
controller:
● LS10 NVLink4 Switch IC: 2pcs (LS1/G1_A: LS10-A and LS2/G1_B: LS10-B), each has its own EROT/CEC1736 Chip.
✔ The FW SPI Flash for each LS10 is W25Q32 (4MB/32Mbits) from Winbond
✔ No EROT-bypass path is provided for LS10’s FW access from LS10
✔ Both EROT / CEC1736 Devices have FW_CONF[7:0] strapped as 0x20 (0b0010,0000): for GPU /
NVSwitch
For Each LS10 (AP), the connections are as following:
● LS1’s FW SPI Flash U126 is connected to LS1-EROT/CEC1736 U97’s QSPI0_IO Interface
● LS2’s FW SPI Flash U115 is connected to LS2-EROT/CEC1736 U124’s QSPI0_IO interface
● LS1’s QSPI (ROM) interface is connected to LS1-EROT/CEC1736 U97’s QSPI0_IN interface
● LS2’s QSPI (ROM) interface is connected to LS2-EROT/CEC1736 U124’s QSPI0_IN interface
● LS1-EROT/CEC1736 U97’s QSPI1_IN and LS2-EROT/CEC1736 U124’s QSPI1_IN are connected to COMe U2
PCH GSPI0 for Out of Band (OOB) EROT communications and LS10 AP-FW and EROT FW (EC-FW) update.
● LS1-EROT and LS2-EROT’s I2C06 is used as EROT ATTEST I2C Interface and connected to COMe
I2C_B2B_SCL/SDA (COMe CPLD: I2C_CPLD_B2B_SCL/SDA). The I2C addresses used are:
✔ 0xA4 (8b) for Runtime access
✔ 0xA6 (8b) for Recovery access

Figure 42. Goldstone Node's EROT-Protected AP FWs


As shown in Figure 42 and Figure 43, the EROT-Protected AP FW update solutions are a little different for Goldstone
LS10 NVSwitch devices than other boards with a BMC.
For Goldstone Switchboard, the LS10 FW update path is:
NVFLASH on COMe > PCH > EROT/CEC1736 > LS10 FW SPI Flash Device.

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

86
Figure 43. EROT-Protected AP-FW Update Flow with BMC
The EROT/CEC1736 itself has its own FW (EC-FW) stored in its internal built-In Flash as shown in Figure 44. The
EC-FW is pre-programmed before the EROT/CEC1736 is assembled to the Goldstone Switchboard PCB using a Device
Programmer at Microchip or CM (FXSJ); and the EROT’s JTAG interface is disabled using OTP.

Figure 44. EROT/CEC1736 & EC-FW Flash

Figure 45. Goldstone Switchboard FW Update Paths

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

87
Table 24. EROT Test Items

Test Priorit Bug # &


Req-# Requirement
Method y TS

GOLDSTO Name: EROT Fatal ERROR Checking Auto P0 FA FLA


NE-EROT00 Mandatory: Shall check both EROT/CEC1736 devices’ Fatal ERROR EVT enhanceme
0 Signals nt
Expected Result: no fatal error
Guidance: Goldstone Switchboard MAIN CPLD Register & U80

GOLDSTO Name: EROT RESET Input & AP0 RESET output Checking Auto P0 N/A FLA
NE-EROT00 Mandatory: Shall check both EROT/CEC1736 devices’ RESET and EVT
1 AP0 Reset Signals
Expected Result: When EROT’s RESET input is asserted (LOW), its
AP0 RESET output should be LOW to hold LS10/AP in Reset state;
When EROT’s RESET input is deasserted (High), its AP0 RESET
output should also be High to release LS10/AP from RESET, after
some delays TBD.
Guidance: Goldstone Switchboard MAIN CPLD Register & U80

GOLDSTONE- Name: EROT Device EC-FW Boot Checking Auto P0 Same as PLA
EROT002 Mandatory: Shall check both EROT/CEC1736 devices have booted EVT GOLDSTO
successfully NE-EROT0
Expected Result: Both EROT/CEC1736 Devices boot its EC-FW 00
successfully.
Guidance: Please refer to CEC1736 Datasheet and NV EROT
Design Spec.

GOLDSTONE- Name: EROT Device & EC-FW Version Checking Auto P0 Same as PLA
EROT003 Mandatory: Shall check both EROT/CEC1736 devices having the EVT GOLDSTO
correct device ID and EC-FW version NE-EROT0
Expected Result: Correct EROT/CEC1736 Device ID and EC-FW 00
Version matching POR or BOM.
Guidance: Please refer to CEC1736 Datasheet and NV EROT
Design Spec.

GOLDSTO Name: LS10 AP-FW Boot Checking Auto P0 Same as PLA


NE-EROT00 Mandatory: Shall check both LS10 devices could boot its AP-FW EVT GOLDSTO
4 from its EROT-Protected SPI Flash Device successfully after its NE-EROT0
EROT’s RESET Input is Released. 00
Guidance: Please refer to CEC1736 Datasheet and NV EROT
Design Spec

GOLDSTO Name: LS10 AP-FW Version Checking Auto P0 Same as PLA


NE-EROT00 Mandatory: Shall check both LS10 devices having the correct EVT GOLDSTO
5 device ID and EC-FW version NE-EROT0
Expected Result: Correct LS10 Device ID and EC-FW Version 00
matching POR or BOM.
Guidance: Please refer to LS10 Programming manual and.

GOLDSTO Name: EROT Device EC-FW Update Auto P0 N/A - One PLA
NE-EROT00 Mandatory: Shall check both EROT/CEC1736 EROT Devices’ EVT Diag does
6 EC-FW could be updated successfully and Reliably not support
Guidance: NV EROT Design guide & NVFLASH flashing

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

88
GOLDSTONE- Name: LS10 Device AP-FW Update Auto P0 N/A - One PLA
EROT007 Mandatory: Shall check both LS10 Devices’ AP-FW could be EVT Diag does
updated successfully and Reliably not support
Guidance: NV EROT Design guide & NVFLASH flashing

GOLDSTONE- Name: EROT Device ATTEST I2C Interface Auto P1 Jinshui … PLA
EROT008 Mandatory: Shall check COMe could communicate with both EVT to clarify
EROT/CEC1736 Devices over the ATTEST I2C Interface for both the
Runtime and Recovery I2C address requiremen
Guidance: NV EROT Design guide & NVFLASH t - If LS10
VBIOS is
present,
this is
covered
indirectly

GOLDSTONE- Name: LS10 Device AP-FW Write-Protection Checking Auto P1 Jinshui … FLA
EROT009 Mandatory: Shall check both LS10 Devices’ AP-FW could be EVT to clarify
Write-protected successfully and Reliably the
Guidance: NV EROT Design guide; AP-FW Flash’s WP is from EROT requiremen
t - is it
checking
for WP
enable/disa
bled? If it is
more to it,
then this
seem like a
validation
test.

GOLDSTOne- Name: LS10 Device AP-FW Kill Checking Auto P1 N/A - One FLA
EROT010 Mandatory: Shall check both LS10 Devices’ AP-FW could be killed EVT Diag does
successfully and Reliably not support
Guidance: NV EROT Design guide; the KILL signal is from EROT, flashing,
and once KILL is Active, the AP-FW Flash has no power. but also
this seems
like a
validation
testing.

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

89
8 Sensor List
Please refer to Table 9 for sensors’ I2C addresses on Goldstone Switchboard including PDB, and the Figure 24 for
sensors’ I2C addresses on the COMe module.
In addition to the standalone sensors, the SODIMM, M.2 SSD and OSFP modules all include temperature and/or voltage
sensors accessible to their management interfaces (typically I2C).
During normal operations, LS10 devices manage their power sensors and OSFP modules’ sensors; but COMe CPU could
access them under Main CPLD’s control.
Table 25 Goldstone Node Sensors

Device LOC I2C Add (8b)

Temperature sensor inside M.2 SSD , please see M.2 SSD Section GS Switchboard 0x3A

U192: ADT75 Temp Sensor 0x94

U191: ADT75 Temp Sensor 0x92

U139: ADT75 Temp Sensor 0x90

U193: ADT75 Temp Sensor 0x96

U195: ADT75 Temp Sensor 0x9C

U189: ADT75 Temp Sensor 0x98

U196: ADT75 Temp Sensor 0x9E

U194: ADT75 Temp Sensor 0x9A

U95: MAX11603 Voltage sensor, default N/A 0xDA

PU4: LM5066 54V HSC 0x22

PU7: U50SU4P180 DC-DC P12V for Switchboard 0x26

PU8: U50SU4P180 DC-DC P12V for Switchboard 0x2E


PDB: I2C_HSC_
PU5: QS54SH12060 DC-DC 12V for Fans 0x36

PU10: MP8880 LDO P12V_STBY 0x2C

PU11: MP8880 LDO P12V_STBY 0x24

U29: TMP75 Temperature sensor PDB & Fan Tray via PDB: 0x9A
I2C_TEMP_*
U8: TMP75 Temperature sensor 0x9C

U5_35: MP2975 for LS1 VDD GS Switchboard 0xC4

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

90
U4_35: MP2975 for LS1 DVDD & HVDD 0xCA

U1_33: MP2975 for OSFP Ports 1-16 P3.3V via MP86975 0x54

J1_01: OSFP Port 1 NVLink transceiver’ Temperature & Voltage sensors OSFP Module 0xA0

J1_02: OSFP Port 2 NVLink transceiver’ Temperature & Voltage Sensors 0xA0

J1_03: OSFP Port 3 NVLink transceiver’ Temperature & Voltage Sensors 0xA0

J1_04: OSFP Port 4 NVLink transceiver’ Temperature & Voltage Sensors 0xA0

J1_05: OSFP Port 5 NVLink transceiver’ Temperature & Voltage Sensors 0xA0

J1_06: OSFP Port 6 NVLink transceiver’ Temperature & Voltage Sensors 0xA0

J1_07: OSFP Port 7 NVLink transceiver’ Temperature & Voltage Sensors 0xA0

J1_08: OSFP Port 8 NVLink transceiver’ Temperature & Voltage Sensors 0xA0

J1_09: OSFP Port 9 NVLink transceiver’ Temperature & Voltage sensors 0xA0

J1_10: OSFP Port 10 NVLink transceiver’ Temperature & Voltage Sensors 0xA0

J1_11: OSFP Port 11 NVLink transceiver’ Temperature & Voltage Sensors 0xA0

J1_12: OSFP Port 12 NVLink transceiver’ Temperature & Voltage Sensors 0xA0

J1_13: OSFP Port 13 NVLink transceiver’ Temperature & Voltage Sensors 0xA0

J1_14: OSFP Port 14 NVLink transceiver’ Temperature & Voltage Sensors 0xA0

J1_15: OSFP Port 15 NVLink transceiver’ Temperature & Voltage Sensors 0xA0

J1_16: OSFP Port 16 NVLink transceiver’ Temperature & Voltage Sensors 0xA0

U5_36: MP2975 for LS2 VDD GS Switchboard 0xC4

U4_36: MP2975 for LS2 DVDD & HVDD 0xCA

U40: MAX11603 COMe Voltage sensor COMe 0xDA

U43: MP2975 IMVP8 0xD6

J1: SODIMM Temperature Sensor 0x38

J2: SODIMM Temperature Sensor 0x34

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

91
9 Goldstone Node I2C Trees
Please refer to Figure 24 for I2C tree on COMe Module and Table 9 for I2C tree on the Switchboard including PDB.

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

92
10 JTAG & Boundary Scan Test

To be added in V1.1

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

93
11 Equipment List

Item-# Description Vendor & P/N

EQP-001 OSFP Loopback Dongle for NVLink Loopback Nvidia P/N = MOP4OPT-NFLMD3U2
Amphenol P/N = NLMACE-0001, ~$XX

EQP-002 M.2 Clone Machine to clone OS & Manufacturing Test SW, eSystor.com, P/N = SYSNVME-M2205, ~$5400
2mins/4GB, cloning to

EQP-003 24-Port RS-232 Terminal Server, 1RU, C13 Power Cord Perle.com, IOLAN STS24, P/N = 04030464, ~$3000

EQP-004 1-Slot Goldstone Node Test Fixture (Tester) from ZT ZT

EQP-005 Golden parts for EQP-004 (ME Tray, PDB, Cable set, etc.) ZT

EQP-006 40-OU OCP Rack with 20x IT Rail Kits Delta

EQP-007 12x C19 + >=12 C13 PDU for OCP Rack

EQP-008 Type A USB 2.0 Flash Drive, 1GB

EQP-009 CAT5 RJ45 1GE Ethernet Cable

EQP-010 CAT5 RJ45 RS232 Cable for Goldstone and IOLAN STS24 Pinout

Figure 46. OSFP Loopback Dongle (2000-2250 mating cycles)

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

94
Figure 47. EZDUPE M.2 NVMe SSD Duplicator (DM-HE0-8V07NTP)

Figure 48. Perle 24-Port RS-232 Terminal Server

Figure 49. M.2 NVMe/SATA Duplicator (Produplicator.com)

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

95
NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

96
12 References
● SMBPBI High-Speed Proxy Interface - Compute Resman Software - Confluence (nvidia.com), Out of band
GPU management from BMC via a SMBus (I2C Bus).
● OSFP: A new Small Form-factor Plug with 8x Lanes and a 60 pins connector (vs. 76 pins of QSFP-DD)
● QSFP 112 vs. QSFP-DD800:
● QSFP-112: 8x differential pairs (4x TX + 4x RX) on a 38-pin connector, http://qsfp112.com/
● QSFP-DD800: 16x differential pairs (8x TX + 8x RX) on a 76-pin connector,
http://www.qsfp-dd.com/. Goldstone Switchboard uses QSFP-DD800 connectors for NVLink4 links.

Figure 50. QSFP-112 Pinout

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

97
Figure 51. QSFP-DD800 vs. QSFP112 Pinout

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

98
Figure 52. QSFP-DD/QSFP-DD800 Connector

● OSFP for Goldstone NVLink4 Port:


The OSFP connector used for Goldstone NVLink4 ports is TE 2344064-4, a 60-pin OSFP connector for 16
differential pairs, 8x power & sideband signals and 20 Ground Pins.

Figure 53. OSFP Connector & Pinout in Goldstone


Table 26. QSFP Comparison

Form-Factor QSFP+ QSFP28 QSFP56 QSFP112 QSFP-DD QSFP-DD800

Backward Compatible N/A QSFP+ QSFP+/28 QSFP+/28/56 QSFP+/28/56/112 QSFP+/28/56/112

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

99
Differential data rate 10G 10-25G 10-56G 10-112G 10-112G 10-112G

Modulation Type NRZ NRZ PAM4 PAM4 NRZ & PAM4 NRZ & PAM4

Lanes / Differential Pairs 4/8 4/8 4/8 4/8 8/16 8/16

MSA Connector Pins 38 38 38 38 76 76

Applications 40G 100G 200G 400G 800G 800G

Max Speed 4x10G 4x25G 4x53.125G 4x106.25G 8x106.25G 8x106.25G

Note:

Figure 54. QSFP from QSFP+ to QSFP-DD800

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

100
Figure 55. QSFP-DD MSA Connector PCB Layout

Figure 56. QSFP-DD vs. OSFP

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

101
Figure 57. QSFP-DD vs. OSFP in Sizes

Figure 58. CFP vs. QSFP vs. OSFP vs. QSFP-DD

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

102
Figure 59. QSFP-DD vs. OSFP for 400G

● PCI Standard Capability IDs:

● PCIe Extended Capability IDs:

● USB Connector Pinout

Figure 60. USB Connector Pinout

● SATA Connector Pinout

Figure 61. SATA Pinout

NVIDIA Confidential

The controlled copy of this document resides in PDP and printed copies of it are for reference only. Printed Date: 9/28/2021 2:50 PM

103

You might also like