PCT Detects Link Training Failures

๏ปฟOne of our customers was experiencing field returns when 10 Gigabit Ethernet ports started failing to pass traffic at full line rate. How could they test for these failing boards in manufacturing and prevent them from getting out to customers?

First, some technical background: on circuit board designs supporting 10GbE, a PHY device is usually connected to a MAC/NIC chip. The PHY acts as a transceiver for copper, coax, or fiber media access to the outside world. The NIC acts as a switch and provides useful capabilities like TCP/IP offload and support for upper-layer protocols. The two devices are connected via the Extended Attachment Unit Interface, XAUI (pronounced โ€œZowieโ€), a  4-lane X 3.125Gbps bus.

For the system to be able to pass 10 gigabit Ethernet traffic at full speed, both PHY and NIC must be fully operational and the XAUI bus between these two devices must be โ€œtrainedโ€. More specifically, there must be no structural defects or marginalities on any of the XAUI nets that might impact the traffic-carrying capacity of the bus.

Testing XAUI on a board using legacy technologies can be tough. Itโ€™s a high-speed bus and very sensitive to resonant stubs, so placing In-Circuit Test (ICT) test points on these nets is not a good idea. And even if it were possible, ICT wonโ€™t detect such marginalities as bad clocks, missing or wrong terminations, or flaky power rails.

And checking for full line-rate performance using external equipment on the manufacturing floor is also problematic. 10GbE load generator/analyzers are expensive and take too much test time for high volume manufacturing applications.

So how did our customer solve their problem? By using ScanWorks of course (as you may have expected, as this is an ASSET blog). In particular, processor-controlled test (PCT) was used to read configuration and status registers within both the NIC and PHY to verify link training status. PCT can of course access any device register content which is accessible via PCI/PCIe configuration / extended configuration space. A simplistic example of how these registers can be used to verify link training status is demonstrated by the following table extract out of a PHY data sheet:

PHY Status Register:

Field Bits

Field Length

Field Name

Description

0:6

7

RESERVED

 

7

1

PWR_DN

Power Down capability

1 = ability to power down

8:14

7

RESERVED

 

15

1

XGXS_TX_STATUS

PHY XS link status

1 = link error

 

So, it was easy enough to add in a check for the link status bit in their existing PCT test profile. If the link had a fault, the bit would say so, and the board would not ship to a customer.

PCTโ€™s ability to read any deviceโ€™s configuration and status registers (and there are typically hundreds per device, providing all sorts of useful information) non-intrusively via the JTAG port puts great power into the hands of test engineers.