In the first two parts of this multi-part blog, we reviewed different kinds of short circuit, open circuit, and stuck-at faults and how they might affect link performance. Let’s recap and rank these defects and see what we can do about them.
In Structural Defects on High-Speed I/O and Structural Defects on High-Speed I/O – Part 2, we covered a list of defects and explored the technology behind how high-speed differential serdes channels would respond. A good summary table of a handful of defects and their effects is as follows:
Now let’s consider three technologies that can detect some or all of these defects:
Boundary-Scan Test (BST) – a combination of IEEE 1149.1 and 1149.6, and also known as JTAG – will detect every one of the listed defects, provided the associated devices are compliant to the above IEEE specifications. It is important to have comprehensive DC and AC boundary scan coverage to detect the universe of short-circuit and open-circuit defects on both sides of the capacitors. Because of the complexity of a comprehensive 1149.6 implementation, many vendors’ solutions fall short of 100% shorts and opens coverage.
Processor-Controlled Test (PCT) uses a processor’s debug port and run-control to examine the serial I/O status registers of associated devices and detect CRC errors, and link width and speed changes. This technology will detect all defects labeled “High” and “Medium” under the Effect column. It is important to use a PCT tool which has a comprehensive library of device support, because different devices from PLX, Broadcom, Mellanox, IDT, etc. etc. all have different status register definitions, and researching these can take man-months of effort. PCT also runs below the operating system and BIOS/boot loader, making it extremely effective for detecting defects which prevent a board from booting.
Intel IBIST is a bit error rate and margining tool which uses embedded instrumentation within Intel’s silicon to detect all defects labeled “Low” under the Effect column. Different defect types will take I/O outside of its allowable range for voltage and/or timing, so margining both and comparing against a baseline, to look for violations and/or skew, is important.
All the above technologies have trade-offs in terms of ease of test implementation, test time, and test coverage. Moreover, each technology will detect a class of defects which the others may miss. It’s thus easy to see that all three technologies are needed to detect defects in high-speed serial I/O.