Structural Defects on High-Speed Serial I/O

Shorts and open circuits on high-speed serdes buses, such as PCI Express, may have subtle and difficult-to-diagnose effects on system performance. In other words, you might not know about them until customers start complaining and you get warranty returns. What kind of effects are these, and how are they prevented?

Let’s look at a typical AC-coupled differential bus, such as PCI Express. A differential pair, transmit from one chip to receive on another, can be illustrated as follows:

  Blog graphic high speed IO defects base picture

Now let’s review a couple of failure scenarios, and see what happens.

AN OPEN-CIRCUIT: MISSING CAPACITOR

In this example, a problem during a system assembly caused a capacitor to be left off, or somehow the capacitor was detached or disabled during its lifespan in the field. This open circuit on one net will, however, not necessarily prevent any signal from making its way to the Rx1- net at the receiver, as can be seen below:

  Blog graphic high speed IO defects missing cap

Receivers work by reconstructing the differential signaling on the + and – legs of a pair. And in this case there may be sufficient coupling present for that lane to train and operate, albeit at a reduced level of performance. It will be more susceptible to crosstalk, power distribution network (PDN) noise, jitter, and inter-symbol interference (ISI). So it will operate with a higher bit error rate (BER). If it crosses the appropriate thresholds, the results will be PHY layer re-inits, datalink layer retransmissions, and ultimately lane drop-outs (either soft (intermittent) or hard).

A SHORT-CIRCUIT: Tx1- TO GND

In this example, a short exists between one of the transmit nets and ground. Similar to the example above, this will impair the signal to its corresponding receiver.

  Blog graphic high speed IO defects Tx1- short to GND

But again, the receiver operates by considering the difference in the signals received, and it may be able to reconstruct the data stream. Whether the link drops out or continues to operate is of course dependent on a large number of factors, and the bit error rate ultimately determines this.

There are many other different kind of failure scenarios, such as shorts between Tx1+ and Tx1-, Tx1+ and Tx2+, Tx1+ and Rx1+, two missing capacitors, etc. I’ll describe this defect universe and its effects on system performance and reliability in a future blog. I’ll also describe the technologies needed to detect these defects.

These, of course, are just “hard faults”, resulting for example from missing components, excess solder, and other common assembly defects. High-speed serial I/O are also sensitive to the “quality” of the interconnects, manifested by such defects as incompletely plated vias, high trace surface roughness, or head-in-pillow. A good reference to manufacturing assembly variances and their effects on high-speed serial I/O is here:  How to Avoid Poor Serdes Performance Caused by Circuit Board Manufacturing Variances.