Larry Traylor, an industry luminary in the world of x86 architecture debug and trace technologies, has published a new eBook on this fascinating topic. Here’s my review.
Larry Traylor is widely known by most people who work in the esoteric world of low-level debug of Intel-based systems. As CEO of American Arium, purveyor of the SourcePoint debugger, he has been in this business for the better part of 30 years. And when Larry’s company merged with ASSET InterTech in 2013, Larry joined a larger team to support innovative new trace technologies in next-generation Intel silicon platforms. As he says in the preface of the book, “Programmers debugging BIOS (now UEFI) have been without trace for the last 20 years” and these new innovations helped change the landscape regarding what it takes to get an Intel design to market.
What does Larry mean by his “20 year” comment? Well, for the longest time, one of the only ways to debug the early boot stages of an Intel design was via “run-control”; the essence of which is to single-step through code, and look at register and memory contents. Although very powerful by itself, single-stepping is by its nature forward-looking; you stop the code at a breakpoint, and then go forward in time from there. To look backwards in time, and identify what might have been the root cause of a bug, you need Trace. And for a very long time, Intel silicon only supported rudimentary trace functions. An example of this is Last Branch Record (LBR). LBR uses a small number (typically 8-16) of model-specific registers (MSRs) to record the “from address” and “to address” pairs of program execution branches. If you’re really interested in how this works, I’d recommend chapters 16-18 of the MinnowBoard Chronicles eBook.
LBR and some of its other cousins don’t give enough trace depth, or have other limitations, that make them somewhat less than ideal. And most engineers know from experience with agent-based debuggers that trace turns a debug “point solution” into a tool very much integral to any benchtop integrated development environment (IDE). So, Intel embarked on a journey to enhance the trace capabilities within its silicon. New features such as Intel Processor Trace, and Intel Trace Hub, brought the silicon into the 21st century for engineers who craved better debugging tools.
Larry’s book covers some of the history of this, and describes the new capabilities in detail. Since the Xeon, Core and Atom industries have been using basic run-control only for the better part of 20 years, and it takes time to ramp up on new technologies, I think the book will be a welcome addition to debug engineers’ libraries. It describes the differences between LBR, BTS, AET, and other forms of trace. And it demonstrates how trace data, when presented within a meaningful code context, makes for a powerful debugging tool to identify root cause. Examples are given throughout, with ASSET’s SourcePoint product demonstrating how the trace logic within Intel silicon can make debug sessions more effective.
What did I not like about the book? Although it was very technically detailed, I thought that it left the reader thirsting for more. A little more technical depth on some of the subjects would have been welcome. I do realize that some of the material cannot be revealed except under non-disclosure agreement; except some more application examples, specifically regarding UEFI and kernel debug, could probably have been provided without breaking any confidentiality rules. Larry does have a sequel planned, and hopefully some more depth will be in there. Stay tuned.