Intel Processor Trace for UEFI Debug

Intel Processor Trace (Intel PT) is a capability on new Intel silicon that captures information about software execution using dedicated hardware facilities inside the chip. How is it used to debug UEFI?


Intel PT is most helpful for UEFI debugging because, unlike agent-based debuggers, its logic resides inside the silicon, is independent of the OS, and causes only minimal performance perturbation to the software being traced. Intel PT works by collecting trace information within data packets. A variety of packets are processed by a software decoder, such as is within SourcePoint. The packets include timing, program flow information (e.g. branch targets, branch taken/not taken indications) and program-induced mode related information (e.g. Intel TSX state transitions, CR3 changes). These packets are then sent to the memory subsystem or other output mechanism available in the platform.

A lot of the detail on Intel PT is in the Intel 64 and IA-32 Architectures Software Developer Manual, within Chapter 36. Hardware tracing of this sort generates huge amounts of data — in the megabytes per second range. It is often helpful to view this voluminous trace data in a meaningful code context – that is, going back in time from the current instruction pointer, with source code display, and capturing and filtering based on the target processor(s), instruction pointer (IP) values/ranges, and other criteria. See below for a SourcePoint Trace Configuration window which makes setting up Intel PT very easy:

SP Trace Configuration window

As can be seen, Intel PT can be set up to capture trace data from all processors, or a defined list.

Intel PT can optionally be configured to trace instructions only when the processor(s) are executing code within certain IP ranges. If the IP is outside of these ranges, tracing is disabled.

There are two IP ranges: Range 1 and Range 2. The checkboxes are used to enable these ranges. If neither range is enabled, all instructions are traced. The edit control to the right is used to specify the IP range. It can contain a symbolic range (e.g., a function or module name), or two addresses separated with a hyphen (e.g., 1000-2000). The Symbol finder button to the right can be used to lookup symbol names. It is processor-dependent whether IP ranges are supported.

Intel PT provides the ability to specify whether tracing can occur in OS (CPL=0) or User (CPL>0) modes. The CPL checkbox enables this capability. The dropdown list to the right selects what to trace. If not enabled, then all privilege levels are traced.

Intel PT can also enable or disable trace generation depending on the current CR3 value. The CR3 checkbox enables this capability. The edit control to the right specifies the CR3 value to trace.

In the Timestamp section, Intel PT can be configured to generate timestamp information in the trace stream. Timestamp can be used to measure time within the trace data, or to time align with other Trace views. There are three types of timestamp packets that can be enabled.

TSC. TSC (Timestamp Counter) packets contain TSC values. These are the same values read from the TSC MSR or by using the RDTSC instruction. These packets are required in order to time align with other Trace views. This allows for time alignment with Intel PT from other processors or with SW/FW trace from the Trace Hub.

MTC. MTC (Mini Time Counter) packets contain incremental updates to the CTC (a component of TSC). These values provide slightly more accuracy than TSC packets alone. The frequency of these packets is controlled with the Frequency setting. A value of CTC 6 indicates that a packet is generated whenever bit 6 of the CTC counter changes. These packets are generally not used.

Cycle Accurate. Cycle accurate packets contain information about the number of processor clocks that have occurred. These packets can be used by themselves to measure time within a trace buffer. TSC packets are required to time align with other Trace views. The frequency of these packets is controlled with the Threshold setting. This value indicates how many processor clocks occur before a packet is emitted.

Want to learn more about Intel Processor Trace? See our eBook, Intel Adds High-Speed Instruction Trace (note: requires registration).