swtrace is a software tracing mechanism. swtrace is normally run from a command prompt by issuing the swtrace command with the appropriate arguments.
swtrace uses software trace hooks to collect data. Trace hooks are identified by both a major code and a minor code. Trace data is collected to a trace buffer that is allocated when swtrace is initialized or when swtrace is turned on. The size of the trace buffer can be set when swtrace is initialized. The swtrace command allows the user to select which major codes are traced, when tracing starts, when tracing stops, when data is transferred from the trace buffer to disk and formatting of the trace data.
Arguments in "<>" (angle brackets) are OPTIONAL.
Arguments in "[]" (square brackets) are REQUIRED.
Arguments not preceded by an option keyword ("-x") are POSITIONAL.
Syntax:
swtrace command <options>
Commands:
[? | -? | help | -help | syntax]
* Display summary swtrace command syntax help.
[?? | -?? | --help]
* Display detailed swtrace command help.
init <-s trace_buffer_size> <-t trace_mode> <-f file_name>
<-sm mte_buffer_size> <-ss section_size>
* Allocate/reallocate SWTRACE trace buffers.
* '-s trace_buffer_size' specifies trace buffer size in MB, per processor.
Default trace buffer size: 3MB per processor.
* '-sm mte_buffer_size' specifies MTE buffer size in MB, per processor.
Default mte buffer size: 5MB per processor.
* '-t trace_mode specifies the mode of tracing.
trace_mode values:
- norm - normal mode: when a trace buffer is full, tracing stops;
- wrap - wraparound mode: when a trace buffer is full,
new records are written from the buffer start;
- cont - continuous mode: trace records are continuously written to a file.
Default trace_mode is norm.
* '-f file_name' specifies file name for continuous/wraparound modes.
This option is ignored with the normal mode.
Default name is swtrace.nrm2
* '-ss section_size' specifies the size of each trace/MTE section in KB written to the trace file
in the continuous mode, and the size of MTE section in the wraparound mode.
This option is ignored with the normal mode.
Default value is 64KB.
free
* Deallocate swtrace trace buffer(s), disable all major codes,
and turn off tracing.
get <nrm2_filename>
* Get copy of swtrace trace buffer.
- Default nrm2_filename: swtrace.nrm2
on
* Turn swtrace on (start collecting hooks).
off
* Turn swtrace off (stop collecting hooks).
enable <m ...>
* Enable swtrace major codes.
- M is a blank-delimited list of major codes and/or the word 'tprof'.
- Major codes can be specified in decimal or hexadecimal notation.
- Some often-used major codes are:
4: Interrupts
16: tprof (can also use the word 'tprof')
18: Dispatches
- Enable without arguments enables all major codes.
disable <m ...>
* Disable SWTRACE major codes.
- M is a blank-delimited list of major codes and/or the word 'tprof'.
- Major codes can be specified in decimal or hexadecimal (with -x option).
- Disable without arguments disables all codes.
maj
* Display active swtrace major codes.
- Major codes are displayed in decimal and hexadecimal notation.
setrate N
* Set tprof tick rate to approximately N hooks/second.
* Used when doing "time-based" profiling.
* Note: N will be rounded to (CPU tick rate)/(CPU tick rate/N).
event [ctr_name | event_name | list] <-c event_cnt>
* Set TPROF event and event count.
* Used when doing "event-based" profiling.
- 'event_name' specifies the event to use to drive TPROF. To get a list
of supported names use the "swtrace event list" command.
Supported events vary by hardware architecture.
+ swtrace takes care of setting up the hardware performance counters
for the selected event.
- 'ctr_name' specifies the performance counter to use to drive TPROF. To
get a list of supported names use the "swtrace event list" command.
Supported counters vary by hardware architecture.
+ If you specify a counter then you are responsible for setting it up
to count the desired event.
+ You *CANNOT* specify a counter if HyperThreading is enabled.
- 'list' lists the available events and counters for the system.
- '-c event_cnt' specifies the number of events to be counted before a
TPROF interrupt occurs.
+ If not specified the default count is 10,000,000 events.
+ Too low a count may cause you not to get enough samples.
+ Too high a count may cause you to flood the system with interrupts
thus distorting the scenario you are profiling.
info
* Display swtrace status/information.
ai <interval> <[num_samples | -r run_time]> <-t> <-s | -sp> <-l log_file>
* Display CPU utilization.
- 'interval' specifies how often CPU utilization is displayed.
+ It is specified as a decimal number, >= 1, in seconds.
+ Default is 1 (i.e., display utilization once a second).
- 'num_samples' specifies the number of times CPU utilization is
displayed.
+ It is specified as a decimal number, >= 1.
+ Default is infinity (i.e., display utilization forever).
+ You cannot specify both 'num_samples' and 'run_time'.
- 'run_time' is the length of time for which to run AI.
+ It is specified as a decimal number, >= 1,
immediately followed by the desired time unit suffix.
+ If no unit is specified the default is Seconds.
+ Valid unit suffixes are:
' ' no suffix (ex. -r 60 means: run for 60 seconds)
's' for Seconds (ex. -r 60s means: run for 60 seconds)
'm' for Minutes (ex. -r 20m means: run for 20 minutes)
'h' for Hours (ex. -r 5h means: run for 5 hours)
'd' for Days (ex. -r 10d means: run for 10 days)
- '-t' causes a timestamp to be appended to each line displayed.
+ Default is to not append a timestamp.
- '-s' causes system-wide utilization information to be displayed.
- '-sp' causes both per-processor and system-wide information to be
displayed.
+ Default is to display utilization per-processor.
+ This option is ignored on non-SMP systems.
- 'log_file' is the name of a file to which AI will log its output.
+ Output is always displayed (sent to stdout).
+ This option works much the way "ai | tee log_file" would.
* When no parameters are specified, ai will run until manually stopped
and will display the per process and system wide cpu utilization every second.
it_install
* Enable ITrace.
it_install -ss
* On PowerPC64, enable ITrace to collect a single step trace.
If not specified, ITrace does branch tracing.
it_remove
* Disable ITrace.
ptt_int [on | off]
* On x86 and x86_64 systems, enables/disables counting of time spent in interrupts in per thread time.
Back to TOP
Running swtrace
The following sections assume swtrace resides either in a directory listed in the PATH environment variable, or it resides in the current directory.
swtrace must be initialized using the
To initialize swtrace, enable tracing for major codes 16 and 18, and begin tracing, enter:
swtrace init
swtrace enable 16 18
swtrace on
Major codes to be traced are enabled using the
To enable tracing for major codes 16 and 18, enter:
swtrace enable 16 18
- OR -
swtrace enable 0x10 0x12
After the desired set of major codes has been enabled, tracing is started using
the
To start/stop tracing, enter:
swtrace enable <desired major codes>
swtrace on
--- run workload to be
traced ---
swtrace off
After tracing has completed and swtrace has been turned off, use the
To capture the contents of the trace buffer to file swtrace.nrm2 and post-process it, enter:
swtrace get
post -showx
The format of the ASCII dump file produced by the
A very small excerpt of a post.show file is shown here:
CPU MAJOR MINOR TmHI TmLO [INT]* [STR]* 0 a8 1 2ac0:38b9aaff 2ac0 0 11 b5 2ac0:38b9aaff 223 0 11 b4 2ac0:38b9b146 4000 0 11 b6 2ac0:38b9b3bd beeb0280 0 beec0c62 0 0 11 b9 2ac0:38b9b67e 0 0 11 ba 2ac0:38b9b8ec 8 0 19 1 2ac0:38e2e4e9 0 151e0 39a9fd0c 1ade7 bfeb2000 [7]Mup.sys 0 19 1 2ac0:38e3c6df 0 27c20 3a68bea9 37a77 bfec8000 [8]NDIS.sys 0 10 20 2ac0:ae5e9a8a 2f0 2c0 8049b96f
The general format for trace output is:
p MM mmmmm TTTT:tttttttt dddddd ....
| | | | | |
| | | | | |
| | | | | +----- Hook-specific data
| | | | |
| | | | +-------------- Low-order 32 bits of Pentium
| | | | Time Stamp Counter, in hexadecimal
| | | |
| | | +--------------------- High-order 32 bits of Pentium
| | | Time Stamp Counter, in hexadecimal
| | |
| | +-------------------------- Minor Code (in hex)
| |
| +---------------------------------- Major Code (in hex)
|
+-------------------------------------- Processor ID in decimal (0,1,...,63)
(Always 0 for a uniprocessor).