The Performance Inspector™ packages contain suites of performance analysis tools for Linux®, Windows®, AIX®, and ZOS® platforms. These tools will help you gain a better understanding of the performance of your applications and the resources they consume. The tools can be used to help identify performance problems in your application, as well as how your application interacts with the operating system. Performance Inspector works with both Java™ and C/C++ applications and may be used with the Visual Performance Analyzer.
This documentation describes the maximum capabilities of our tools, which may be reduced due to the capabilities of the platform on which it is executed. Performance Inspector uses the performance counters provided by the CPUs, if available, to measure system events at the process level. The package may also include a device driver, and may use a pinned buffer per CPU along with kernel hooks, to capture performance related information.
The majority of this project is LGPL code, and the rest is GPL. See the Licensing Information section for more information.
You can find tools with similar capabilities to some of our tools in the Linux Trace Toolkit, Oprofile, and dProbes.
The first style is "time" profiling. This style records the address currently being executed by each processor, usually at regular time intervals, but it can recorded when any specific event occurs. At post-processing time, these addresses are converted to symbols and counts of these events are reported for each symbol, module, thread, and process. Because the number of these events can be very small compared to the total number of instructions being executed, the impact on the overall performance of the application being profiled can also be very small. However, time profiling only shows "hotspots", the methods that are being executed most of the time, without giving any indication why those methods are being executed.
The second style is callflow profiling. Unlike the information that was gathered by time profiling, callflow profiling does not merely collect samples of what is happening, it gathers information about every method entry and exit by intercepting the method entry and exit events generated by the Java Virtual Machine (JVM) or by the HOOKIT library. Thus, it reports not only what methods were executed, but also the context in which they were executed. That is, if A calls B, which then calls C, then the entire sequence A-B-C is recorded. This provides the detailed information that time profiling lacked, but because of the large number of method entry and exit events that must be handled, callflow profiling has a much greater impact on the execution of the application.
The third style is callstack sampling, which combines the best features of both time profiling and callflow profiling when analyzing Java applications. It can only be used with Java applications, because callstacks can not be reliably determined for non-Java applications. Callstack sampling uses events in the same way as time profiling. However, instead of merely recording the address being executed, it records the entire callstack for that thread, as reported by the JVM. That is, if method C is executing when the event is received and C was called by B which was called by A, then the sequence A-B-C is recorded, just as it would have been by callflow profiling. Thus, callstack sampling falls somewhere between time profiling and callflow profiling providing more detailed information than time profiling with less execution overhead than callflow profiling.
The first style is heap dump analysis, which is performed by a post-processor. It reports a variety of information about the heap, including:
The A2N service is used by POST to resolve code execution addresses to symbols in the application being profiled/traced. For platform specific details, see A2N for Linux or A2N for Windows.
The cpi tool measures application CPI or overall system CPI over a specified time interval.
hdump post-processes the information collected by JPROF about the Java heap. It provides a snapshot of references between live memory objects on the heap, in an attempt to locate memory leaks.
hookit is a set of interfaces to JPROF to profile C/C++ applications. It provides a way to instrument C/C++ function entries and exits that allows JPROF to capture call flow. For platform specific details, see HOOKIT for Linux or HOOKIT for Windows.
ITRACE traces code execution at the instruction level and reports code flow. It traces both application and kernel code, including jitted code (compiled by IBM Java's Just-In-Time compiler).
JLM works with the IBM Java Runtime Environment to provide statistics on monitors (locks).
jprof is a profiling agent that is loaded by Java using either the JVM Profiler Interface (JVMPI) or the JVM Tools Interface (JVMTI). It is the primary tool of the package which provides:
Runtime control is provided via RTDRIVER using sockets or via the RTDriver class from Java.
mpevt provides access to pre-defined processor hardware Performance Counter Events. The set of defined events varies by processor type.
msr reads from and writes to any valid processor MSR. This is a powerful, but very dangerous, tool.
post generates human-readable reports from the output of TPROF and ITRACE.
ptt is a user-mode command that allows some control over the Per-Thread Time (PTT) facility, which monitors the amount of time or instructions executed for each thread. You can turn it on and off, read summary information, and dump information about threads (all or subsets) for which PTT data is available.
rtdriver connects to JPROF via sockets to allow runtime commands, such as starting and stopping data collection and dumping collected data to files.
SCS provides a style of sampling using Java callstacks that provides more information than TPROF with lower execution overhead than callflow profiling. It can only be used with Java applications, because callstacks can not be reliably determined for non-Java applications.
skew provides a way to skew/offset the processor Time Stamp Counters on SMP systems. This can sometimes make it easier to determine whether or not back-to-back reads of the TSC occurred on the same processor.
SWTRACE allows applications to write trace hooks to a kernel trace buffer. The swtrace command controls the Software Trace Facility. It supports starting and stopping data collection and dumping collected trace data to file.
TPROF samples code execution and reports "hotspots" in applications and the kernel, including IBM's Java jitted code. Sampling rate and/or event are user-specified.
Mailing list (perfinsp-list@lists.sourceforge.net).
GNU Library General Public License
IBM is a trademark of International Business Machines Corporation in the
United States, other countries, or both.
Notwithstanding the terms of any agreement you may have with IBM or any of its related or affiliated companies (collectively "IBM"),
the following terms and conditions apply to the Performance Inspector ("Program"):
(a) the Program is provided on an "AS IS" basis;
(b) IBM DISCLAIMS ANY AND ALL EXPRESS AND IMPLIED WARRANTIES AND CONDITIONS INCLUDING, BUT NOT LIMITED TO, THE WARRANTY OF NON-INFRIGNEMENT OR INTERFERENCE AND THE IMPLIED WARRANTIES AND CONDITIONS OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE,
(c) IBM will not be liable to you or indemnify you for any claims related to the Program; and (d) IBM will not be liable for any direct, indirect, incidental, special, exemplary, punitive or consequential damages with respect to the Program.
The Program and future updates and fixpacks to the Program may contain certain third party components which are provided to you under terms and conditions which are different from this Agreement,
or which require IBM to provide you with certain notices and/or information.
For each such third party component, either IBM will identify such third party component in a "README" file
(or in an updated "README" file accompanying the fixpack or update),
or in a file or files referenced in such "README" files
(and shall include any associated license agreement, notices and other related information therein),
or the third party component will contain or be accompanied by its own license agreement
(for example, provided when installing or starting such component,
or accompanying such component in a file entitled "README", "COPYING", "LICENSE"
or a substantially similar title, or included among the Program's paper documentation, if any).
Your use of each third party component which contains or is accompanied by its own license agreement,
or for which IBM has identified a license agreement in one of the above "README" files
(or in a file or files referenced therein), will be subject to the terms and conditions of such other license agreement,
and not this Agreement.
By using or not uninstalling such third party components after the initial installation of such third party components
(thereby giving you access to the applicable license agreements, notices and information),
you acknowledge and agree to all such license agreements, notices and information, including those provided only in the English language.
You agree to review any updated "README" files which accompany updates and fixpacks to the Program.
Performance Inspector is a trademark of IBM.
Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in
the United States, other countries, or both.
Other company, product, or service names may be trademarks or service marks
of others.
MMX, Pentium, and ProShare are trademarks or registered trademarks of Intel
Corporation in the United States, other countries, or both.
Image of the detective was derived from the Art Explosion 300,000 package by Nova Development Corporation.