Application-level Tuning

The second level of performance tuning is application-level tuning. The goal of application-level tuning is to speed up the application by improving the application's algorithms, threading implementation, and/or using APIs or primitives.  

The two main application-level tuning strategies are:

Improving the threading model can require more effort, but can result in large gains in performance on multiprocessor systems and uniprocessor systems with Hyper-Threading Technology. Improving the efficiency of computation often requires less effort, but it typically results in a smaller performance gain on multiprocessor systems and uniprocessor systems with Hyper-Threading Technology. For more information on the opportunities for speeding up your application, start with "improving the threading model".

Improving the Threading Model

Using an effective threading model is important, because it enables your application's performance to scale well - meaning that performance will increase proportionally as users add more processing units, both physical processors and Hyper-Threading Technology hardware threads to their systems. If you know your application is single-threaded, then the first step in improving scaling is to add multithreading to the application. If your application is already multithreaded, look for opportunities to improve performance and scaling on multiprocessor systems (and uniprocessor systems with Hyper-Threading Technology).

Improving the Efficiency of Computation

Identify code regions that have a high impact on application performance using sampling and\or call graph data collectors:

Using Sampling

  1. Use Sampling wizard to create an Activity. For application-level tuning, monitor the Clockticks event. If you are interested in microarchitecture-level tuning, you may wish to collect additional events in this step. See Tuning Methodology for Specific Goals for more information.

  2. Look for code regions with a relatively high Clockticks count - these are the code regions that have a high impact on application performance. Drill-down to the source view to investigate the algorithmic improvements that could speed up your application.

Using Call Graph

  1. Use Call Graph wizard to create an Activity.

  2. Analyze data in call graph view to determine your program flow and identify the most time-consuming function calls and call-sequences.

  3. Select high-impact code regions identified in the call graph views, and  drill down to source view to investigate algorithmic improvements that could speed up your application.

Once you have resolved application-level performance issues, or have determined that there are no significant application-level issues, you can move on to the next tuning level, microarchitecture-level tuning.