Using Per CPU Buffering

Use the Per CPU buffering feature to enhance the quality of sampling results on systems with large number of processors. This feature is automatically enabled on NUMA platforms, such as the SGI* Altix.

To manually enable Per CPU Buffering, set SEP_PERCPU_BUFFER=1 prior to running a sampling session.

To manually disable Per CPU Buffering, set SEP_PERCPU_BUFFER=0 prior to running a sampling session.

note

When Per CPU Buffering is enabled, sort the results from each processor and merge them into final/single sampling data file. This adds additional time to the post-collection phase.

Adjusting the Priority of VTune(TM) Performance Analyzer process

During sampling collection, data is stored in the internal kernel buffers. When the buffers become full, the VTune analyzer flushes the data to the disk. When flushing, the VTune analyzer does not collect samples. To minimize the loss of samples, the VTune analyzer's process needs to run at a higher-than-normal priority while flushing data to the disk.

To set the process priority, use the SEP_PRIORITY environment variable and set the value of this process during the times it needs to flush data to the disk. The valid range is between -20 to 19. A negative number is a higher priority and tends to miss fewer samples. A positive number is a lower priority and enables the collector to have less overhead but with the risk of missing some samples. 0 is a normal priority and may miss some samples. If the value is outside of this range, or SEP_PRIORITY is not set in the environment, the -1 value is used by default.

note

The higher the priority, the more likely the VTune analyzer takes precedence over other applications, including the application being profiled, and the less samples are dropped.