Index

Altix UV Hub
Performance Tuning Running MPI on Altix UV 100 and Altix UV 1000 Systems

Amdahl's law
Understanding Parallel Speedup and Amdahl's Law
execution time given n and p
Predicting Execution Time with n CPUs
parallel fraction p
Understanding Amdahl's Law
parallel fraction p given speedup( n )
Calculating the Parallel Fraction of a Program
speedup( n ) given p
Understanding Amdahl's Law
superlinear speedup
Understanding Superlinear Speedup

application placement and I/O resources
Application Placement and I/O Resources

application tuning process
Performance Analysis and Debugging

automatic parallelization
limitations
Use Compiler Options

avoiding segmentation faults
Avoiding Segmentation Faults

cache bank conflicts
Tuning the Cache Performance

cache coherency
Cache Coherency

Cache coherent non-uniform memory access (ccNUMA) systems
Other ccNUMA Performance Issues

cache performance
Tuning the Cache Performance

ccNUMA
Other ccNUMA Performance Issues
See Also cache coherent non-uniform memory access

ccNUMA architecture
ccNUMA Architecture

cluster environment
Scalable Computing

commands
dlook
Using the dlook Command
dplace
Using the dplace Command
topology
topology(1) Command

common compiler options
Compiler Overview

compiler command line
Compiler Overview

compiler libaries
C/C++
C/C++ Libraries
dynamic libraries
Dynamic Libraries
message passing
SHMEM Message Passing Libraries
overview
Library Overview

compiler libraries
static libraries
Static Libraries

compiler options
tracing and porting
Getting the Correct Results

compiler options for tuning
Using Compiler Options Where Possible

compiling environment
The SGI Compiling Environment
compiler overview
Compiler Overview
debugger overview
Other Compiling Environment Features
libraries
Library Overview
modules
Environment Modules

CPU-bound processes
Sources of Performance Problems

Cpuset Facility
advantages
An Overview of the Advantages Gained by Using Cpusets
cpuset
definition
An Overview of the Advantages Gained by Using Cpusets
determine if cpusets are installed
How to Determine if Cpusets are Installed
overview
An Overview of the Advantages Gained by Using Cpusets
systems calls
mbind
An Overview of the Advantages Gained by Using Cpusets
sched_setaffinity
An Overview of the Advantages Gained by Using Cpusets
set_mempolicy
An Overview of the Advantages Gained by Using Cpusets

data decomposition
Data Decomposition

data dependency
Identifying Parallel Opportunities in Existing Code

data parallelism
Data Decomposition

data placement tools
Data Placement Tools
cpusets
Data Placement Practices
dplace
Data Placement Practices
overview
Data Placement Tools Overview
taskset
Data Placement Practices

data Pplacement practices
Data Placement Practices

debugger overview
Other Compiling Environment Features

debuggers
Debugging Tools
gdb
Other Compiling Environment Features
idb
Other Compiling Environment Features
TotalView
Other Compiling Environment Features

denormalized arithmetic
Compiler Overview

determining parallel code amount
Parallelizing Your Code

determining tuning needs
tools used
Determining Tuning Needs

distributed shared memory (DSM)
Distributed Shared Memory (DSM)

dlook command
Using the dlook Command

dplace command
Using the dplace Command

Environment variables
Environment Variables for Performance Tuning

explicit data decomposition
Data Decomposition

False sharing
Fixing False Sharing

file limit resources
resetting
Resetting the File Limit Resource Default

Flexible File I/O (FFIO)
Multithreading Considerations
environment variables to set
Environment Variables
operation
FFIO Operation
overview
Flexible File I/O
simple examples
Simple Examples

functional parallelism
Data Decomposition

gdb tool
Debugging Tools

Global reference unit (GRU)
Performance Tuning Running MPI on Altix UV 100 and Altix UV 1000 Systems

GNU debugger
Debugging Tools

gtopology command
gtopology(1) Command

Gustafson's law
Gustafson's Law

hwinfo command
hwinfo(1) Command

idb tool
Debugging Tools

implicit data decomposition
Data Decomposition

I/O tuning
application placement
I/O Tuning
layout of filesystems
Layout of Filesystems and XVM for Multiple RAIDs

I/O-bound processes
Sources of Performance Problems

iostat command
System Usage Commands

latency
Scalable Computing

layout of filesystems
Layout of Filesystems and XVM for Multiple RAIDs

limits
system
Resetting System Limits

linkstat command
linkstat-uv(1) Command

Linux shared memory accounting
Linux Shared Memory Accounting

memory
cache coherency
Cache Coherency
ccNUMA architecture
ccNUMA Architecture
distributed shared memory (DSM)
Distributed Shared Memory (DSM)
non-uniform memory access (NUMA)
Non-uniform Memory Access (NUMA)

memory accounting
Linux Shared Memory Accounting

memory management
The Basics of Memory Management
Managing Memory

memory page
The Basics of Memory Management

memory strides
Tuning the Cache Performance

memory-bound processes
Sources of Performance Problems

Message Passing Toolkit
for parallelization
Use MPT

modules
Environment Modules
command examples
Environment Modules

MPI on Altix UV systems
Performance Tuning Running MPI on Altix UV 100 and Altix UV 1000 Systems
general considerations
General Considerations
job performance types
Job Performance Types
other ccNUMA performance issues
Other ccNUMA Performance Issues

MPI profiling
MPInside Profiling Tool

MPInside profiling tool
MPInside Profiling Tool

MPP definition
Scalable Computing

non-uniform memory access (NUMA)
Non-uniform Memory Access (NUMA)

NUMA Tools
command
dlook
dlook Command
dplace
Using the dplace Command
installing
Installing NUMA Tools

OpenMP
Use OpenMP
environment variables
Environment Variables for Performance Tuning
Guide OpenMP Compiler
Other Performance Tools

parallel execution
Amdahl's law
Understanding Parallel Speedup and Amdahl's Law
parallel fraction p
Understanding Amdahl's Law

parallel speedup
Understanding Parallel Speedup

parallelization
automatic
Use Compiler Options
using MPI
Use MPT
using OpenMP
Use OpenMP

perf tool
Profiling with perf

Perfcatcher
Perfcatcher

performance
Assure Thread Analyzer
Other Performance Tools
Guide OpenMP Compiler
Other Performance Tools
VTune
Using VTune for Remote Sampling

performance analysis
Performance Analysis and Debugging

Performance Co-Pilot monitoring tools
Performance Co-Pilot Monitoring Tools
hubstats
hubstats(1) Command
linkstat
linkstat-uv(1) Command
Other Performance Co-Pilot monitoring tools
Other Performance Co-Pilot Monitoring Tools

performance gains
types of
Performance Analysis and Debugging

performance problems
sources
Sources of Performance Problems

PerfSuite script
Profiling with PerfSuite

process placement
Determining Process Placement
MPI and OpenMP
Combination Example (MPI and OpenMP)
set-up
Determining Process Placement
using OpenMP
Example Using OpenMP
using pthreads
Example Using pthreads

profiling
MPI
MPInside Profiling Tool
perf
Profiling with perf
PerfSuite
Profiling with PerfSuite

ps command
System Usage Commands

resetting default system stack size
Resetting the Default Stack Size

resetting file limit resources
Resetting the File Limit Resource Default

resetting system limit resources
Resetting System Limits

resetting virtual memory size
Resetting Virtual Memory Size

resident set size
The Basics of Memory Management

sar command
System Usage Commands

scalable computing
Scalable Computing

segmentation faults
Avoiding Segmentation Faults

SGI PerfBoost
SGI PerfBoost

SHMEM
SHMEM Message Passing Libraries

shortening execution time
Adding CPUs to Shorten Execution Time

shubstats command
hubstats(1) Command

SMP definition
Scalable Computing

stack size
resetting
Resetting the Default Stack Size

superlinear speedup
Understanding Superlinear Speedup

swap space
The Basics of Memory Management

system
overview
System Overview

system configuration
Determining System Configuration

system limit resources
resetting
Resetting System Limits

system limits
address space limit
Resetting System Limits
core file siz
Resetting System Limits
CPU time
Resetting System Limits
data size
Resetting System Limits
file locks
Resetting System Limits
file size
Resetting System Limits
locked-in-memory address space
Resetting System Limits
number of logins
Resetting System Limits
number of open files
Resetting System Limits
number of processes
Resetting System Limits
priority of user process
Resetting System Limits
resetting
Resetting System Limits
resident set size
Resetting System Limits
stack size
Resetting System Limits

system monitoring tools
Monitoring Tools
command
hwinfo
hwinfo(1) Command
topology
topology(1) Command

system usage commands
System Usage Commands
iostat
System Usage Commands
ps
System Usage Commands
sar
System Usage Commands
top
System Usage Commands
uptime
System Usage Commands
vmstat
System Usage Commands
w
System Usage Commands

taskset command
taskset Command

tools
Assure Thread Analyzer
Other Performance Tools
Guide OpenMP Compiler
Other Performance Tools
perf
Profiling with perf
PerfSuite
Profiling with PerfSuite
VTune
Using VTune for Remote Sampling

top command
System Usage Commands

topology command
topology(1) Command
topology(1) Command

tuning
cache performance
Tuning the Cache Performance
debugging tools
idb
Debugging Tools
dplace
Using dplace and taskset
environment variables
Environment Variables for Performance Tuning
false sharing
Fixing False Sharing
heap corruption
Managing Heap Corruption Problems
managing memory
Managing Memory
multiprocessor code
Multiprocessor Code Tuning
parallelization
Parallelizing Your Code
profiling
perf
Profiling with perf
PerfSuite script
Profiling with PerfSuite
VTune analyzer
Using VTune for Remote Sampling
single processor code
Single Processor Code Tuning
using compiler options
Using Compiler Options Where Possible
using dplace
Using dplace and taskset
using math functions
Using Tuned Code
using taskset
Using dplace and taskset
verifying correct results
Getting the Correct Results

uname command
Determining System Configuration

unflow arithmetic
effects of
Compiler Overview

uptime command
System Usage Commands

virtual addressing
The Basics of Memory Management

virtual memory
The Basics of Memory Management

vmstat command
System Usage Commands

VTune performance analyzer
Using VTune for Remote Sampling

w command
System Usage Commands