Bank_Conflict indicates a penalty condition.
The data cache in Pentium(R) processors, data contains eight banks interleaved on four-byte boundaries. The data cache can be accessed simultaneously by instructions in both the U pipe and the V pipes. However, a bank conflict occurs when both instructions access data from the same data-cache bank. When a bank conflict occurs, the instruction pair stalls for one cycle.
The VTune(TM) Performance Analyzer indicates this penalty condition only when the instructions' base or index registers are 32 (or a multiple of 32) bytes apart.
To prevent bank conflicts:
Reorder your code so that two instructions that access the same bank do not pair.
Reorder your data.
This example shows how the stall is prevented by inserting the CMP instruction (that writes all the flags), after the SAL shift instruction.
Example:
Original |
Cycle |
Optimized |
Cycle |
---|---|---|---|
|
1 2 2 3 |
1. mov ecx, mem 3. add eax, 4 2. mov edx, mem+32 4. 4 add ebx ,4 |
|
Although Instructions 1 and 2 can pair. The pair stalls for one cycle, due to the bank conflict that occurs when accessing MEM and MEM+32. Total execution time: 3 cycles |
In the optimized code, the code sequence is reordered so that the bank conflict between Instructions 1 and 2 is prevented. Total execution time: 2 cycles |