|
|
Lessons Learned from Jacobi:
|
|
|
|
|
|
* The reduction clause needs to be in the outer loop, otherwise each team is computing its own error
|
|
|
* The problem size needs to be big enough, or else the OMP3.1 version dominates
|
|
|
(80000,1920),100 iterations,1e-120 tolerance show close timings between the OMP3.1 and the ACConHost versions
|
|
|
* This was adding -O3 -align to the jacc.host.intel, and -arch SSE and -qopenmp-simd
|
|
|
|
|
|
\ No newline at end of file |