Loading
[TSan] Ignore reads if not stored early
As documented in this paper https://publications.rwth-aachen.de/record/840022/files/840022.pdf, we could trace back a significant runtime overhead introduced by certain HPC/scientific applications to concurrent shared read accesses. A typical scenario for such read accesses is matrix-vector multiplication which is frequently used to solve linar equation system. Accidentally, similar operations are also present in different machine learning algorithms. The performance issue typically arises when the code executes with more than 4 threads and gets worse when the threads are spread across different NUMA domains / sockets. The proposed change is to skip logging of reads, of they are not logged early. This means that previous reads by the current threads will still be updated. Empty shadow cells will also be used for logging. This change also avoids that previous writes get randomly overwritten by a read access. Under review as #74575