Loading
[MLIR][NVVM] Fix register mapping in `wgmma.mma_async`
WgmmaMmaAsync Op generates `wgmma.mma_async` PTX instruction that uses the same registers as read and write with mapping. Therefore, the registers count needs to be increased 2 times for the following registers.
This works changes this:
```
llvm.inline_asm has_side_effects asm_dialect = att "{wgmma.mma_async... {$0, $1, $2, $3, $4}, $5, $6, p", "=f,=f,=f,=f,0,1,2,3,l,l"
```
Into this one below. The only different is the number of registers ($8 and $9) that comes after read/write.
```
llvm.inline_asm has_side_effects asm_dialect = att "{wgmma.mma_async... {$0, $1, $2, $3, $4}, $8, $9, p", "=f,=f,=f,=f,0,1,2,3,l,l"
```
Reviewed By: qcolombet
Differential Revision: https://reviews.llvm.org/D157843