Fix runs on Ascent with FP32

By removing the second model in the forward pass, we can halve the memory usage

Merge request reports

Loading