Let's put together a list of things we'd like to have in the harness and maybe then we can create issues for the features we agree to include. As a start, these are suggestions Hai Ah shared a while back:
* More granular testing options (checkout, build, submit, kill after X iterations, triage results) – I would like a build only option for PE testing and then a quick way to see what has built and what is broken.
* When a build fails, it is not getting logged properly because a run script is still submitting - bug
* Launch tests by specific name (e.g. GTC, IMB, or all) so that there doesn’t have to be a separate rgt.input file for each launch.
* Timers on build, especially long builds – we should monitor this variability
* Easily tune frequency of job submit to get a balanced workload
* Output workload statistics (size of job, duration, etc to mimic OLCF workload)
* Add number to set how many times a given test should be executed
* Move code that repeats in every test to the inside the harness (e.g. send_to_scheduler, make_batch_script) to make test creation easier
* Output test metrics similar to rgt_status.txt, we can have an rgt_metrics.txt to include app specific values (e.g. GTC/checkpoint restart time, S3D/IO rate, etc)