A Reconstruction of EM data using FCDenseNet looks promising. As such FCDenseNet is a top candidate model to use in a GB run and/or SC'19 paper submission.
Carry out performance (single gpu for flops) and scaling (multiple nodes for communication) studies of FCDenseNet.
Input sizes from simulation will vary in size, some relevant sizes are [x,256,256], x=16x16, 32x32, etc...
Output size from simulation is [256,256].
Necessary code (to build FCDenseNet) is in stemdl/network Benchmarks (might) require following code mods:
- 1. Modify stemdl/inputs/DatasetTFRecords to generate batch of inputs+outputs on the fly.
2. Create dummy TFRecords (for relevant inputs+outputs) to assess impact of I/O.
Benchmarks can be quantified using:
- 1. gpu timeline traces.
- 2. analytical flops.
- 3. Model's data processing throughput as a function of ranks.
- Most of the necessary code is already implemented (in stemdl).
- Coordinate with Sean T. (in particular, Sean has a binary that forces direct convolutions --> 2x performance).