Load root directories directly

In implementing STEMDataModule, we enabled custom datasets to override prepare_data() which lets you customize all of the data preprocessing. The result is put into a "root" directory (the --root argument) which holds a directory of .pth or .npy files, along with an index.csv giving locations and other parameters for those files, and hparams and microscope params saved as JSON files.

In cases where we have large datasets that have been processed, it may be nice to simply load these preprocessed root directories instead of specifying all of the datamodule parameters and holding the unprocessed data. In this case, it'd be nice to have another datamodule named preprocessed or something that takes only the default parameters (including --root). This would let us just pass these directories around, i.e. to OLCF, and would simplify our command line arguments considerably.

Related: #4 (closed)

Plan

All that's needed is to implement this inside of STEMDataModule. We'll need to move the --root argument and setup() method to there, so that PTODataModule basically only implements prepare_data(). Then we'll add the preprocessed datamodule name to data/__init__.py and test that we can load it properly.