README.md 2.18 KB
Newer Older
1 2 3 4 5 6 7 8 9
# `STEMDL`
A Python package for distributed deep learning with a special focus on inverse problems in materials imaging.  
`stemdl` was used in the following (applied and fundamental) deep learning research projects:   
1. *3-D reconstruction of Structural Distortions from Electron Microscopy* ([Link to Paper](https://arxiv.org/abs/1902.06876))  
2. *27,600 V100 GPUs and 7MW(h) of Power to solve an age-old scientific inverse problem* ([Link to Paper](https://arxiv.org/abs/1909.11150) and [Medium story](https://medium.com/syncedreview/nvidia-ornl-researchers-train-ai-model-on-worlds-top-supercomputer-using-27-600-nvidia-gpus-1165e0d5da7b) )
3. *YNet: a Physics-Constrainted and Semi-Supervised Learning Approach to Inverse Problems*
---
#### Getting Started
See __scripts__ folder for the following:  
Laanait, Nouamane's avatar
Laanait, Nouamane committed
10 11 12 13
1. __stemdl_run.py__:    
  Python script. Runs from the CLI to setup Neural Nets and start training/evaluation operations.
2. __generate_json.py__:  
  Python script. Generates .json files needed as input for stemdl_run.py
14 15
---
#### Brief description of Modules:  
Laanait, Nouamane's avatar
Laanait, Nouamane committed
16 17
1. __inputs.py__:  
  Classes to read training/evaluation data, create training batches, and image transformations.  
18
  Can handle I/O ops on TFRecords, numpy arrays, and lmdb files
Laanait, Nouamane's avatar
Laanait, Nouamane committed
19 20 21
2. __network.py__:  
  Classes to setup various kinds of Neural Nets (ConvNets, ResNets, etc...)  
3. __runtime.py__:  
22
  Functions and Classes to perform (low-level) network training/evaluation
Laanait, Nouamane's avatar
Laanait, Nouamane committed
23
4. __io_utils.py__:  
24 25 26 27 28 29 30 31
  Functions to generate .json files for model architectures input files, hyperparameters, and training runs configurations
5. __losses.py__:   
   Functions to generate and manipulate loss functions
6. __optimizers.py__:   
   Optimizer setup and gradients pre-processing and reduction
7. __automatic_loss_scaler.py__:  
   Python module for dynamic loss scaling during fp16 training (taken as is from [OpenSeq2Seq](https://nvidia.github.io/OpenSeq2Seq/html/index.html))
---
Laanait, Nouamane's avatar
Laanait, Nouamane committed
32
#### Software Requirements:  
33 34 35 36
1. __numpy__ >= 1.13  
2. __tensorflow__ >=1.2
3. __python__ 3.6
4. __horovod__ >=0.16
Laanait, Nouamane's avatar
Laanait, Nouamane committed
37 38

#### Hardware Requirements:  
39
1. CUDA compatible GPU >=1
Laanait, Nouamane's avatar
Laanait, Nouamane committed
40

Laanait, Nouamane's avatar
Laanait, Nouamane committed
41 42 43 44 45 46 47
#### Install ####:
Project not yet on Pypi. For now:
```python
git clone
cd stemdl
pip install .
```
Laanait, Nouamane's avatar
Laanait, Nouamane committed
48 49 50