Basic DDP example
We should implement beam parallelism with the existing PyTorch distributed backend. Using the lightning example this may be as simple as adding a command line option but will require us to test on a multigpu or multinode machine, ideally Summit.
Plan
Test whether lightning.py runs with the ddp or ddp2 accelerators on Summit. If not, let's fix it or file other bugs here for larger issues we need to address.