Unverified Commit 3f57f46a authored Jul 06, 2021 by Robert Clark

Add seq-length argument to distribued BERT example



The --seq-length argument was missing from the example distributed BERT
pre-training script which throws an error while attempting to parse the
arguments as the code expects all arguments to be specified and have a
non-None value.

Signed-Off-By: Robert Clark <roclark@nvidia.com>

parent 90e0a0dd

examples/pretrain_bert_distributed_with_mp.sh

+1 −0

Original line number	Diff line number	Diff line
		@@ -23,6 +23,7 @@ python -m torch.distributed.launch $DISTRIBUTED_ARGS \
		--num-attention-heads 16 \
		--micro-batch-size 2 \
		--global-batch-size 16 \
		--seq-length 512 \
		--max-position-embeddings 512 \
		--train-iters 1000000 \
		--save $CHECKPOINT_PATH \