Skip to content
GitLab
Projects
Groups
Snippets
/
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Sign in
Toggle navigation
Menu
Open sidebar
Tsaris, Aristeidis (aris)
pytorch_tutorial
Commits
3454fd8a
Commit
3454fd8a
authored
Oct 25, 2021
by
Tsaris, Aristeidis
Browse files
clean up
parent
4483a884
Changes
4
Hide whitespace changes
Inline
Side-by-side
ascent/README.md
deleted
100644 → 0
View file @
4483a884
# imagenet_tutorial
This code is from
[
NVIDIA-DeepLearningExamples
](
https://github.com/NVIDIA/DeepLearningExamples
)
with some modifications.
\ No newline at end of file
ascent/export_DDP_envvars.sh
deleted
100644 → 0
View file @
4483a884
export
RANK
=
$OMPI_COMM_WORLD_RANK
export
LOCAL_RANK
=
$OMPI_COMM_WORLD_LOCAL_RANK
export
WORLD_SIZE
=
$OMPI_COMM_WORLD_SIZE
export
MASTER_ADDR
=
$(
cat
$LSB_DJOB_HOSTFILE
|
sort
|
uniq
|
grep
-v
batch |
grep
-v
login |
head
-1
)
export
MASTER_PORT
=
29500
# default from torch launcher
ascent/sub_ibmwml.lsf
deleted
100755 → 0
View file @
4483a884
#!/bin/bash
# Begin LSF directives
#BSUB -P stf011
#BSUB -J sc21
#BSUB -o logs/sc21.o%J
#BSUB -W 0:30
#BSUB -nnodes 1
#BSUB -alloc_flags "nvme smt4"
####BSUB -N
# End LSF directives and begin shell commands
nnodes
=
$(
cat
${
LSB_DJOB_HOSTFILE
}
|
sort
|
uniq
|
grep
-v
login |
grep
-v
batch |
wc
-l
)
DATA_DIR
=
/gpfs/wolf/gen166/proj-shared/atsaris/imagenet/data/
LOG_DIR
=
logs/
#source /gpfs/wolf/stf011/proj-shared/28t/env/open-ce/activate.sh
source
/gpfs/wolf/stf011/proj-shared/28t/env/ibm-wml/activate.sh
jsrun
--smpiargs
=
"-disable_gpu_hooks"
-n
${
nnodes
}
-a1
-c42
-g1
-r1
\
--bind
=
proportional-packed:7
--launch_distribution
=
packed
\
bash
-c
"
\
source export_DDP_envvars.sh &&
\
python -u ../imagenet/main.py
\
--arch resnet50
\
-j 28
\
-p 10
\
-b 128
\
--training-only
\
--raport-file
${
LOG_DIR
}
/synthetic.ibmwml.1GPU.json
\
--epochs 1
\
--prof 100
\
--no-checkpoints
\
--data-backend sythetic
\
--amp
\
--memory-format nhwc
\
${
DATA_DIR
}
"
ascent/sub_opence.lsf
deleted
100755 → 0
View file @
4483a884
#!/bin/bash
# Begin LSF directives
#BSUB -P stf011
#BSUB -J sc21
#BSUB -o logs/sc21.o%J
#BSUB -W 0:30
#BSUB -nnodes 1
#BSUB -alloc_flags "nvme smt4"
####BSUB -N
# End LSF directives and begin shell commands
nnodes
=
$(
cat
${
LSB_DJOB_HOSTFILE
}
|
sort
|
uniq
|
grep
-v
login |
grep
-v
batch |
wc
-l
)
DATA_DIR
=
/gpfs/wolf/gen166/proj-shared/atsaris/imagenet/data/
LOG_DIR
=
logs/
source
/gpfs/wolf/stf011/proj-shared/28t/env/open-ce/activate.sh
#source /gpfs/wolf/stf011/proj-shared/28t/env/ibm-wml/activate.sh
jsrun
--smpiargs
=
"-disable_gpu_hooks"
-n
${
nnodes
}
-a1
-c42
-g1
-r1
\
--bind
=
proportional-packed:7
--launch_distribution
=
packed
\
bash
-c
"
\
source export_DDP_envvars.sh &&
\
python -u ../imagenet/main.py
\
--arch resnet50
\
-j 28
\
-p 10
\
-b 128
\
--training-only
\
--raport-file
${
LOG_DIR
}
/pytorch.opence.1GPU.json
\
--epochs 1
\
--prof 100
\
--no-checkpoints
\
--data-backend pytorch
\
--amp
\
--memory-format nhwc
\
${
DATA_DIR
}
"
Write
Preview
Supports
Markdown
0%
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment