Loading config/philly/2-gpu.yaml +5 −5 Original line number Diff line number Diff line Loading @@ -13,15 +13,15 @@ system: cpus_per_node: 2 cores_per_cpu: 20 gpus_per_node: 2 cpu_peak_flops: 1248000000000.0 gpu_peak_flops: 7800000000000.0 cpu_peak_flops: 1248000000000.0 # assume Xeon E5-2690v4 CPU 64-bit gpu_peak_flops: 9300000000000.0 # assume 12G P100 32-bit cpu_fp_ratio: 0.667 gpu_fp_ratio: 0.667 power: power_gpu_idle: 75 power_gpu_max: 300 power_gpu_idle: 30 power_gpu_max: 250 power_cpu_idle: 90 power_cpu_max: 280 power_cpu_max: 270 power_mem: 74.26 power_nvme: 30 power_nic: 20 Loading config/philly/8-gpu.yaml +5 −5 Original line number Diff line number Diff line Loading @@ -13,15 +13,15 @@ system: cpus_per_node: 2 cores_per_cpu: 20 gpus_per_node: 8 cpu_peak_flops: 1248000000000.0 gpu_peak_flops: 7800000000000.0 cpu_peak_flops: 1248000000000.0 # assume Xeon E5-2690v4 CPU 64-bit gpu_peak_flops: 12000000000000.0 # assume 24G P40 32-bit cpu_fp_ratio: 0.667 gpu_fp_ratio: 0.667 power: power_gpu_idle: 75 power_gpu_max: 300 power_gpu_idle: 50 power_gpu_max: 250 power_cpu_idle: 90 power_cpu_max: 280 power_cpu_max: 270 power_mem: 74.26 power_nvme: 30 power_nic: 20 Loading raps/dataloaders/philly.py +12 −1 Original line number Diff line number Diff line """ Main reference to Philly traces: This is the dataloader for the Philly traces which is documented in this paper: Jeon, Myeongjae, et al. "Analysis of Large-Scale Multi-Tenant GPU clusters for DNN training workloads." 2019 USENIX Annual Technical Conference (USENIX ATC 19). 2019. https://www.usenix.org/system/files/atc19-jeon.pdf Note on hardware specs: Philly only provides GPU memory sizes (12G & 24G) without clarifying GPU models. Hu et al. (2024) https://arxiv.org/html/2403.07648v1 For estimating system power and FLOPS performance, we assume that the 2-GPU nodes used Tesla P100 (12 GB) GPUs and the 8-GPU nodes used Tesla P40 (24 GB) GPUs, consistent with hardware Microsoft deployed around 2017. Training is assumed to have been performed in 32-bit (FP32), and the CPUs are assumed to be 64-bit Intel Xeon E5-2690 v4. The repository is available here: https://github.com/msr-fiddle/philly-traces Loading Loading
config/philly/2-gpu.yaml +5 −5 Original line number Diff line number Diff line Loading @@ -13,15 +13,15 @@ system: cpus_per_node: 2 cores_per_cpu: 20 gpus_per_node: 2 cpu_peak_flops: 1248000000000.0 gpu_peak_flops: 7800000000000.0 cpu_peak_flops: 1248000000000.0 # assume Xeon E5-2690v4 CPU 64-bit gpu_peak_flops: 9300000000000.0 # assume 12G P100 32-bit cpu_fp_ratio: 0.667 gpu_fp_ratio: 0.667 power: power_gpu_idle: 75 power_gpu_max: 300 power_gpu_idle: 30 power_gpu_max: 250 power_cpu_idle: 90 power_cpu_max: 280 power_cpu_max: 270 power_mem: 74.26 power_nvme: 30 power_nic: 20 Loading
config/philly/8-gpu.yaml +5 −5 Original line number Diff line number Diff line Loading @@ -13,15 +13,15 @@ system: cpus_per_node: 2 cores_per_cpu: 20 gpus_per_node: 8 cpu_peak_flops: 1248000000000.0 gpu_peak_flops: 7800000000000.0 cpu_peak_flops: 1248000000000.0 # assume Xeon E5-2690v4 CPU 64-bit gpu_peak_flops: 12000000000000.0 # assume 24G P40 32-bit cpu_fp_ratio: 0.667 gpu_fp_ratio: 0.667 power: power_gpu_idle: 75 power_gpu_max: 300 power_gpu_idle: 50 power_gpu_max: 250 power_cpu_idle: 90 power_cpu_max: 280 power_cpu_max: 270 power_mem: 74.26 power_nvme: 30 power_nic: 20 Loading
raps/dataloaders/philly.py +12 −1 Original line number Diff line number Diff line """ Main reference to Philly traces: This is the dataloader for the Philly traces which is documented in this paper: Jeon, Myeongjae, et al. "Analysis of Large-Scale Multi-Tenant GPU clusters for DNN training workloads." 2019 USENIX Annual Technical Conference (USENIX ATC 19). 2019. https://www.usenix.org/system/files/atc19-jeon.pdf Note on hardware specs: Philly only provides GPU memory sizes (12G & 24G) without clarifying GPU models. Hu et al. (2024) https://arxiv.org/html/2403.07648v1 For estimating system power and FLOPS performance, we assume that the 2-GPU nodes used Tesla P100 (12 GB) GPUs and the 8-GPU nodes used Tesla P40 (24 GB) GPUs, consistent with hardware Microsoft deployed around 2017. Training is assumed to have been performed in 32-bit (FP32), and the CPUs are assumed to be 64-bit Intel Xeon E5-2690 v4. The repository is available here: https://github.com/msr-fiddle/philly-traces Loading