Commit f397377b authored by Brewer, Wes's avatar Brewer, Wes
Browse files

Refined and document the hardware TDP and performance specs that were used for philly

parent cdff947d
Loading
Loading
Loading
Loading
+5 −5
Original line number Diff line number Diff line
@@ -13,15 +13,15 @@ system:
  cpus_per_node: 2
  cores_per_cpu: 20
  gpus_per_node: 2
  cpu_peak_flops: 1248000000000.0
  gpu_peak_flops: 7800000000000.0
  cpu_peak_flops: 1248000000000.0 # assume Xeon E5-2690v4 CPU 64-bit
  gpu_peak_flops: 9300000000000.0 # assume 12G P100 32-bit
  cpu_fp_ratio: 0.667
  gpu_fp_ratio: 0.667
power:
  power_gpu_idle: 75
  power_gpu_max: 300
  power_gpu_idle: 30
  power_gpu_max: 250
  power_cpu_idle: 90
  power_cpu_max: 280
  power_cpu_max: 270
  power_mem: 74.26
  power_nvme: 30
  power_nic: 20
+5 −5
Original line number Diff line number Diff line
@@ -13,15 +13,15 @@ system:
  cpus_per_node: 2
  cores_per_cpu: 20
  gpus_per_node: 8
  cpu_peak_flops: 1248000000000.0
  gpu_peak_flops: 7800000000000.0
  cpu_peak_flops: 1248000000000.0  # assume Xeon E5-2690v4 CPU 64-bit
  gpu_peak_flops: 12000000000000.0 # assume 24G P40 32-bit
  cpu_fp_ratio: 0.667
  gpu_fp_ratio: 0.667
power:
  power_gpu_idle: 75
  power_gpu_max: 300
  power_gpu_idle: 50
  power_gpu_max: 250
  power_cpu_idle: 90
  power_cpu_max: 280
  power_cpu_max: 270
  power_mem: 74.26
  power_nvme: 30
  power_nic: 20
+12 −1
Original line number Diff line number Diff line
"""
Main reference to Philly traces:
This is the dataloader for the Philly traces which is documented in this paper:

    Jeon, Myeongjae, et al. "Analysis of Large-Scale Multi-Tenant GPU clusters for DNN training workloads." 
    2019 USENIX Annual Technical Conference (USENIX ATC 19). 2019.
    https://www.usenix.org/system/files/atc19-jeon.pdf

Note on hardware specs:

    Philly only provides GPU memory sizes (12G & 24G) without clarifying GPU models.
    Hu et al. (2024) https://arxiv.org/html/2403.07648v1

    For estimating system power and FLOPS performance, we assume that the 2-GPU
    nodes used Tesla P100 (12 GB) GPUs and the 8-GPU nodes used Tesla P40 (24 GB) 
    GPUs, consistent with hardware Microsoft deployed around 2017. Training is 
    assumed to have been performed in 32-bit (FP32), and the CPUs are assumed 
    to be 64-bit Intel Xeon E5-2690 v4.

The repository is available here:

    https://github.com/msr-fiddle/philly-traces