Refined and document the hardware TDP and performance specs that were used for philly (f397377b) · Commits · ExaDigiT / sim-raps

config/philly/2-gpu.yaml

+5 −5

Original line number	Diff line number	Diff line
		@@ -13,15 +13,15 @@ system:
		cpus_per_node: 2
		cores_per_cpu: 20
		gpus_per_node: 2
		cpu_peak_flops: 1248000000000.0
		gpu_peak_flops: 7800000000000.0
		cpu_peak_flops: 1248000000000.0 # assume Xeon E5-2690v4 CPU 64-bit
		gpu_peak_flops: 9300000000000.0 # assume 12G P100 32-bit
		cpu_fp_ratio: 0.667
		gpu_fp_ratio: 0.667
		power:
		power_gpu_idle: 75
		power_gpu_max: 300
		power_gpu_idle: 30
		power_gpu_max: 250
		power_cpu_idle: 90
		power_cpu_max: 280
		power_cpu_max: 270
		power_mem: 74.26
		power_nvme: 30
		power_nic: 20

+5 −5

Original line number	Diff line number	Diff line
		@@ -13,15 +13,15 @@ system:
		cpus_per_node: 2
		cores_per_cpu: 20
		gpus_per_node: 8
		cpu_peak_flops: 1248000000000.0
		gpu_peak_flops: 7800000000000.0
		cpu_peak_flops: 1248000000000.0 # assume Xeon E5-2690v4 CPU 64-bit
		gpu_peak_flops: 12000000000000.0 # assume 24G P40 32-bit
		cpu_fp_ratio: 0.667
		gpu_fp_ratio: 0.667
		power:
		power_gpu_idle: 75
		power_gpu_max: 300
		power_gpu_idle: 50
		power_gpu_max: 250
		power_cpu_idle: 90
		power_cpu_max: 280
		power_cpu_max: 270
		power_mem: 74.26
		power_nvme: 30
		power_nic: 20

+12 −1

Original line number	Diff line number	Diff line
		"""
		Main reference to Philly traces:
		This is the dataloader for the Philly traces which is documented in this paper:

		Jeon, Myeongjae, et al. "Analysis of Large-Scale Multi-Tenant GPU clusters for DNN training workloads."
		2019 USENIX Annual Technical Conference (USENIX ATC 19). 2019.
		https://www.usenix.org/system/files/atc19-jeon.pdf

		Note on hardware specs:

		Philly only provides GPU memory sizes (12G & 24G) without clarifying GPU models.
		Hu et al. (2024) https://arxiv.org/html/2403.07648v1

		For estimating system power and FLOPS performance, we assume that the 2-GPU
		nodes used Tesla P100 (12 GB) GPUs and the 8-GPU nodes used Tesla P40 (24 GB)
		GPUs, consistent with hardware Microsoft deployed around 2017. Training is
		assumed to have been performed in 32-bit (FP32), and the CPUs are assumed
		to be 64-bit Intel Xeon E5-2690 v4.

		The repository is available here:

		https://github.com/msr-fiddle/philly-traces