Update README.md (08525680) · Commits · workflow / OLCF-6 Workflow Benchmark

README.md

+16 −2

Original line number	Diff line number	Diff line
		@@ -6,15 +6,29 @@
		- Rafael Ferreira da Silva (silvarf@ornl.gov)

		## Purpose of Benchmark

		The objective of this workflow benchmark is to assess the capability of the High-Performance Computing (HPC) system in supporting dynamic workloads that originate from various data stream sources. These workloads will be processed within the compute nodes. The processed data will then be made available to a wide array of consumers, each potentially consuming unique abstractions of the data. We target an ML4NSE (Machine Learning for Neutron Scattering Experiment) application that employs a Temporal Fusion Transformer (TFT) model to both train on and predict the measurement time for a distinct cluster of peaks. This cluster includes a robust nuclear peak along with six weaker satellite peaks, resulting from the magnetic ordering within a single-crystal sample. The primary objective of this code is to enable near real-time decision-making by leveraging the combined powers of Machine Learning (ML) and High-Performance Computing (HPC). This benchmark provides a self-contained, end-to-end evaluation of a coupled compute/data problem.

		### Configuration
		## Characteristics of Benchmark

		Data is channeled from multiple sources to a gateway node using an open-source streaming transport layer capable of handling structured data, not just byte sequences (e.g., ZeroMQ, RabbitMQ). One input stream to the gateway node may potentially serve as a control stream, directing how to multiplex, filter, window, or otherwise service the other data streams. This orchestrates the creation of a single stream of structured data, possibly distinct in structure from the input streams, which is then forwarded to a listener at the Services Cluster, ensuring efficient information flow.

		The integration between the source streams (at the edge/service nodes) and the ML4NSE (running within the compute nodes) facilitates the swift analysis of data generated during an experiment, offering timely feedback to researchers (consumers). Such prompt insights allow for the precise adjustment of experimental parameters, enhancing the effectiveness and responsiveness of the ongoing research.

		The Data Orchestrator process has connectivity to both the fast, reliable internal network of the Compute Cluster (e.g. RDMA but potentially other technology) and a network physically disjoint from that internal network. It can potentially communicate with any process in the job running on the Compute Cluster. Periodically the Data Orchestrator will create output data events based on the inputs it receives from the compute job. These events are forwarded to the current subscriber set on one or more output data streams. The output events may be customized (filtered/windowed/transformed) differently for different data streams (for instance, a particular output stream may have a guarantee of no more than one event per wall-clock second, while another output stream may only send summarized data based on input from all processes in the compute job).

		## Mechanics of Building Benchmark

		```
		bash setup.sh
		```

		### Run example workflow
		## Mechanics of Running Benchmark
		```
		sbatch job.sh
		```

		## Run Rules

		## Figure of Merit