Update section headers and add section of software design. (1fee604c) · Commits · Cianciosa, Mark / graph_framework

graph_paper/paper.bib

+14 −3

Original line number	Diff line number	Diff line
		@@ -290,7 +290,7 @@
		Program summary
		Program Title: Dream CPC Library link to program files: https://doi.org/10.17632/vs3yvnrzg6.1 Developer's repository link: https://github.com/chalmersplasmatheory/DREAM Licensing provisions: MIT Programming language: C++, Python Nature of problem: Self-consistently simulates the plasma evolution in a tokamak disruption, with specific emphasis on runaway electron dynamics. The runaway electrons can be simulated either as a fluid, fully kinetically, or as a mix of the two. Plasma temperature, current density, electric field, ion density and charge states are all evolved self-consistently, where kinetic non-thermal contributions are captured using an orbit-averaged relativistic electron Fokker-Planck equation, which couples to the plasma evolution. In the typical use case, the electrons are represented by two distinct populations: a cold fluid population and a kinetic superthermal population. Solution method: The system of equations is solved using a standard multidimensional Newton's method. Partial differential equations---most prominently the bounce-averaged Fokker--Planck and current diffusion equations---are discretized using a high-resolution finite volume scheme that preserves density and positivity.},
		author = {Mathias Hoppe and Ola Embreus and T{\"u}nde F{\"u}l{\"o}p},
		doi = {https://doi.org/10.1016/j.cpc.2021.108098},
		doi = {10.1016/j.cpc.2021.108098},
		issn = {0010-4655},
		journal = {Computer Physics Communications},
		keywords = {Runaway electrons, Tokamak disruptions, Fokker-Planck},
		@@ -394,7 +394,7 @@
		pages = {3054-3069},
		year = {2014},
		issn = {0920-3796},
		doi = {https://doi.org/10.1016/j.fusengdes.2014.09.018},
		doi = {10.1016/j.fusengdes.2014.09.018},
		url = {https://www.sciencedirect.com/science/article/pii/S0920379614005961},
		author = {M. Kovari and R. Kemp and H. Lux and P. Knight and J. Morris and D.J. Ward},
		keywords = {Fusion reactor, Thermonuclear, Deuterium, Tritium, Economics},
		@@ -407,8 +407,19 @@
		pages = {9-20},
		year = {2016},
		issn = {0920-3796},
		doi = {https://doi.org/10.1016/j.fusengdes.2016.01.007},
		doi = {10.1016/j.fusengdes.2016.01.007},
		url = {https://www.sciencedirect.com/science/article/pii/S0920379616300072},
		author = {M. Kovari and F. Fox and C. Harrington and R. Kembleton and P. Knight and H. Lux and J. Morris},
		keywords = {Fusion reactor, Thermonuclear, Deuterium, Tritium, Economics, Magnet, Neutronics, Reliability, Availability, Capacity factor, Blanket, Divertor, DEMO},
		abstract = {PROCESS is a reactor systems code – it assesses the engineering and economic viability of a hypothetical fusion power station using simple models of all parts of a reactor system. PROCESS allows the user to choose which constraints to impose and which to ignore, so when evaluating the results it is vital to study the list of constraints used. New algorithms submitted by collaborators can be incorporated – for example safety, first wall erosion, and fatigue life will be crucial and are not yet taken into account. This paper describes algorithms relating to the engineering aspects of the plant. The toroidal field (TF) coils and the central solenoid are assumed by default to be wound from niobium-tin superconductor with the same properties as the ITER conductors. The winding temperature and induced voltage during a quench provide a limit on the current density in the TF coils. Upper limits are placed on the stresses in the structural materials of the TF coil, using a simple two-layer model of the inboard leg of the coil. The thermal efficiency of the plant can be estimated using the maximum coolant temperature, and the capacity factor is derived from estimates of the planned and unplanned downtime, and the duty cycle if the reactor is pulsed. An example of a pulsed power plant is given. The need for a large central solenoid to induce most of the plasma current, and physics assumptions that are conservative compared to some other studies, result in a large machine, with a cryostat 36m in diameter. Multiple constraints, working together, restrict the parameter space of the optimised model. For example, even when the ratio of operating current to critical current in the TF coils is increased by a factor of five, the total coil cross-section decreases only a little, because of the need for copper stabiliser, insulation, and structural support. The result is that the plasma major radius hardly changes. It is these surprising results that justify the development of systems codes.}}

		@INPROCEEDINGS{Lattner,
		author={Lattner, C. and Adve, V.},
		booktitle={International Symposium on Code Generation and Optimization, 2004. CGO 2004.},
		title={LLVM: a compilation framework for lifelong program analysis & transformation},
		year={2004},
		volume={},
		number={},
		pages={75-86},
		doi={10.1109/CGO.2004.1281665}}

graph_paper/paper.md

+31 −3

Original line number	Diff line number	Diff line
		@@ -111,7 +111,7 @@ will describe the frameworks design and capabilities. Demonstrate applications
		to problems in radio frequency (RF) heating and particle tracing, and show its
		performance scaling.

		# Background
		# State of the field

		\| Framework \| Language \| Cuda Support \| Metal Support \| RocM Support \| Auto Differentiation \|
		\|:---------------:\|:------------------:\|:------------------:\|:---------------------:\|:------------------:\|:--------------------:\|
		@@ -171,7 +171,32 @@ what the framework is doing. Additionally cross platform support is often
		unofficial and can be incomplete. Table \ref{frameworks} shows an overview of
		these frameworks.

		# Performance
		# Software design
		The core of this software is built around a graph data structure representing
		mathematical expressions. In graph form, the expressions can be treated
		symbolically enabling two critical functions. Algebraic rules can be applied to
		reduce graphs to simpler forms or chain rules can be applied to transform graphs
		into expressions for derivatives.

		Since the goal of this framework it not to target machine learning applications,
		it's not necessary to compute gradients of expressions with large numbers of
		parameters. This symbolic approach was chosen for its simplicity and greater
		flexibility. In contrast to machine learning frameworks this framework makes no
		distinction between variables and functions. Derivatives can be taken with
		respect to any other expression.

		After expressions are built, workflows are created. A workflow is defined from
		one or more workflow items. A workflow item is defined from input nodes, output
		nodes, and maps between inputs and outputs. For each input and output nodes,
		device buffers are allocated. Then starting from a given output, device specific
		kernel source code is created by traversing the graph and adding a line
		appropriate for the expression. Duplicate expressions are avoided by tracking a
		list of registers. Kernel sources are JIT compiled using the vender API or using
		the Low Level Virtual Machine LLVM[@Lattner] for CPUs. A workflow is run by
		iterating through the workflow items.

		# Research impact statement

		To demonstrate the performance of the optimized kernels created using this
		framework we measured the strong scaling using the the RF ray tracing problem
		in a realistic tokamak geometry. To to compare against other frameworks we
		@@ -228,6 +253,9 @@ $10^{3}$ time steps. The `graph_framework` consistently shows the best
		throughput on both CPUs and GPUs. Note MLX CPU throughput could by improved by
		splitting the problem to multiple threads.

		# AI usage disclosure
		No AI technology was used in the development of this software.

		# Acknowledgements
		The authors would like to thank Dr. Yashika Ghai, Dr. Rhea Barnett, and Dr.
		David Green for their valuable insights when setting up test cases for the