Fix spelling and grammer errors up to the Tutorial. (dc0e8d74) · Commits · Cianciosa, Mark / graph_framework

graph_c_binding/graph_c_binding.h

+2 −2

Original line number	Diff line number	Diff line
		@@ -9,7 +9,7 @@
		///
		/// @section graph_c_binding_into Introduction
		/// This section assumes the reader is already familar with developing C codes.
		/// The simplist method to link framework code into a C code is to create a c++
		/// The simplist method to link framework code into a C code is to create a C++
		/// function with @code extern "C" @endcode First create a header file
		/// <tt><i>c_callable</i>.h</tt>
		/// @code
		@@ -19,7 +19,7 @@
		/// @endcode
		///
		/// Next create a source file <tt><i>c_callable</i>.c</tt> and add the
		/// framework. This example uses the line
		/// framework. This example uses the equation of a line example from the
		/// @ref tutorial_workflow "making workflows" turorial.
		/// @code
		/// // Include the necessary framework headers.

graph_docs/compiling.dox

+40 −37

Original line number	Diff line number	Diff line
		@@ -9,15 +9,18 @@
		*
		* <hr>
		* @section build_system_user User Guide
		* The following section is for users of framework.
		* The following section is for users of the framework.
		*
		* @subsection build_system_user_dependencies Dependencies
		* The graph_framwork requires three requires external dependencies and one
		* optional dependency. <a href="https://llvm.org">LLVM</a> is another
		* dependency that is used for generating CPU code. However this is
		* automatically obtained via the build system. The graph_frame is written using
		* the C++20 standard. The C interface using C17 and the fortran interface using
		* Fortran 2008.
		* The graph_framwork requires three external dependencies and one optional
		* dependency. <a href="https://llvm.org">LLVM</a> is another dependency that is
		* used for generating CPU code. However this is automatically obtained via the
		* build system. The graph_frame is written using the
		* <a href="https://www.cppreference.com/w/cpp/20.html">C++20</a> standard. The
		* C interface uses
		* <a href="https://www.cppreference.com/w/cpp/compiler_support/17.html">C17</a>
		* and the fortran interface uses
		* <a href="https://fortranwiki.org/fortran/show/Fortran+2008">Fortran 2008</a>.
		*
		* @subsubsection build_system_user_dependencies_required Required
		* * <a href="http://www.cmake.org">cmake</a> version greater than 3.21.
		@@ -28,7 +31,7 @@
		* * <a href="https://www.doxygen.nl/index.html">Doxygen</a> for generating this documentation.
		*
		* @subsection build_system_clone Obtaining the code
		* The framework code itself be obtained from the
		* The framework code itself can be obtained from the
		* <a href="https://github.com/ORNL-Fusion/graph_framework">graph_framework</a>
		* Github repository.
		* @code
		@@ -53,7 +56,7 @@
		* Where <tt>../</tt> points to the source directory containing the top
		* level <tt>CMakeLists.txt</tt> file.
		*
		* The recommended method is to use the interatice <tt>ccmake</tt> command
		* The recommended method is to use the interactive <tt>ccmake</tt> command
		* instead.
		* @code
		ccmake ../
		@@ -62,10 +65,10 @@
		* <tt>-D</tt> option.
		*
		* @subsubsection build_system_user_options Build system Options
		* Initally, there will be no options. Along the botton, there are several
		* command. Use the 'c' command to start the configuation process. Once
		* Initially, there will be no options. Along the botton, there are several
		* commands. Use the 'c' command to start the configuation process. Once
		* configured several options will apear. During this process cmake is cloning
		* the LLVM repository. So this step may take some time initally. Mode of the
		* the LLVM repository. So this step may take some time initally. Most of the
		* are various options for configuing LLVM and can be ignored. The important
		* options are listed below.
		*
		@@ -78,27 +81,27 @@
		* * <tt>MinSizeRel</tt>
		* * <tt>RelWithDebInfo</tt>
		* <tr><td><tt>USE_VERBOSE</tt> <td>Show verbose information about compute kernels.
		* <tr><td><tt>BUILD_C_BINDING</tt> <td>Generate the C langauge interface.
		* <tr><td><tt>BUILD_Fortran_BINDING</tt><td>Generate the Fortran language interface.
		* <tr><td><tt>BUILD_C_BINDING</tt> <td>Generate the @ref graph_c_binding.h "C langauge interface".
		* <tr><td><tt>BUILD_Fortran_BINDING</tt><td>Generate the @ref graph_fortran "Fortran language interface".
		* <tr><td><tt>USE_METAL</tt> <td>Enable the <a href="https://developer.apple.com/metal/">Metal</a> backend (macOS only).
		* <tr><td><tt>USE_CUDA</tt> <td>Enable the <a href="https://developer.nvidia.com/cuda-zone">Cuda</a> backend (Linux only).
		* <tr><td><tt>USE_HIP</tt> <td>Enable the <a href="https://www.amd.com/en/products/software/rocm.html">Hip</a> backend (Linux only, Hip branch).
		* <tr><td><tt>USE_SSH</tt> <td>Use ssh for git instead of html.
		* </table>
		*
		* @note macOS uses will need to change the default option for
		* @note macOS users will need to change the default option for
		* <tt>CMAKE_CXX_COMPILER</tt> to <tt>clang++</tt>. This is due to the way the
		* build systems determines default include directories for system libraries.
		* This can be accomplished using the advacned options using the <tt>t</tt>
		* This can be accomplished using the advacned options accessed from the <tt>t</tt>
		* command or setting this via the command line.
		* @code
		cmake -DCMAKE_CXX_COMPILER=clang++ ../
		@endcode
		*
		* Every time an option is changed, or a new option is available, you need to
		* use the configure <tt>c</tt> command for changes to take affect. Once all
		* options are set, the a generate <tt>g</tt> options will appear. Using this
		* option will build a make file.
		* Any time an option is changed, or a new option becomes is available, you need
		* to use the configure <tt>c</tt> command for changes to take affect. Once all
		* options are set, a generate <tt>g</tt> options will appear. Using this option
		* will generate the Makefile.
		*
		* @subsubsection build_system_trouble_shooting Trouble Shooting.
		* Some times, cmake will fail to locate the NetCDF library if it is not
		@@ -113,28 +116,28 @@
		* @code
		make
		@endcode
		* command. Note that due build system first starts by pulling the latest
		* of LLVM. The build system then has to build LLVM first which can take a
		* while. It is recommended to use a limited parallel build.
		* command. Note that the build system first starts by pulling the latest
		* revision of LLVM. The build system then has to build LLVM first which can
		* take a while. It is recommended to use a limited parallel build.
		* @code
		make -j10
		@endcode
		* The <tt>-j<i>num_processes</i></tt> option determines number of parallel
		* instances to run. The build products will be found in assocated build
		* directories in the <tt>build</tt> directory.
		* instances to run. The build products will be found in assocated directories
		* in the <tt>build</tt> directory.
		*
		* A list of individual components which can be build can be identified using
		* A list of individual components which can be built can be identified using
		* @code
		make -h
		make help
		@endcode
		*
		* @subsection build_system_test Running unit tests.
		* @subsection build_system_test Running unit tests
		* Units tests can be run using the command.
		* @code
		make test ARGS=-j10
		@endcode
		* Like the parallel build the <tt>-j<i>num_processes</i></tt> option determines
		* number of parallel instances to run.
		* the number of parallel instances to run.
		*
		* <hr>
		* @section build_system_dev Developer Guide
		@@ -144,7 +147,7 @@
		* The build system defines some macros for defining targets, configuring debug
		* options, and configuing external dependences.
		*
		* @subsubsection build_system_targets Tool targets.
		* @subsubsection build_system_targets Tool targets
		*
		* <hr>
		* <tt>add_tool_target(target lang)</tt>\n\n
		@@ -153,7 +156,7 @@
		* <tt>[in] <b>target</b></tt> The name of the target.\n
		* <tt>[in] <b>lang</b></tt> File extention for the target (c, cpp, f90).\n\n
		* Target assumes there is a source file defined as <tt>target.lang</tt>. For
		* instance a C++ source file named <tt>foo.cpp</tt> is configures as
		* instance a C++ source file named <tt>foo.cpp</tt> is configured as
		* @code
		add_tool_target(foo cpp)
		@endcode
		@@ -176,8 +179,8 @@
		* Register a sanitizer option.\n\n
		* <b>Parameters</b>\n
		* <tt>[in] <b>name</b></tt> The name of the sanitizer flags.\n\n
		* This add new for using the <tt>SANITIZE_<i>NAME</i></tt> cmake option and
		* add <tt>-fsanitize=<i>name</i></tt> to the command line arguments.
		* This adds a new cmake option <tt>SANITIZE_<i>NAME</i></tt> to add
		* <tt>-fsanitize=<i>name</i></tt> to the command line arguments.
		* <hr>
		*
		* @subsubsection build_system_project Register an external project
		@@ -203,14 +206,14 @@
		* In addition to the standard build options there are several debugging options
		* that can be enabled.
		*
		* @subsubsection build_system_dev_options Build system Options
		* @subsubsection build_system_dev_options Build System Options
		* <table>
		* <caption id="build_system_user_cmake_dev_opts">Build options for developers.</caption>
		* <tr><th>Option <th>Discrption
		* <tr><td><tt>USE_PCH</tt> <td>Use precomiled headers during computation. Most users should keep this on.
		* <tr><th>Option <th>Discription
		* <tr><td><tt>USE_PCH</tt> <td>Use precompiled headers during computation. Most users should keep this on.
		* <tr><td><tt>SAVE_KERNEL_SOURCE</tt> <td>Option to dump the generated compute kernel source code to disk.
		* <tr><td><tt>USE_INPUT_CACHE</tt> <td>Option to cache registers for the kernel arguments.
		* <tr><td><tt>USE_CONSTANT_CACHE</tt> <td>Option to use registers to cache constant values otherwise constanst are inlined.
		* <tr><td><tt>USE_CONSTANT_CACHE</tt> <td>Option to use registers to cache constant values otherwise constants are inlined.
		* <tr><td><tt>SHOW_USE_COUNT</tt> <td>Generates information on the number of times a register is used.
		* <tr><td><tt>USE_INDEX_CACHE</tt> <td>Option to use registers to cache array indicies.
		* <tr><th colspan="2">Sanitizer Flags

graph_docs/general.dox

+27 −26

Original line number	Diff line number	Diff line
		@@ -10,9 +10,9 @@
		* <caption id="general_concepts_glossery">Glossery of terms</caption>
		* <tr><th>Concept <th>Definition
		* <tr><td><b>node</b> <td>A leaf or branch on the graph tree.
		* <tr><td><b>graph</b> <td>A data stucture connecting leaf nodes.
		* <tr><td><b>reduce</b> <td>A tranformation of the graph to remove leaf_nodes.
		* <tr><td><b>auto differentiation</b><td>A tranformation of the graph build derivatives.
		* <tr><td><b>graph</b> <td>A data stucture connecting nodes.
		* <tr><td><b>reduce</b> <td>A transformation of the graph to remove leaf_nodes.
		* <tr><td><b>auto differentiation</b><td>A transformation of the graph build derivatives.
		* <tr><td><b>compiler</b> <td>A tool for translating from one language to another.
		* <tr><td><b>JIT</b> <td>Just-in-time compile.
		* <tr><td><b>kernel</b> <td>A code function that runs on a batch of data.
		@@ -25,27 +25,27 @@
		* <tr><td><b>safe math</b> <td>Run time checks to avoid off normal conditions.
		* <tr><td><b>API</b> <td>Application programming interface.
		* <tr><td><b>Host</b> <td>The place where kernels are launched from.
		* <tr><td><b>Device</b> <td>The device side where kernels are run.
		* <tr><td><b>Device</b> <td>The side where kernels are run.
		* </table>
		*
		* <hr>
		* @section general_concepts_graph Graph
		* The graph_framework operates by building tree structure of math operations.
		* The graph_framework operates by building a tree structure of math operations.
		* For an example of building expression structures see the
		* @ref tutorial_expression "basic expressions tutroial". In tree form it is
		* easy to traverse nodes in the graph. Take the example of equation of a line.
		* @f{equation}{y=mx + b@f}
		* This equation consists of five leaf nodes. The ends of the tree are clasified
		* as either variables @f$x@f$ or constants @f$m,b@f$. These leaf_nodes are
		* connected by leaf nodes for multiply and additon operations. The ouptut
		* @f$y@f$ represents the entire graph of operations.
		* This equation consists of five nodes. The ends of the tree are clasified
		* as either variables @f$x@f$ or constants @f$m,b@f$. These nodes are connected
		* by nodes for multiply and addition operations. The output @f$y@f$ represents
		* the entire graph of operations.
		* @image{} html line_graph.png "The graph stucture for y = mx + b."
		* Evaluation of graphs start from the top most node in this case the @f$+@f$
		* operation. Evaluation of a node is not performed until all subnodes are
		* evaluated starting with the left operand. Evaluation starts by recursively
		* evaluating the left operands until the last leaf_node is reached @f$m@f$.
		* evaluating the left operands until the last node is reached @f$m@f$.
		* @image{} html line_graph_eval1.png ""
		* Once @f$m@f$ the result is returned ot the @f$+@f$ then the right operand is
		* Once @f$m@f$ the result is returned to the @f$+@f$ then the right operand is
		* evaluated.
		* @image{} html line_graph_eval2.png ""
		* Evaluation is repeated until every node in the graph is evaluated.
		@@ -54,13 +54,13 @@
		* <hr>
		* @section general_concepts_diff Auto Differentiation
		* From the previous @ref general_concepts_graph "section", it was shown how
		* graph can be evaluated. This same evaluation can be applied to build
		* graphs can be evaluated. This same evaluation can be applied to build
		* graphs of a function derivative. For an example of taking derivatives see the
		* @ref tutorial_derivatives "auto differentiation tutroial". Lets say that we
		* want to take the derivative of @f$\frac{\partial y}{\partial x}@f$. This is
		* achieved by evaluating the until bottom left most leaf_node is reached. Then
		* a new graph is build starting with @f$\frac{\partial m}{\partial x}=0@f$.
		* Applying the first half of the chain rule we build a new graph for @f$0x@f$
		* achieved by evaluating the until bottom left most node is reached. Then a new
		* graph is build starting with @f$\frac{\partial m}{\partial x}=0@f$. Applying
		* the first half of the chain rule we build a new graph for @f$0x@f$
		* @image{} html line_graph_dydf1.png ""
		* Then we take the derivative of the right operand and apply the second half
		* of the chain rule to build a new graph for @f$0x=0@f$.
		@@ -72,17 +72,18 @@
		* @section general_concepts_reduction Reduction
		* The final expression for @f$\frac{\partial y}{\partial x}@f$ contains many
		* unnecessary nodes in the graph. Instead of building full graphs, we can
		* simplify and eleminate node as we build them. For instance, when the
		* expression @f$0x@f$ this create can be immediately reduced to a single node.
		* simplify and eleminate nodes as we build them. For instance, when the
		* expression @f$0x@f$ this created can be immediately reduce it to a single
		* node.
		* @image{} html line_graph_reduce1.png ""
		* Applying all possible reduction reduces the final expression to
		* Applying all possible reductions reduces the final expression to
		* @f$\frac{\partial y}{\partial x}=m@f$.
		* @image{} html line_graph_reduce_final.png ""
		* By reducing graphs as they are build, we can eliminate nodes one by one.
		*
		* <hr>
		* @section general_concepts_compile Compile
		* Once graph expressions are build, they can be compiled to a compute kernel.
		* Once graph expressions are built, they can be compiled to a compute kernel.
		* For an example of compiling expression trees into kernels see the
		* @ref tutorial_workflow "workflow tutroial".
		* Using the same recursive evaluation, we can visit each node of a graph and
		@@ -92,9 +93,9 @@
		* be genereted from multiple outputs and maps.
		*
		* @subsection general_concepts_compile_inputs Inputs
		* Inputs are the varible nodes that define the graph. In the line example
		* @f$\frac{\partial y}{\partial x}@f$, the input variable would be the node
		* for @f$x@f$. Some graphs have no inputs. The graph for
		* Inputs are the variable nodes that define the graph inputs. In the line
		* example @f$\frac{\partial y}{\partial x}@f$, the input variable would be the
		* node for @f$x@f$. Some graphs have no inputs. The graph for
		* @f$\frac{\partial y}{\partial x}=m@f$ has eliminated all the variable nodes
		* in the graph.
		*
		@@ -106,8 +107,8 @@
		* are never stored.
		*
		* @subsection general_concepts_compile_maps Maps
		* Maps enable the results of an output node to stored in an input node. This is
		* use for a wide varity of steps. For instance take a gradient decent step.
		* Maps enable the results of an output node to be stored in an input node. This
		* is use for a wide varity of steps. For instance take a gradient decent step.
		* @f{equation}{y = y + \frac{\partial f}{\partial x}@f}
		* In this case the output of the expression
		* @f$y + \frac{\partial f}{\partial x}@f$
		@@ -120,7 +121,7 @@
		*
		* <hr>
		* @section general_concepts_safe_math Safe Math
		* There are some conditions where mathematically, a graph should evaluate to
		* There are some conditions where mathematically, a graph should evaluate to a
		* normal number. However, when evaluted suing floating point precison, can lead
		* to <tt>Inf</tt> or <tt>NaN</tt>. An example of this the
		* @f$\exp\left(x\right)@f$ function. For large argument values,

graph_docs/main.dox

+7 −8

Original line number	Diff line number	Diff line
		@@ -3,11 +3,10 @@
		* @tableofcontents
		* @section introduction Introduction
		* The <a href="https://github.com/ORNL-Fusion/graph_framework">graph_framework</a>
		* is a domain specific compiler for translating physics
		* equations to optimized code that run a on a GPUs and CPUs. The domain
		* specific aspect limits this to classes of problems where the same physics is
		* applied to a ensemble. Eamples include RF Ray tracing, particle pushing, and
		* field line following.
		* is a domain specific compiler for translating physics equations to optimized
		* code that runs on GPUs and CPUs. The domain specific aspect limits this to
		* classes of problems where the same physics is applied to an ensemble. Eamples
		* include RF Ray tracing, particle pushing, and field line following.
		*
		* @subsection purpose Purpose
		* The purpose of this framework is to enable domain scientists to write code
		@@ -15,15 +14,15 @@
		*
		* This framework enables:
		* * Portability to Nvidia, AMD, and Apple GPUs and CPUs.
		* * Abstract physcis from the compute.
		* * Abstraction of the physics from the compute.
		* * Enable Auto Differentiation.
		* * Enable easy embedding in C, C++, and Fortran codes.
		*
		* <hr>
		* @section tools User guides for tools
		* @subsection rf_ray_tracing RF Ray tracing
		* This section covers user guides to run the RF Ray traceing code. To run this
		* code, a use selects an equilibrium, a wave distribution function, a solver
		* This section covers user guides to run the RF Ray tracing code. To run this
		* code, a user selects an equilibrium, a wave distribution function, a solver
		* method, intial ray conditions, and a power obsorption model. To run an
		* example follow the instructions for the @ref xrays_commandline_example.
		*

graph_docs/tutorial.dox

+19 −18

Original line number	Diff line number	Diff line
		/*!
		* @page tutorial Tutorial
		* @brief Hands on tutorial for building expressions and running workflow.
		* @brief Hands on tutorial for building expressions and running workflows.
		* @tableofcontents
		*
		* @section tutorial_introduction Introduction
		* In this tutorial we will put the basic @ref general_concepts of the
		* graph_into action. This will discuss building trees, generating, and
		* executing workflows.
		* graph_framework into action. This will discuss building trees, generating
		* kernels, and executing workflows.
		*
		* To accomplish this there is a playground tool in the <tt>graph_framework</tt>
		* directory. This playground is a preconfigured executable target which can be
		@@ -27,8 +27,8 @@ int main(int argc, const char * argv[]) {
		}
		@endcode
		* To start, create a template function above main and call that function from
		* main. This will allow to play with different floating point types. For now we
		* will start with a simple floating point type.
		* main. This will allow us to play with different floating point types. For now
		* we will start with a simple floating point type.
		* @code
		#include "../graph_framework/jit.hpp"

		@@ -47,9 +47,10 @@ int main(int argc, const char * argv[]) {
		END_GPU
		}
		@endcode
		* Here @ref jit::float_scalar is a C++20
		* Here @ref jit::float_scalar is a
		* <a href="https://www.cppreference.com/w/cpp/20.html">C++20</a>
		* <a href="https://www.cppreference.com/w/cpp/concepts.html">Concept</a> for
		* valid types of floating point types allowed by the framework.
		* valid floating point types allowed by the framework.
		*
		* <hr>
		* @section tutorial_basic Basic Nodes
		@@ -97,7 +98,7 @@ void run_tutorial() {
		}
		@endcode
		* An explicit @ref graph::constant_node is create for <tt>m</tt> while an
		* impicit constant was defined for <tt>b</tt>. Node in the implicit case, the
		* impicit constant was defined for <tt>b</tt>. Note in the implicit case, the
		* actual node for <tt>b</tt> is not created until we use it in an expression.
		*
		* <hr>
		@@ -147,19 +148,19 @@ void run_tutorial() {
		}
		@endcode
		* Here we take derivatives using the @ref graph::leaf_node::df method. We can
		* also take several variation of this.
		* also take several variations of this.
		* @code
		auto dydm = y->df(m);
		auto dydy = y->df(y);
		auto dydb = y->df(m*x);
		@endcode
		* The respective results will be @f$x@f$, @f$1@f$, and @f$0@f$ respectively.
		* The results will be @f$x@f$, @f$1@f$, and @f$0@f$ respectively.
		*
		* <hr>
		* @section tutorial_workflow Making workflows.
		* In this section we will build workflow from these nodes we created. For
		* simplicity we will decrease the number of elements in variable so we can set
		* the values easier. First thing we do is create a @ref workflow::manager.
		* In this section we will build a workflow from these nodes we created. For
		* simplicity we will decrease the number of elements in the variable so we can
		* set the values easier. First thing we do is create a @ref workflow::manager.
		* @code
		template<jit::float_scalar T>
		void run_tutorial() {
		@@ -270,7 +271,7 @@ work.print(2, {x, y, dydx});
		@endcode
		*
		* @subsection tutorial_workflow_iter Iteration
		* In this section we are going to make use of maps to iterate a variable. I
		* In this section we are going to make use of maps to iterate a variable. We
		* want to evaluate the value of @f$y@f$ and set it as the new value of @f$x@f$.
		* We do this my modifying call to @ref workflow::manager::add_item to define
		* a map. This generates a kernel where after @f$y@f$ is computed it is stored
		@@ -306,11 +307,11 @@ for (size_t i = 0; i < 10; i++) {
		@endcode
		*
		* <hr>
		* @section tutorial_workflow_newton Newtons Method.
		* @section tutorial_workflow_newton Newton's Method.
		* In this tutorial we are going to show how we can put all these concepts
		* together to impliment a newtons method. Newtons method if defined as
		* together to implement a newtons method. Newton's method is defined as
		* @f{equation}{x = x - \frac{f\left(x\right)}{\frac{\partial}{\partial x}f\left(x\right)}@f}
		* From the iteration example, it's step update can be handled by a simple map.
		* From the iteration example, its step update can be handled by a simple map.
		* However we need a measure for convergence. To do that we output the value of
		* @f$f\left(x\right)@f$. Lets setup a test function.
		* @code
		@@ -366,7 +367,7 @@ void run_tutorial() {
		@endcode
		*
		* However there are some things that are not optimial here. We are performing
		* a reduction on the host side and transfer the entire array to the host. To
		* a reduction on the host side and transfering the entire array to the host. To
		* improve this we can use a converge item instead.
		* @code
		// Create a workflow manager.