Commit 9cd9e44c authored by Lee, Seyong's avatar Lee, Seyong
Browse files

Update

parent f14a3449
Loading
Loading
Loading
Loading
+16 −14
Original line number Diff line number Diff line
@@ -18,7 +18,7 @@ find more details on OpenARC.
ENVIRONMENT SETUP
-------------------------------------------------------------------------------
* Module Setting
	- Load the following modules: java, cuda, pgi, and cudampi
	- Load the following modules (java, cuda, pgi, and cudampi):
	//Run the commands below:
	$ module load java
	$ module load cuda
@@ -27,6 +27,9 @@ ENVIRONMENT SETUP

	//Or run the following command:
	$ source /home01/kedu01/shared/OpenARCSetup.bash
	//The above command also sets the following environment variables:
	$ export openarc=/scratch/${USER}/local/openarc
	$ export openarcexamples=scratch/${USER}/local/openarcexamples

* OpenARC Environment Setting
	- Set the following environment variables (targeting NVIDIA CUDA GPUs):
@@ -91,7 +94,7 @@ EXPERIMENTS
	$ vi ./matmul/openarcConf.txt
	//Run O2GBuild.script script again.
	$ ./O2GBuild.script
	//Check the generated output files
	//Check the generated output files.
	$ vi ./cetus_output/openarc_kernel.cu
	$ vi ./cetus_output/matmul.cpp
	//Repeat the above steps by changing showInternalAnnotations up to 3.
@@ -102,7 +105,7 @@ EXPERIMENTS
	$ vi ./matmul/openarcConf.txt
	//Run O2GBuild.script script again.
	$ ./O2GBuild.script
	//Check the generated output files
	//Check the generated output files.
	$ vi ./cetus_output/openarc_kernel.cu
	$ vi ./cetus_output/matmul.cpp

@@ -112,7 +115,7 @@ EXPERIMENTS
	$ vi ./matmul/openarcConf.txt
	//Run O2GBuild.script script again.
	$ ./O2GBuild.script
	//Check the generated output files
	//Check the generated output files.
	$ vi ./cetus_output/matmul.cpp

	- Task4: learn how to use a commandline option to set default number of workers: defaultNumWorkers
@@ -121,17 +124,17 @@ EXPERIMENTS
	$ vi ./matmul/openarcConf.txt
	//Run O2GBuild.script script again.
	$ ./O2GBuild.script
	//Check the generated output files
	//Check the generated output files.
	$ vi ./cetus_output/openarc_kernel.cu
	$ vi ./cetus_output/matmul.cpp

	- Task5: learn how to use a commandline option to set the maximum number of gangs: maxNumGangs
	$ cd ${openarcexamples}/matmul
	//Open the OpenARC configuration file and set maxNumGangs to 32 (e.g., maxNumGangs=32)
	//Open the OpenARC configuration file and set maxNumGangs to 32 (e.g., maxNumGangs=32).
	$ vi ./matmul/openarcConf.txt
	//Run O2GBuild.script script again.
	$ ./O2GBuild.script
	//Check the generated output files
	//Check the generated output files.
	$ vi ./cetus_output/openarc_kernel.cu
	$ vi ./cetus_output/matmul.cpp

@@ -144,7 +147,7 @@ EXPERIMENTS
	//Check the generated output files
	$ vi ./cetus_output/openarc_kernel.cu
	$ vi ./cetus_output/matmul.cpp
	//To compile and run the generated output programs, environment variable OPENARC_ARCH should be also set accordingly (e.g., OPENARC_ARCH=1)
	//To compile and run the generated output programs, environment variable OPENARC_ARCH should be also set accordingly (e.g., OPENARC_ARCH=1).
	//If the commandline option targetArch is not used, environment variable OPENARC_ARCH will be used to decide the target architecture.

* Experiment3: learn how to use CUDA Unified Memory with OpenARC
@@ -152,15 +155,14 @@ EXPERIMENTS
		- To use Unified Memory, 1) host data should be allocated using OpenARC's Unified Memory APIs such as acc_create_unified(), and 
		2) environment variable OPENARCRT_UNIFIEDMEM should be set to 1 (e.g., OPENARCRT_UNIFIEDMEM=1).
		- To create an OpenACC program that works both with Unified Memory and without Unified Memory, write the OpenACC program as if targeting devices without Unified Memory and OpenARC's Unified Memory APIs for host memory allocation.
			- If OPENARCRT_UNIFIEDMEM=0, OpenARC's Unified Memory APIs work as if using corresponding OpenACC data managment runtime APIs (acc_create_unified() ==> acc_create()).
			- If OPENARCRT_UNIFIEDMEM=0, OpenARC's Unified Memory APIs work as if using corresponding OpenACC data management runtime APIs (acc_create_unified() ==> acc_create()).
			- If OPENARCRT_UNIFIEDMEM=1, OpenACC data clauses are ignored for the data allocated on Unified Memory.
		- Using the OpenARC's Unified Memory APIs allows users to apply CUDA

	- Task2: compile and run an example program using OpenARC's Unified Memory APIs.
	$ cd ${openarcexamples}/unifiedmemory
	//Run O2GBuild.script script.
	$ ./O2GBuild.script
	//Check the generated output files
	//Check the generated output files.
	$ vi ./cetus_output/jacobi.cpp
	//Compile the generated output files.
	$ make
@@ -188,7 +190,7 @@ EXPERIMENTS

* Experiment5: learn how to use MPI with OpenARC
	- Task1: learn how to use MPI with OpenARC.
		- To use MPI with OpenACC, 1) commandline option addIncludePath should be set to the MPI include path, it is not in the default include search path, and 
		- To use MPI with OpenACC, 1) commandline option addIncludePath should be set to the MPI include path, if it is not in the default include search path, and 
		2) compile the OpenARC-generated output program using an MPI compiler (e.g., mpicxx).

	- Task2: compile and run jacobi_mpi example.
@@ -242,8 +244,8 @@ EXPERIMENTS
* Experiment8: learn how to use OpenARC's built-in interactive debugging tools
	- Task1: learn how to use OpenARC's built-in interactive debugging tools.
		- Use commandline option programVerification:
			- programVerification=1 //verify the correctness of CPU-GPU memory transfers
			- programVerification=2 //verify the correctness of GPU kernel translation
			- programVerification=1 //verify the correctness of CPU-GPU memory transfers.
			- programVerification=2 //verify the correctness of GPU kernel translation.
			- OpenARC offers various sub-options to control the interactive debugging tools.
				- verificationOptions, defaultMarginOfError, minValueToCheck, etc.