FK6D issueshttps://code.ornl.gov/groups/FK6D/-/issues2018-10-22T20:41:26Zhttps://code.ornl.gov/FK6D/FK6D/-/issues/36FIx write_output functions to work with single/double prec according to macro...2018-10-22T20:41:26ZMcDaniel, TylerFIx write_output functions to work with single/double prec according to macro USE_FLOAT32 set in control.hpp@3bm @dg6@3bm @dg6https://code.ornl.gov/FK6D/FK6D/-/issues/35Cuda failure /home/dg6/code/FK6D/include/GPU_misc.h:24: 11 'invalid argument'2018-10-19T15:05:16ZGREENDL1 emailCuda failure /home/dg6/code/FK6D/include/GPU_misc.h:24: 11 'invalid argument'```
dg6@fusiont5:~/code/FK6D/build$ ./fk6d -p 4 -l 7 -d 2 -n 100 -w 1 -c 20.0 -f -i
--run begin--
dim :2:
deg :2:
lev :7:
using implicit timestepping
dt :0.255254:
operator_two_scale :14: ms
operator_two_scale :13: ms
forwardMWT setup :1...```
dg6@fusiont5:~/code/FK6D/build$ ./fk6d -p 4 -l 7 -d 2 -n 100 -w 1 -c 20.0 -f -i
--run begin--
dim :2:
deg :2:
lev :7:
using implicit timestepping
dt :0.255254:
operator_two_scale :14: ms
operator_two_scale :13: ms
forwardMWT setup :12: ms
forwardMWT setup :12: ms
hash_table :20: ms
Connect2D :4227: ms
initial_condition_vector :0: ms
matrix_coeffs_independent :357: ms
global_matrix_SG :1870: ms
poisson_factor :7: ms
GPU DRAM capacity (MB) :12621.4:
GPU library :MAGMA:
Cuda failure /home/dg6/code/FK6D/include/GPU_misc.h:24: 11 'invalid argument'
```McDaniel, TylerMcDaniel, Tylerhttps://code.ornl.gov/FK6D/FK6D/-/issues/34Full-grid C++ doesn't match Full-grid matlab.2018-10-19T16:43:35ZGREENDL1 emailFull-grid C++ doesn't match Full-grid matlab.For Vlasov43
```
./fk6d -p 4 -l 3 -d 3 -n 369 -w 10 -f -c 0.1
```
matlab gives ...
![Screen_Shot_2018-10-18_at_11.34.11_AM](/uploads/209bc50a2e2426aa1c1ad2de33bd06fc/Screen_Shot_2018-10-18_at_11.34.11_AM.png)
but C++ gives ...
![Scree...For Vlasov43
```
./fk6d -p 4 -l 3 -d 3 -n 369 -w 10 -f -c 0.1
```
matlab gives ...
![Screen_Shot_2018-10-18_at_11.34.11_AM](/uploads/209bc50a2e2426aa1c1ad2de33bd06fc/Screen_Shot_2018-10-18_at_11.34.11_AM.png)
but C++ gives ...
![Screen_Shot_2018-10-18_at_11.34.06_AM](/uploads/bf1d3529f073e76bf260b0bb7df5f35c/Screen_Shot_2018-10-18_at_11.34.06_AM.png)
Notes :
1. Looks fine for Vlasov8, so that means the problem is likely in the Poisson solve.
2. Vlasov4 (`-p 0`) C++ goes unstable after about 900 time steps. I'm running the matlab out to that now to check.
I may get to tracking this down today, but if not we can take a look tomorrow.McDaniel, TylerMcDaniel, Tylerhttps://code.ornl.gov/FK6D/FK6D/-/issues/33realspace conversion doesn't seem to match matlab conversion2019-01-02T17:22:42ZGREENDL1 emailrealspace conversion doesn't seem to match matlab conversionThis is a very brief go a this, but with 0c0eea0262df454622ba8fb39a523ab1e7fd2bc1 I've added a python plotting of the real space conversion. I run with the following ...
```
./fk6d -p 3 -l 5 -d 3 -n 11 -w 1 -v 1
```
then
```
python ../...This is a very brief go a this, but with 0c0eea0262df454622ba8fb39a523ab1e7fd2bc1 I've added a python plotting of the real space conversion. I run with the following ...
```
./fk6d -p 3 -l 5 -d 3 -n 11 -w 1 -v 1
```
then
```
python ../python/plot.py
```
with the resulting .png files going into `output/*`. However, they look mostly like ...
![f2d_11](/uploads/9bba8761c6b2ed53a8f90f27c1bb855a/f2d_11.png)
whereas I was expecting
![Screen_Shot_2018-10-15_at_11.14.35_AM](/uploads/ada09f0eef86769cc5c37a158901d05f/Screen_Shot_2018-10-15_at_11.14.35_AM.png)
Not that I likely have made a mistake. Just wanted to document how you can now produce figures, and what they should look like.
I'm not sure I got the t=0 subtraction correct either.NVIDIA demoElwasif, WaelElwasif, Waelhttps://code.ornl.gov/FK6D/FK6D/-/issues/32Vlasov5 not matching matlab2018-10-17T20:33:11ZGREENDL1 emailVlasov5 not matching matlabHere is the output of running Vlasov5 (the case we want to use for the NVIDIA demo) from the matlab ...
On the reference branch run 11 time steps of the matlab (lev=5,deg=3,explicit) ...
```
fk6d(Vlasov5,5,3,.1)
```
![Screen_Shot_2018-...Here is the output of running Vlasov5 (the case we want to use for the NVIDIA demo) from the matlab ...
On the reference branch run 11 time steps of the matlab (lev=5,deg=3,explicit) ...
```
fk6d(Vlasov5,5,3,.1)
```
![Screen_Shot_2018-10-15_at_11.09.10_AM](/uploads/26bb9b060b4829c1dfeb0db64a68b080/Screen_Shot_2018-10-15_at_11.09.10_AM.png)
And then let's run the C code (built with `cmake -DWITH_CUDA=1 -DCMAKE_CXX_COMPILER=mpiCC -DCMAKE_C_COMPILER=mpicc -DCMAKE_CXX_FLAGS=-O3 ../` on fusiont5) with the following `include/control.hpp` ...
```
#pragma once
//these have no meaning if you run with -r reference switch - that's always on CPU
#define USE_GPU
//below must be undefined if you don't use GPU
#define USE_MAGMA
```
First with the reference branch (shows an error that looks like a left-right offset) ...
```
./fk6d -p 3 -l 5 -d 3 -n 11 -w 1 -r
```
![Screen_Shot_2018-10-15_at_11.13.03_AM](/uploads/059eacf21c831cddbbf9c2a1ca191c84/Screen_Shot_2018-10-15_at_11.13.03_AM.png)
Second with the explicit optimized version (shows a rather difference df)
```
./fk6d -p 3 -l 5 -d 3 -n 11 -w 1
```
![Screen_Shot_2018-10-15_at_11.14.35_AM](/uploads/8f2228e7bb17753ae3e8825fd79bbf2e/Screen_Shot_2018-10-15_at_11.14.35_AM.png)
Third with the implicit optimized version ()
```
./fk6d -p 3 -l 5 -d 3 -n 11 -w 1 -i
```
![Screen_Shot_2018-10-15_at_11.16.31_AM](/uploads/8b532cc3c367230a646e7938fc52c68b/Screen_Shot_2018-10-15_at_11.16.31_AM.png)
So, in summary ... the implicit and explicit optimized versions agree. However, neither agree with the reference implementation, and that is close, but not quite in agreement with the matlab.
And let the bug chasing begin (again) ...NVIDIA demoMcDaniel, TylerMcDaniel, Tylerhttps://code.ornl.gov/FK6D/FK6D/-/issues/31Implicit time advance disagrees with explicit time advance2018-10-14T21:34:11ZGREENDL1 emailImplicit time advance disagrees with explicit time advanceOn master, running with
Explicit
```
#define USE_GPU
#define USE_MAGMA
./fk6d -p 0 -l 5 -d 4 -n 100 -w 10
```
gives
![Screen_Shot_2018-10-10_at_11.02.54_AM](/uploads/12299cf4371512ae6e67a5c0c622ad8e/Screen_Shot_2018-10-10_at_11.02.54_A...On master, running with
Explicit
```
#define USE_GPU
#define USE_MAGMA
./fk6d -p 0 -l 5 -d 4 -n 100 -w 10
```
gives
![Screen_Shot_2018-10-10_at_11.02.54_AM](/uploads/12299cf4371512ae6e67a5c0c622ad8e/Screen_Shot_2018-10-10_at_11.02.54_AM.png)
Implicit
```
#define USE_GPU
#define USE_MAGMA
./fk6d -p 0 -l 5 -d 4 -n 100 -w 10 -i
```
gives
![Screen_Shot_2018-10-10_at_11.03.29_AM](/uploads/ac9270411d0919eb54bc4d0124f8a173/Screen_Shot_2018-10-10_at_11.03.29_AM.png)
Implicit
```
//#define USE_GPU
//#define USE_MAGMA
./fk6d -p 0 -l 5 -d 4 -n 100 -w 10 -i
```
gives
![Screen_Shot_2018-10-10_at_11.07.28_AM](/uploads/a2627e13fcfd446d1661554f86a8bb2b/Screen_Shot_2018-10-10_at_11.07.28_AM.png)
Implicit
```
#define USE_GPU
//#define USE_MAGMA
./fk6d -p 0 -l 5 -d 4 -n 100 -w 10 -i
```
gives
![Screen_Shot_2018-10-10_at_11.08.38_AM](/uploads/fe4fd913363789a1b51ca98501af453f/Screen_Shot_2018-10-10_at_11.08.38_AM.png)McDaniel, TylerMcDaniel, Tylerhttps://code.ornl.gov/FK6D/FK6D/-/issues/30missing scale_co and phi_co variables2018-10-11T13:41:40ZElwasif, Waelmissing scale_co and phi_co variablesMatlab two_scale_rel_NN.mat files contain scale_co (deg x deg) and phi_co (2deg x deg) which do not exist in the C++ data files. Needed to convert to real spaceMatlab two_scale_rel_NN.mat files contain scale_co (deg x deg) and phi_co (2deg x deg) which do not exist in the C++ data files. Needed to convert to real spaceGREENDL1 emailGREENDL1 emailhttps://code.ornl.gov/FK6D/FK6D/-/issues/29Poisson solve in master giving incorrect result2018-10-10T19:32:38ZGREENDL1 emailPoisson solve in master giving incorrect resultA long-time run of both the reference and optimized versions shows instabilities for Vlasov4. Upon investigation, it seems the electric field does not compare well with the matlab solution after the very first solve of the electric field...A long-time run of both the reference and optimized versions shows instabilities for Vlasov4. Upon investigation, it seems the electric field does not compare well with the matlab solution after the very first solve of the electric field. The b and DeltaX look to match, so I'm unsure where the problem lies.
I tried comparing the factors with those computed via matlab (`[L,U]=lu(DeltaX); factors=L+U;`), and they do seem different, but I'm not sure if that's a great comparison.
Just writing this down here so I don't forget the problem details.
@3bmhttps://code.ornl.gov/FK6D/FK6D/-/issues/28Reference implementation does not compile2018-09-24T03:01:22ZGREENDL1 emailReference implementation does not compileI'm trying to build the reference implementation (so I can implement the implicit solve for Azzam). However, it appears that after the recent PDE overhaul, the reference implementation wasn't tested?
I've gotten past one compile error I...I'm trying to build the reference implementation (so I can implement the implicit solve for Azzam). However, it appears that after the recent PDE overhaul, the reference implementation wasn't tested?
I've gotten past one compile error I couldn't figure out by removing the time() wrapper, but am now stuck on an error I don't understand since I though time_advance_ref wasn't a member function (even in the updated PDE implementation)
```
david@dlg-ubuntu-18:~/FK6D/build$ make
[ 15%] Built target implementation_lib
Scanning dependencies of target fk6d
[ 20%] Building CXX object CMakeFiles/fk6d.dir/src/main.cpp.o
In file included from /home/david/FK6D/src/main.cpp:39:0:
/home/david/FK6D/include/time_advance_ref.hpp: In instantiation of ‘matrix2d<T> runge_kutta_3(A_DATA<T>&, matrix2d<T>&, T, T, PDE<T>&, INVHASH&) [with T = double; INVHASH = std::vector<std::array<int, 6> >]’:
/home/david/FK6D/include/time_advance_ref.hpp:9:25: required from ‘matrix2d<T> time_advance_ref(A_DATA<T>&, matrix2d<T>&, T, T, PDE<T>&, INVHASH&) [with T = double; INVHASH = std::vector<std::array<int, 6> >]’
/home/david/FK6D/src/main.cpp:209:118: required from here
/home/david/FK6D/include/time_advance_ref.hpp:16:20: error: ‘unsigned int PDE<double>::deg’ is protected within this context
auto deg = pde.deg;
~~~~^~~
In file included from /home/david/FK6D/include/device_data.hpp:22:0,
from /home/david/FK6D/src/main.cpp:12:
/home/david/FK6D/include/pde.hpp:19:13: note: declared protected here
unsigned deg;
^~~
```McDaniel, TylerMcDaniel, Tylerhttps://code.ornl.gov/FK6D/FK6D/-/issues/27Replace "degSquared" everywhere in code with "deg^dimension"2018-08-24T20:20:48ZMcDaniel, TylerReplace "degSquared" everywhere in code with "deg^dimension"McDaniel, TylerMcDaniel, Tylerhttps://code.ornl.gov/FK6D/FK6D/-/issues/26Replace string based hashing2018-08-24T20:21:19ZMcDaniel, TylerReplace string based hashingReplace the use of strings as keys in hash table.Replace the use of strings as keys in hash table.McDaniel, TylerMcDaniel, Tylerhttps://code.ornl.gov/FK6D/FK6D/-/issues/25Replace string based hashing2018-08-24T20:20:02ZMcDaniel, TylerReplace string based hashinghttps://code.ornl.gov/FK6D/FK6D/-/issues/24MAGMA - Fix batch count problem - also change to fixed size batch gemm2018-10-22T20:24:54ZMcDaniel, TylerMAGMA - Fix batch count problem - also change to fixed size batch gemmLeaving this as a reminder to myself.Leaving this as a reminder to myself.McDaniel, TylerMcDaniel, Tylerhttps://code.ornl.gov/FK6D/FK6D/-/issues/23How to create tests with different include files?2018-09-25T15:02:25ZGREENDL1 emailHow to create tests with different include files?At present the way you specify the equation to be solved is by selecting a different `inlcude/pde-vlasovX.hpp` file in `include/pde.hpp` at compile time.
Also, the present set of tests only runs successfully when using `inlcude/pde-vla...At present the way you specify the equation to be solved is by selecting a different `inlcude/pde-vlasovX.hpp` file in `include/pde.hpp` at compile time.
Also, the present set of tests only runs successfully when using `inlcude/pde-vlasov4.hpp`.
I'd like to add tests for `inlcude/pde-vlasov7.hpp`, which would be whole code tests, i.e., you run the code for some lev, deg, and the number of time steps, and it returns a single number (the error as compared with the analytic solution).
I see two difficulties here ..
1. How do I call the whole code in a test? (At present the "integration" test has most of `main.cpp` replicated in the test.
2. How do I have tests for the two different pde cases where the selection must be done at compile time?
@bgl Any thoughts?
Thanks.Lopez, Matthew GrahamLopez, Matthew Grahamhttps://code.ornl.gov/FK6D/FK6D/-/issues/22Building on a system without CUDA2018-08-15T15:35:16ZGREENDL1 emailBuilding on a system without CUDAHow do I build on a system without CUDA?
```
[fk6d@fk6d build]$ cmake -DCMAKE_CXX_COMPILER=mpicxx -DCMAKE_CUDA_COMPILER=mpicxx -DCMAKE_CXX_FLAGS=-g ../
-- The CXX compiler identification is GNU 8.1.1
-- The CUDA compiler identification ...How do I build on a system without CUDA?
```
[fk6d@fk6d build]$ cmake -DCMAKE_CXX_COMPILER=mpicxx -DCMAKE_CUDA_COMPILER=mpicxx -DCMAKE_CXX_FLAGS=-g ../
-- The CXX compiler identification is GNU 8.1.1
-- The CUDA compiler identification is unknown
-- Check for working CXX compiler: /usr/lib64/openmpi/bin/mpicxx
-- Check for working CXX compiler: /usr/lib64/openmpi/bin/mpicxx -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Check for working CUDA compiler: /usr/lib64/openmpi/bin/mpicxx
-- Check for working CUDA compiler: /usr/lib64/openmpi/bin/mpicxx -- broken
CMake Error at /usr/share/cmake/Modules/CMakeTestCUDACompiler.cmake:46 (message):
The CUDA compiler
"/usr/lib64/openmpi/bin/mpicxx"
is not able to compile a simple test program.
It fails with the following output:
Change Dir: /home/fk6d/code/FK6D/build/CMakeFiles/CMakeTmp
Run Build Command:"/usr/bin/gmake" "cmTC_22761/fast"
/usr/bin/gmake -f CMakeFiles/cmTC_22761.dir/build.make CMakeFiles/cmTC_22761.dir/build
gmake[1]: Entering directory '/home/fk6d/code/FK6D/build/CMakeFiles/CMakeTmp'
Building CUDA object CMakeFiles/cmTC_22761.dir/main.cu.o
/usr/lib64/openmpi/bin/mpicxx -x cu -c /home/fk6d/code/FK6D/build/CMakeFiles/CMakeTmp/main.cu -o CMakeFiles/cmTC_22761.dir/main.cu.o
g++: error: language cu not recognized
g++: error: language cu not recognized
gmake[1]: *** [CMakeFiles/cmTC_22761.dir/build.make:66: CMakeFiles/cmTC_22761.dir/main.cu.o] Error 1
gmake[1]: Leaving directory '/home/fk6d/code/FK6D/build/CMakeFiles/CMakeTmp'
gmake: *** [Makefile:126: cmTC_22761/fast] Error 2
CMake will not be able to correctly generate this project.
Call Stack (most recent call first):
CMakeLists.txt:10 (project)
-- Configuring incomplete, errors occurred!
See also "/home/fk6d/code/FK6D/build/CMakeFiles/CMakeOutput.log".
See also "/home/fk6d/code/FK6D/build/CMakeFiles/CMakeError.log".
```McDaniel, TylerMcDaniel, Tylerhttps://code.ornl.gov/FK6D/FK6D/-/issues/21Implement ProjCoef2Wav_v22018-10-10T19:34:00ZGREENDL1 emailImplement ProjCoef2Wav_v2Implement `ProjCoef2Wav_v2` and `source_vector_2` from the reference branch of the Matlab to enable the Vlasov7 test case which provides analytic, and lev / deg independent testing.Implement `ProjCoef2Wav_v2` and `source_vector_2` from the reference branch of the Matlab to enable the Vlasov7 test case which provides analytic, and lev / deg independent testing.https://code.ornl.gov/FK6D/FK6D/-/issues/20Implement entire time advance loop on GPU.2018-09-05T20:09:48ZGREENDL1 emailImplement entire time advance loop on GPU.McDaniel, TylerMcDaniel, Tylerhttps://code.ornl.gov/FK6D/FK6D/-/issues/19Merge non-string based hash into master.2018-08-06T19:56:56ZGREENDL1 emailMerge non-string based hash into master.https://code.ornl.gov/FK6D/FK6D/-/issues/18Merge new Legendre routines into master to allow larger degree sizes for othe...2018-08-09T16:02:08ZGREENDL1 emailMerge new Legendre routines into master to allow larger degree sizes for other branches.https://code.ornl.gov/FK6D/FK6D/-/issues/17Profile the setup routines for large problem sizes and determine where the ti...2018-10-10T19:32:55ZGREENDL1 emailProfile the setup routines for large problem sizes and determine where the time is being spent.