Commit d5e50c4f authored by Dmitry I. Lyakh's avatar Dmitry I. Lyakh
Browse files

Updated tensor domain existence compatibility requirement



Signed-off-by: default avatarDmitry I. Lyakh <quant4me@gmail.com>
parent 34a15d66
Pipeline #166056 failed with stage
in 29 minutes and 11 seconds
/** ExaTN::Numerics: General client header (free function API)
REVISION: 2021/09/25
REVISION: 2021/09/27
Copyright (C) 2018-2021 Dmitry I. Lyakh (Liakh)
Copyright (C) 2018-2021 Oak Ridge National Laboratory (UT-Battelle) **/
......@@ -88,13 +88,20 @@ Copyright (C) 2018-2021 Oak Ridge National Laboratory (UT-Battelle) **/
MPI processses are aware of the existence of the created tensor. Note that the concrete
physical distribution of the tensor body among the MPI processes is hidden from the user
(either fully replicated or fully distributed or a mix of the two).
(c) All tensor operands of any non-unary tensor operation must have the same domain of existence,
otherwise the code is non-compliant, resulting in an undefined behavior.
(c) A subset of MPI processes participating in a given tensor operation defines
its execution domain. The tensor operation execution domain must be compatible
with the existence domains of its tensor operands:
(1) The existence domains of all output tensor operands must be the same;
(2) The tensor operation execution domain must coincide with the existence
domains of all output tensor operands;
(3) The existence domain of each input tensor operand must include
the tensor operation execution domain AND each input tensor operand
must be fully available within the tensor operation execution domain.
(d) By default, the tensor body is replicated across all MPI processes in its domain of existence.
The user also has an option to create a distributed tensor by specifying which dimensions of
this tensor to split into segments, thus inducing a block-wise decomposition of the tensor body.
Each tensor dimension chosen for splitting must be given its splitting depth, that is, the number
of recursive bisections applied to that dimension (a depth of D results in 2^D segments).
of recursive bisections applied to that dimension (a depth of D results in 2^D dimension segments).
As a consequence, the total number of tensor blocks will be a power of 2. Because of this,
the size of the domain of existence of the corresponding composite tensor must also be a power of 2.
In general, the user is also allowed to provide a Lambda predicate to select which tensor blocks
......@@ -105,7 +112,7 @@ Copyright (C) 2018-2021 Oak Ridge National Laboratory (UT-Battelle) **/
(f) Tensor creation generally does not initialize a tensor to any value. Setting a tensor to some value
requires calling the tensor initialization operation.
(g) Any other unary tensor operation can be implemented as a tensor transformation operation with
a specific tranformation functor.
a specific tranformation functor (exatn::TensorMethod interface).
(h) Tensor addition is the main binary tensor operation which also implements tensor copy
when the output tensor operand is initialized to zero.
(i) Tensor contraction and tensor decomposition are the main ternary tensor operations, being
......@@ -252,9 +259,12 @@ inline bool withinTensorExistenceDomain(Args&&... tensor_names) //in: tensor nam
{return numericalServer->withinTensorExistenceDomain(std::forward<Args>(tensor_names)...);}
/** Returns the process group associated with the given tensors.
The calling process must be within the tensor exsistence domain,
which must be the same for all tensors. **/
/** Returns the process group associated with the given tensors, that is,
the overlap of existence domains of the given tensors. Note that the
existence domains of the given tensors must be properly nested,
tensorA <= tensorB <= tensorC <= ... <= tensorZ,
otherwise the code will result in an undefined behavior. As a useful
rule, always place output tensors in front of input tensors. **/
template <typename... Args>
inline const ProcessGroup & getTensorProcessGroup(Args&&... tensor_names) //in: tensor names
{return numericalServer->getTensorProcessGroup(std::forward<Args>(tensor_names)...);}
......
/** ExaTN::Numerics: Numerical server
REVISION: 2021/09/25
REVISION: 2021/09/27
Copyright (C) 2018-2021 Dmitry I. Lyakh (Liakh)
Copyright (C) 2018-2021 Oak Ridge National Laboratory (UT-Battelle) **/
......@@ -1091,7 +1091,7 @@ const ProcessGroup & NumServer::getTensorProcessGroup(const std::string & tensor
bool exists = withinTensorExistenceDomain(tensor_name);
if(!exists){
std::cout << "#ERROR(exatn::getTensorProcessGroup): Process " << getProcessRank()
<< " is not within the existence domain of tensor " << tensor_name << std::endl << std::flush;
<< " is not within the existence domain of tensor " << tensor_name << std::endl;
assert(false);
}
auto iter = tensor_comms_.find(tensor_name);
......@@ -1168,7 +1168,7 @@ bool NumServer::createTensor(const ProcessGroup & process_group,
std::dynamic_pointer_cast<numerics::TensorOpCreate>(op)->resetTensorElementType(element_type);
submitted = submit(op,tensor_mapper);
if(submitted){
if(process_group != getDefaultProcessGroup()){
if(!(process_group == getDefaultProcessGroup())){
auto saved = tensor_comms_.emplace(std::make_pair(tensor->getName(),process_group));
assert(saved.second);
}
......@@ -1220,7 +1220,7 @@ bool NumServer::createTensorSync(const ProcessGroup & process_group,
std::dynamic_pointer_cast<numerics::TensorOpCreate>(op)->resetTensorElementType(element_type);
submitted = submit(op,tensor_mapper);
if(submitted){
if(process_group != getDefaultProcessGroup()){
if(!(process_group == getDefaultProcessGroup())){
auto saved = tensor_comms_.emplace(std::make_pair(tensor->getName(),process_group));
assert(saved.second);
}
......@@ -1660,6 +1660,11 @@ bool NumServer::replicateTensor(const ProcessGroup & process_group, const std::s
int byte_packet_len = 0;
if(local_rank == root_process_rank){
if(iter != tensors_.end()){
if(iter->second->isComposite()){
std::cout << "#ERROR(exatn::NumServer::replicateTensor): Tensor " << name
<< " is composite, replication not allowed!" << std::endl << std::flush;
return false;
}
iter->second->pack(byte_packet_);
byte_packet_len = static_cast<int>(byte_packet_.size_bytes); assert(byte_packet_len > 0);
}else{
......@@ -1686,6 +1691,12 @@ bool NumServer::replicateTensor(const ProcessGroup & process_group, const std::s
auto submitted = submit(op,tensor_mapper);
if(submitted) submitted = sync(*op);
assert(submitted);
}else{
auto num_deleted = tensor_comms_.erase(name);
}
if(!(process_group == getDefaultProcessGroup())){
auto saved = tensor_comms_.emplace(std::make_pair(name,process_group));
assert(saved.second);
}
clearBytePacket(&byte_packet_);
//Broadcast the tensor body:
......@@ -1705,6 +1716,11 @@ bool NumServer::replicateTensorSync(const ProcessGroup & process_group, const st
int byte_packet_len = 0;
if(local_rank == root_process_rank){
if(iter != tensors_.end()){
if(iter->second->isComposite()){
std::cout << "#ERROR(exatn::NumServer::replicateTensorSync): Tensor " << name
<< " is composite, replication not allowed!" << std::endl << std::flush;
return false;
}
iter->second->pack(byte_packet_);
byte_packet_len = static_cast<int>(byte_packet_.size_bytes); assert(byte_packet_len > 0);
}else{
......@@ -1731,6 +1747,12 @@ bool NumServer::replicateTensorSync(const ProcessGroup & process_group, const st
auto submitted = submit(op,tensor_mapper);
if(submitted) submitted = sync(*op);
assert(submitted);
}else{
auto num_deleted = tensor_comms_.erase(name);
}
if(!(process_group == getDefaultProcessGroup())){
auto saved = tensor_comms_.emplace(std::make_pair(name,process_group));
assert(saved.second);
}
clearBytePacket(&byte_packet_);
//Broadcast the tensor body:
......@@ -1895,7 +1917,7 @@ bool NumServer::extractTensorSlice(const std::string & tensor_name,
iter = tensors_.find(slice_name);
if(iter != tensors_.end()){
auto tensor1 = iter->second;
auto tensor_mapper = getTensorMapper(getTensorProcessGroup(tensor_name,slice_name));
auto tensor_mapper = getTensorMapper(getTensorProcessGroup(slice_name,tensor_name));
std::shared_ptr<TensorOperation> op = tensor_op_factory_->createTensorOp(TensorOpCode::SLICE);
op->setTensorOperand(tensor1);
op->setTensorOperand(tensor0);
......@@ -1921,7 +1943,7 @@ bool NumServer::extractTensorSliceSync(const std::string & tensor_name,
iter = tensors_.find(slice_name);
if(iter != tensors_.end()){
auto tensor1 = iter->second;
const auto & process_group = getTensorProcessGroup(tensor_name,slice_name);
const auto & process_group = getTensorProcessGroup(slice_name,tensor_name);
auto tensor_mapper = getTensorMapper(process_group);
std::shared_ptr<TensorOperation> op = tensor_op_factory_->createTensorOp(TensorOpCode::SLICE);
op->setTensorOperand(tensor1);
......
/** ExaTN::Numerics: Numerical server
REVISION: 2021/09/25
REVISION: 2021/09/27
Copyright (C) 2018-2021 Dmitry I. Lyakh (Liakh)
Copyright (C) 2018-2021 Oak Ridge National Laboratory (UT-Battelle) **/
......@@ -488,15 +488,22 @@ public:
bool withinTensorExistenceDomain(const std::string & tensor_name) const; //in: tensor name
/** Returns the process group associated with the given tensors.
The calling process must be within the tensor exsistence domain,
which must be the same for all tensors. **/
/** Returns the process group associated with the given tensors, that is,
the overlap of existence domains of the given tensors. Note that the
existence domains of the given tensors must be properly nested,
tensorA <= tensorB <= tensorC <= ... <= tensorZ,
otherwise the code will result in an undefined behavior. As a useful
rule, always place output tensors in front of input tensors. **/
template <typename... Args>
const ProcessGroup & getTensorProcessGroup(const std::string & tensor_name, Args&&... tensor_names) const //in: tensor names
{
const auto & tensor_domain = getTensorProcessGroup(tensor_name);
const auto & other_tensors_domain = getTensorProcessGroup(std::forward<Args>(tensor_names)...);
assert(other_tensors_domain == tensor_domain);
if(!tensor_domain.isContainedIn(other_tensors_domain)){
std::cout << "#ERROR(exatn::getTensorProcessGroup): Tensor operand existence domains must be properly nested: "
<< "Tensor " << tensor_name << " violates this requirement!" << std::endl;
assert(false);
};
return tensor_domain;
}
......@@ -1042,7 +1049,7 @@ bool NumServer::createTensor(const ProcessGroup & process_group,
std::dynamic_pointer_cast<numerics::TensorOpCreate>(op)->resetTensorElementType(element_type);
submitted = submit(op,getTensorMapper(process_group));
if(submitted){
if(process_group != getDefaultProcessGroup()){
if(!(process_group == getDefaultProcessGroup())){
auto saved = tensor_comms_.emplace(std::make_pair(name,process_group));
assert(saved.second);
}
......@@ -1067,7 +1074,7 @@ bool NumServer::createTensorSync(const ProcessGroup & process_group,
std::dynamic_pointer_cast<numerics::TensorOpCreate>(op)->resetTensorElementType(element_type);
submitted = submit(op,getTensorMapper(process_group));
if(submitted){
if(process_group != getDefaultProcessGroup()){
if(!(process_group == getDefaultProcessGroup())){
auto saved = tensor_comms_.emplace(std::make_pair(name,process_group));
assert(saved.second);
}
......
......@@ -47,7 +47,7 @@
#define EXATN_TEST26
//#define EXATN_TEST27 //requires input file from source
//#define EXATN_TEST28 //requires input file from source
//#define EXATN_TEST30
#define EXATN_TEST30
#ifdef EXATN_TEST0
......
/** ExaTN: MPI Communicator Proxy & Process group
REVISION: 2021/07/13
REVISION: 2021/09/26
Copyright (C) 2018-2021 Dmitry I. Lyakh (Liakh)
Copyright (C) 2018-2021 Oak Ridge National Laboratory (UT-Battelle) **/
......@@ -13,9 +13,16 @@ Copyright (C) 2018-2021 Oak Ridge National Laboratory (UT-Battelle) **/
#include <cstdlib>
#include <iostream>
#include <algorithm>
namespace exatn {
//Temporary buffers:
constexpr std::size_t MAX_NUM_MPI_PROCESSES = 65536;
std::vector<unsigned int> processes1;
std::vector<unsigned int> processes2;
MPICommProxy::~MPICommProxy()
{
#ifdef MPI_ENABLED
......@@ -52,6 +59,44 @@ bool MPICommProxy::operator==(const MPICommProxy & another) const
}
bool ProcessGroup::isCongruentTo(const ProcessGroup & another) const
{
bool is_congruent = (*this == another);
if(!is_congruent){
is_congruent = (this->process_ranks_.size() == another.process_ranks_.size());
if(is_congruent && this->process_ranks_.size() > 0){
processes1.reserve(MAX_NUM_MPI_PROCESSES);
processes2.reserve(MAX_NUM_MPI_PROCESSES);
processes1 = this->process_ranks_;
std::sort(processes1.begin(),processes1.end());
processes2 = another.process_ranks_;
std::sort(processes2.begin(),processes2.end());
is_congruent = (processes1 == processes2);
}
}
return is_congruent;
}
bool ProcessGroup::isContainedIn(const ProcessGroup & another) const
{
bool is_contained = (*this == another);
if(!is_contained){
is_contained = (this->process_ranks_.size() <= another.process_ranks_.size());
if(is_contained){
processes1.reserve(MAX_NUM_MPI_PROCESSES);
processes2.reserve(MAX_NUM_MPI_PROCESSES);
processes1 = this->process_ranks_;
std::sort(processes1.begin(),processes1.end());
processes2 = another.process_ranks_;
std::sort(processes2.begin(),processes2.end());
is_contained = std::includes(processes2.begin(),processes2.end(),processes1.begin(),processes1.end());
}
}
return is_contained;
}
std::shared_ptr<ProcessGroup> ProcessGroup::split(int my_subgroup) const
{
std::shared_ptr<ProcessGroup> subgroup(nullptr);
......
/** ExaTN: MPI Communicator Proxy & Process group
REVISION: 2021/07/13
REVISION: 2021/09/27
Copyright (C) 2018-2021 Dmitry I. Lyakh (Liakh)
Copyright (C) 2018-2021 Oak Ridge National Laboratory (UT-Battelle) **/
......@@ -86,15 +86,22 @@ public:
ProcessGroup & operator=(ProcessGroup &&) noexcept = default;
~ProcessGroup() = default;
/** Checks whether the process group has the same
intra-communicator as another process group,
hence both process groups being fully identical. **/
inline bool operator==(const ProcessGroup & another) const
{
return (intra_comm_ == another.intra_comm_);
}
inline bool operator!=(const ProcessGroup & another) const
{
return !(*this == another);
}
/** Returns TRUE if the process group is composed
of the same processes as another process group,
with no regard to their intra-communicators. **/
bool isCongruentTo(const ProcessGroup & another) const;
/** Returns TRUE if the process group is contained in
or congruent to another process group. **/
bool isContainedIn(const ProcessGroup & another) const;
/** Returns the size of the process group (number of MPI processes). **/
unsigned int getSize() const {return process_ranks_.size();}
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment