Do not assume CUDA reduce operator is unary
The `Reduce` algorithm is sometimes used to convert an input type to a different output type. For example, you can compute the min and max at the same time by making the output of the binary functor a pair of the input type. However, for this to work with the CUDA algorithm, you have to be able to also convert the input type to the output type. This was previously done by treating the binary operator as also a unary operator. That's fine for custom operators, but if you are using something like `thrust::plus`, it has no unary operation. (Why would it?) So, detect whether the operator has a unary operation. If it does, use it to cast from the input portal to the output type. If it does not, just use `static_cast`. Thus, the operator only has to have the unary operation if `static_cast` does not work.