I believe you might need to contact gpudirect@nvidia.com to assist with the application/ CUDA implementation.
There is useful material found here to get your started understanding the basics of GPUDirect RDMA, which is an implementation for RDMA communication between GPU Devices over InfiniBand or RoCE protocols.
You might find useful implementation & methods available in the Open MPI 1.7.4 or later which has support for CUDA / GPUDirect RDMA.