I found a cool feature on-demand paging in the user manual. This features seems to be a good fit for my application. However, while I am evaluating the performance of on-demand paging, I see there is a unexpected behavior. For example, if I just issue rdma operation from same buffers in thousands of iterations. I see there are some page faults reported by the hca even after I prefetch these buffers in the beginning. That's why I was looking at the trace information to see if my card is ok or not. Do you have any idea why this is happening?
↧