8.
mKernel: fused compute and communication kernels for multi-node GPUs (t.co)
mKernel fuses compute and communication into persistent GPU kernels across intra- and inter-node systems, attacking a real bottleneck in distributed training
1 appearance on the backlist front page in the last 30 days.
mKernel fuses compute and communication into persistent GPU kernels across intra- and inter-node systems, attacking a real bottleneck in distributed training