The performance of GPU-based algorithms can be reduced significantly by contention among memory accesses and by locking. We focus on highvolume output in GPU-based algorithms for streaming query processing: a very large number of cores process input streams and simultaneously produce a sustained output stream whose volume is sometimes orders of magnitude larger than that of the input streams. In this context, several cores can produce results simultaneously that must be written in the output buffer according to some order and without conflicts with other writers. To enable this behavior, we propose a waitfree bitmap-based data structure and a usage pattern that combine to obviate the use of locks and atomic operations. In our experiments, where the GPU-based algorithm considered is otherwise unchanged, the introduction of the new wait-free data structure entails a performance improvement of one order of magnitude.
|Titolo:||A wait-free output data structure for GPU-based streaming query processing|
|Data di pubblicazione:||2015|
|Appare nelle tipologie:||4.1 Articolo in Atti di convegno|