Skip to main content Skip to main navigation

Publication

Efficient SIMD Vectorization for Hashing in OpenCL

Tobias Behrens; Viktor Rosenfeld; Jonas Traub; Sebastian Breß; Volker Markl
In: Michael Böhlen; Reinhard Pichler; Norman May; Erhard Rahm; Shan-Hung Wu; Katja Hose (Hrsg.). Advances in Database Technology — EDBT 2018. International Conference on Extending Database Technology (EDBT-2018), 21th International Conference on Extending Database Technology, March 26-29, Vienna, Austria, Pages 489-492, ISBN 978-3-89318-078-3, OpenProceedings, Konstanz, Germany, 2018.

Abstract

Hashing is at the core of many efficient database operators such as hash-based joins and aggregations. Vectorization is a technique that uses Single Instruction Multiple Data (SIMD) instructions to process multiple data elements at once. Applying vectorization to hash tables results in promising speedups for build and probe operations. However, vectorization typically requires intrinsics – low-level APIs in which functions map to processor specific SIMD instructions. Intrinsics are specific to a processor architecture and result in complex and difficult to maintain code. OpenCL is a parallel programming framework which provides a higher abstraction level than intrinsics and is portable to different processors. Thus, OpenCL avoids processor dependencies, which results in improved code maintainability. In this paper, we add efficient, vectorized hashing primitives to OpenCL. Our results show that OpenCL-based vectorization is competitive to intrinsics on CPUs but not on Xeon Phi coprocessors.

Projects

More links