Database Acceleration




doppioDB is a hardware accelerated database, it extends MonetDB with hardware user defined functions (HUDFs) to provide a seamless integration of hardware operators into the database engine. doppioDB is targeting hybrid multicore architectures such as Intel Xeon+FPGA platform or IBM's Power8 with CAPI. In these platforms the accelerator (FPGA) has direct access to the main memory. On traditional accelerators data movement to/from the accelerator is very explicit and often involves reformatting the data to the accelerator's execution model. However in doppioDB the accelerator is integrated and designed to be a specialized co-processor which can operate on the same data as the database engine. doppioDB makes use of Centaur which bridges the gap between MonetDB on the CPU side and the hardware operators implemented on the FPGA. Centaur provides a software API which is used in the HUDFs to create and monitor jobs on the FPGA.

The figure on the right shows the system with MonetDB, Centaur and the FPGA operators. Apart from the bootstrap process (through Intel's AAL), all control and data communication between CPU and FPGA occurs through shared memory data structures.

Currently, we have a range of operators which can be accelerated on the FPGA such as LIKE, REGEXP_LIKE, SKYLINE, SGD (stochastic gradient descent) and others.

doppioDB is open source.



Centaur is a framework for developing applications on CPU-FPGA shared memory platform, bridging the gap between the application software and accelerators on the FPGA. The initial deployment of Centaur was in doppioDB to accelerate database operators, however, Centaur can be used with any other application. The main objectives of Centaur is to 1) Provide software abstractions that hide complexities of FPGA accelerator invocation from software. 2) Facilitate concurrent access to FPGA accelerators and sharing them between multiple users.

Centaur exposes a FPGA accelerator as a thread called FThread, which can be created and joined in a similar fashion to C++ threads. In addition, Centaur's software abstractions allow combining multiple FThreads dynamically in a pipeline on the FPGA using on-chip FIFOs. Further, Centaur provides data structures to pipeline an FThread with a software function running on the CPU using shared memory. On the FPGA, Centaur implements a thin hardware layer that handles the translation of the FThread abstraction, and manages sharing the on-chip resources between multiple users. 

Centaur decouples the development of accelerators and applications, through the operators library model. The accelerators developer exposes an accelerator as an operator and uses Centaur's operator model to describe his operators and add them to the operators library. An application developer can use operators in any way he wishes using Centaur's FThread abstraction and the corresponding operator descriptor in the operators library. This will also make applications written using Centaur portable to other platforms as long as Centaur supports them.

Centaur is open source.




Data partitioning is often used in databases to improve data access patterns of query execution engines. One prominent example of this is the radix join algorithm, which partitions large input tables to small cache-fitting parts, so that classic hash join can be performed with cache-fitting hash tables. The initial partitioning phase improves overall join performance significantly, especially if the input tables are large.

With the partitioning operator in doppioDB, our goal is to perform high-performance, large-fanout partitioning using an FPGA. FPGAs can perform partitioning efficiently, because (1) distributed on-chip memory (block RAMs) can be used to do specialized caching to improve random-access behavior, (2) partitioning can be implemented as a deep dataflow, enabling pipeline parallelism to improve throughput, (3) spatial parallelism on the FPGA can be exploited to create a vector-like instruction that is specialized for partitioning.

Stochastic Gradient Descent (SGD)

Using machine learning algorithms directly on relational data residing in a relational database has many advantages. Obviously, the ability the perform machine learning tasks on relational data besides having the robust and declarative way of interacting with it within a database, is very attractive.

With the SGD operator, our goal is to provide the capability to perform linear model training directly on relational data residing in doppioDB, and to accelerate this using an FPGA. Machine learning tasks highly benefit from FPGA acceleration, because (1) they are mostly based on vector algebra and inherently parallel, (2) they can be implemented as deep dataflow pipelines, (3) they are prone to errors introduced by quantization and can be performed using custom precision arithmetic more efficiently.



David Sidler, Zsolt Istvan, Muhsen Owaida, Gustavo Alonso.
ACM SIGMOD International Conference on Management of Data, Chicago, IL, May 2017.
Kaan Kara, Jana Giceva, Gustavo Alonso.
ACM SIGMOD International Conference on Management of Data, Chicago, IL, May 2017.
Muhsen Owaida, David Sidler, Kaan Kara, Gustavo Alonso.
25th IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), Napa, CA, April 2017.
Kaan Kara, Gustavo Alonso.
26th International Conference on Field Programmable Logic and Applications (FPL), Lausanne, Switzerland, September 2016.
Zsolt Istvan, David Sidler, Gustavo Alonso.
24th IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), Washington DC, May 2016.


David Sidler, Muhsen Owaida, Zsolt Istvan, Kaan Kara, Gustavo Alonso.
27th International Conference on Field Programmable Logic and Applications (FPL), Ghent, Belgium, September 2017.
David Sidler, Zsolt Istvan, Muhsen Owaida, Kaan Kara, Gustavo Alonso.
ACM SIGMOD International Conference on Management of Data, Chicago, IL, May 2017.