Compiling Text Analytics Queries to FPGAs

Extracting information from unstructured text data is a compute-intensive task. The performance of general-purpose processors cannot keep up with the rapid growth of textual data. Therefore we discuss the use of FPGAs to perform large-scale text analytics. We present a framework consisting of a compiler and an operator library capable of generating a Verilog processing pipeline from a text analytics query specified in the annotation query language AQL. The operator library comprises a set of configurable modules capable of performing relational and extraction tasks which can be assembled by the compiler to represent a full annotation operator graph. Leveraging the nature of text processing, we show that most tasks can be performed in an efficient streaming fashion.

A copy of this report is available from pol@zurich.ibm.com

By: Raphael Polig, Kubilay Atasu, Heiner Giefers and Laura Chiticariu

Published in: RZ3864 in 2014

This Research Report is not available electronically. Please request a copy from the contact listed below. IBM employees should contact ITIRC for a copy.

Questions about this service can be mailed to reports@us.ibm.com .