Parallelization of XPath Queries using Multi-Core Processors: Challenges and Experiences

In this study, we present experiences of parallelizing XPath queries using the Xalan XPath engine on shared-address space multi-core systems. For our evaluation, we consider a scenario where an XPath processor uses multiple threads to concurrently navigate and execute individual XPath queries on a shared XML document. Given the constraints of the XML execution and data models, we propose three strategies for parallelizing individual XPath queries: Data partitioning, Query partitioning, and Hybrid (query and data) partitioning. We experimentally evaluated these strategies on an x86/Linux multicore system using a set of XPath queries, invoked on a variety of XML documents using the Xalan XPath APIs. Experimental results demonstrate that the proposed parallelization strategies work very effectively in practice; for a majority of XPath queries under evaluation, the execution performance scaled linearly as the number of threads was increased. Results also revealed the pros and cons of the different parallelization strategies for different XPath query patterns.

By: Rajesh Bordawekar; Lipyeow Lim; Oded Shmueli

Published in: RC24659 in 2008


This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.


Questions about this service can be mailed to .