Highly Concurrent B-Trees Using Atomic Blocks

We present a new highly-concurrent Blink-tree algorithm and discuss our implementation of it. Our design is novel in that it supports high concurrency while using atomic blocks in the implementation. Atomic blocks impose a discipline of static declaration of regions in which the system enforces atomicity of accesses. However, their static structure precludes lock coupling down through the levels of the tree, the usual method for traversing concurrent B-trees. We consider atomic blocks because of their software engineering advantages over unstructured use of locks. For example, it is easy to see that our design is deadlock-free.

In our approach, structure modifying operations (SMOs), such as B-tree node splits, occur as separate deferred, even asynchronous, operations at each affected level. This increases concurrency and leads to interesting data structure invariants and traversal algorithms. Our fine-grained concurrency approach locks as few nodes as possible at a time, most commonly just one node, but sometimes two or three. We also present coarser-grained strategies, which trade off locking overhead versus maximum concurrency. Our design further supports cursors, which avoid traversing the tree as much when a series of operations has locality. In sum, our design contributions consist in: using atomic blocks, deferred structure modifying operations, fine-grained locking, adjustable lock granularity, and support for cursors. We further include some informal arguments on the correctness of our design.

By: Rajesh Bordawekar; J. Eliot B. Moss

Published in: RC24658 in 2008

LIMITED DISTRIBUTION NOTICE:

This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.

rc24658.pdf

Questions about this service can be mailed to reports@us.ibm.com .