Data Consistency in a Loosely Coupled Transaction Model

Managing a combination of database data and file data in a consistent manner is an interesting challenge in content management technology. Typically, meta-data referencing/indexing external files is created and stored in a database for efficient search and retrieval of file data. Tight coupling of meta-data and file-updates is unsuitable as it makes the meta-data inaccessible during the potentially long process of editing and refining content. We propose an efficient solution to the problem of maintaining consistency between the content of the file and the associated meta-data from a reader’s point of view without holding long duration locks on meta-data tables. In the model, an object is directly accessed and edited in-place through normal filesystem APIs using a reference obtained via an
SQL Query on the database. To relate file modifications to meta-data updates, the user issues an update through the DBMS, and commits both file and meta-data updates together. A temporally unique version indicator associated with the last committed update transaction is encoded in the object reference associated with a given meta-data state. The current last modification timestamp of the file is available from the filesystem. A thin interceptor layer at the object’s native store compares these with the latest information it has about the last committed version of the file and the corresponding last modification timestamp for that version. A mismatch of the versions or the timestamps indicates that the meta-data may not correspond to the current contents of the file and access to the file may be denied. The solution is simple as it utilizes the internal mechanisms for timestamping and cache coherency employed by the independent stores linked, i.e. the database and the filesystem. The approach prevents an inconsistent view of contents, even when the interceptor system becomes out of sync with the DBMS. This aspect is utilized for enabling reliable client initiated caching schemes for optimizing performance in a distributed filesystem setup with authoritative caching, by triggering sync-ups only when a potential inconsistency is detected.

By: Suparna Bhattachary, Karen W. Brannon, Hui-I Hsiao, Inderpal Narang

Published in: RJ10232 in 2002

LIMITED DISTRIBUTION NOTICE:

This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.

rj10232.pdf

Questions about this service can be mailed to reports@us.ibm.com .