A Tree Projection Algorithm for Generation of Large Itemsets for Assocation Rules

        In this paper we propose algorithms for generation of frequent itemsets by successive construction of the nodes of a lexicographic tree of itemsets. We discuss different strategies in generation and traversal of the lexicographic tree such as breadth-first search, depth-first search or a combination of the two. These techniques provide different trade-offs in terms of the I/O, memory and computational time requirements. We use the hierarchical structure of the lexicographic tree to successively project transactions at each node of the lexicographic tree, and use matrix counting on this reduced set of transactions for finding frequent itemsets. We tested our algorithm on both real and synthetic data. We provide an implementation of the tree projection method which is up to one order of magnitude faster than other recent techniques in the literature.

By: Ramesh C. Agarwal, Charu C. Aggarwal, V. V. V. Prasad, Vivane Crestana

Published in: RC21341 in 1998

This Research Report is not available electronically. Please request a copy from the contact listed below. IBM employees should contact ITIRC for a copy.

Questions about this service can be mailed to reports@us.ibm.com .