Mining Associations With the Collective Strength Approach

Copyright [©] (2001) by IEEE. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distrubuted for profit. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee.

The large itemset model has been proposed in the literature for finding associations in a large database of sales transactions. A different method for evaluating and finding itemsets referred to as strongly collective itemsets is proposed. We propose a criterion that stresses the importance of the actual correlation of the items with one another rather than their absolute level of presence. Precious techniques for finding correlated itemsets are not necessarily applicable to very large databases. We provide an algorithm which provides very good computational efficiency, while maintaining statistical robustness. The fact that this algorithm relies on relative measures rather than absolute measures such as support also implies that the method can be applied to find association rules in datasets in which items may appear in the sizeable percentage of the transactions (dense datasets), datasets in which the items have varying density, or even negative association rules

By: Charu C. Aggarwal, Philip S. Yu

Published in: IEEE Transactions on Knowledge and Data Engineering, volume 13, (no 6), pages 863-73 in 2001

Please obtain a copy of this paper from your local library. IBM cannot distribute this paper externally.

Questions about this service can be mailed to reports@us.ibm.com .