A Fast Algorithm for Mining Frequent Connected Subgraphs

Graph structured data mining to derive frequent subgraphs from a graph dataset is difficult because the search for subgraphs is combinatorially explosive, and includes subgraph isomorphism matching.
The past research on this issue has not show the sufficient performance for practical use.
In this paper, we propose a new approach called AcGM which achieves the complete search of frequent connected (induced) subgraphs in a massive labeled graph dataset within highly practical time.
The power of AcGM comes from the algebraic representation of graphs, its associated operations and well-organized constraints to limit the search space efficiently.
Its performance has been evaluated through some artificial simulation data and real world data. The high scalability of AcGM has been confirmed for the amount of data, the complexity of graphs and the computation time.

By: Akihiro Inokuchi, Takshi Washio(Osaka Univ), Kunio Nishimura(Osaka Univ) and Hiroshi Motoda(Osaka Univ)

Published in: RT0448 in 2002

LIMITED DISTRIBUTION NOTICE:

This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.

rt0448.pdf

Questions about this service can be mailed to reports@us.ibm.com .