Search:
Lehrstuhl  |  Institut  |  Fakultät  |  LMU
print

Discriminative Frequent Subgraph Mining with Optimality Guarantees

Journal: Statistical Analysis and Data Mining
Volume: 3
Issue: 5
Pages: 302–318
Editor: Joseph Verducci

Abstract

The goal of frequent subgraph mining is to detect subgraphs that frequently occur in a dataset of graphs. In classification settings, one is often interested in discovering discriminative frequent subgraphs, whose presence or absence is indicative of the class membership of a graph. In this article, we propose an approach to feature selection on frequent subgraphs, called CORK, that combines two central advantages. First, it optimizes a submodular quality criterion, which means that we can yield a near-optimal solution using greedy feature selection. Second, our submodular quality function criterion can be integrated into gSpan, the state-of-the-art tool for frequent subgraph mining, and help to prune the search space for discriminative frequent subgraphs even during frequent subgraph mining.

Copyright Notes:

Marisa Thoma, Hong Cheng, Arthur Gretton, Jiawei Han, Hans-Peter Kriegel, Alex Smola, Le Song, Philip S. Yu, Xifeng Yan and Karsten M. Borgwardt, Discriminative Frequent Subgraph Mining with Optimality Guarantees. Statistical Analysis and Data Mining, 3: 302–318. doi: 10.1002/sam.10084.

Copyright © 2010 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 3: 302-318, 2010

DOI: 10.1002/sam.10084

Documents:

This is the author’s version of the work. It is posted here by permission of Wiley for your personal use. Not for redistribution.

Paper pdf.gif
Code for gSpanCORK zip.gif
used Datasets zip.gif

BibTeX:

@ARTICLE{ThoCheGreHanetal10,
  AUTHOR    = {Marisa Thoma and Hong Cheng and Arthur Gretton and Jiawei Han and Hans-Peter Kriegel 
               and Alex Smola and Le Song and Philip S. Yu and Xifeng Yan and Karsten M. Borgwardt},
  TITLE     = {Discriminative Frequent Subgraph Mining with Optimality Guarantees},
  JOURNAL   = {Statistical Analysis and Data Mining},
  VOLUME    = {3},
  NUMBER    = {5},
  YEAR      = {2010},
  PAGES     = {302--318},
  DOI       = {10.1002/sam.10084},
}
blank