LINQS

STATISTICAL RELATIONAL LEARNING GROUP @ UMD



 

Effective Label Acquisition for Collective Classification

ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, page 43--51 - 2008
Note: Winner of the KDD'08 Best Student Paper Award.  
Download the publication : bilgic-kdd08.pdf [776Ko]   bilgic-kdd08.ps [2.2Mo]  
Information diffusion, viral marketing, and collective classification all attempt to model and exploit the relationships in a network to make inferences about the labels of the nodes. A variety of techniques have been introduced and methods that combine the attribute information and neighboring label information have been shown to be effective for collective labeling of the nodes in a network. However, in part because of the correlation between node labels that the techniques exploit, it is easy to find cases in which, once a misclassification is made, incorrect information propagates throughout the network. This problem can be mitigated if the system is allowed to judiciously acquire the labels for a small number of nodes. Unfortunately, under relatively general assumptions, determining the optimal set of labels to acquire is intractable. Here we propose an acquisition method that learns the cases when a given collective classification algorithm makes mistakes, and suggests acquisitions to correct those mistakes. We empirically show on both real and synthetic datasets that this method significantly outperforms a greedy approximate inference approach, a viral marketing approach, and approaches based on network structural measures such as node degree and network clustering. In addition to significantly improving accuracy with just a small amount of label data, our method is tractable on large networks.

BibTex references

@InProceedings{bilgic:kdd08,
  author       = "Bilgic, Mustafa and Getoor, Lise",
  title        = "Effective Label Acquisition for Collective Classification",
  booktitle    = "ACM SIGKDD International Conference on Knowledge Discovery and Data Mining",
  pages        = "43--51",
  year         = "2008",
  note         = "Winner of the KDD'08 Best Student Paper Award.",
}

Other publications in the database