
LINQS
STATISTICAL RELATIONAL LEARNING GROUP @ UMD
Deduplication and Group Detection using Links
ACM SIGKDD Workshop on Link Analysis and Group Detection (LinkKDD) - 2004
Clustering is a fundamental problem in data mining. Traditionally,
clustering is done based on the similarity of the attribute values of
the entities to be clustered. More recently, there has been greater
interest in clustering relational and structured data. Often times
this data is best described as a graph, in which there are both
entities, described by a collection of attributes, and links between
entities, representing the relations between them. Clustering in
these scenarios becomes more complex, as we should also take into
account the similarity of the entity links when we are clustering. We
propose novel distance measures for clustering linked data, and show
how they can be used to solve two important data mining tasks, entity
deduplication and group discovery.
BibTex references
@InProceedings{bhattacharya:kdd04-wkshp,
author = "Bhattacharya, Indrajit and Getoor, Lise",
title = "Deduplication and Group Detection using Links",
booktitle = "ACM SIGKDD Workshop on Link Analysis and Group Detection (LinkKDD)",
year = "2004",
}
![bhattacharyakdd04-whskp.pdf [237Ko]](/basilic/web/Publications/images/pdf.png)

