Clustering Relational Data Based on Randomized Propositionalization

dc.contributor.authorAnderson, Grant
dc.contributor.authorPfahringer, Bernhard
dc.coverage.spatialConference held at Conference held Corvallis, OR, USAen_NZ
dc.date.accessioned2008-12-19T00:46:52Z
dc.date.available2008-12-19T00:46:52Z
dc.date.issued2008
dc.description.abstractClustering of relational data has so far received a lot less attention than classification of such data. In this paper we investigate a simple approach based on randomized propositionalization, which allows for applying standard clustering algorithms like KMeans to multi-relational data. We describe how random rules are generated and then turned into Boolean-valued features. Clustering generally is not straightforward to evaluate, but preliminary experimental results on a number of standard ILP datasets show promising results. Clusters generated without class information usually agree well with the true class labels of cluster members, i.e. class distributions inside clusters generally differ significantly from the global class distributions. The two-tiered algorithm described shows good scalability due to the randomized nature of the first step and the availability of efficient propositional clustering algorithms for the second step.en
dc.identifier.citationAnderson, G. & Pfahringer, B. (2008) Clustering Relational Data Based on Randomized Propositionalization. In Proceedings of 17th International Conference, ILP 2007, Corvallis, OR, USA, June 19-21, 2007(pp. 39-48). Berlin: Springeren
dc.identifier.doi10.1007/978-3-540-78469-2_8en
dc.identifier.urihttps://hdl.handle.net/10289/1726
dc.language.isoen
dc.publisherSpringer, Berlinen
dc.relation.isPartOfProc 17th International Conference on Inductive Logic Programmingen_NZ
dc.relation.urihttp://www.springerlink.com/content/d05u76x584395081/?p=5206d3a803a344a487f82ff4fd0beec0&pi=7en
dc.sourceILP 2007en_NZ
dc.subjectcomputer scienceen
dc.subjectclusteringen
dc.subjectpropositionalizationen
dc.subjectrandomizationen
dc.subjectMachine learning
dc.titleClustering Relational Data Based on Randomized Propositionalizationen
dc.typeConference Contributionen
dspace.entity.typePublication
pubs.begin-page39en_NZ
pubs.end-page48en_NZ
pubs.finish-date2007-06-21en_NZ
pubs.start-date2007-06-19en_NZ
pubs.volume4894en_NZ

Files

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.79 KB
Format:
Item-specific license agreed upon to submission
Description: