We classify clustering algorithms into sequence-based
techniques - which transform the object net into a linear sequence -
and partition-based clustering algorithms. Tsangaris
and Naughton have shown that the partition-based techniques are superior.
However, their work is based on a single partitioning algorithm,
the Kernighan and Lin heuristics, which is not applicable to
realistically large object bases because of its high running-time complexity.
The contribution of this paper is two-fold:
- We devise a new class of greedy object graph partitioning
algorithms (GGP) whose running-time complexity is moderate while
still yielding good quality results.
- Our extensive quantitative analysis of all well-known
partitioning algorithms indicates that no one algorithm
performs superior for all object net characteristics.
Therefore, we propose an adaptable clustering strategy according
to a multi-dimensional grid: the dimensions correspond to particular
characteristics of the object base - given by, e.g.,
number and size of objects, degree of object sharing - and the grid entries indicate
the most suitable clustering algorithm for the particular configuration.
An extended version of this paper is available as
technical report.