We classify clustering algorithms into sequence-based
techniques - which transform the object net into a linear sequence -
and partition-based clustering algorithms. Tsangaris
and Naughton have shown that the partition-based techniques are superior.
However, their work is based on a single partitioning algorithm,
the Kernighan and Lin heuristics, which is not applicable to
realistically large object bases because of its high running-time complexity.
The contribution of this paper is two-fold:
Our analysis yields that no one algorithm performs superior
for all object net characteristics. Therefore, we derive a
multi-dimensional grid: the dimensions correspond to particular
characteristics of the object base configurations and the grid entries
indicate the best clustering algorithm for the particular configuration.
We propose an adaptable clustering strategy by determining
first the characteristics of the clustering problem - given by, e.g.,
number and size of objects, degree of object sharing - and then applying
the most suitable algorithm which is obtained from the multi-dimensional grid.
- we devise a new class of greedy object graph partitioning
algorithms (GGP) whose running-time complexity is moderate while
still yielding good quality results. For large object graphs GGP
is the best known heuristics with an acceptable running-time.
- We carry out an extensive quantitative analysis of all well-known
partitioning algorithms for clustering object graphs.
A shorter version of this paper is published at the
FODO 1993 conference.