A clustering algorithm to identify information subsystems
This paper presents an algorithm to cluster the entities and relationships identified by database designers into a set of internally cohesive subsystems. Our algorithm is based on the calculation of a distance score that is inversely related to the similarity of interactions of a pair of entities with the relationships in a binary entity-relationship matrix. Our algorithm avoids manual manipulation of rows and columns required by some of the available approaches (Feldman et al, 1986; Teorey, et al, 1989). It has been implemented on a PC, and does not require a super computer as the Wei and Gaither (1990) method does. Using a part-machine clustering problem presented by King (1980), we also show that our algorithm is superior to King's "rank order cluster" algorithm which requires manual intervention to suppress exceptional entries before one can arrive at the final solution. Directions for further research are identified.
Joglekar, Prafulla; Tavana, Madjid; and Banerjee, Snehamay, "A clustering algorithm to identify information subsystems" (1994). Business Systems and Analytics Faculty Work. 325.
Joglekar, P., Tavana, M. and Banerjee, S. (1994) ‘A Clustering Algorithm for Identifying Information Subsystems,’ Journal of International Information Management, Vol. 3, Special Edition, pp. 129-141.