New Publication!
GraphBased Clustering and Data Visualization Algorithms
Ágnes VathyFogarassy, János Abonyi, University of Veszprém, Hungary
2013, XIII / 110 p. / 62 illus / eBook / Softcover
ISBN 9781447151586 Price: 32,13 € (eBook) / 42,79 € (Softcover)
This work presents a data visualization technique that combines graphbased topology representation and dimensionality reduction methods to visualize the intrinsic data structure in a lowdimensional vector space. The application of graphs in clustering and visualization has several advantages. A graph of important edges (where edges characterize relations and weights represent similarities or distances) provides a compact representation of the entire complex data set. This text describes clustering and visualization methods that are able to utilize information hidden in these graphs, based on the synergistic combination of clustering, graphtheory, neural networks, data visualization, dimensionality reduction, fuzzy methods, and topology learning. The work contains numerous examples to aid in the understanding and implementation of the proposed algorithms, supported by a MATLAB toolbox available at an associated website.
Key features:
Examines vector quantization methods, and discusses the advantages and disadvantages of minimal spanning treebased clustering
Presents a novel similarity measure to improve the classical JarvisPatrick clustering algorithm
Reviews distance, neighborhood and topologybased dimensionality reduction methods, and introduces new graphbased visualization algorithms
The book is aimed primarily at researchers, practitioners, and professionals in graph theory and clustering, but it is also accessible to graduate students in electrical, chemical, and process engineering. Technical prerequisites include an undergraduatelevel knowledge of graph theory and linear algebra. Additional familiarity with clustering methods is helpful but not required.  Table of Contents (PDF)
Introduction Application of graphs in clustering and visualisation has several advantages. Edges characterise relations, weights represent similarities or distances. A Graph of important edges gives compact representation of the whole complex data set. In this book we present clustering and visualisation methods that are able to utilise information hidden in these graphs based on the synergistic combination of classical tools of clustering, graphtheory, neural networks, data visualisation, dimensionality reduction, fuzzy methods, and topology learning. The understanding of the proposed algorithms is supported by
 figures (over 110);
 references (170) which give a good overview of the current state of clustering, vector quantising and visualisation methods, and suggest further reading material for students and researchers interested in the details of the discussed algorithms;
 algorithms (17) which aim to understand the methods in details and help to implement them;
 examples (over 30)
 software packages which incorporate the introduced algorithms. These Matlab files are downloadable from the website of the author (www.abonyilab.com).
Vector Quantisation and Topology Based Compact graph based representation of complex data can be used for clustering and visualisation. In this chapter we introduce basic concepts of graph theory and present approaches which may generate graphs from data. Computational complexity of clustering and visualisation algorithms can be reduced replacing original objects with their representative elements (code vectors or fingerprints) by vector quantisation. We introduce widespread vector quantisation methods, the kmeans and the neural gas algorithms. Topology representing networks obtained by the modification of neural gas algorithm create graphs useful for the lowdimensional visualisation of data set. In this chapter the basic algorithm of the topology representing networks and its variants (Dynamic Topology Representing Network and Weighted Incremental Neural Network) are presented in details.
GraphBased Clustering Algorithms The way how graph based clustering algorithms utilise graphs for partitioning data is very various. In this chapter two approaches are presented. The first hierarchical clustering algorithm combines minimal spanning trees and GathGeva fuzzy clustering. The second algorithm utilises a neighbourhood based fuzzy similarity measure to improve $k$nearest neighbour graph based JarvisPatric clustering.
GraphBased Visualisation of High Dimensional Data In this chapter we give an overview of classical dimensionality reduction and graph based visualisation methods that are able to uncover hidden structure of high dimensional data and visualise it in a lowdimensional vector space.
