Graph-Based Clustering and Data Visualization Algorithms

gclustbook

About this book

Application of graphs in clustering and visualisation has several advantages. Edges characterise relations, weights represent similarities or distances. A Graph of important edges gives compact representation of the whole complex data set. In this book we present clustering and visualisation methods that are able to utilise information hidden in these graphs based on the synergistic combination of classical tools of clustering, graph-theory, neural networks, data visualisation, dimensionality reduction, fuzzy methods, and topology learning.

The understanding of the proposed algorithms is supported by

Graph-Based Clustering and Data Visualization Algorithms

Ágnes Vathy-Fogarassy, János Abonyi, University of Veszprém, Hungary

2013, XIII / 110 p. / 62 illus / eBook / Softcover

ISBN 978-1-4471-5158-6

Price: 32,13 € (eBook) / 42,79 € (Softcover)

This work presents a data visualization technique that combines graph-based topology representation and dimensionality reduction methods to visualize the intrinsic data structure in a low-dimensional vector space. The application of graphs in clustering and visualization has several advantages. A graph of important edges (where edges characterize relations and weights represent similarities or distances) provides a compact representation of the entire complex data set. This text describes clustering and visualization methods that are able to utilize information hidden in these graphs, based on the synergistic combination of clustering, graph-theory, neural networks, data visualization, dimensionality reduction, fuzzy methods, and topology learning. The work contains numerous examples to aid in the understanding and implementation of the proposed algorithms, supported by a MATLAB toolbox available at an associated website.

The book is available from Springer-NY and also from Amazon.com.

Key features:

Examines vector quantization methods, and discusses the advantages and disadvantages of minimal spanning tree-based clustering

Presents a novel similarity measure to improve the classical Jarvis-Patrick clustering algorithm

Reviews distance-, neighborhood- and topology-based dimensionality reduction methods, and introduces new graph-based visualization algorithms

The book is aimed primarily at researchers, practitioners, and professionals in graph theory and clustering, but it is also accessible to graduate students in electrical, chemical, and process engineering. Technical prerequisites include an undergraduate-level knowledge of graph theory and linear algebra. Additional familiarity with clustering methods is helpful but not required.

Compact graph based representation of complex data can be used for clustering and visualisation. In this chapter we introduce basic concepts of graph theory and present approaches which may generate graphs from data. Computational complexity of clustering and visualisation algorithms can be reduced replacing original objects with their representative elements (code vectors or fingerprints) by vector quantisation. We introduce widespread vector quantisation methods, the k-means and the neural gas algorithms. Topology representing networks obtained by the modification of neural gas algorithm create graphs useful for the low-dimensional visualisation of data set. In this chapter the basic algorithm of the topology representing networks and its variants (Dynamic Topology Representing Network and Weighted Incremental Neural Network) are presented in details.

Graph-Based Clustering Algorithms

The way how graph based clustering algorithms utilise graphs for partitioning data is very various. In this chapter two approaches are presented. The first hierarchical clustering algorithm combines minimal spanning trees and Gath-Geva fuzzy clustering. The second algorithm utilises a neighbourhood based fuzzy similarity measure to improve $k$-nearest neighbour graph based Jarvis-Patric clustering.

Graph-Based Visualisation of High Dimensional Data

In this chapter we give an overview of classical dimensionality reduction and graph based visualisation methods that are able to uncover hidden structure of high dimensional data and visualise it in a low-dimensional vector space.

References (PDF)

Index (PDF)

Matlab files, examples and detailed manual