Back
The analysis of asymmetry and orthogonality presented by John Gower in his
famous paper in 1977 inspired many methods in asymmetric multidimensional
scaling. In this presentation a review will be considered, focalizing on methods
for skew-symmetry with graphical capabilities.
Such situations require a blend of cluster analysis and discriminant
analysis, striking a compromise between the compactness and integrity of the
clustering on one hand and the accuracy of the future assignment to clusters on
the other. This talk will describe two algorithms for achieving such a
compromise, discuss some of their features, and illustrate their performance for
the above financial example. Based on a series of examples, this review gives the
opportunity of encountering several times the scientific trajectory of John
Gower and, in so doing, to remind some of his significant contributions. In the second part of the talk we examine the use of triadic distances.
The multidimensional scaling (MDS) of triadic distances (MDS3) and a
conventional MDS of dyadic distances (MDS2) both give Euclidean representations.
When used as an analysis method for multivariate data our analysis suggests that
MDS2 and MDS3 can be expected to give very similar results, and this is strongly
supported by numerical examples. A situation where MDS3 does provide quite
different solutions from MDS2 is when both are applied to three-way contingency
tables. In such a case MDS2 models marginal association, whereas MDS3
models conditional association. Biplots
authored by Gower and Hand in 1996 provides a unified theory underlying
different types of biplot. It sparked off many a research project and caught
the attention of users of statistics in diverse fields of application. However,
over time shortcomings in Biplots were identified. It is written in a
rather concentrated style making it difficult for research workers to appreciate
fully the geometric subtleties of the biplot family. Subsequently Gower, Lubbe
and Le Roux embarked on a project to:
make more readily assessable the geometric
background essential for the understanding of biplots and monoplots
provide detailed measures of fit for various
types of biplot
develop an extensive collection of R functions
for constructing biplots and monoplots
develop tools to create more informative
biplots
provide a wealth of illustrative examples drawn
from a wide variety of fields of application, illustrating different
representatives of the biplot family. After nearly a decade, a
milestone is about to be reached with the forthcoming appearance of
Understanding Biplots. In this presentation, we preview some of the material
in Understanding Biplots. In particular the following topics receive
attention: the geometry of biplot and monoplot reference systems; sample and
axis predictivity; the geometry of canonical variate analysis (CVA) as an
application of principal component analysis (PCA) in a two-step procedure;
creating analysis of distance (AoD) biplots as an application of nonlinear
biplot methodology; using R to construct 1D, 2D and 3D biplots; usages of
circle projection; parallel axis shift, lambda-scaling and other novelties for
creating better biplots; the capabilities of the collection of R functions
UBbipl. Posters The Geometry of Data Analysis (Gowerfest II)
Abstracts for invited oral presentations
Casper Albers
(University of Groningen, The Netherlands):
Missing values in general forms of Procrustes Analysis
For given
configuration matrices Xk, the aim of the General Procrustes Problem
is to find transformations Tk such that the Euclidean distance
between the transformed configurations XkTk, is minimised.
The Tk
are restricted to some class of matrix. When missing values
occur in Xk, these need to be estimated before standard Procrustes
algorithms can be applied. Previous work in this field worked on special cases,
e.g. when the Tk are required to be orthogonal or when complete rows
in Xk are missing. In this talk, a more general approach is taken. An
iterative algorithm alternating between an algorithm from constrained quadratic
optimisation (Albers, Critchley, Gower, 2009) and
a general Procrustes algorithm is developed. This algorithm is shown to be
applicable to a very broad class of missing values Procrustes problems. This
class includes those problems were isotropic scaling, centring
and /or standardisation is required.
Guiseppe Bove (University
Roma Tre, Italy):
Asymmetry in proximity data
Frank Critchley (The Open University):
Conics, cones and ...
All will be revealed!
Patrick J. F. Groenen (
Biplots,
Multidimensional Scaling, and Eigenvalues
The work of John Gower is very diverse and broad. However,
main themes can be distinguished such as visualization of multivariate data,
preferably done through biplots, multidimensional scaling, and
eigendecompositions. In all cases, this is combined with rigorous and deep
mathematical insights. In this presentation, I will highlight several of his
works including the application of the modified Leferrier-Feddeev algorithm to
multidimensional scaling, the area biplot, and a new set of icons that should
help readers to make the proper interpretation of these visualization methods.
Wojtek Krzanowski (University of Exeter):
Optimal Predictive
Partitioning
In many situations, it may be desired to group objects into well-defined classes
on the basis of one set of variables and then subsequently to predict the
classes of new objects from another set of variables. For example, a bank
may categorise customers into distinct classes reflecting their financial
behaviour over a period of years, and then wish to assign new customers to
future behaviour classes using information obtained from them when they open an
account.
Ludovic Lebart (Telecom-ParisTech,
France):
Data compression, summary and knowledge
We present the viewpoint of a practitioner on the data
analysis techniques related to data compression. The role of geometry both in
designing the methods and in interpreting the results is discussed. In this
context, we deal also with the problem of metadata, together with the problem of
the articulation between exploration and inference. How do we use what we know
to learn more from the data, and how to use what has been discovered from the
data to continue learning from the same data... Less geometrically oriented, the
assessment of the observed patterns remains however an essential phase of the
knowledge process.
Mark de Rooij (
The geometry and use of triadic distance
models
Triadic distance models define Euclidean distances between
triples of points. In the first part of this talk we study the geometry of
triadic distances t defined as functions of the Euclidean (dyadic) distances a1,
a2, a3 between three points are studied. Special attention is paid to the
contours of all points giving the same value of t when a3 is kept constant.
These isocontours allow some general comments to be made about the suitability,
or not, for practical investigations of certain definitions of triadic distance.
We are especially interested in those definitions of triadic distance,
designated as canonical, that have optimal properties.
Niel le Roux and Sugnet Lubbe (University
of Stellenbosch & University of Cape Town, South Africa):
FROM BIPLOTS TO UNDERSTANDING BIPLOTS:
A decade of studying biplot methodology with John Gower
Jose L. Vicente-Villardon (University of
Salamanca, Spain):
Geometry of
Logistic Biplots for Categorical Data
In many practical situations data is presented in a matrix containing binary or
categorical variables. For such cases the classical linear biplots are not
suitable in the same way as linear regressions are not suitable for binary or
categorical responses. In this paper we present a generalization of the
linear biplots for categorical data and study its properties and geometry.