Casper Albers
(University of Groningen, The Netherlands):
Missing values in general forms of Procrustes Analysis
For given
configuration matrices X_{k}, the aim of the General Procrustes Problem
is to find transformations T_{k} such that the Euclidean distance
between the transformed configurations X_{k}T_{k}, is minimised.
The T_{k
} are restricted to some class of matrix. When missing values
occur in X_{k}, these need to be estimated before standard Procrustes
algorithms can be applied. Previous work in this field worked on special cases,
e.g. when the T_{k} are required to be orthogonal or when complete rows
in X_{k} are missing. In this talk, a more general approach is taken. An
iterative algorithm alternating between an algorithm from constrained quadratic
optimisation (Albers, Critchley, Gower, 2009) and
a general Procrustes algorithm is developed. This algorithm is shown to be
applicable to a very broad class of missing values Procrustes problems. This
class includes those problems were isotropic scaling, centring
and /or standardisation is required.
Guiseppe Bove (University
Roma Tre, Italy):
Asymmetry in proximity data
The analysis of asymmetry and orthogonality presented by John Gower in his famous paper in 1977 inspired many methods in asymmetric multidimensional scaling. In this presentation a review will be considered, focalizing on methods for skew-symmetry with graphical capabilities.
Frank Critchley (The Open University):
Conics, cones and ...
All will be revealed!
Patrick J. F. Groenen (
Biplots,
Multidimensional Scaling, and Eigenvalues
The work of John Gower is very diverse and broad. However,
main themes can be distinguished such as visualization of multivariate data,
preferably done through biplots, multidimensional scaling, and
eigendecompositions. In all cases, this is combined with rigorous and deep
mathematical insights. In this presentation, I will highlight several of his
works including the application of the modified Leferrier-Feddeev algorithm to
multidimensional scaling, the area biplot, and a new set of icons that should
help readers to make the proper interpretation of these visualization methods.
Wojtek Krzanowski (University of Exeter):
Optimal Predictive
Partitioning
In many situations, it may be desired to group objects into well-defined classes
on the basis of one set of variables and then subsequently to predict the
classes of new objects from another set of variables. For example, a bank
may categorise customers into distinct classes reflecting their financial
behaviour over a period of years, and then wish to assign new customers to
future behaviour classes using information obtained from them when they open an
account.
Such situations require a blend of cluster analysis and discriminant analysis, striking a compromise between the compactness and integrity of the clustering on one hand and the accuracy of the future assignment to clusters on the other. This talk will describe two algorithms for achieving such a compromise, discuss some of their features, and illustrate their performance for the above financial example.
Ludovic Lebart (Telecom-ParisTech,
France):
Data compression, summary and knowledge
We present the viewpoint of a practitioner on the data
analysis techniques related to data compression. The role of geometry both in
designing the methods and in interpreting the results is discussed. In this
context, we deal also with the problem of metadata, together with the problem of
the articulation between exploration and inference. How do we use what we know
to learn more from the data, and how to use what has been discovered from the
data to continue learning from the same data... Less geometrically oriented, the
assessment of the observed patterns remains however an essential phase of the
knowledge process.
Based on a series of examples, this review gives the opportunity of encountering several times the scientific trajectory of John Gower and, in so doing, to remind some of his significant contributions.
Mark de Rooij (
The geometry and use of triadic distance
models
Triadic distance models define Euclidean distances between
triples of points. In the first part of this talk we study the geometry of
triadic distances t defined as functions of the Euclidean (dyadic) distances a1,
a2, a3 between three points are studied. Special attention is paid to the
contours of all points giving the same value of t when a3 is kept constant.
These isocontours allow some general comments to be made about the suitability,
or not, for practical investigations of certain definitions of triadic distance.
We are especially interested in those definitions of triadic distance,
designated as canonical, that have optimal properties.
In the second part of the talk we examine the use of triadic distances. The multidimensional scaling (MDS) of triadic distances (MDS3) and a conventional MDS of dyadic distances (MDS2) both give Euclidean representations. When used as an analysis method for multivariate data our analysis suggests that MDS2 and MDS3 can be expected to give very similar results, and this is strongly supported by numerical examples. A situation where MDS3 does provide quite different solutions from MDS2 is when both are applied to three-way contingency tables. In such a case MDS2 models marginal association, whereas MDS3 models conditional association.
Niel le Roux and Sugnet Lubbe (University
of Stellenbosch & University of Cape Town, South Africa):
FROM BIPLOTS TO UNDERSTANDING BIPLOTS:
A decade of studying biplot methodology with John Gower
Biplots authored by Gower and Hand in 1996 provides a unified theory underlying different types of biplot. It sparked off many a research project and caught the attention of users of statistics in diverse fields of application. However, over time shortcomings in Biplots were identified. It is written in a rather concentrated style making it difficult for research workers to appreciate fully the geometric subtleties of the biplot family. Subsequently Gower, Lubbe and Le Roux embarked on a project to:
make more readily assessable the geometric background essential for the understanding of biplots and monoplots
provide detailed measures of fit for various types of biplot
develop an extensive collection of R functions for constructing biplots and monoplots
develop tools to create more informative biplots
provide a wealth of illustrative examples drawn from a wide variety of fields of application, illustrating different representatives of the biplot family.
After nearly a decade, a milestone is about to be reached with the forthcoming appearance of Understanding Biplots. In this presentation, we preview some of the material in Understanding Biplots. In particular the following topics receive attention: the geometry of biplot and monoplot reference systems; sample and axis predictivity; the geometry of canonical variate analysis (CVA) as an application of principal component analysis (PCA) in a two-step procedure; creating analysis of distance (AoD) biplots as an application of nonlinear biplot methodology; using R to construct 1D, 2D and 3D biplots; usages of circle projection; parallel axis shift, lambda-scaling and other novelties for creating better biplots; the capabilities of the collection of R functions UBbipl.
Posters
Jose L. Vicente-Villardon (University of
Salamanca, Spain):
Geometry of
Logistic Biplots for Categorical Data
In many practical situations data is presented in a matrix containing binary or
categorical variables. For such cases the classical linear biplots are not
suitable in the same way as linear regressions are not suitable for binary or
categorical responses. In this paper we present a generalization of the
linear biplots for categorical data and study its properties and geometry.