Compositional data analysis : a business case application to cross selling

(1)

POLITECNICO DI MILANO

Scuola di Ingegneria dei Sistemi

Corso di Laurea Specialistica in Ingegneria Matematica

Dipartimento di Matematica

Compositional Data Analysis: a business case

application to Cross Selling

Relatore: Prof. Simone VANTINI

Correlatrice: Dott.ssa Alessandra MENAFOGLIO

Tesi di Laurea di: Luca FACCHETTI Matr. 755105

(2)

Sommario

In questo lavoro si presentano le principali caratteristiche della Compositio-nal Data ACompositio-nalysis, soffermandosi sul problema dell’applicazione di tecniche classiche di statistica multivariata su dati composizionali.

Successivamente i metodi di analisi e trasformazione di dati composizionali vengono applicati ad un dataset inerente a un caso aziendale di Cross

Sel-lingrelativo alle quantità di macro famiglie di prodotti acquistate dai clienti

della società di automazione industriale Festo S.p.A. (ed alle relazioni che in-tercorrono tra le varie famiglie), cercando di capire, tramite l’applicazione di analisi di statistica multivariata come PCA, Clustering, K-means e Classifica-zione Supervisionata, se le strategie aziendali per lo sviluppo di campagne di marketing relative al Cross Selling abbiano evidenza statistica.

PAROLE CHIAVE: Compositional Data Analysis, Cross Selling, PCA, Cluste-ring, K-means, Festo.

(3)

Abstract

In this work main characteristics of Compositional Data Analysis are pre-sented, specially focusing on issue of application of classical multivariate sta-tistical techniques to compositional data.

Afterwards methods of compositional data transformation are applied to a dataset inherent to a Cross Selling business case related to product macro-families purchased by customers of Industrial Automation company Festo Ltd (and related to relations between product families) with the aim, through the application of multivariate statistical methods such PCA, Clustering, K-means and Supervised Classification, to understand if company strategies regarding the development of marketing campaigns related to Cross Selling are sup-ported by statistical results.

KEYWORDS: Compositional Data Analysis, Cross Selling, PCA, Clustering, K-means, Festo

(4)

List of Figures

2.1 Festo TC in Esslingen . . . 6

2.2 Example of typical Festo products and the final result, an indus-try machine where these products are applied . . . 9

2.3 Two typical Customer Solutions composed by standard prod-ucts; in general these are not serial projects. . . 10

2.4 F-IT market, in NTO and number of customers, split by cus-tomer typology. . . 10

2.5 F-IT market proportion, in NTO (blue) and number of customers (grey), split by Industry Sector . . . 11

2.6 Related items suggested by Amazon website . . . 14

2.7 Bundling on Amazon website . . . 14

2.8 Proposal based on customer purchasing choices. . . 15

2.9 Specific section that show related products: a first Cross Selling approach. . . 16

2.10 Example of how products have been aggregated into bigger fam-ilies . . . 17

2.11 Final macro categories taken into account for Level 1 analysis (Customer Solutions have been excluded). . . 18

2.12 YTD Result, on NTO and Volumes, of all product families for F-IT 18 2.13 Template Excel Analysis for Level 1 in Food & Beverage sector . 19 2.14 Relations between Drives and other product families . . . 23

(8)

2.15 After the analysis, the resume sheet shows us the ratios for each

F-IT customer . . . 23

3.1 Simplex inR3+and its representation as ternary diagram (Source: [6]) . . . 29

3.2 Example of distance inS3 _{. . . .} ₃₃

3.3 Previous example with customers perturbed by y = [5, 30, 80] and y= [10, 100, 10] . . . 35

3.4 Previous example with powering applied to customers with α = 0.5 and α =2 . . . 35

3.5 Template for ternary diagrams and importance of amount com-ponents proportion inS3_{. . . .} ₃₉

3.6 Similar customers (by shape or colour) lying on same proportion lines. . . 40

3.7 Parallel lines, circles and ellipses on ternary diagram. Source: [6] 41 4.1 Final macro families taken into analysis . . . 45

4.2 Example of dataset . . . 46

4.3 Representation of dataset as absolute amounts . . . 47

4.4 Representation of dataset on simplex . . . 48

4.5 Results summary applied to all 5 components . . . 49

4.6 Proportion of variance on 4 Principal Components . . . 49

4.7 Biplot of first 2 principal components . . . 50

4.8 Loadings on first 2 Principal Components . . . 52

4.9 Barplot representing loadings value on all 4 principal components 52 4.10 Dataset with first (solid line) and second (dashed line) PCs . . . 53

4.11 Representation of first (solid line) and second (dashed line) PCs on subcompositions for Sensors, Pneumatic Drives and Cylinder Mountings coloured by ISM . . . 54

(9)

4.12 Dendograms for Euclidean and Manhattan distances . . . 56

4.13 Dendograms for Euclidean and Manhattan distances with Ward 57 4.14 Cophenetic correlations for Euclidean and Manhattan distances 57 4.15 Data grouped in 3 clusters . . . 58

4.16 Data grouped in 6 clusters . . . 59

4.17 Subcompositions data grouped in 3 clusters for average, com-plete and Ward linkage: first row Euclidean distance, second row Manhattan distance . . . 59

4.18 Subcompositions data grouped in 6 clusters for average, com-plete and Ward linkage: first row Euclidean distance, second row Manhattan distance . . . 60

4.19 Results for k-means algorithm with 6 clusters and starting cen-tres from hierarchical clustering . . . 62

4.20 Ternary diagrams for k-means with 6 clusters, purple squares represent centres of the clusters . . . 62

4.21 Subcompositions’ original data coloured by ISM and coloured by k-means clusters . . . 63

4.22 Extreme centres applied to k-means algorithm . . . 64

4.23 Comparing results of k-means with 5 clusters . . . 64

4.24 Variance explained percentage in function of k . . . 65

4.25 Comparing with k=3 and different initial centres . . . 65

4.26 Values of final clusters’ centres in both cases . . . 65

4.27 Results for lda analysis using proportional priors . . . 67

4.28 Scalings of lda . . . 69

4.29 Dataset with new classification for ISM . . . 70

4.30 Dataset with new classification computed with LDA and QDA algorithms. . . 71

(10)

List of Tables

2.1 Example table: which Ratio is the correct one? . . . 21

2.2 Opposite effect on ratios generated by Drives . . . 24

3.1 Euclidean Distance between customers onS3 _{. . . .} ₃₃

3.2 Aitchison Distance between customers onS3 _{. . . .} ₃₇

4.1 Extreme centres applied to k-means algorithm . . . 63

4.2 Number of customers per ISM . . . 66

4.3 ISM included in analysis . . . 66

4.4 Confusion matrix of dataset: on rows original classes, on column predicted classes . . . 68

4.5 Confusion matrix for LDA using uniform priors for groups . . . 68

(11)

Chapter 1 Introduction

1.1 Framework and motivation

This work aims to present the basic theory behind the branch of statistics called Compositional Data Analysis, that is the set of statistical techniques de-veloped to analyse multivariate dataset as compositional data, so as they are quantitative descriptions of the parts of some whole, conveying exclusively relative information.

The basics concepts that lead to the formulation of principles beyond these techniques has ancients origins since in a paper of Karl Pearson (1897) [7] that begin with the words "On a form of spurious correlation ..." he wrote about the awareness of problems related to the analysis of data from their relative point of view instead of their standard quantitative approach.

After a long period where most of statisticians community focused on devel-opment of multivariate statistical analysis (in the classic meaning of the term), in last decades these techniques have been developed specially thanks to the work of John Aitchison and his working team that first in 1986 defined princi-ples beyond analysis of data treated as compositions and second developed a dedicated geometry and data transformation in order to be able to apply most of classics multivariate statistical techniques (such as Principal Components

(12)

1.2. CASE STUDY: APPLICATION OF COMPOSITION DATA ANALYSIS TO CROSS SELLING

Analysis, Cluster Analysis, Data sampling, Classification etc).

Nowadays Compositional Data Analysis found its deserved space in statisti-cal techniques since literature is full of quantities and data where it is more interesting to focus on the relations between components instead of absolute quantities. Fields with most applications are such as biosciences, geosciences, chemistry (think about the proportion of elements in a chemical compound or in the relations between parts of different material in an earth sample).

1.2 Case study: application of Composition Data

Analysis to Cross Selling

1.2.1 Cross Selling in a nutshell

One of the most important fields in developing in marketing analyses for industrial and sales companies is the so-called Cross Selling, its specific con-cepts and applications will be described in depth in next chapter: we can de-scribe Cross Selling in a nutshell as the set of practices related to the

opti-mization of the saleof products or accessories belonging to a specific basket

of products related each others. These analyses are of critical importance as gives to the company the latent information about the missing potential

mar-ketrelying on sales data of its customers’ basket of purchased products.

Let’s make a simple example of Cross Selling logic: company Festo sells two product families, Pneumatic Drives and Sensors: it is well-known that in almost all application cases it is necessary to mount a Sensor on a Pneumatic Drive, so quantities sold from Festo of these two product families are corre-lated: if Customer A is buying only Pneumatic Drives from Festo it means that it is potentially buying Positioning Sensors from a Festo competitor, and

(13)

1.3. CHAPTERS DESCRIPTION

the amount of Sensors that it is buying from the competitor is also quantified (in relation to Pneumatic Drives quantities sold to Customer A by Festo). We can generalize the concept to more than two product families, in this case we are interested in the relation between a sold product family quantity with respect to the whole components’ quantity.

1.2.2 Cross Selling: from heuristic to quantitative approach

Unfortunately Cross Selling is generally used in a heuristic way: data are analysed from a qualitative point of view and only absolute quantities of sin-gle product families are taken into account.

The target of this work is to develop propose the proper mathematical

en-vironmentand the set of quantitative (and formally correct) methods for the

Cross Selling analysis applied to a business case related to Cross Selling data for customers of Italian subsidiary of Festo, a German company worldwide leader in Industry Automation.

After the definition of the right analysis environment for dataset composed by Festo customers, some classical multivariate techniques are applied to the dataset in order to verify if company strategy related to Cross Selling cam-paigns definition is supported by statistical analysis of data.

1.3 Chapters description

In first part of Chapter 2 Festo’s brief history and company structure is de-scribed, after that there is the description of the market where the company operates, the strategical attributes for customers defined by the company and a general description of products sold by Festo. In the second part of the

(14)

chap-1.3. CHAPTERS DESCRIPTION

ter it is described the status quo of Cross Selling analysis developed in the company: issues and crucial points of weakness of the analysis are defined.

Chapter 3is the one concerning the description of Compositional Data

Anal-ysis theory: after the formal description of basic principles of Composition

Data Analysis enunciated by John Aitchison and after verifying on a part of

dataset of compositions the issues related to the application of standard statis-tical techniques, a proper data transformation using Aitchison Geometry is developed.

In last part, considering the difference geometry rules of compositional data on the simplex with respect to standard Euclidean spaces geometry rules,

geome-try on the simplexis described with examples of parallel lines, curves, circles

and ellipses .

Chapter 4 Is the joint between second and third chapter: Festo Cross Selling

data is transformed into compositional data and some multivariate analysis techniques (such Principal Components Analysis, Hierarchic Cluster Anal-ysis, K-means Clustering and Linear and Quadratic Discriminant Analysis) have been developed in order to understand if company’s strategies are sup-ported by data analysis.

For each type of analysis conclusions on the nature of Festo customers’ pur-chasing attitudes are drawn and differences between the conclusions coming from a standard approach are described.

(15)

Chapter 2 Business case: Festo and Cross

Selling analysis

2.1 Introduction of Festo

Festo is worldwide leader in Industrial Automation sector, in training and updating of industrial systems for its customers. Its Head Quarter is placed in

Esslingen am Neckar, near Stuttgart1; it is also present almost worldwide with

61 abroad subsidiaries and almost 18700 employees.

Figure 2.1 – Festo Technology Centre in Esslingen

1. one of the most important European areas for industry development, for this motivation its region, Baden-Württemberg, is called one of the Four Motors for Europe together with Italian region Lombardia, Spanish Catalunya and French Rhône-Alpes

(16)

2.2. FESTO ITALY AND INDUSTRY AUTOMATION ITALIAN MARKET Founded in 1925 as a wood working machines’ builder, Festo is now both a global player and an independent family-run company that covers its financial needs by their own means.

For such a motivation the company is not dependent by any kind of capital market but it’s linked to its relations with customers, employees and business partners. This allowed the company to define a long term planning even in highly dynamic and competitive markets, often different by their starting spe-cific market. The company is synonymous with innovation in industrial and process automation, from individual single products up to customer solutions ready for installation.

Festo’s innovative strength is demonstrated by the introduction of about 100 new products every year, the investment of 9% of its total turnover in Research and Development and by the deposition of over 2900 patents worldwide.

With a total turnover of 2.64 billion Euros in 2015, Festo supplies pneumatic and electrical automation technology for 300,000 customers in over 35 different industry fields.

2.2 Festo Italy and Industry Automation Italian

mar-ket

2.2.1 Festo Italy

Festo SpA, from now on F-IT, was born in 1956 as the first Festo abroad sub-sidiary, after only one year since Festo started to spread in Germany pneu-matic technology applied to industrial sectors different from its specific one, the wood working sector.

It is from Italy that its vocation of international company started to become a reality. Contributing to the affirmation of industrial development, Festo Italy

(17)

2.2. FESTO ITALY AND INDUSTRY AUTOMATION ITALIAN MARKET witnessed the birth of small, medium and large enterprises that arose during the post-war period, and using simple technology, progressively reach more complex and efficient working models.

With an approximate number of 240 employees and an established net-work of authorized distributors F-IT successfully underpins the group’s mis-sion. The flagship of Assago company offices is the Application Center, an ex-clusive operating environment with full availability for F-IT customers: a func-tional technological space divided into four application areas and with training classrooms where customers can test solutions and simulate real-world appli-cations.

In the headquarters of Assago also operational Festo C.T.E. Srl can be found. Festo C.T.E. is part of international Festo Didactic, one of the world-leading providers of equipment and solutions for technical education. The product and service portfolio offers customers holistic education solutions for all ar-eas of technology in factory and process automation, such as pneumatics, hy-draulics, electrical engineering, production technology, mechanical engineer-ing, mechatronics, CNC, HVAC and telecommunications.

2.2.2 F-IT market: Festo products

As mentioned before, F-IT is a leading company on the Italian manufactur-ing market in industrial automation. The core business of F-IT is characterized by the sale of almost all components for material handling, mainly pneumatic (i.e. all you need to build an industrial machine such as valves, valve termi-nals, cylinders and related accessories, positioning sensors, flow regulators, cables, wires, tubes, pipes and fittings etc) and to a lesser extent but in big de-velopment, for electric movement (electric drives, brushless motors, etc.) and Process Automation (products related to the process industry market for the

(18)

2.2. FESTO ITALY AND INDUSTRY AUTOMATION ITALIAN MARKET

Figure 2.2 – Example of typical Festo products and the final result, an industry machine where these products are applied

Pharma, the Oil & Gas and for wastewater treatment). Figure 2.2 depicts some examples of Festo products and one typical final application of them.

The most important part of Festo market is composed by standard prod-ucts, or single products that could be used as components for a more complex system. A smaller part of its market is also composed by the so-called Cus-tomer Solutions, that are single projects specifically assembled with both Festo and external products and customized for our customers (see two examples in Figure 2.3).

2.2.3 F-IT customer segmentation

Excluding Distributors channel, Italian market is mainly composed by OEMs (Original Equipment Manufacturers) or industrial machine manufacturers for handling, production and packaging of products of all manufacturing sectors. These represent about 80% of F-IT market; the residual part is made by End Users, i.e., final users that purchase products only related to their actual

(19)

busi-2.2. FESTO ITALY AND INDUSTRY AUTOMATION ITALIAN MARKET

Figure 2.3 – Two typical Customer Solutions composed by standard products; in general these are not serial projects.

ness requirements (these companies, for example, are in most cases customers of OEM companies from whom they buy one or more machines and which require only single products for personal needs such as replacing a Festo dam-aged component or a modification of a machine project).

Customer Structure

Market Sales [LC] /# Customer by OEM/EU/DEALER

Market Sales [LC] # Customer 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% IT 16,1% 6,6% 77,3% 23,1% 72,6% Color by OEM_End U Dealer End us OEM

Figure 2.4 – F-IT market, in NTO and number of customers, split by customer typology.

(20)

2.2. FESTO ITALY AND INDUSTRY AUTOMATION ITALIAN MARKET An important additional segmentation in F-IT (and in a similar way in Festo total market) is the allocation of its customers to a specific industry sector. This classification is crucial since the market in which F-IT operates is vast and eclectic. Therefore the business is divided in different industry sectors, among which the most significant for F-IT are:

— Food & Beverage — Packaging

— Automotive

— Electronic-Light Assembly — Printing & Plastic

— Machine Tool & Handling — Process Automation

Customer Structure

Market Sales [LC] /# Customer by ISM

ISM Desc. [short]

M a rk e t S a le s [L C ] # C u st o m e r A C A G R I A M I B IO P H A B U IL D C H E M D E A L E D U C E LA _ELP E N E R G Y FL U ID FO O D FO O D P R FO O T G E N P A G LC E R H E A V Y H Y P N E U M E D LA B M P M TO O L O TH E R P LA S T P R IN T P R V A LV P TR A N S P U LP A P R A S S P C IA L S TO LI TE S T TE X TI L TY R E W A TE R W E LD W O O D IT Color by Market Sales [LC] # Customer

Figure 2.5 – F-IT market proportion, in NTO (blue) and number of customers (grey), split by Industry Sector

(21)

2.3. CROSS SELLING AS MARKETING AND STATISTICAL APPROACH This classification is essential as the market in which F-IT operates is so dif-ferentiated that reducing everything to a single action strategy would not be only simplistic than ever misleading. In fact, in the European market Italy is second only to Germany for the manufacture of industry machines, and our market is also recognized as the one with the most innovative and creative so-lutions.

Every sector is different and has particular industrial applications of its mar-ket. By way of example consider the machinery market for Food & Beverage where there are much more stringent sanitary certifications than other mar-kets. In a market like this there will be specific products, waterproof or totally aseptic for the treatment of the food or beverage compared to a market such as the Automotive one that has different applications and standards in force (think about bending machines of metal sheets of vehicle bodies).

2.3 Cross Selling as marketing and statistical

ap-proach

2.3.1 Cross and Up Selling

For the varied composition of its market, Festo implements and supports analytically different marketing campaigns that are placed on different planes of action: there are campaigns related to specific industry sectors and cam-paigns related to individual specific products or to macro-families of products. Here is where comes into play the Cross Selling concept.

The term Cross Selling defines the set of practices, marketing actions,

sta-tistical analyses and business KPIs2 related to the optimization of the sale of

2. Key Performance Indicators: a type of performance measurements calculated to evalu-ate the success of an organization or an activity with respect to a particular business need.

(22)

2.3. CROSS SELLING AS MARKETING AND STATISTICAL APPROACH products or accessories belonging to a specific basket of products related each others. This tool is necessarily linked to a partially loyal customers that are already buying from Festo a basket of consolidated products.

The practice of Cross Selling is widely used as part of B2C3markets, where

the buyer is characterized mainly by End Users and the logic of the market is

mainly characterized by direct marketing4.

A classic and simple example of this practice is the one implemented by major e-commerce companies like Amazon: once a buyer has completed a pur-chase on their site, the same service provider shows to the buyer some deals related to the products of his latest purchase: this practice not only allows one to bypass all the logic related to indirect marketing campaigns or the field studies to understand the purchasing logics of a particular customer segment, but also enables one to directly use the purchasing information of a customer and to use other sales techniques. A classic example is the formulation of a spot price of a given bundle of different products with a further saving for the customers in case they decide to buy the entire package instead of the single individual products.

Figure 2.6 shows an example of Cross Selling implemented by Amazon, the largest e-commerce site on the European market. In Figure 2.7 we can see how Cross Selling could be applied to bundling campaigns.

The first image shows how, after a careful categorization of products and 3. Business to Consumer (B2C) is business or transactions conducted directly between a company and consumers who are the end-users of its products or services

4. from Wikipedia: Direct Marketing is a form of advertising which allows businesses and nonprofit organizations to communicate directly to customers through a variety of media in-cluding cell phone text messaging, email, websites, online adverts, database marketing, fliers, catalogue distribution, promotional letters and targeted television, newspaper and magazine advertisements as well as outdoor advertising. Among practitioners, it is also known as direct response.

(23)

2.3. CROSS SELLING AS MARKETING AND STATISTICAL APPROACH

Figure 2.6 – Related items suggested by Amazon website

Figure 2.7 – Bundling on Amazon website

links between them, the management and proposal of related products is im-plemented almost automatically. In the second image we have an example of the next automatic step for a cumulated offer.

2.3.2 Why Cross Selling in Festo?

As we explained in the first section of this chapter, F-IT market is mainly composed by OEMs (Original Equipment Manufacturers).

It is easily understood that a practice like Cross Selling in Festo market is much more suited to the market constituted by the so-called industry machine manufacturers (OEMs) as they require almost all of the products and

(24)

compo-2.3. CROSS SELLING AS MARKETING AND STATISTICAL APPROACH nents sold by Festo for the manufacture of their final products, as opposed to End Users that will purchase only the necessary products to the resolution of his momentary need, or buy products from Festo because "forced" by the composition of its industrial plant (e.g. buy pneumatic components for re-placement).

The ability to examine the basket of goods purchased by an OEM customer, therefore, gives important information about what he is purchasing from Festo and what is necessarily buying from its direct competitors.

2.3.3 Festo first steps in Cross Selling

Festo has already activated strategies of Cross Selling as part of its shares trading. An example of this is a service similar to that described in the first part of this chapter from Amazon. In fact, when customers decide to use our Online Shop, they have also the ability to view in few clicks a series of acces-sories products related to the product that they have just bought.

Figure 2.8 – Proposal based on customer purchasing choices.

However, this happens only when the indirect acquisition channel is used by customers, not when, as often happens, the customer is followed directly by a Sales Engineer. The target was to identify and develope the right tools to

(25)

2.4. FESTO ANALYTICS AND STATUS QUO

Figure 2.9 – Specific section that show related products: a first Cross Selling approach.

support and monitoring Sale Force Cross Selling actions in the best way. There was also the problem of a lack of specific statistical analysis of Cross Sell-ing monitorSell-ing for direct customers and to develop specific campaigns related to this topic.

2.4 Festo analytics and Status Quo

2.4.1 Level 1: total amounts

Festo decided to start a centralized analysis few years ago by forming a global team that was in charge to set and develop some KPIs and some mea-surements in order to identify and positioning customers from Cross Selling point of view.

The first step has been to aggregate all Festo products’ basket into macro-families of products that could be compared between them (e.g. all the Pneu-matic Drives, all the Sensors etc.), so Cross Selling works on the comparison of absolute amounts from these different product families between them.

(26)

Figure 2.10 – Example of how products have been aggregated into bigger fam-ilies

The result has been the possibility to compute, for each Festo customer, sector, or country (depending on the analysis level) the amount for a certain analysis period, generally a moving year, of all macro-families and to compute the total amount for every single product family: this concept was already present in the past for main product families (such Drives, Sensors or Valves) but with the new analysis we have the full view of F-IT split by all product categories and for every single customer (see Figure 2.11). This allowed us to make specific and strategic campaigns based on the information coming from this analysis and would allow us to make specific statical analysis on them. We remind that we exclude from the analysis the product category called Cus-tomer Solutions as they don’t represent a serial product (except in specific cases) but single customization made by the aggregation of single products.

We now have the possibility to aggregate yearly Market Sales or Volumes generated for every product family for F-IT customers and start making some assumption that will explain some strategic choices in the Cross Selling Level

(27)

Figure 2.11 – Final macro categories taken into account for Level 1 analysis (Customer Solutions have been excluded).

2 analysis.

In Figure 2.12 F-IT yearly market sales and volumes in quantity are shown.

CHART Cross Selling Basic Analysis (LEV 1) - only OEM (I) (GREEN)

Proportion of Market Sales [LC] into defined cluster - only OEMs!

Air supply Pneumatic

Drives Valves Valve Terminals Throttles Tubings Fittings Sensors Electric Drives cablesConnecting Other products Pneumatic Drives Acc. Valve Accessoiries Other Accessoiries Customer Solutions (VCC 3-13)

not classified EW

IT

Proportion of Quantities into defined cluster - only OEMs!

Air supply Pneumatic Drives

Valves Valve Terminals

Throttles Tubings Fittings Sensors Electric Drives Connecting cables Other products Pneumatic Drives Acc. Valve Accessoiries Other Accessoiries Customer Solutions (VCC 3-13) not classified EW IT

Figure 2.12 – YTD Result, on NTO and Volumes, of all product families for F-IT It’s easy to understand from this picture that Pneumatic Drives is the most profitable family (from NTO point of view) and it is quite known as Festo is worldwide leader in production of Pneumatic Drives, Valves and Valves Terminals. This information will be important in the next paragraphs.

(28)

is a lot of variability in the relation between total amounts of product families in different customers. This could be due to a lack of sales action in a customer for a certain product families or it could be a pathological situation, due to the incompatibility of a customer’s final product with a particular product family.

2.4.2 Pros and Cons of Level 1 analysis

Level 1 of Cross Selling analysis is therefore useful to make a rough

anal-ysis on amounts of all macro categories for all customers and to find where

the amount of particular product family is completely null or very small. An example of this comparison is made in the template of Figure 2.13 for a partic-ular F-IT industry sector’s customers (Food & Beverage). We can immediately see which customerrs are not buying a particular product family. Every Sales Engineer of F-IT receives an Excel file with data for his own customers every quarter.

Figure 2.13 – Template Excel Analysis for Level 1 in Food & Beverage sector The main benefit from Level 1 analysis is to identify in a fast and easy way the customers with which F-IT can choose to start a marketing campaign rely-ing on the information of null (red cells on Excel Table) or very small amount

(29)

2.4. FESTO ANALYTICS AND STATUS QUO for a certain product family.

There are two big problems with Level 1 analysis:

1. It doesn’t take in account the relation between different product fami-lies: for a particular customer (a row in the Excel file), we cannot work on the distribution of the composition of a product basket for a cus-tomer: we can just act on single product families (such Drives, Valves, Sensors etc.) but not on the relation between this family with the other ones.

2. The second one is even more crucial: since it works on absolute quan-tities of each product family, it doesn’t allow us to compare in the right way customers with different dimensions (both from Volumes or NTO point of view). What is the benefit of comparing customers with several differences in the absolute amount on the same product families? As we already understood, the relation between different families allow us to understand the customer fidelity level, because if we know that 2 product fam-ilies are positively correlated (e.g. if I buy a Pneumatic Drive I will need some accessories for this Drive such Positioning Sensors, a Valve or a Valve Termi-nal to let it move, a Cylinder Mounting etc.) and we note that this relation is not satisfied for some customers, it means the latter are necessarily buying some products form our competitors. Hence if we were able to define a sta-tistically significant relation between correlated product families, we would be able to make exploratory analysis and to cluster customers with methods that are supported by statistical instruments.

(30)

2.4.3 Level 2 analysis: Ratios between product families

The way to build specific analysis to compare amounts of different product families for a customer has been developed in the most feasible and easy to be implemented method: Festo decided to calculate ratios between the amounts of related macro families taken in pairs and decide, after defining which fam-ilies were related together, and to study the distance of this ratio from 1 in order to understand if this distance was a motivation for marketing actions or if it was related to technical motivation (e.g. ratio Sensors/Drives could not be ideally 1 for each industry application, in some case we need 2 Sensors for each pneumatic drive to be controlled, in other cases, less frequent, we won’t need any Sensor for a Pneumatic Drive). For this kind of analysis Festo needed to limit data taken into account, as we could set a ratio between products only for the ones whose technical relation we were sure about.

The first issue to be managed with ratios was to set which product family between the two taken into account for each ratios was the one leading the ratio meaning: to make a simple example, let’s have dummy values for the 3 Customers in Table 2.1

Table 2.1 – Example table: which Ratio is the correct one?

Name Pneumatic Drives Sensors Sens./Drives Drives/Sens.

CUST 1 3500 2800 0,8 1,25

CUST 2 23459 27021 1,15 0,87

CUST 3 (perfect) 5500 5500 1,00 1,00

Let’s look to Customer 1 ratio between Sensors Vs Drives equal to 0.8: it means that this customer is buying 80 sensors every 100 purchased drives; does it means that Festo is not good selling Sensors to this customer or that it is very good in selling Drives to it?

(31)

So how do we manage ratios with value >1? Do we leave them out from cam-paigns or do we have to deal with them in another way?

2.4.4 Level 2 Ratios logic: Pneumatic Drives as family basis

We understand that to use ratios between two product families we have to

fixone of the values that are composing the ratio, in this way we are able to

create a feasible positioning for our customers by relating them to their ratios (and in this way we give meanings to ratios higher than 1). We also under-stand that this decision is completely strategic and will draw the direction of the future campaigns to be done related to Cross Selling analysis.

As we already explained in the second section of this chapter, F-IT (and generally Festo worldwide) core business is historically constituted by

Pneu-matic Drives product family. This is due to historic technical competence of

the company and to the quality of these products. Pneumatic Drives are also the products with the highest number of other product families correlated with them: it means that, by fixing these product families, we have the possibility to set a good number of ratios with other product families or with product that are accessories for Pneumatic Drives (necessarily correlated with the quantity of Drives sold).

The final result are 6 main ratios all based on drives quantities, 4 of these with a 1 by 1 relation with the number of drives, 2 ratios composed by for-mulas with more than 2 different components (see Figure 2.14). The result is an analysis with information about drives that allows us to compare cus-tomers with different volumes generated, but this time comparable between them with ratios: this enables us to identify groups of customers with the same ratios positioning and to define specific strategies for them.

(32)

Figure 2.14 – Relations between Drives and other product families

Figure 2.15 – After the analysis, the resume sheet shows us the ratios for each F-IT customer

In the right side of Figure 2.15, the Analysis Area, we can note all 6 ratios amount for some F-IT customer and the logic (or at least the simplest logic) to define an action or not with a customer on the basis of ratios’ values.

The header of Analysis Area shows us also the fact that all ratios (with the exception of the last one) are defined by taking fixed in the denominator the amount of Pneumatic Drives, in order to treat all customers in the same way,

(33)

thinking of Pneumatic Drives as the leading product family and trying to study differences between all the other macro families.

2.4.5 Level 2 problems: the dependence from Drives trend

As we explained, all Ratio Analysis is based on the amount of Pneumatic Drives, this means that if this family suffers from some specific trend, this will affect all Ratios in the opposite way of its trend (as for all ratios the amount of Drives is set as denominator). To make a simple example look at Table 2.2 and study the case where F-IT Pneumatic Drives total amount decreases instead of being the leading product family and at the same time all the other families’ amount remains the same: what will happen to first 2 Ratios?

Name Pneumatic Drives Sensors Mountings Sens./Drives Mount./Drives

CUST 1 3500 2800 2600 0,80 0,74

CUST 1b 2500 2800 2600 1,12 1,04

Table 2.2 – Opposite effect on ratios generated by Drives

For the nature of ratios, this would be taken as good effect on all ratios in-stead of a loss of market share on Pneumatic Drives family. This is something very misleading and counter-intuitive and makes us unable to set specific sta-tistical analysis on our dataset with coherent logics.

The solution to this issue is a middle way between Level 1 and Level 2 anal-yses, that is to reconsider the total amounts of macro-families (so retake in account the trend of Pneumatic Drives) and to study all customers as a

com-position(in the statistic meaning of the term) of the amounts of single families.

In this way we are able to avoid the Pneumatic Drives trend problem and we also have a complete view of how a customer is positioned with respect to its ideal situation.

(34)

Compositional Data Analysis. We will also see that we have to define the right

geometry(and so the right basis transformation) for our case study in order

to obtain results coherent with our scope, namely to study the relation

be-tween the single componentsof a customer and not just the size of the

(35)

Chapter 3 Compositional Data Analysis and

the Aitchison Simplex

3.1 D-compositions and Simplex

S

D

As we introduced at the end of the last chapter we are more interested in the study, for each F-IT customer, of the relation between the amounts of product families than their absolute size.

By treating customers on an absolute scale we can run into misleading situa-tions as we can deal with spurious correlasitua-tions between components of a com-position and also because most of standard multivariate statistical methods are not applicable to this type of data.

This is a problem that was known since the end of nineteenth century: Karl Pearson in a paper in 1897 [7] pointed out the problems arising from the use of standard statistical methods with proportions. Nowadays some different schools of thought arose and there are different approach to Compositional Data Analysis, some of them seem to meet our needs. Before introducing these techniques first of all let’s define what a composition is from a statistical point of view.

Definition 1. A vector x = [x1, x2, . . . , xD] is a D-part composition when all its

(36)

3.1. D-COMPOSITIONS AND SIMPLEXSD

In our study case we are more interested in the relation between the com-ponents of customer product basket than on the amount of a single product family. Thus it seems to be reasonable to treat our data as compositions instead of considering them as a standard multivariate dataset (with every customer as a single observation of the dataset) with the absolute amounts of product families as a variable.

Generally a composition has the intrinsic propriety that all its components sum

to a constant k (in this case we are talking about closed data), with k=1 if we are

dealing with compositions that represent proportions (i.e. every components

will describe the part per unit of a component) or k = 100 if we are dealing

with percentages. In our case most of our observations will sum up to a total different every time since we are dealing with customers with different total amounts. In this case a kind of normalization or a transformation of data is essential in order to have the possibility to manage the data in the proper way. We will see different approaches to obtain our purpose that involve different schools of thought. First of all we have to fix the concept that no matters if we are dealing with compositions with different total sum of their components as what we need is to study the ratios between them respect to the total.

Definition 2. Two vectors of D positive real components x, y∈ RD

+(xi, yi >0,∀i =

1, 2, . . . , D) are compositionally equivalent if exists a positive constant number

c ∈ R+such that x =c·y.

This definition allows us to begin to think that this statistical approach is feasible for our purpose as it treats two customers (vectors) with same ra-tios between components but different total amount (so with the components that differ two by two only for a constant c) as compositionally equivalent, so they would represent the same composition even if they had different total

(37)

amounts. This fact allows to understand that, with the appropriate scaling

fac-tor for each composition, we can represent all our customers’ D−composition

as elements of the hyperplane containing the vectors whose components sum up to a given constant κ. To make this we need to define the closure operation, that allows to assign a costant sum to all our D-compostions.

Definition 3. For any vector of D strictly positive real components,

z= [z1, z2, . . . , zD] ∈ RD+, zi >0 ∀i=1, 2, . . . , D,

the closure of z to κ >0 is defined as

C(z)= " κ·z1 ∑D_i=1zi , κ·z2 ∑_i=1D zi , . . . , κ·zD ∑_i=1D zi # . (3.1)

So we can rewrite the compositional equivalence in the next way: two

vec-tors x, y ∈ RD+ are compositionally equivalent if C(x) = C(y)for all κ closure

constant chosen.

With Closure operation we represent all our D-compositions (as customers’ components of all product families taken in analysis) rescaled to sum up to the

same constant. From now on we will consider κ =1, as we are familiar to treat

Cross Selling ratios that are ideally close to 1.

With this operation we will be able to define the sample space for our composi-tions.

Definition 4. The sample space of compositional data is the simplex,

SD = ( x= [x1, x2, . . . , xD] xi>0, i =1, 2, . . . , D; D

∑

i=1 xi =κ ) (3.2)

In Figure 3.1 it is shown the simplex SD _{of constant sum k relative to}

R3+:

(38)

Figure 3.1 – Simplex inR3+ and its representation as ternary diagram (Source:

[6])

Now that the sample space for compositions is defined, we can also intro-duce the concept of subcomposition.

Definition 5. Given a composition x and a selection of indices S = i1, i2, . . . , is, a

subcomposition xS, with s parts, is obtained by applying the closure operation to the

subvector[xi1, xi2, . . . , xis]of x.

The set of subscripts S indicates which parts are selected in the subcomposition, not necessarily the first s ones.

It is essential to remark that in most situations we are just able to measure only subcompositions (let’s think, for instance, to measurements of basic com-ponents of a chemical compost where there will be comcom-ponents not measured because of their real absence but there will be also components that are not measured because of instrument errors or other external factors). This happens generally in analyses in biologic fields. We will see in Chapter 4 the importance of how to treat subcompositions in relation to the whole compositions of the same dataset. In our specific case we will make some analysis on subcomposi-tions of our initial customers to obtain results on 3 components, even because it will be more intuitive and it will help us to draw some graphical conclusions. Subcompositions are one way to reduce the dimensionality of a compositional

(39)

3.2. COMPOSITIONAL ANALYSIS PRINCIPLES

data set: another quite commonly used way is to amalgamate some compo-nents, that is, to sum them into one new part.

Property 1. Amalgamation: Given a composition x∈ SD_{, and a selection of indices}

A= {i1, . . . , ia}(not necessarily the first ones), D−a ≥1, and the set of remaining

indices ¯A the value

xA =

_∑

j∈A xi

is called amalgamated part or amalgamated component. The vector x’ = [x_A¯, xA],

containing the components with subscript in ¯A grouped in x_A¯ and the amalgamated

component xA, is called amalgamated composition which is in SD−a+1. Note that

using a fill-up or residual value is equivalent to using an amalgamated composition.

3.2 Compositional analysis principles

As it’s nature based on the relation between components with the whole unit, it’s clear that compositions (as statical tool) must meet certain needed requirements (or principles). Aitchison in 1986 defined three principles that induce geometric transformations for the dataset allowing for the use of classic statistical tools on the transformed dataset.

3.2.1 Scale invariance

As we mentioned, compositions carry only relative information. In our business case no matter if are comparing a customer with generated volumes much larger than another one, we want to analyse them for the relative rela-tions between the different product families that they are buying, so we need to define the concept of scale invariance.

Definition 6. A function f(·)defined on R+D is scale invariant if, for any positive

(40)

3.2. COMPOSITIONAL ANALYSIS PRINCIPLES

returns the same result for all compositionally equivalent compositions.

There are a lot of functions that satisfy this principle, an example is the

simple ratio function as, given compositions x= [x1, x2, . . . , xD]and y=λxis

easy to understand that f(x) = x1/x2 = (λx1)/(λx2) = f(y) = f(λx). In our

business case it means that if ratio between the first 2 product families was the same (even with different amounts) we would treat them as compositionally equivalent.

However, ratios on compositions, as we define them, are strictly positive and

depend on the ordering of parts, in fact x1/x2and x2/x1 would return us two

completely different results. A convenient transformation of ratios is the

cor-responding logratio, f(x) =ln(x1/x2).

In this way the inversion of the ratio only produces a change of sign and de-fines a symmetry in the function with respect to the ordering of composition’s parts.

3.2.2 Permutation invariance

Another crucial point is that results of the analysis must not depend on the sequence of the components our dataset.

Definition 7. A function f(·)of a vector argument x = [x1, x2, . . . , xD] is

permu-tation invariant if the value of f(·)do not change if we permute the components of

x

If we think to the log-ratio approach, this is a very important principle when we ask which methods can be meaningfully applied to coordinates of

compositional data. A naive Euclidean distance of alr1 transformed data is

not permutation invariant and is not the proper tool for cluster analysis. We

(41)

3.3. THE AITCHISON GEOMETRY

would risk to have different clusterings depending on which was the last vari-able in the dataset. An alternative transformation will be discussed below.

3.2.3 Subcompositional coherence

The final principle we need is subcompositional coherence: subcomposi-tions behave as the equivalent in real analysis of orthogonal projecsubcomposi-tions of their original compositions. This led us to two important consequence:

— The distance (whatever the way in which we define it) between two compositions must be greater or equal to the distance between them when we are considering any subcompositions. This is called subcom-positional dominance. It could be easily demonstrated that Euclidean distance between compositional vectors does not fulfil this condition.

— If a non-informative part is removed, results should not change: it means that measures of association or measures of dissimilarity between com-ponents, for example correlations or distances are unaffected by consid-ering subcompositions instead of the whole composition.

Scale invariance of the results is preserved within arbitrary subcompositions, that is, the ratios between any parts in the subcomposition are equal to the corresponding ratios in the original composition.

3.3 The Aitchison geometry

To work with our dataset and to treat it as a set of observations we need to build a proper environment for it and, in the same time, we have to preserve the principles defined above.

If we were working in the real space, we could add vectors, multiply them by scalar values and look for properties such as orthogonality, or compute the distance between two points.

(42)

Drives Sensors Mountings

Cust1 115 25 5

Cust2 90 45 10

Cust3 60 15 15

Cust4 35 35 10 Drives Sensors

Mountings

● ●

●

Figure 3.2 – Example of representation of 4 Customers onS3

Cust1 Cust2 Cust3

Cust2 32.40370

Cust3 56.78908 42.72002

Cust4 80.77747 55.90170 32.40370

Table 3.1 – Euclidean Distance between customers onS3

All of this, and much more, is possible because the real space is a linear vec-tor space with a metric structure. We are familiar with its geometric structure, the Euclidean geometry, and we are used to represent observations within this geometry. But this geometry is generally not a proper geometry for composi-tional data, because it is not able to capture relations between the components of a composition. Let’s take as an example the simple case of 4 customers rep-resentation in Figure 3.2.

If we treat the simplex as a standard Euclidean space we would come to the (graphical) conclusion that Cust1 and Cust2 have different distance than Cust3 and Cust4. Results given by Euclidean Distance are in Table 3.1.

The Euclidean distance between them is certainly the same, as there is a dif-ference of 25, 20 and 5 units between the three respective components. But in the second case, the proportion in the first component is almost doubled, while in the second case the difference is lower; we have a similar situation also in the third component. An approach that takes into account relative differences seems more adequate to describe compositional variability.

(43)

This is not the only reason for discarding the usual Euclidean geometry as a proper tool for analysing compositional data. Problems might appear in other situations, such as those where results end up outside the sample space, for ex-ample, when translating compositional vectors or computing joint confidence regions for random compositions under assumptions of normality, ellipses and lines. So we need to build a geometry to work with compositional data in the simplex, as there things appear not as simple as they are in real space.

To make this we have to define two operations equip the simplex with a vector space structure. The first one is perturbation, which is analogous to addition in real space; the second one is powering, the analogous to multiplication by a scalar in real space. Moreover, it is possible to obtain an Euclidean vector space structure on the simplex adding an inner product, a norm, and a dis-tance to the previous definitions. With all these definitions we can operate in the simplex in the same way as one operates in real space.

3.3.1 Defining a vector space structure

We thus have to define the analogous operations to addition and multipli-cation by a scalar in real space, using closure operation defined in Definition 3:

Definition 8. Perturbation of x∈ SD _{by y}_{∈ S}D_,

xL

x= C[x1y1, x2y2, . . . , xDyD] ∈ SD

Definition 9. Power transformation or powering of x∈ SD_{by a constant α} _{∈ R}_,

αJ

x = C[xα₁, xα₂, . . . , xα_D] ∈ SD

Now we can say that the triple(S,L

,J

)with perturbation and powering

is a vector space. This means that these properties are analogous to translation and scalar multiplication in real space.

(44)

3.3. THE AITCHISON GEOMETRY Drives Sensors Mountings ● ● ● ● Drives Sensors Mountings ● ● ● ● Drives Sensors Mountings ●● ●●

Figure 3.3 – Previous example with customers perturbed by y= [5, 30, 80]and

y= [10, 100, 10] Drives Sensors Mountings ● ● ● _● Drives Sensors Mountings ●● ● _● Drives Sensors Mountings ●● ● ●

Figure 3.4 – Previous example with powering applied to customers with α =

0.5 and α =2

Property 2. (SD,L

)is a commutative group structure; that is, for x, y, z∈ SD_{, it}

holds: 1. commutative property: xL y=yL x; 2. associative property:(xL y)L z=xL (yL z); 3. neutral element: n= C[1, 1, . . . , 1] = " 1 D, 1 D, . . . , 1 D # ; n is the barycentre of the simplex and is unique;

4. inverse of x : x-1 = C[x−1₁ , x−1₂ , . . . , x−1_D ]: thus, xL

x-1 =n

By analogy with standard operations in real space, for the perturbation

differ-ence, we will write: xL

(45)

Property 3. Powering satisfies the properties of an external product. For x, y ∈

SD, α, β ∈ R, it holds: 1. associative property: αJ (βJ x) = (α·β)J x; 2. distributive property 1: αJ (xL y) = (αJ x)L (αJ y); 3. distributive property 2: (α+β)J x= (αJ x)L (βJ x); 4. neutral element: 1J

x=x; the neutral element is unique.

Note that the closure operation cancels out any constant and, thus, the closure constant itself is not important from a mathematical point of view. This fact allows us to omit the closure in intermediate steps of any computation without problem.

It has also significant implications for practical reasons, as shall be seen during

simplicial principal component analysis. We can express this property for z ∈

R+D and x∈ SD as

x⊕ (α z) =x⊕ (α C(z))

Nevertheless, one should be always aware that the closure constant is very important for the interpretation of the units of the problem at hand. Therefore, controlling for the right units should be the last step in any analysis.

3.3.2 Aitchison inner product, norm, and distance

To obtain a Euclidean vector space structure, we take the following inner product, with associated norm and distance (the subindex A stands for Aitchi-son).

Definition 10. Aitchison scalar product for compositions x, y∈ SD_,

hx, yi_A = _D1 D

∑

i>j lnxi xj lnyi yj

(46)

Definition 11. Aitchison norm for composition x∈ SD_,

kxk_A = v u u t_2D1 D

∑

i=1 D

∑

j=1 lnxi xj 2

Now we are also able to define a proper distance that will allow us to work on our compositional dataset as in real space.

Definition 12. Aitchison distance: the distance between compositions x, y∈ SD_is

defined in the next way,

dA(x, y) = kx ykA = v u u t_2D1 D

∑

i=1 D

∑

j=1 lnxi xj −lnyi yj 2

With this new definition of distance let’s consider again the example of 4 customers in Figure 3.2 and now compute the Aitchison distance between Customers 1 and 2 and between Customers 3 and 4, considering that these distance were identical using Euclidean distance (although this conclusion was misleading from a logic point of view):

Cust1 Cust2 Cust3

Cust2 0.7269086

Cust3 1.3747150 1.0646298

Cust4 1.4143010 0.6917658 1.0815202

Table 3.2 – Aitchison Distance between customers onS3

We can see that now the two distances are different, as in this definition we are considering also the relation between components on our customers. We thus understand that geometry on the simplex has logics that differs from our standard geometric conceptions: even in ternary diagram the two distances seem to be different (contradicting the results of Euclidean distance between customers).

(47)

3.3. THE AITCHISON GEOMETRY Transformations

Different log-ratio transformations could be applied to compositional data, each one that gives different results: most used are Centered log-ratio transfor-mation (clr) and Isometric log-ratio transfortransfor-mation (ilr):

— Centered log-ratio transform(g(x) = √D _x1_{· · ·}_xD₎

clr(x) = z= ln x1 g(x); . . . ; ln xD g(x) = ln(x) D ·      D−1 −1 . . . −1 −1 D−1 . . . −1 .. . ... . .. ... −1 −1 . . . D−1      clr−1(z) = C[exp(z)]

— Isometric log-ratio transform

ilrV(x) =clr(x) ·V=ln(x) ·V

For a given matrix V of D rows and (D-1) columns such that V ·Vt =

ID−1 (Identity matrix of D-1 elements) and V·Vt = ID+a1, where a

may be any value and 1 is a matrix full of ones. The inverse is:

ilr−1_V (x) = C[exp(x·Vt)]

3.3.3 Geometry on

S

3

_{: figures on ternary diagrams}

Ternary diagrams interpretation

Now that we have a well-defined geometry, it is useful to show how some

geometric figures are represented on the simplex S3 _{using ternary diagrams.}

(48)

elements have to be interpreted in order to understand results of our statical analyses, for instance for PCA analysis to understand the meaning of Principal Components directions, for the interpretation of the position on ternary dia-gram of our customers and understand the meaning of clusters derived from classification.

Figure 3.5 – Template for ternary diagrams and importance of amount

compo-nents proportion inS3_.

In Figure 3.6 some customers are represented on ternary diagrams in order to show how they act on ternary diagrams: customers with same shape are similar. The more a customer is near a vertex, the more the proportion of the components of this vertex is important in the compositions (Figure 3.5): this means that if a point is lying near the opposite side of a certain vertex, the component related the vertex is very weak.

Shapes on ternary diagrams

As we are working on simplex, in ternary diagrams geometric shapes will behave in different way from real space. In left ternary diagram of Figure 3.7 we show how parallel lines are represented and in right ternary diagram of

(49)

3.3. THE AITCHISON GEOMETRY Drives Sensors Mountings ● ● ● ●

Figure 3.6 – Similar customers (by shape or colour) lying on same proportion lines.

circles and ellipses change respect to real space.

For some characteristics of ternary diagram interpretation is intuitive, for in-stance the positioning of a point on ternary diagram and the relation with its distance from the vertex is intuitive, for others is not so common: lines’ direc-tion and shape together with transladirec-tion of elements on ternary diagram has to be treated with proper attention to not mislead data interpretation.

Now we have a proper mathematical environment to work on simplex with F-IT customers’ data and we have also a proper representation tool such ternary diagrams to support analysis results.

(50)

(51)

Chapter 4 Business case: Compositional data

analysis of Cross Selling data

Now with Aithcison geometry we have the sample space for our compo-sitional data that is the 5-part simplex. The latter is the 4-dimensional subset

ofRD−1that contains all 5-part compositions that sum up to a prescribed

con-stant, in our case to 1.

For all the analyses of this chapter it has been used a specific R package named

compositions [8], that already provides some operations on compositional

data (such Aitchison distance and Compositional PCA). For analyses like K-means clustering, Linear and Quadratic Discriminant Analysis specific opera-tions have been defined using ilr and clr transformaopera-tions.

4.1 F-IT Customer dataset as Compositional dataset

We already introduced the Festo approach to Cross Selling with the defini-tion of its Ratios and also with its weaknesses and we also denoted importance in analysis of Industry Sector variable, we want to define our data in order to be able to be coherent with Company’s strategies and also in order to be able to make statistical analysis on it.

(52)

4.2. SIMPLIFICATION OF DATASET

4.1.1 Considering Customer as compositional observations

As explained in Chapter 1, regarding Cross Selling, by now customers have been analysed on Level 1 as observations of absolute amounts for each product family (but from the relation between these macro-families) and on Level 2 by ratios for some product families (the one we are have relations with Drives). We can indeed take all macro-families that we are sure have relations between

them(i.e. all the macro families used on Level 2 ratios and also the amount of

Pneumatic Drives).

Our aim is to be able to use classic statistical procedures such as PCA, Cluster Analysis, Linear and Quadratic Discriminant Analysis in order to con-firm or not the direction that Festo wants to tread for next marketing cam-paigns, and which variable is better to take as guide. For the theory describ-ing the above mentioned classical statistical methods we refer to (ref Johnson-Wichern).

Our first target is to reduce the variables of our dataset.

4.2 Simplification of Dataset

4.2.1 Variables taken into consideration

The focus is to reduce dataset variables in order to keep only the ones that we want to use for our analyses. We split the data reduction based on their nature.

In our final dataset every observation will corresponds to a single customer with some categorical and some numerical variables.

(53)

4.2. SIMPLIFICATION OF DATASET Categorical variables

To be able to interpret our final results we will maintain Customer Num-ber and Customer Name variables, even if they show unique values for each observation. Another variable we will keep is the one referring to the code of the F-IT Sales Engineer of each customer. As we explained in first chapter, our aim is to make analysis on the Industry Sector of F-IT customers, it is cru-cial as we want to understand if the idea of F-IT to make specific Cross Selling campaigns differentiated by ISM, is reflected by results that come out from our classification analyses.

Numerical variables: macro-families amounts taken into account

Not all macro families are perfectly related to each other. Ratios of Level 2 analysis are just six and two of them (last two) are stand-alone ratios not related to the other one. Thus we decided to take into account only strictly re-lated macro families: Pneumatic Drives, Proximity Sensors, Throttles, Cylin-der Mountings and Piston Rod Attachment. We stated in Chapter 3 that

neu-tral element nof a perturbation is the one with uniform amounts in each part,

i.e., for our dataset it would be

n= C[k, k, k, k, k] = " 1 5, 1 5, 1 5, 1 5, 1 5 #

We will also find some issues in interpreting graphical results for some analy-ses applied to the complete dataset with compositions of 5 components, since

we are not able to represent data inS5 _{but we will need to represent data on}

theS3_{subspace. In these cases we will repeat some analyses on}

subcomposi-tions of three components where results are more clearly explicable.

In Figure 4.1 we have some examples of products composing the selected macro families.

Compositional data analysis : a business case application to cross selling

POLITECNICO DI MILANO

Scuola di Ingegneria dei Sistemi

Corso di Laurea Specialistica in Ingegneria Matematica

Dipartimento di Matematica

Compositional Data Analysis: a business case

application to Cross Selling

Contents

List of Figures

List of Tables

Chapter 1

Introduction

1.1

Framework and motivation

1.2

Case study: application of Composition Data

Analysis to Cross Selling

1.2.1

Cross Selling in a nutshell

1.2.2

Cross Selling: from heuristic to quantitative approach

1.3

Chapters description

Chapter 2

Business case: Festo and Cross

Selling analysis

2.1

Introduction of Festo

2.2

Festo Italy and Industry Automation Italian

mar-ket

2.2.1

Festo Italy

2.2.2

F-IT market: Festo products

2.2.3

F-IT customer segmentation

2.3

Cross Selling as marketing and statistical

ap-proach

2.3.1

Cross and Up Selling

2.3.2

Why Cross Selling in Festo?

2.3.3

Festo first steps in Cross Selling

2.4

Festo analytics and Status Quo

2.4.1

Level 1: total amounts

2.4.2

Pros and Cons of Level 1 analysis

2.4.3

Level 2 analysis: Ratios between product families

2.4.4

Level 2 Ratios logic: Pneumatic Drives as family basis

2.4.5

Level 2 problems: the dependence from Drives trend

Chapter 3

Compositional Data Analysis and

the Aitchison Simplex

3.1

D-compositions and Simplex

S

∑

∑

3.2

Compositional analysis principles

3.2.1

Scale invariance

3.2.2

Permutation invariance

3.2.3

Subcompositional coherence

3.3

The Aitchison geometry

3.3.1

Defining a vector space structure

3.3.2

Aitchison inner product, norm, and distance

_∑

_{: figures on ternary diagrams}