Cercati di recente

Non ci sono risultati.

Etichete

Non ci sono risultati.

Documento

Non ci sono risultati.

Carica

Home Scuole Argomenti

Accedi

How is our preferred car positioned?

Condividi "How is our preferred car positioned?"

N/A

N/A

Protected

Anno accademico: 2021

Info

Protected

Academic year: 2021

Condividi "How is our preferred car positioned?"

Copied!

30

0

0

30

0

0

Caricamento.... (Visualizza il testo completo ora)

Scarica adesso ( 30 pagine )

Testo completo

(1)

FIAT 500 C, 1953

Regression

(2)

Regression

Segment C: price vs. weight/power

20,000 25,000 30,000 35,000 40,000 45,000

12,00 13,00 14,00 15,00 16,00 17,00 18,00 19,00

w eight/pow er (g/W)

price (M£)

How is our preferred car positioned?

Rover 45

(3)

Segment C: price vs. weight/power

20,000 25,000 30,000 35,000 40,000 45,000

12,00 13,00 14,00 15,00 16,00 17,00 18,00 19,00

w eight/pow er (g/W)

price (M£)

Regression

trend

How is our preferred car positioned?

Rover 45

(4)

Regression

Definition Let Ψ be a class of curves of IR

²

. We call trend or regression line of the set A in the class Ψ the element C

^*

of Ψ that lies at a

minimum distance from A

For any x, y ∈ IR

²

, let d(x, y) be the distance of x from y based on some metric, with

d(x, y) > 0, d(x, y) = 0 ⇔ x = y (non negativity)

d(x, y) = d(y, x) (symmetry)

d(x, y) + d(y, z) > d(x, z) (triangle inequality) Let C be a curve of IR

²

, x ∈ IR

ⁿ

, A discrete and finite ⊆ IR

²

Definition d(x, C) = min{d(x, y): y ∈ C} is called distance of x from C Definition d(A, C) = Σ d(x, C) is called distance of A from C

x∈A

1

|A|

(5)

normal to C by P_k

x_k y_k

d

_k

= d((x

_k

, y

_k

), C) = |y

_k

– f(x

_k

)|

On the metric

Definition Let Ψ be a class of curves of IR

²

. We call trend or regression line of the set A in the class Ψ the element C

^*

of Ψ that lies at a

minimum distance from A The metric chosen depends on the

situation modeled

1) If in an experiment all the record entries are affected by error, it is convenient to define the distance as the length of the shortest segment touching both P

_k

and C

2) If uncertainty affects just the

dependent variable y, one can choose the absolute value of the difference between the y-coordinate of P

_k

and the value attained by the curve in x

_k

x y

P_k

f(x_k) d((x_k, y_k), C)

(6)

x_k y_k

Linear regression

Definition Let Ψ be a class of curves of IR

²

. We call trend or regression line of the set A in the class Ψ the element C

^*

of Ψ that lies at a

minimum distance from A If the curve is a straight line y = c

₁

x + c

₀

, we speak of linear regression.

Consider case (2), in which the distance of P

_k

= (x

_k

, y

_k

) ∈ A from C is given

by the absolute value of the difference between the y-variable of P

_k

and the value attained by the line in x

_k

, namely

d

_k

= |y

_k

– c

₁

x

_k

– c

₀

|

x y

c₁x_k – c₀ d_k

P_k

(7)

x_k y_k

Linear regression

If the curve is a straight line y = c

₁

x + c

₀

, we speak of linear regression.

Consider case (2), in which the distance of P

_k

= (x

_k

, y

_k

) ∈ A from C is given

by the absolute value of the difference between the y-variable of P

_k

and the value attained by the line in x

_k

, namely

d

_k

= |y

_k

– c

₁

x

_k

– c

₀

|

x y

c₁x_k – c₀ d_k

Basically, the distance d

_k

of P

_k

from C depends on c

₀

and c

₁

Also the choice of the straight line depends on these two coefficients.

Hence a line at a minimum distance from the point set {P

₁

, P

₂

, …, P

_m

} is one given by those c

₀

and c

₁

that minimize

P_k

(d

₁

+ d

₂

+ … + d

_m

)

1 m

(8)

x_k y_k

Linear regression

x y

c₁x_k – c₀ d_k

Basically, the distance d

_k

of P

_k

from C depends on c

₀

and c

₁

Also the choice of the straight line depends on these two coefficients.

Hence a line at a minimum distance from the point set {P

₁

, P

₂

, …, P

_m

} is one given by those c

₀

and c

₁

that minimize the sum d

₁

+ d

₂

+ … + d

_m

Formulation:

min d

₁

+ … + d

_m

d

₁

> y

₁

– c

₁

x

₁

– c

₀

d

₁

> c

₁

x

₁

+ c

₀

– y

₁

…

d

_m

> y

_m

– c

₁

x

_m

– c

₀

d

_m

> c

₁

x

_m

+ c

₀

– y

_m

d

_k

= |y

_k

– c

₁

x

_k

– c

₀

|

P_k

(9)

x_k y_k

Linear regression

x y

c₁x_k – c₀ d_k

Basically, the distance d

_k

of P

_k

from C depends on c

₀

and c

₁

Also the choice of the straight line depends on these two coefficients.

Hence a line at a minimum distance from the point set {P

₁

, P

₂

, …, P

_m

} is one given by those c

₀

and c

₁

that minimize the sum d

₁

+ d

₂

+ … + d

_m

Formulation:

min d

₁

+ … + d

_m

d

₁

+ x

₁

c

₁

+ c

₀

> y

₁

d

₁

– x

₁

c

₁

– c

₀

> – y

₁

…

d

_m

+ x

_m

c

₁

+ c

₀

> y

_m

d

_m

– x

_m

c

₁

– c

₀

> – y

_m

d

_k

= |y

_k

– c

₁

x

_k

– c

₀

|

P_k

(10)

∂ d

²

(c

₀

, c

₁

)

∂ c

₀

= – Σ

^m

^{2( y}

^k

^{– c}

¹

^x

^k

^{– c}

⁰

⁾

k=1

1 m

c₀ z = d²(c₀, c₁)

c₁

Linear regression

c₁^*

c₀^*

* The surface z = d

²

(c

₀

, c

₁

) is an up-concave paraboloid of IR

³

d

²

(c

₀

, c

₁

) = (d

¹ ₁²

+ … + d

_m²

)

m

The minimum (c

₀^*

, c

₁^*

) ∈ IR

²

of d

²

(c

₀

, c

₁

) can be computed by annealing the partial derivatives in c

₀

and c

₁

∂ d

²

(c

₀

, c

₁

)

∂ c

₁

= –

¹

Σ ^2x

k

( y

_k

– c

₁

x

_k

– c

₀

)

m

m

= ( ^c

₁k=1

Σ ^x

^k²

^{+ c}

⁰

Σ ^x

^k

^– Σ ^x

^k

^y

^k

)

m

k=1

m

k=1

m

k=1

2 m

= ( ^c

₁

Σ ^x

^k

^– Σ ^y

^k

) ^{+ 2c}

₀

m

k=1

m

k=1

2 m

x^uy^w = ¹

Σ

^xkuy_k^w

m

m

k=1

x c

₀

+ x

²

c

₁

= xy

c

₀

+ x c

₁

= y

Alternatively, one can define the distance d(A, c

₀

, c

₁

) = d(c

₀

, c

₁

) of y = c

₁

x + c

₀

from the set A via

d

_k²

= (y

_k

– c

₁

x

_k

– c

₀

)

²

= ( ^c

₁

Σ ^x

^k²

^{+ c}

⁰

Σ ^x

^k

^– Σ ^x

^k

^y

^k

)

m k=1

m k=1

m k=1

2 m

= ( ^c

₁

Σ ^x

^k

^– Σ ^y

^k

) ^{+ 2c}

₀

m k=1

m k=1

2 m

(11)

c₀ z = d²(c₀, c₁)

c₁

Linear regression

c₁^*

c₀^*

* This is known as the least squares method d

²

(c

₀

, c

₁

) = (d

¹ ₁²

+ … + d

_m²

)

m

x^uy^w = ¹

Σ

^xkuy_k^w

m

m

k=1

x c

₀

+ x

²

c

₁

= xy

c

₀

+ x c

₁

= y

d

_k²

= (y

_k

– c

₁

x

_k

– c

₀

)

²

Solving the linear systems returns the values of c

₀^*

and c

₁^*

, and hence the regression line

y = c

₁^*

x + c

₀^*

(12)

c₀ z = d²(c₀, c₁)

c₁

Linear regression

c₁^*

c₀^*

* This is known as the least squares method d

²

(c

₀

, c

₁

) = (d

¹ ₁²

+ … + d

_m²

)

m

d

_k²

= (y

_k

– c

₁

x

_k

– c

₀

)

²

x y

d₁²

d₄²

d₃²

d₂²

P₃ P₂

P₁

P₄

(13)

x_k y_k

Polynomial regression

Let us now pass to a curve of the form y = c

_t

x

^t

+ c

_t–1

x

^t–1

+ … + c

₁

x + c

₀

If we set:

x y

f(x_k) d_k

P_k

d

_k

= |y

_k

– c

_t

x

_kt t–1

– c

_t–1

x

_k

– … – c

₀

| the parameters of C can be obtained by solving the LP

min d

₁

+ … + d

_m

d

₁

> y

₁

– c

_t

x

₁

– c

_t–1

x

₁

– … – c

₀

d

_m

> y

_m

– c

_t

x

_m

– c

_t–1

x

_m

– … – c

₀

d

₁

> c

_t

x

₁

+ c

_t–1

x

₁

+ … + c

₀

– y

₁

d

_m

> c

_t

x

_m

+ c

_t–1

x

_m

+ … + c

₀

– y

_m

t t–1

t t–1 t t–1

t t–1

…

(14)

x y

Other non linear models

d

_k

= |g

^–1

(y

_k

) – c

₁

f

₁

(x

_k

) – … – c

_t

f

_t

(x

_k

)|

can be determined by solving the LP y = g [ Σ ^c

i

f

_i

(x) ]

t i=1

min d

₁

+ … + d

_m

d

₁

> g

^–1

(y

₁

) – c

₁

f

₁

(x

₁

) – … – c

_t

f

_t

(x

₁

) d

₁

> c

₁

f

₁

(x

₁

) + … + c

_t

f

_t

(x

₁

) – g

^–1

(y

₁

)

…

d

_m

> g

^–1

(y

_m

) – c

₁

f

₁

(x

_m

) – … – c

_t

f

_t

(x

_m

) d

_m

> c

₁

f

₁

(x

_m

) + … + c

_t

f

_t

(x

_m

) – g

^–1

(y

_m

) Generalizing, a model of the type

with f

_i

, g known functions, g monotone invertible and

or, with minimum squares, by defining

d

_k²

= [g

^–1

(y

_k

) – c

₁

f

₁

(x

_k

) – … – c

_t

f

_t

(x

_k

)]

²

d

_k²

= [g

^–1

(y

_k

) – c

₁

f

₁

(x

_k

) – … – c

_t

f

_t

(x

_k

)]

²

and solving the relevant

linear system

(15)

7,2

23,2

62,9

122,8

203,2

0 50 100 150 200 250

1810 1850 1890 1930 1970 2010

Example

Let us try and estimate by the linear model y = c

₁

x + c

₀

the growth of US population

?

min

Σ

^|yk – c₁x_k – c₀|

m k=1

(16)

Example

7,2

23,2

62,9

122,8

203,2

-26,6

23,2

73,0

122,8

172,6

222,4

-50 0 50 100 150 200 250

1810 1850 1890 1930 1970 2010

The outcome is clearly unacceptable: it even provides a negative value for the year 1810!

min

Σ

^|yk – c₁x_k – c₀|

m k=1

(17)

7,2

23,2

62,9

122,8

203,2

0 50 100 150 200 250 300

1810 1850 1890 1930 1970 2010

Example

Let us then try an exponential model of the type y = e

^c¹^{x + c}⁰

?

min

Σ

^|log(yk) – c₁x_k – c₀|

m k=1

(18)

7,2

23,2

62,9

122,8

203,2

10,1

23,2

53,4

122,8

282,5

0 50 100 150 200 250 300

1810 1850 1890 1930 1970 2010

Example

In practice, this amounts to apply a linear regression to the points (x

_k

, log(y

_k

))

The model forecast is that in 2010 the US population will reach 650 million

min

Σ

^|log(yk) – c₁x_k – c₀|

m k=1

(19)

Example

The minimum squares method provides a slightly reduced trend, and a forecast of 589,1 million inhabitants for the year 2010

7,2

23,2

62,9

122,8

203,2

9,1

21,0

48,3

111,2

256,0

0 50 100 150 200 250 300

1810 1850 1890 1930 1970 2010

min

Σ

^|log(yk) – c₁x_k – c₀|

m k=1

(20)

Linear regression in IR ⁿ

x

₁

x

₂

The locus of points x of IR

ⁿ

whose scalar product by a given versor

u = (u

₁

, …, u

_n

) returns a known value β is a hyperplane H

The points of H fulfill the linear equation

u

₁

x

₁

+ … + u

_n

x

_n

= u

⋅

x =

β

or equivalently

a

₁

x

₁

+ … + a

_n

x

_n

= a

⋅

x = b where

u

x β

b =

β

|a| =

β

√ a

²₁

+ … + a

²_n

u =

^a

|a|

If distance is expressed as in case (2), then we can describe the hyperplane by

its explicit form x

_n

= c

₀

+ c

₁

x

₁

+ … + c

_n–1

x

_n–1

ending up with an LP or a linear

system similar to that seen in IR

²

(21)

Linear regression in IR ⁿ

x

₁

x

₂

The locus of points x of IR

ⁿ

whose scalar product by a given versor

u = (u

₁

, …, u

_n

) returns a known value β is a hyperplane H

The points of H fulfill the linear equation

u

₁

x

₁

+ … + u

_n

x

_n

= u

⋅

x =

β

or equivalently

a

₁

x

₁

+ … + a

_n

x

_n

= a

⋅

x = b where

β

b =

β

|a| =

β

√ a

²₁

+ … + a

²_n

u =

^a

|a|

If distance is expressed as in case (1), things get a bit more complicated:

d^k

u

x₁^k x₂k ux

k

d

_k

= |u

⋅

x

^k

–

β

|

x^k

(22)

Linear regression in IR ⁿ

x

₁

x

₂

The optimization problem reads min d

₁

+ … + d

_m

β

If distance is expressed as in case (1), things get a bit more complicated:

x^k d^k

u

x₁^k x₂k ux

k

d

_k

= |u

⋅

x

^k

–

β

| d

₁

> x

₁¹

u

₁

+ … x

_n¹

u

_n

–

_β

d

₁

>

_β

– x

₁¹

u

₁

+ … x

_n¹

u

_n

d

_m

>

_β

– x

₁^m

u

₁

+ … x

_n^m

u

_n

d

_m

>

β

– x

₁^m

u

₁

+ … x

_n^m

u

_n

u

₁²

+ … + u

²_n

= 1

…

and is a non linear problem

(23)

Linear regression in IR ⁿ

x

₁

x

₂

The optimization problem reads min d

₁

+ … + d

_m

However, the information carried by versor u is redundant

For one, since

_β

is unconstrained we can limit our attention to the directions belonging to a semi sphere

x^k d^k

u

x₁^k x₂k

d

₁

> x

₁¹

u

₁

+ … x

_n¹

u

_n

–

_β

d

₁

>

_β

– x

₁¹

u

₁

+ … x

_n¹

u

_n

d

_m

>

_β

– x

₁^m

u

₁

+ … x

_n^m

u

_n

d

_m

>

β

– x

₁^m

u

₁

+ … x

_n^m

u

_n

u

₁²

+ … + u

²_n

= 1

…

and is a non linear problem

(24)

Linear regression in IR ⁿ

x

₁

x

₂

The optimization problem reads min d

₁

+ … + d

_m

However, the information carried by versor u is redundant

And in the very end, we are just interested to find a direction: to this purpose, also vectors a with distinct modules can be used

x^k d^k

a

x₁^k x₂k

d

₁

> x

₁¹

u

₁

+ … x

_n¹

u

_n

–

_β

d

₁

>

_β

– x

₁¹

u

₁

+ … x

_n¹

u

_n

d

_m

>

_β

– x

₁^m

u

₁

+ … x

_n^m

u

_n

d

_m

>

β

– x

₁^m

u

₁

+ … x

_n^m

u

_n

u

₁²

+ … + u

²_n

= 1

…

and is a non linear problem

u

(25)

Linear regression in IR ⁿ

x

₁

x

₂

min d

₁

+ … + d

_m

x^k d^k

x₁^k x₂k

d

₁

> x

₁¹

a

₁

+ … x

_n¹

a

_n

– b d

₁

> b – x

₁¹

a

₁

+ … x

_n¹

a

_n

d

_m

> b – x

₁^m

a

₁

+ … x

_n^m

a

_n

d

_m

> b – x

₁^m

a

₁

+ … x

_n^m

a

_n

a

₁

+ … + a

_n

= 1, a

_i

> 0

…

This calls for solving 2

^n–1

linear problems in the n + 1 variables a

₁

, …, a

_n

, b

a

Note that d

_k

= |u

⋅

x

^k

–

β

| = |

⋅

x

^a^k

–

β

|

|a|

But for any choice of a there exists a real b such that d

_k

= |a

⋅

x

^k

– b|

ax

k

(26)

Efficiency curve

5,4 5,1

5,0 5,6

5,2 consumption (l×100km) 4,8

8,9 10,6

10,4 8,5

9,8 12,4

price (K€)

6 5

4 3

2 1

Model

Let us make the (basically wrong) assumption that clients choose a car adopting a rational behaviour, and, prior to purchase, compute its cost on a five-year basis For the sake of simplicity, suppose that

• costs are given by the car price and by fuel consumption

• the car yearly covers 20.000 km

• a litre of fuel costs 1,3€

cost(k) = price(k) + 5

_⋅

consumption(k)

_⋅

200

_⋅

1,3

Suppose that, since less fuel consumption means less pollution, we are interested

in finding the best solutions under both respects

(27)

Efficiency curve

5,4 5,1

5,0 5,6

5,2 consumption (l×100km) 4,8

15,92 17,23

16,90 15,78

16,56 18,64

cost (K€)

6 5

4 3

2 1

Model

costi vs. consumi

4,7 4,8 4,9 5,0 5,1 5,2 5,3 5,4 5,5 5,6 5,7

15,500 16,000 16,500 17,000 17,500 18,000 18,500 19,000 costo (k€)

consumo (lt x 100 km)

We look for a math model giving the cost c(x) associated with consumption x as

c(x) = a/x + bx + c (a, b, c > 0)

We look for a math model giving the cost c(x) associated with consumption x as

c(x) = a/x + bx + c

(a, b, c > 0)

more consumption less technology

more consumption more fuel

(28)

costi vs. consumi

4,7 4,8 4,9 5,0 5,1 5,2 5,3 5,4 5,5 5,6 5,7

15,500 16,000 16,500 17,000 17,500 18,000 18,500 19,000 costo (k€)

consumo (lt x 100 km)

Efficiency curve

5,4 5,1

5,0 5,6

5,2 consumption (l×100km) 4,8

15,92 17,23

16,90 15,78

16,56 18,64

cost (K€)

6 5

4 3

2 1

Model

+ bx_k + c

a x_k

x_k y_k

d

_k

= y

_k

–

^a

– bx

_k

– c > 0

x_k

c(x) = a/x + bx + c

(29)

formulation

Efficiency curve

a + x

₁

b + x

₁

c < x

₁

y

₁

…

a + x

_n

b + x

_n

c < x

_n

y

_n

a, b, c > 0

2

2

5,4 5,1

5,0 5,6

5,2 consumption (l×100km) 4,8

15,92 17,23

16,90 15,78

16,56 18,64

cost (K€)

6 5

4 3

2 1

Model

min d

₁

+ … + d

_n

d

_k

= y

_k

–

^a

– bx

_k

– c > 0

x_k

a = 75,678 b = 0,353 c = 0 a = 75,678 b = 0,353 c = 0

optimal solution

max ( + … + )

_x¹

a + (x

₁

+ … + x

_n

) b + c

1

1 x_n

(30)

Efficiency curve

consumption vs. cost

14,000 16,000 18,000 20,000 22,000

4 4,5 5 5,5 6

consumption (lx100km)

cost (k€)

efficiency curve sample

Riferimenti

Scarica adesso ( PDF - 30 pagine - 166.76 KB )

Documenti correlati

Muzio Attendolo da Cotignola, capostipite degli Sforza

Orsini per Firenze e l’antipapa, contro Tartaglia e Ceccolino dei Michelotti, per Napoli e Perugia); 2) battaglia dell’Aquila, del 1415 (Muzio Attendolo Sforza, per Napoli, contro

Symptomatic toxicities experienced during anticancer treatment: Agreement between patient and physician reporting in three randomized trials

The aim of this study was to describe patients’ and physicians’ reporting of six symptomatic toxicities occurring during anticancer treatment, based on data prospectively collected

Problems of regulatory reforms in electricity : examples from Turkey

We specifically disaggregate the electricity sector into public generation and private generation sectors, and introduce four satellite accounts: public wholesale trading,

segment equation

Tlwaeatoohlletubbietici n Italia Ioksaitto dale. segment

ANTIRETROVIRAL PRICE REDUCTIONS

The GFATM may not be able to guarantee that these countries will be able to access the lowest prices for new medicines, including those under patent, and it may even

27 ST Segment ElevationMyocardial Infarction

The relative benefits of primary PCI are greatest in patients at highest risk, including those with cardiogenic shock, right ventricular infarction, large anterior MI, and increased

e) metodo transazionale di ripartizione degli utili (PSM): basato sull'attribuzione a ciascuna impresa associata che partecipa ad un'operazione controllata della quota di utile, o

VALUE FIRSTTHEN PRICE

ANDREAS HINTERHUBER: Marketing, pricing, and sales managers in B2B should take notice: if SKF is able to quantify the value of industrial bearings, so should other companies

Carica i tuoi materiali di studio per scaricare tutti i documenti.

Il tuo documento sarà arricchito, condiviso su 123dok IT per aiutare nello studio.

Documenti correlati

Supramolecular spectrally encoded microgels with double strand probes for absolute and direct miRNA fluorescence detection at high sensitivity

Supramolecular spectrally encoded microgels with double strand probes for absolute and direct miRNA fluorescence detection at high sensitivity

4

0

0

Dissezione aortica

Dissezione aortica

53

0

0

From four- to two-channel Kondo effect in junctions of XY spin chains

From four- to two-channel Kondo effect in junctions of XY spin chains

38

0

0

Territorio e rappresentazione

Territorio e rappresentazione

1

0

0

Web marketing e acquisto online: indagine sugli studenti dell'Ateneo di Pisa.

Web marketing e acquisto online: indagine sugli studenti dell'Ateneo di Pisa.

155

0

0

Guarda Ricordando Alberto Gianquinto

Guarda Ricordando Alberto Gianquinto

2

0

0

View of "C’est costume d’amur de joie aveir aprés dolur." La fenomenologia amorosa in alcuni passi del 'Tristan' e del 'Cligés'
| Interfaces: A Journal of Medieval European Literatures

View of "C’est costume d’amur de joie aveir aprés dolur." La fenomenologia amorosa in alcuni passi del 'Tristan' e del 'Cligés' | Interfaces: A Journal of Medieval European Literatures

25

0

0

Guarda Introduzione

Guarda Introduzione

2

0

0