mixed fractional factorial designs

(1)

Complex and real codings for

mixed fractional factorial designs

Maria Piera Rogantin

DIMA – Universit`a di Genova – [email protected]

Giovanni Pistone

DIMAT – Politecnico di Torino – [email protected]

W-optimum design and related statistical issues May 26-28 2005, Juan-Les-Pins, France

A partial announcement of unpublished results was presented at ICODOE Memphis TN, May 13-15 2005

(2)

Polynomial algebra and design of experiments

The application of commutative computer algebra to the study of estima- bility, confounding on the fractions of factorial designs has been proposed by Pistone & Wynn (Biometrika 1996).

• The points of a design are the solutions of a system of polynomial equations

• The functions defined on a design are polynomial functions

• Linear models can be specified as hierarchical polynomial models

• A hierarchical set of identifiable monomials on a fraction is com- puted from a proper version of the system of polynomial equations (Gr¨obner basis).

(3)

Qualitative and quantitative factorial designs

• Qualitative factors

The relevant properties of a fraction F are mainly combinatorial.

Responses on F and bases of the responses: there is no notion of increasing complexity associated to the sequence of the element of the basis (there is no meaningful notion of linear, quadratic, . . . , terms); models are best specified by a list of interaction spaces, i.e.

by including in the linear model all the elements of the basis of given interaction spaces.

Pistone, G., Rogantin, M.-P., 2005. Indicator function and complex coding for mixed fractional factorial designs. Submitted.

• Quantitative factors

The levels of the factors are given as real-rational-natural numbers.

The relevant statistical properties of a fraction F ⊂ D are mainly of geometric or algebraic nature.

Work in progress

(4)

Complex coding for qualitative factors

We code the n levels of a factor with the n-th roots of the unity:

ω_k = exp

µ

i 2π n k

¶

for k = 0, . . . , n − 1 ^ω⁰ ω1

ω2

The full factorial design D with m factors, as a subset of C^m, is defined by the system of equations

ζ_jⁿ^j − 1 = 0 for j = 1, . . . , m

(5)

The complex coding allows one to have:

1. an orthonormal monomial basis of the functions defined on D

2. equivalence between different definitions of orthogonality

3. equivalence between different definitions of regular fraction

(6)

1. Monomial basis of the responses on the full design

The set of all the monomial responses on D:

{X^α, α ∈ L}

X^α = X₁^α¹ · · · Xm^α^m L = {α : α = (α¹, . . . , αm) 0 ≤ αⁱ ≤ nⁱ − 1, i = 1, . . . , m}

is an orthonormal monomial basis of the set of all the complex functions defined on the full factorial design C(D).

Each function defined on full factorial design is represented in a unique way by an identifiable complete regression model:

f = ^X

α∈L

θ_α X^α with θ_α ∈ C where the coefficients are uniquely defined by:

θ_α = E_D ^³f X^α^´

The discrete Fourier transform formulas supply an equivalent interpreta- tion of the equations relating the response f and the θ’s coefficients.

(7)

2. Two orthogonalities

of functions defined in a vector space, based on a scalar or Hermitian product:

< f, g >= E_D(f g) = 0

“Two orthogonal responses are not confounded and the estimators of their coefficients in a model are not correlated.”

of factors : “All level combinations of two factors appear equally often”

Vector orthogonality is affected by the coding of the levels, while factor orthogonality is not.

If the levels are coded with the complex roots of the unity the two notions of orthogonality are essentially equivalent

(8)

Equivalence between the two orthogonalities (Pistone-Rogantin 2005)

1. Let X^α be a term with level set Ω_s on the full factorial design D.

The s levels of X^α appear equally often on a fraction F if and only if

s prime

the term X^α is centered on F

s not prime

the terms X^α, (X^α)^r are centered on F

2. A fraction F is an orthogonal array of strength t if and only if

all the terms X^α up to the order t are centered.

Then, for any β and γ of order up to t s.t α = [β − γ] or α = [γ − β], X^β is orthogonal to X^γ on F

(9)

3. Regular fractions and indicator function

(Pistone-Rogantin 2005) - F a fraction without replicates where all the factors have n levels

- Ω_n the set of the n-th roots of the unity, Ωn = {ω⁰, . . . , ω_n−1}

- L a subset of exponents, L ⊂ L = (Z_n₎^m containing (0, . . . , 0), l = #L - e a map from L to Ωⁿ, e : L → Ωⁿ

F is regular if the following equivalent conditions are satisfied.

(1) The defining equations are of the form

X^α = e(α) , α ∈ L

where L is a sub-group of L, e is a group homomorphism (2) The indicator function of the fraction has the form

F (ζ) = 1 l

X α∈L

e(α) X^α(ζ) ζ ∈ D where L is a subset of L, e is a map

(3) The terms X^α and X^β are either orthogonal or totally confounded, for each α, β ∈ L

(10)

Quantitative factors

We code (re-label) quantitative factors according to Bayley (1983) and we show on an example how to apply the algebraic technology to ideas developed by various authors, e.g.

- Bayley, R. A., 1983. The decomposition of treatment degrees of freedom in quantitative factorial experiments. J. R. Statist. Soc. B 44 (1), 63–70.

- Caboara, M., Riccomagno, E., 1998. An algebraic computational approach to the identifiability of Fourier models. Journal of Symbolic Computation 26, 245–260.

- Edmondson, R. N., 1994. Fractional factorial designs for factors with a prime number of quantitative levels. J. R. Statist. Soc., B 56 (4), 611–622.

- Giglio, B., Riccomagno, E., Wynn, H. P., 2000. Gr¨obner basis strategies in regression.

J. Appl. Statist. 27 (7), 923–938.

- Kobilinsky, A., 1990. Complex linear model and cyclic designs. Linear Algebra and its Applications 127, 227–282.

- Kobilinsky, A., Monod, H., 1991. Experimental design generated by group morphism:

An introduction. Scand. J. Statist. 18, 119–134.

(11)

Real responses

(ordered or quantitative factors)

In applications, we are interested on the real responses on the design.

A response f (f = P

α∈L θα X^α) is real valued if and only if θ_α = θ_[−α], α ∈ L Both the real vector space R(D) and the complex vector space C(D) of the responses on the design D have a real basis.

For each factor j with n_j levels, an orthogonal basis of the real functions R(Dj) defined on the design restricted to the j factor Dj is:











1 constant term

ℜ(X_j^k) for 1 ≤ k ≤ nj/2 ℜ(X_j^k) = X_j^k + X^k_j 2

ℑ(X_j^k) for 1 ≤ k < nj/2 ℑ(X_j^k) = −i X_j^k − X^k_j 2

The basis of R(D) is obtained via the Kronecker product:

R(D) = R(D¹) ⊗ · · · ⊗ R(D^j) · · · ⊗ R(D^m)

(12)

Real recoding for factors with a prime number of levels

If n_j is an odd prime, then the map

ω_k ←→ ℑ(ω_k) is one-to-one.

ℑ(Xj) can be considered as the linear term of a real basis of the responses defined on Dj.

We denote by u_1j this term.

The trigonometric expression of the coding is {0, ℑ(ω1), . . . , ℑ(ωn_j−1)} =

(

0, sin

Ã2 π n_j

!

, . . . , sin

Ã2 π

n_j (n_j − 1)

!)

• The new real levels are not equally spaced

• Increasing order differs from the numbering k = 0, 1, . . . , nj − 1

(13)

Example: n

_j

= 5

The roots of the unity and the real coding are:

0 2

3

4 1

−

p10 + 2√ 5

4 sin

µ2π 5 4

¶

−

p10 − 2√ 5

4 sin

µ2π 5 3

¶

0 sin

µ2π 5 0

¶

p10 − 2√ 5

4 sin

µ2π 5 2

¶

p10 + 2√ 5

4 sin

µ2π 5 1

¶

The values are part of a computable field: exact, not floating point, computations can be done with a symbolic computation software, e.g.

the system CoCoA developed at the University of Genoa and available at http://cocoa.dima.unige.it

(14)

The real basis is polynomial

The elements of the basis are representable as polynomials of degrees from 0 up to n_j − 1.

The degrees induce a hierarchical ordering on the elements of the basis.

We consider the re-ordered basis u_0j, u_1j, . . . , u_kj, . . . , u_(n

j−1)j

defined by











u_0j =1

u_kj =ℜ(X_j^k) for k 6= 0, k even u_kj =ℑ(X_j^k) for k odd

The interest of the unusual numbering consists on the following triangular system of equations

u^k_1j =







Pk/2

s=0 γ_sk u_(2s)j for k 6= 0, k even

P[k/2]

s=0 γ_sk u_(2s+1)j for k odd where the γ_rk’s are rational coefficients.

The basis consists of the orthogonal polynomials on the real coding.

Hierarchical polynomial models are not the same as using trigonometric representation

(15)

Example: n

_j

= 5 (cont.)

The re-ordered basis u_0j, u_1j, u_2j, u_3j, u_4j real basis is:











u0 = 1

u1 = ℑ(X) = −i ³

X−X2

´

u2 = ℜ(X²) = ³

X²+X² 2

´

u3 = ℑ(X³) = −i ³

X³−X³ 2

´ = i ³

X²−X² 2

´ = −ℑ(X²) u4 = ℜ(X⁴) = ³

X⁴+X⁴ 2

´

= ³

X+X 2

´

= ℜ(X)

The powers of the linear term u₁ and the polynomial expressions are:











u²₁ = −1

2u2 + 1 2 u³₁ = −1

4u3 + 3 4u1

u⁴₁ = 1

8u4 − 1

2u2 + 3 8

The orthogonalized system of the monomial 1, u₁, u²₁, u³₁, u⁴₁ on the given points is:









 1 u1

u2 = −2u²1 + 1 u3 = −4u³1 + 3u1

u4 = 8u⁴₁ − 8u²1 + 1

(16)

Minimal polynomial for the real coding

A better description of the values of the real coding is through the minimal polynomial (with indeterminate s):

S_p(s) =

(p−1)/2 X k=0

(−1)^k^³ p 2k + 1

´(1 − s²)^{(p−1)/2−k}s^2k+1

general form in: Beslan and de Angelis , 2004

Note that the complexity has been reduced, because the coefficients of the minimal polynomial are integers.

Mapping between real and complex codings







ω = c + i s c = u_n_j₋₁ in fact u_n_j₋₁ = ℜ(X)

(17)

Example: n

_j

= 5 (cont.)

The minimal polynomial is:

16s⁵ − 20s³ + 5s = 0

The mapping between real and complex codings is:

(ω = c + i s

c = 8s⁴ − 8s² + 1

(18)

Recoding of monomial responses

A monomial response X^α, X^α = X₁^α¹ · · · X_m^α^m, is recoded as ℑ^³X₁^α¹ · · · X_m^α^m^´

imaginary part of the

product of k roots of the unity ⇐⇒ sinus of the

sum of k angles

If c^(k), s^(k) denote the cos and sin of the sum of k angles, then











c^(k) = ^X

h=k,k−2,...

h≥0

(−1)^k−h² ^X

I⊆{1,...,k}

#I=h

Y i∈I

c_i ^Y

i /∈I

s_i

s^(k) = ^X

h=k−1,k−3,...

h≥0

(−1)^k−h−1² ^X

I⊆{1,...,k}

#I=h

Y i∈I

c_i ^Y

i /∈I

s_i

(19)

Regular fractions

Regular fractions are defined by equations of the form X^α = constant

The polynomial formulas for: - cos as a function of sin

- sin and cos for sum of k angles

allow the translation of every generating equation in a generating equation for the real coding.

Example: n

_j

= 5 (cont.)

Generating equation for a 5³⁻¹:

XY Z = 1 or ℑ(XY Z) = 0

½c_i = 8s⁴_i − 8s²i + 1 i = 1, . . . , 3 0 = s1c2c3 + c1s2c3 + c1c2s3 − s¹s2s3

Defining equations for the full factorial design:

16s⁵_i − 20s³i + 5s_i = 0 i = 1, . . . , 3

(20)

Example: n

_j

= 5 (cont.) CoCoA script

Use R::=Q[x,y,z]; -- rational coefficients and 3 indeterminate ring Cx:=8*x^4-8*x^2+1; -- equations for cosines

Cy:=8*y^4-8*y^2+1;

Cz:=8*z^4-8*z^2+1;

F:=x*Cy*Cz + Cx*y*Cz+ Cx*Cy*z -x*y*z;

-- addition formulas and generating equations

Dx:=16*x^5-20*x^3+5*x; -- defining equations of the full design Dy:=16*y^5-20*y^3+5*y;

Dz:=16*z^5-20*z^3+5*z;

I:=Ideal(F,Dx,Dy,Dz); -- ideal generated by equations G:=GBasis(I); -- Groebner basis

Set Indentation;

G;

Est:=QuotientBasis(I); -- set of estimable terms Sort(Est); Est;

(21)

Example: n

_j

= 5 (cont.) CoCoA output of Gr¨ obner basis

[

16z^5 - 20z^3 + 5z, 16y^5 - 20y^3 + 5y,

-19/35xyz^3 - 19/140x^2z + 57/140xyz - 19/140y^2z + 19/140z^3,

-9/14xy^2z + 3/7y^3z - 3/14x^2z^2 + 3/14xyz^2 - 3/7y^2z^2 - 9/14yz^3 + 3/14z^4 + 9/56xz + 9/56yz + 9/56z^2,

-3/5xz^4 - 3/5yz^4 + 1/4x^3 - 1/2x^2y - 1/2xy^2 + 1/4y^3 - 1/2x^2z + 1/4xyz

- 1/2y^2z + 1/4xz^2 + 1/4yz^2 + 1/4z^3 + 1/8x + 1/8y + 5/16z, -1/12x^3 + 1/6x^2y + 1/6xy^2 - 1/12y^3 + 1/6x^2z - 1/12xyz + 1/6y^2z + 1/6xz^2

+ 1/6yz^2 - 1/12z^3 - 5/48x - 5/48y - 5/48z, y^3z + x^2z^2 + 2xyz^2 + 2y^2z^2 - z^4 - 3/4xy - 3/4y^2 - 3/4yz,

3y^2z^3 + 3yz^4 + 3/4xy^2 + 3/4y^3 + 3/4x^2z + 3/4xyz - 3/2y^2z - 3yz^2 - 3/4z^3 - 3/16x - 3/16y - 3/16z,

3/4x^2y + 3/4xy^2 + 3/4x^2z + 3/4y^2z + 3/4xz^2 + 3/4yz^2 - 3/4x - 3/4y - 3/4z, -1/2xy^3 - 1/2y^4 - 1/2xyz^2 - 1/2y^2z^2 + 1/2xz^3 - 1/2yz^3 + 1/2z^4 - 1/8x^2

+ 5/8xy + 3/4y^2 - 1/2xz + 3/8yz - 3/8z^2,

-3x^2z^2 - 3xyz^2 - 3y^2z^2 + 3z^4 + 3/4x^2 + 3/2xy + 3/4y^2 - 3/4z^2]

(22)

Example: n

_j

= 5 (cont.) CoCoA output for Est

[ 1, z, y, x, z^2, yz, xz, y^2, xy, x^2, z^3, yz^2, xz^2, y^2z, xyz, x^2z, y^3, xy^2, z^4, yz^3, xz^3, y^2z^2, xyz^2, y^4, yz^4 ]

The monomial terms in the list are linearly independent of the fraction. A possible choice of a symmetric hierarchical model based on this list is shown below in red. There are many other hierarchical monomial bases;

not all bases are obtained via Gr¨obener algorithm.

1,

z,y,x, z²,y²,x², z³, y³, z⁴, y⁴, yz,xz,xy,

yz², xz², y²z, x²z, xy², yz³, xz³, y²z²,

yz⁴, xyz, xyz²

(23)

Conclusions

• Fractions are defined by polynomial equations in the complex coding

• Defining equations translate to polynomial equations in the real coding

• Gr¨obner basis software can produce a list of estimable monomial terms

• The computations of interest are exact

• Not all the hierarchical bases are produced this way

• Inverse problems can be considered

(24)

References: Indicator function

- Fontana, R., Pistone, G. and Rogantin, M. P. (2000). Classification of two-level factorial fractions, J. Statist. Plann. Inference 87(1), 149–172.

- Tang, B., Deng, L. Y., 1999. Minimum G2-aberration for nonregular fractinal factorial designs. The Annals of Statistics 27 (6), 1914–1926.

- Ye, K. Q., 2003. Indicator function and its application in two-level factorial designs.

The Annals of Statistics 31 (3), 984–994.

- Pistone, G., Rogantin, M.-P., 2003. Complex coding for multilevel factorial designs.

Technical report n. 22 October 2003 Dipartimento di Matematica Politecnico di Torino.

- Ye, K. Q., 2004. A note on regular fractional factorial designs. Statistica sinica 14 (4), 1069–1074.

- Cheng, S.-W., Ye, K. Q., 2004. Geometric isomorphism and minimum aberration for factorial designs with quantitative factors. The Annals of Statistics 32 (5).

- Pistone, G., Rogantin, M.-P., 2005. Indicator function and complex coding for mixed fractional factorial designs. Technical report n. 13 April 2005 Dipartimento di Matematica Politecnico di Torino. Submitted.