• Non ci sono risultati.

Indicator function and different codings for fractional factorial designs 1

N/A
N/A
Protected

Academic year: 2021

Condividi "Indicator function and different codings for fractional factorial designs 1"

Copied!
21
0
0

Testo completo

(1)

Indicator function and different codings for fractional factorial designs

1

Giovanni Pistone

Department of Mathematics - Politecnico di Torino, Italy

Maria-Piera Rogantin ∗

Department of Mathematics - Universit`a di Genova, Italy

Abstract

First, we discuss the relationship between ideal theory and indicator function in fractional factorial designs. Secondly, we deal with the algebraic theory of Bayley (1983) level codings in the framework of ideal theory.

Key words: Algebraic Statistics, Complex coding, Indicator function, Mixed designs, Regular fractions, Orthogonal Arrays

1 Introduction

We discuss here the properties of a fractional factorial design with m fac- tors X1, . . . , Xm, where the set of treatments is described by the solutions of a system of polynomial equations with rational coefficients. The solutions them- selves are not restricted to be rational. In fact they could be real or complex algebraic numbers. We discuss both qualitative and quantitative factors. The approach has been introduced in Pistone and Wynn (1996) and Pistone et al.

(2001). Other references to current research are given.

A design is usually a list or array of treatments. In our approach, the design is coded into the coefficients of the polynomials whose solutions are the treatment

∗ Corresponding author.

Email addresses: giovanni.pistone@polito.it (Giovanni Pistone), rogantin@dima.unige.it(Maria-Piera Rogantin).

1 Partially supported by Italian grant PRIN03 coordinated by G. Consonni (Uni- versity of Pavia, Italy)

(2)

levels, then the levels are coded by numeric values. This indirect approach may seem baroque, but it has some advantage. In fact, the algebraic complexity of the coefficients of a polynomial is smaller than the algebraic complexity of its solutions.

A classical example of such a representation is the definition of regular frac- tions using the generating words. The translation from generating words into the language of polynomial equations is discussed in Pistone and Rogantin (2005).

Our aim is to push forward as far as possible the analysis without actually do- ing any computation on the treatment codings. As most of the computations are done on polynomials, we use extensively symbolic algebra software, espe- cially CoCoA. This one and other similar software systems can perform exact computations on computable number fields, i.e. where the numbers can be stored in a finite or potentially infinite memory. Examples are mod p-fields Zp, Galois fields GF(ps), rational numbers Q.

A second feature of the approach we use in this paper is the systematic em- bedding of a generic design into a full factorial design containing it. For this reason, we call fraction a generic design, while the term design is usually re- served to the full factorial, possibly mixed, case. Each function, or response, defined on the design has a polynomial representation. Two different poly- nomials represent the same function if their difference is zero on the design points.

Indicator polynomials are one way to give defining equations. The main result of the paper are Theorem 1 to Theorem 3 which relate indicator function to Gr¨obner bases.

In the case of qualitative factors with numerical coding of levels, the monomial terms are not directly meaningful, but they could supply a handy linear basis for the interaction spaces, see Galetto et al. (2003). In the quantitative case, the levels of factors are given as real values and most of the relevant properties are of geometric or algebraic type. Designs and their fractions can be properly specified as solutions of systems of polynomial equations, while linear models can be specified as hierarchical polynomial models. It is interesting to remark that the orthogonalization of monomial terms in a hierarchical polynomial model can be shown to produce a set of orthogonal polynomials whose leading terms coincides with the monomials in the model, giving rise to a notion of interaction adapted to the hierarchy, see Giglio and Wynn (2000).

(3)

Acknowledgment

Section 3 of this paper is a revised version of the talk presented to ICODOE 2005 conference, Memphis, May 13-15, 2005. The authors wish to thank the Organizers for the opportunity to present these ideas to such a highly qualified audience.

Eva Riccomagno made useful comments on a previous version of the paper, pointing out in particular some similarities of the second part with Caboara and Riccomagno (1998).

2 Designs, Gr¨obner basis and indicator functions

In this Section we review basic facts about the algebraic study of DOE, while discussing some new material not published elsewhere. A general reference to the polynomial algebra we use is Cox et al. (1997) or Kreuzer and Robbiano (2000). The informed reader can go directly to the main results in Section 2.3.

2.1 Full factorial designs

Let k[x1, . . . , xm] be a polynomial ring containing Q, let d1(x1), . . . , dm(xm) be univariate polynomials. We denote by aij, i = 1, . . . , nj, the solutions of each equation dj(xj) = 0. We assume that these solutions are all distinct and they belong to some extension K of the number field k. Let D be the full factorial design consisting of the solutions of the system d1(x1) = 0, . . . , dm(xm) = 0.

As each equation contains one single indeterminate, the solution set of the system is the Cartesian product of each set of solutions. We usually drop the sub-j when discussing a single factor.

The ideal generated by the previous polynomials, I(D) =< d1(x1), . . . , dm(xm) >,

is, by definition, the set of all the polynomials which vanish on D. It is called the ideal of the full factorial design. Each polynomial f ∈ I(D) is of the form f = Pmi=1hjdj with hj ∈ k[x1, . . . , xm], j = 1, . . . , m. Two polynomial whose difference belongs to I(D) represent models confounded on D.

Example. Our main examples are the 5m factorial designs, with different codings of levels. However, most of the results we discuss are true for non- prime number of levels and for mixed designs.

(4)

(C1) If the levels are 0, 1, 2, 3, 4, then

d(x) = x(x − 1)(x − 2)(x − 3)(x − 4)

= x5− 10x4+ 35x3− 50x2+ 24x . Here, k = K = Q.

(C2) If the levels are −2, −1, 0, 1, 2, then

d(x) = x(x2− 1)(x2− 4)

= x5− 5x3+ 4x . Here, k = K = Q.

(C3) If the levels are the 5-th roots of the unity as in Pistone and Rogantin (2005) and other references therein, then

d(x) = x5− 1 .

Here, k = Q, but K is an extension of Q containing the field <xQ[x]5−1>. This extended field is computable, but most computational algebra systems do not implement it. If symbolic computations are not of interest, we can take K = C.

(C4) If the levels are the sines of trigonometry angles, sin(5 k) = ℑ(ei25πk), k = 0, 1, 2, 3, 4, as in Bayley (1983), then a direct computation shows that

d(x) = 16x5− 20x3 + 5x . Here k = Q and the extended field K = Q

·q

10 − 2√

5,q10 + 2√ 5

¸

, see below in Section 3.2.

The ring of real valued responses on D is denoted by R(D). It is identified with the quotient ring R[x1, . . . , xm]/I(D), which contains k[x1, . . . , xm]/I(D) as a sub-ring.

In the following we write xα = xα11· · · xαmm for a monomial of indeterminates and Xα = X1α1· · · Xmαm for a monomial function on D. As a finite dimensional K-vector space, K(D) is generated by the monomials Xα with α ∈ L with

L = {0, . . . , n1− 1} × · · · × {0, . . . , nm− 1} .

We denote such unique hierarchical monomial basis by EstD. In a very special case, when d(x) = xn−1, this basis is actually orthonormal. All the identifiable polynomial models are a linear combination of the monomials in EstD. Given a polynomial f , it is equal on D to a unique linear combination of elements in EstD, called normal form of f , NFD(f ). Especially, if dj(xj) is monic of

(5)

degree nj, then

dj(xj) = xnjj − NFD³xnjj´ .

The normal forms of each f can be computed by repeated applications of the re-writing rules xnjj = NFD

³xnjj´.

The product XαXβ is computed in the quotient ring K(D) through the linear basis as

XαXβ = Xα+β = X

γ∈L

cα+β,γXγ

where the last expression is the normal form of Xα+β on D and the sum of the exponents is computed mod nj. Note that the actual solutions of the equations dj = 0 are not required to compute the array [cα,β], where α ∈ L+L, β ∈ L and cα,β ∈ k.

There exists a remarkable special case, namely when each of the dj’s is a binomial of the form xn− xn−h, with h = 0, 1, that is of the form x(xn−1− 1) or xn− 1. The normal form of a monomial is itself a monomial. In particular, in last case all the monomials of the basis are invertible in C(D) and the mapping L ∋ α 7→ NF(Xα) ∈ EstD is a homomorphism from the additive group Zn1 × · · · Znm to EstD, sub-group of the multiplicative group of the invertible elements of the ring C(D).

Example. In the case (C4) where the levels are coded by the sines of trigono- metric angles the re-writing formulæ up to degree 8 (4+4) are

x5 = 5

4x3− 5 16x;

x6 = 5

4x4− 5 16x2; x7 = 5

4x5− 5

16x3 = 5 4

µ5

4x3− 5 16x

− 5

16x3 = 5

4x3 −25 64x;

x8 = 5

4x4−25 64x2.

(1)

The statistical properties of a design are mostly computed through the mo- ments µα = ED(Xα). We show how to compute moments from the coefficients of the dj’s.

The coefficients of each dj are symmetric functions of the level values, d(x) =

m

Y

j=1

(x − aj) =

m

X

k=0

(−1)m−kσm−kxk

where σh denotes the h-elementary symmetric polynomial of the aj’s, see (van der Waerden, 1970, Sec. 5.7). E.g. the coefficient of term of order m − 1

(6)

gives the sum of the level values. If we denote by sh, h = 1, . . . , m − 1 the h-moments of the factor with respect to the counting measure on Dj, sh =Pmj=1ahj, the the Newton-Girard formulæ are

sh− sh−1σ1+ sh−2σ2− · · · + (−1)h−1s1σh−1+ (−1)hh = 0

with h = 1, . . . , n This allows the computation of the moments as functions of the coefficients of the polynomial d(x).

Example. With nj = 5 the Newton-Girard formulæ are solved as

s1 = σ1 s2 = σ21− 2σ2

s3 = σ31− 3σ1σ2+ 3σ3

s4 = σ41− 4σ12σ2+ 2σ22+ 4σ1σ3 − 4σ4

(2)

and in the case (C4) with d-polynomial x554x3+165 x, we get from Equations (2) and (1):

s1 = 0 s2 = −2

µ

−5 4

= 5 2 s3 = 0

s4 = 2

µ5 4

2

− 45 16 = 15

8

s5 = 5

4s3− 5

16s1 = 0 s6 = 5

4s4− 5

16s2 = 25 16 s7 = 5

4s3− 25 64s1 = 0 s8 = 5

4s4− 25

64s2 = 175 128 .

(3)

Let µα = E(Xα), α = (α1, . . . , αm) ∈ L be the moments on D with respect to the uniform probability. Then

µα= 1 n

m

Y

j=1

sj,αj with n = n1· · · nm .

Example. (continued) In the case (C4) with nj = 5 and m = 3, the not zero moments are µ222 = 18, µ224 = µ242 = µ422 = 323 , µ244 = µ424 = µ442 = 323 and µ444 = 51227. •

2.2 Fractional factorial design

Given a set of generating equations of the fraction, g1, . . . , gk, the fraction F consists of all the points of D on which all the generating polynomials are zero. The ideal of the fraction, I(F) is generated by d1, . . . , dm, g1, . . . , gk and we can assume that g1, . . . , gk are reduced in normal form on I(D). Given the

(7)

generating equations, it is not obvious how to compute the number of points in the fraction and how to find a hierarchical monomial basis xα, α ∈ M. Note that #M = #F.

One possible solution consists of the three following steps, see Pistone et al.

(2001) for details.

(1) Order. The first step is to introduce a monomial order, that is a total order

≺ on monomials such that 1 ≺ xα and xα ≺ xβ implies xα+γ ≺ xβ+γ. (2) G-basis. The basis di, . . . , dm, g1, . . . , gk of I(F) is transformed into a

special equivalent basis, called Gr¨obner basis, by the application of an algorithm which is implemented into computer algebra softwares.

(3) Est. A hierarchical basis consist of all the monomials which are not di- vided by the leading term of the polynomials in the Gr¨obner basis.

In this paper we rely marginally on this technology. Note that all the previous points are trivial in the case of a factorial design. In such a case, the hierarchical monomial basis is unique. Note also that some hierarchical bases can not be obtained via the Gr¨obner Basis method, as the example in the next section shows.

2.3 Indicator function

We call indicator function of a fraction F ⊂ D a polynomial F reduced in normal form on I(D), F =Pα∈LbαXα, such that

F (a) =

1 if a ∈ F

0 if a ∈ D \ F . (4)

Note that a reduced F is an indicator polynomial if and only if F2− F = 0 on D. Indicator polynomials for fractional factorial designs were introduced first in Fontana et al. (1997), Fontana et al. (2000) and used in Cheng et al.

(2004). It was extended to fractions with replicates in Ye (2003). Tang and Deng (1999) present a basically equivalent methodology. The indicator func- tion could be obtained as the unique reduced interpolator from Equation (4).

When a list of treatment values is available, F can be computed using some form of interpolation formula, as done in the aforementioned papers. Here we follow a new approach based on generating equations.

Theorem 1 Let the ideals of the design and of the fraction be respectively:

I(D) =< d1, . . . , dm > and I(F) =< d1, . . . , dm, g1. . . , gk> . A D-reduced polynomial F is the indicator of F if (1) and (2) below are both

(8)

satisfied:

(1) there exist hj ∈ k[x1, . . . , xm], j = 1, . . . , k such that 1 − F −X

j

hjgj ∈ I(D) ;

(2) for all gi,

F gi ∈ I(D) . Moreover, given (1), statement (2) is equivalent to

(3) for all gi, there exists ki ∈ k[x1, . . . , xm] such that gi− ki(1 − F ) ∈ I(D) .

Proof. Condition (1) is equivalent to F (a) = 1 for all a ∈ F. Condition (2) is equivalent to F (a) = 0 for all a ∈ D \ F because for such an a there exists a gj such that gj(a) 6= 0.

Moreover, condition (1) is equivalent to 1 − F ∈ I(F); in such a case (2) is equivalent to < d1, . . . , dm, 1 − F >= I(F). 2

Remark. The computation of the indicator polynomial from the generating equations could be reduced to the case of a single generating equation. In fact, if Fi is the fraction whose generating equation is gi, with indicator polynomial Fi, then F = ∩ki=1Fi and F = NFD(F1· · · Fk).

Example. Let us consider the 32 factorial design with coding −1, 0, +1, as in (C2). Then the design equations are d1(x) = x3− x and d2(y) = y3− y. We consider the “cross” fraction whit generating polynomial g(x, y) = xy. The system corresponding to statements (1) and (2) of Theorem 1 is

x3− x = 0 y3 − y = 0 1 − f − hxy = 0 f xy = 0 .

From the third equation multiplied by x2y2 we get x2y2 − fx2y2 − hx3y3 = 0, and, using the other equations, we get x2y2 − 1 + f = 0. The indicator polynomial is F (x, y) = 1 − x2y2. The equivalent system of equations

x3− x = 0 y3− y = 0 f + x2y2− 1 = 0 hxy + f − 1 = 0

(9)

is in lower-triangular form with respect to the lexicographic order of monomials with x ≺ y ≺ f ≺ h and has smallest leading term. In other words, it is a Gr¨obner basis for the lexicographic order. •

Theorem 2 We consider the ring k[h1, . . . , hk, f, x1, . . . , xm]. Then the lexi- cographic Gr¨obner basis of the elimination ideal

< d1, . . . , dm, (1 − f) −X

j

hjgj, f g1, . . . , f gk> ∩ k[f, x1, . . . , xm]

contains a unique polynomial of the form f −Pα∈Lbαxα. Then the indicator function F is Pα∈LbαXα.

Proof. The polynomial f −Pα∈Lbαxα belongs to the elimination ideal be- cause of theorem 1. It has minimum leading term among the polynomials containing the indeterminates f and x1, . . . , xm. 2

If g is a response on the design D, the mapping L(D) ∋ g 7→ NFD(gF ) ∈ L(D)

sets g to zero outside F, and provides a distinguished representative of the equivalence class of responses which are confounded on the fraction. The com- putation of NFD(XαF ), α ∈ L is a nice way to study the confounding struc- ture of the fraction. Again, the computation is done without actually comput- ing the treatments.

Theorem 3 We fix a term order on the set of exponents L. We compute the system of the Normal Forms on D of the polynomials XαF , α ∈ L with the previous order. We denote by R the matrix of the coefficients

NFD(XαF ) =X

β

RαβXβ .

Any vector k in the left kernel of R, ktR = 0, is a confounding relation among the monomials Xα:

X

α∈L

kαXα = 0 on F.

A hierarchical monomial basis of the fraction can be found from the ker-matrix K. If KL\M is a non singular sub-matrix of K, then Xα, with α ∈ M, is a monomial basis of F.

Proof. We write the ker-matrix in two vertical blocks as K = [KL\M | KM] and the matrix R in the two corresponding horizonal blocks R =

RL\M

RM

. If

(10)

KL\M is non singular, then KL\M−1 K R = 0 implies RM + KL\M−1 KM RM = 0 and Xα, with α ∈ M, is a basis of the fraction.

2

Example. Composite fraction with 2 factors.

In this section we apply all the preceding developments to the fraction with the following 9 points:

The equations of the full factorial design in the coding (C2) are:

d1 = x1(x21− 1)(x21 − 4) and d2 = x2(x22− 1)(x22− 4) and the generating equations of the composite design can be taken as

g1 = x31x2−x1x2, g2 = x1x32−x1x2, g3 = x31+3x1x22−4x1, g4 = 3x21x2+x32−4x2 where g3 and g4 can be derived respectively from x1(x21 − 4)(x22 − 1) and x2(x22− 4)(x21− 1), using g1 and g2.

As in the previous example (the cross), the application of Theorem 1, its remark and Theorem 2 and the use of the Buchberger algorithm for computing Gr¨obner basis with the lexicographic order provide the indicator function for each generating equation as

F1= 1

48x41x42− 5

48x41x22− 1

48x21x42+ 5

48x21x22+ 1 F2= 1

48x41x42− 1

48x41x22− 5

48x21x42+ 5

48x21x22+ 1 F3= 19

144x41x42− 79

144x41x22+ 1

3x41− 67

144x21x42+ 271

144x21x22− 4 3x21+ 1 F4= 19

144x41x42− 67

144x41x22− 79

144x21x42+ 271

144x21x22+1

3x42− 4 3x22+ 1 and

F = 31

144x41x42− 127

144x41x22−127

144x21x42+1

3x41+ 511

144x21x22+1

3x42− 4

3x21− 4 3x22+ 1

(11)

We can compute the Normal Forms of XαF , α ∈ L with the DegRevLex order.

A left kernel K of the matrix R, computed by CoCoA, gives a good picture of the confounding patterns:

00 01 02 03 04 10 11 12 13 14 20 21 22 23 24 30 31 32 33 34 40 41 42 43 44

0 4 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0

0 0 4 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0

0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0

1 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0

1 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0

1 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0

1 0 0

0 0 0 0 0

4 0 0 0 3 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0

1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

1 0 1 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

4 0 0 0 0 0 0 0 0 0 1 0 3 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

1 0 1

We observe that there are 12 rows with 2 non-zero values (-1 and 1) and 23 zeros: we have the full confounding of the correspondent monomials. The other 4 rows are cases of partial confounding. The 12 cases of total confounding split in four cycles.

A non singular sub-matrix of the previous kernel of the R matrix corresponds to M = {(α1, α2) | 0 ≤ αi ≤ 2}; the determinant of the matrix KL\M is 1.

In particular, the response surface with terms with total degree up to 2, is estimable.

Notice that the previous discussion about the confounding, based on the ma- trix K depends on the very special form of the kernel matrix produced by CoCoA. Actually CoCoA produces a matrix with integer entries such that the number of non zero entries of each row is low. A generic numerical software would have produced some type of orthonormalized matrix unsuitable for a statistical interpretation. •

3 Real linear bases and real levels

While all codings of levels specified as polynomial ideals are compatible with the algebraic theory, special cases have special features.

In particular, the use of the n-roots of the unity produces an orthogonal basis of monomials, see Pistone and Rogantin (2005). Thus, not only the identifiability can be treated but also the estimation of effects (i.e. least square) are easily managed.

However, in the applications we are interested in real effects. Bayley (1983) suggests a way to define a real coding and interpretable real effects, while keeping some of the feature of the complex coding. Other relevant references

(12)

are Caboara and Riccomagno (1998), Edmondson (1994), Kobilinsky (1990), Kobilinsky and Monod (1991).

We shall discuss in detail an example of a fraction of a 53 design.

3.1 Bases for the responses on the design

Coding the levels as in (C3) of the previous section and denoting by L the set of the exponents of the monomial responses, the set of functions {Xα , α ∈ L}

is an orthonormal basis of the complex responses on the design D. Then, each response f can be represented as an unique C-linear combination of constant, simple and interaction terms:

f = X

α∈L

θα Xα, θα ∈ C

where the coefficients are uniquely defined by θα = ED

³f Xα´. A response f is real valued if and only if

θα = θ[−α] , α ∈ L .

In the applications, we are interested in the real responses on the design. Note that both the real vector space R(D) and the complex vector space C(D) of the responses on the design D have a real basis, see (Kobilinsky, 1990, Prop.

3.1). In particular, a common real basis of both spaces is described in the following proposition.

Proposition 1 For each factor j with nj levels, an orthogonal basis of the real functions R(Dj) defined on the design restricted to the j factor Dj is:

1 constant term

ℜ(Xjk) for 1 ≤ k ≤ nj/2 ℜ(Xjk) = Xjk+ Xkj 2

ℑ(Xjk) for 1 ≤ k < nj/2 ℑ(Xjk) = −i Xjk− Xkj

2 .

The basis of R(D) is obtained via the Kronecker product:

R(D) = R(D1) ⊗ · · · ⊗ R(Dj) · · · ⊗ R(Dm) .

Proof. For each real function defined on the factor, we have f = f , that is:

nj−1

X

k=0

θk Xjk =

nj−1

X

k=0

θk Xjk .

(13)

If θk = ak+ i bk then θk = ak− i bk and each real function f can be written as

f = θ0+

nj−1

X

k=1

ak

³Xjk+ Xjk´

2 + i

nj−1

X

k=1

bk

³Xjk− Xjk

´

2

= θ0+ X

1≤k≤nj/2

αk ℜ(Xjk) + X

1≤k<nj/2

βk ℑ(Xjk) .

with αk = 2ℜ(θk) for k < nj/2, αk = ℜ(θk) for k = nj/2 (even case), βk =

−2ℑ(θk) for k < nj/2.

Now we check the orthogonality of the elements of the basis. First, all the vectors are orthogonal to the constant.

Then, let h and k be distinct integer numbers such that 1 ≤ k < h < nj/2.

ℑ(Xh) ℑ(Xk) = i

ÃXjh− Xjn−h

2

!

i

ÃXjk− Xjn−k

2

!

=

= −1 2

ÃXjh+k+ Xjn−h−k

2 − Xjh−k+ Xjn−h+k 2

!

=

= −1 2

³ℜ(Xjh+k) − ℜ(Xjh−k)´ .

The mean value of ℑ(Xh) ℑ(Xk) is 0 because neither h + k nor h − k are 0.

The cases ℑ(Xh)–ℜ(Xk) and ℜ(Xh)–ℜ(Xk) are similar. 2

Remark. The previous bases, as a function of the integer lattice coding 0, 1, 2, . . . , nj− 1 consist of the usual trigonometric functions, see Riccomagno et al. (1997) and Bates et al. (1998). In fact we are now discussing the algebraic properties of the Fourier regression.

3.2 Real recoding for factors with a prime number of levels

Now we consider the application of the basis of Proposition 1 to ordered and quantitative factors.

If nj is a prime integer greater than 2, then the map ωk ←→ ℑ(ωk)

is one-to-one and ℑ(Xj) can be considered the linear term of a real basis of the responses defined on the re-coded Dj. We denote by u1j this term. In this

(14)

case the coding is

{0, ℑ(ω1), . . . , ℑ(ωnj−1)} =

(

0, sin

Ã2 π nj

!

, . . . , sin

Ã2 π nj

(nj − 1)

!)

Notice that the new real levels are not equally spaced and their increasing order differs from the numbering k = 0, 1, . . . , n1 − 1, see example below. In fact, we are constructing polynomial models and the trigonometric representation is not appropriate.

Example. nj = 5

The roots of the unity and the real coding are:

0 1 2

3

4

p10 + 2√ 5

4 sinµ 2π

5 4

p10 − 2√ 5

4 sinµ 2π

5 3

0 sinµ 2π 5 0

¶ p10 − 2√

5

4 sinµ 2π

5 2

¶ p10 + 2√

5

4 sinµ 2π

5 1

(5)

We shall discuss below a better algebraic presentation of such values.

Now we shall show that the other elements of the basis are representable as polynomials of degrees from 0 up to nj− 1. The degrees shall induce a special ordering on the elements of the basis. We consider the re-ordered basis

u0j, u1j, . . . , ukj, . . . , u(nj−1)j

defined as follows:

u0j =1

ukj =ℜ(Xjk) for k 6= 0, k even ukj =ℑ(Xjk) for k odd

. (6)

In fact, if nj is a prime number greater than 2, it is odd and the conjugate of an even power of Xj is an odd power and vice-versa. Then

(15)

nℜ(Xjk), k 6= 0, k even, k ≤ nj − 1o=

½

ℜ(Xjk), 1 ≤ k ≤ nj

2

¾

nℑ(Xjk), k odd, k ≤ nj − 1o=

½

(−1)k−1ℑ(Xjk), 1 ≤ k < nj

2

¾

. The interest of this new numbering is given by the following proposition.

Proposition 2 Let the number of levels be a prime number greater than 2 and let

u0j, u1j, . . . , u(nj−1)j

be the basis defined above in (6).

Then, each element of such a basis is a polynomial in u1j as implied by the following triangular system of equations:

uk1j=

Pk/2

s=0 γsk u(2s)j if k is even

P[k/2]

s=0 γsk u(2s+1)j if k is odd (7)

where the γrk’s are rational coefficients.

Consequently, the basis consists of the orthogonal polynomials on the real cod- ing.

Proof. We have:

i k³Xj− Xj´k= (i )k

k

X

r=0

Ãk r

!

(−1)k−rXj2r−k =

(if k is odd) = (i )k

k/2−1

X

r=0

(−1)r

Ãk r

!

³Xjk−2r− Xk−2rj

´

= (i )k

k/2−1

X

s=0

(−1)s

à k

k+1+2s 2

!

³Xj2s+1− X2s+1j

´

(if k is even) = (−1)k/2+1

à k k/2

!

+

k/2−1

X

r=0

(−1)r

Ãk r

!

³Xj2s+ X2sj ´

= (−1)k/2+1

à k k/2

!

+

k/2−1

X

s=0

(−1)s+1

à k

k+1+2s 2

!

³Xj2s+ X2sj ´ .

2

We notice that the last element of the basis u(nj−1)j equals ℜ(Xj); in fact u(nj−1)j = 1/2 (Xnj−1+ Xnj−nj+1) = ℜ(Xj). From the previous proposition, we can write u(nj−1)j as an (nj− 1)-degree polynomial of u1j. Then, the real

(16)

part of a complex coding level ω, denoted by c, can be written as a polynomial of the corresponding imaginary part, denoted by s. The mapping between the real and the complex codings is

(ω = c + i s c = unj−1(s)

Remark. Having identified a natural degree associated to each term of the basis, we have a consistent definition of a parsimonious hierarchical model.

For example we could define a response surface model with constant, linear terms and interactions of order two and minimum degree.

A better description of the values of the real coding is available through the notion of minimal polynomial. For example, the irrational values appearing in (5) are solution of the equation of (C4).

The general form of the minimal polynomial for sin(2πk)/p, k = 0, . . . , n − 1, with p a prime is given by (see Beslin and de Angelis (2004)):

Sp(s) =

(p−1)/2

X

k=0

(−1)k

à p 2k + 1

!

(1 − s2)(p−1)/2−ks2k+1 .

Note that the algebraic complexity has been reduced, because the coefficients of the minimal polynomial are integers.

Example. nj = 5 (cont.)

For each factor, the re-ordered real basis u0, u1, u2, u3, u4 is:





















 u0 = 1

u1 = ℑ(X) = −i ³

X−X 2

´ u2 = ℜ(X2) =³

X2+X2 2

´ u3 = ℑ(X3) = −i ³

X3−X3 2

´

= i ³

X2−X2 2

´

= −ℑ(X2) u4 = ℜ(X4) =³

X4+X4 2

´

X+X 2

´

= ℜ(X) .

The triangular system in Equation (7) is:

u21 = −1

2u2+ 1

2 u31 = −1

4u3 +3

4u1 u41 = 1

8u4− 1

2u2+ 3

8 (8)

and the orthogonalized system of the monomial 1, u1, u21, u31, u41 on the given

(17)

points is:

1 u1

u2 = −2u21+ 1 u3 = −4u31+ 3u1

u4 = 8u41− 8u21+ 1 .

(9)

In this case the mapping between the real and the complex coding is:

(ω = c + i s

c = 8s4− 8s2 + 1 .

The mean values of the element of the basis 1, u1, u21, u31, u41 are rational num- ber. In fact, from the relations (8), the mean values of the odd powers are zero and the mean values of the even powers are (cfr. (3):

ED(1) = 1 , ED(u21) = 1

2 , ED(u41) = 3 8 .

The space of the responses is linear with basis u0, . . . , u4. As a ring, it has the following multiplication table.

u0 u1 u2 u3 u4

u0 2u0 2u1 2u2 2u3 2u4

u1 2u1 −u2+ 1 −u1+ u3 u2− u4 −u3 u2 2u2 −u1+ u3 u4+ 1 u1 u2+ u4

u3 2u3 u2− u4 u1 −u4+ 1 −u1− u3 u4 2u4 −u3 u2+ u4 −u1− u3 u2+ 1

× 1 2

3.3 Recoding of regular fractions

The regular fractions in complex coding are defined by equations of the form Xα = constant

The imaginary part of the product of k roots of the unity corresponds to the sinus of the sum of k angles which can be computed using the following

(18)

formulas. If c(k), s(k) denote the cos and sin of the sum of k angles, then

c(k)= X

h=k,k−2,...

h≥0

(−1)k−h2 X

I⊆{1,...,k}

#I=h

Y

i∈I

ci

Y

i /∈I

si

s(k)= X

h=k−1,k−3,...

h≥0

(−1)k−h−12 X

I⊆{1,...,k}

#I=h

Y

i∈I

ci

Y

i /∈I

si .

(10)

By the use of the polynomial formulas of c as function of s and the Equations 10, every set of generating equations of a regular fraction translate in a set of generating equations in the real coding.

Example. nj = 5 (cont.)

The following generating equation for a 53−1: XY Z = 1 translates into:

(ci = 8s4i − 8s2i + 1 i = 1, 2, 3

0 = s1c2c3+ c1s2c3 + c1c2s3− s1s2s3 .

The defining equations for the full factorial design are:

16s5i − 20s3i + 5si = 0 i = 1, 2, 3

A monomial basis of the fraction, obtained using CoCoA, see Appendix, is:

1, z, y, x, z2, y2, x2, z3, y3, z4, y4,

yz, xz, xy, yz2, xz2, y2z, x2z, xy2, yz3, xz3, y2z2, yz4, xyz, xyz2 .

The monomial terms in the list are linearly independent on the fraction. A possible choice of a symmetric hierarchical model based on this list is

1, z, y, x, z2, y2, x2, yz, xz, xy, xyz .

We consider the same regular fraction, using the Fourier coding as in the system (9). The basis is square free. A monomial basis of the fraction, obtained using CoCoA is:

1, z4, z3, z,z1, y4, y3, y2, y1, x4, x3, x2, x1,

y4z4, y3z4, y2z4, y1z4, x4z4, x3z4, x2z4, x1z4, y4z3, y3z3, y2z3, y1z3 .

It is remarkable that interactions of order 3 appear in the former case and not in the last one. •

(19)

4 Conclusions

In this paper we have discussed how fractions defined by polynomial equations in the complex coding translate to polynomial equations in the real coding.

Moreover, Gr¨obner basis softwares can produce a list of estimable monomial terms deduced from any type of defining equations, in particular indicator functions.

The computations of interest are exact but, unfortunately, not all the hier- archical bases are produced this way We expect this to be relevant because inverse problems could be considered, as in Robbiano and Rogantin (1998) and Fontana et al. (2000).

5 Appendix: CoCoA script

Use R::=Q[x,y,z]; -- rational coefficients and 3 indeterminate ring Dx:=16*x^5-20*x^3+5*x; -- defining equations of the full design Dy:=16*y^5-20*y^3+5*y;

Dz:=16*z^5-20*z^3+5*z;

Cx:=8*x^4-8*x^2+1; -- equations for cosines Cy:=8*y^4-8*y^2+1;

Cz:=8*z^4-8*z^2+1;

F:=x*Cy*Cz + Cx*y*Cz+ Cx*Cy*z -x*y*z;

-- addition formulas and generating equation of the fraction I:=Ideal(F,Dx,Dy,Dz); -- ideal generated by equations

Est:=QuotientBasis(I); Sort(Est); Est; -- set of estimable terms Use R::=Q[x[1..4],y[1..4],z[1..4]];

PowX:=[x[1]^2+1/2*x[2] - 1/2, x[1]^3+1/4*x[3]-3/4*x[1],

x[1]^4-1/8*x[4] +1/2*x[2]- 3/8, x[1]^5-20/16*x[1]^3 + 5/16 x[1]];

PowY:=[y[1]^2+1/2*y[2] - 1/2, y[1]^3+1/4*y[3]-3/4*y[1],

y[1]^4-1/8*y[4] +1/2*y[2]- 3/8, y[1]^5-20/16*y[1]^3 + 5/16 y[1]];

PowZ:=[z[1]^2+1/2*z[2] - 1/2, z[1]^3+1/4*z[3]-3/4*z[1],

z[1]^4-1/8*z[4] +1/2*z[2]- 3/8, z[1]^5-20/16*z[1]^3 + 5/16 z[1]];

Pow:=Concat(PowX,PowY,PowZ);

F:=[x[1]y[4]z[4]+x[4]y[1]z[4]+x[4]y[4]z[1]-x[1]y[1]z[1]];

Frac:=Concat(Pow,F);

I:=Ideal(Frac);

Est:=QuotientBasis(I);Sort(Est);Est;

(20)

References

Bates, R. A., Riccomagno, E., Schwabe, R., Wynn, H. P., 1998. Lattices and dual lattices in optimal experimental design for fourier models. Computa- tional statistics & data analysis 28, 283–296.

Bayley, R. A., 1983. The decomposition of treatment degrees of freedom in quantitative factorial experiments. J. R. Statist. Soc. B 44 (1), 63–70.

Beslin, S., de Angelis, V., 2004. The minimal polynomials of sin(2π/p) and cos(2π/p). Mathematical Magazine 77, 146–149.

Caboara, M., Riccomagno, E., 1998. An algebraic computational approach to the identifiability of Fourier models. Journal of Symbolic Computation 26, 245–260.

Cheng, S.-W., Li, W., Ye, K. Q., 2004. Blocked nonregular two-level factorial designs. Technometrics 46 (3), 269–279.

Cox, D. A., Little, J. B., O’Shea, D., 1997. Ideal, Varieties, and Algorithms, 2nd Edition. Springer-Verlag, New York, 1st ed. 1992.

Edmondson, R. N., 1994. Fractional factorial designs for factors with a prime number of quantitative levels. J. R. Statist. Soc., B 56 (4), 611–622.

Fontana, R., Pistone, G., Rogantin, M.-P., 1997. Algebraic analysis and gen- eration of two-levels designs. Statistica Applicata 9 (1), 15–29.

Fontana, R., Pistone, G., Rogantin, M. P., 2000. Classification of two-level factorial fractions. J. Statist. Plann. Inference 87 (1), 149–172.

Galetto, F., Pistone, G., Rogantin, M. P., 2003. Confounding revisited with commutative computational algebra. J. Statist. Plann. Inference 117 (2), 345–363.

Giglio, Beatrice, R. E., Wynn, H. P., 2000. Gr¨obner basis strategies in regres- sion. Journal of Applied Statistics 27 (7), 923–938.

Kobilinsky, A., 1990. Complex linear model and cyclic designs. Linear Algebra and its Applications 127, 227–282.

Kobilinsky, A., Monod, H., 1991. Experimental design generated by group morphism: An introduction. Scand. J. Statist. 18, 119–134.

Kreuzer, M., Robbiano, L., 2000. Computational Commutative Algebra 1.

Springer, Berlin-Heidelberg.

Pistone, G., Riccomagno, E., Wynn, H. P., 2001. Algebraic Statistics: Compu- tational Commutative Algebra in Statistics. Chapman&Hall, Boca Raton.

Pistone, G., Rogantin, M., 2005. Indicator function and complex coding for mixed fractional factorial designs. Tech. rep., Dipartimento di Matematica, Politecnico di Torino, submitted to JSPI.

Pistone, G., Wynn, H. P., Mar. 1996. Generalised confounding with Gr¨obner bases. Biometrika 83 (3), 653–666.

Riccomagno, E., Schwabe, R., Wynn, H. P., 1997. Lattice–based optimum design for Fourier regression. The Annals of Statistics 25 (6), 2313–2327.

Robbiano, L., Rogantin, M.-P., 1998. Full factorial designs and distracted frac- tions. In: Buchberger, B., Winkler, F. (Eds.), Gr¨obner Bases and Applica- tions (Proc. of the Conf. 33 Years of Gr¨obner Bases). Vol. 251 of London

(21)

Mathematical Society Lecture Notes Series. Cambridge University Press, pp. 473–482.

Tang, B., Deng, L. Y., 1999. Minimum G2-aberration for nonregular fractinal factorial designs. The Annals of Statistics 27 (6), 1914–1926.

van der Waerden, B. L., 1970. Algebra. Vol 1. Translated by Fred Blum and John R. Schulenberger. Frederick Ungar Publishing Co., New York.

Ye, K. Q., 2003. Indicator function and its application in two-level factorial designs. The Annals of Statistics 31 (3), 984–994.

Riferimenti

Documenti correlati

A bloom of loricate choanoflagellates was recorded for the first time in the Ross Sea polynya during the austral summer 2017.. Both individual cells and uncom- mon large-size

This kinematic suppression factor is measured using K μ2 (K π2 ) decays selected with K πν ¯ν like selection on a control-trigger data sample.. A MC sim- ulation of 400

Proposition about C -regular fractions provides an algorithmic way to check if a given array is a regular

The representation of a fraction by its indicator polynomial function was generalized to designs with replicates in Ye (2003) and extended to not binary factors using

Indicator function and complex coding for mixed fractional factorial

Indicator function and complex coding for mixed fractional factorial designs.. 13 April 2005 Dipartimento di Matematica Politecnico

Inference 87(1), 149–172P.

La struttura si articola in due parti: Dalla lettura lineare alla let- tura dell'ipertesto, la prima parte, Dalla lettura dell'ipertesto alla comples- sità della lettura online,