Given z T , the first row of L(z), compute the first row of L(z) 2 , L(z) 4 , L(z) 8 , . . . , L(z) 2k. Cost = one U T transform +kn a.o.+ one U transform.

(1)

U n × n unitary, [U ^T e 1 ] i 6= 0 ∀ i,

L(z) = Ud(U ^T z)d(U ^T e 1 ) ⁻¹ U ^H , z ∈ C ⁿ ( e ^T ₁ L(z) = z ^T ), L(x) = L(z) ² ,

U ^T x = d(U ^T e 1 ) ⁻¹ d(U ^T z)U ^T z, x ^T = z ^T L(z) (Gianluca),

L = {L(z) : z ∈ C ⁿ } is a commutative matrix algebra, L(x)L(y) = L(L(x) ^T y ) = L(y)L(x) = L(L(y) ^T x).

Given z ^T , the first row of L(z), compute the first row of L(z) ² , L(z) ⁴ , L(z) ⁸ , . . . , L(z) ²

^k

. Cost = one U ^T transform +kn a.o.+ one U transform.

Given z, the first row of L(z), the eigenvalues λ of L(z) can be computed by performing a U ^T transform, and the eigenvalues of L(z) ^s are simply λ ^s .

Circulant, τ , η and µ matrix algebras are of type L.

Given A n × n and its first row, say [z ¹ z 2 · · · z ⁿ −1 z n ], one can show that A ∈ L, L = τ, η, µ, iff

a _i,j−1 + a i,j+1 = a _i−1,j + a i+1,j , 1 ≤ i, j ≤ n, where, for s = 1, . . . , n,

a

s,0

= a

0,s

= a

s,n+1

= a

n+1,s

= 0, if L = τ,

a

s,0

= a

0,s

= a

_1,n−s+1

, a

s,n+1

= a

n+1,s

= a

1,s

, if L = η, a

s,0

= a

0,s

= −a

1,n−s+1

, a

s,n+1

= a

n+1,s

= −a

1,s

, if L = µ.

Examples. Let us write the 4 × 4 τ, η, µ matrices with first row [0 1 0 1]:







0 1 0 1 1 0 2 0 0 2 0 1 1 0 1 0





 ,







0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0





 ,







0 1 0 1 1 0 3 0 0 3 0 1 1 0 1 0





 ,

Let us write the 5 × 5 τ, η, µ matrices with first row [0 1 0 0 1]:







0 1 0 0 1

1 0 1 1 0

0 1 1 1 0

0 1 1 0 1

1 0 0 1 0





 ,







0 1 0 0 1

1 0 1 0 0

0 1 0 1 0

0 0 1 0 1

1 0 0 1 0





 ,







0 1 0 0 1

1 0 1 2 0

0 1 2 1 0

0 2 1 0 1

1 0 0 1 0





 .

Stochastic by columns ∩ matrix algebras

3×3 stochastic by columns (symmetric, symmetric and persymmetric) matrices:

M=





a c e

b d f

1 − a − b 1 − c − d 1 − e − f





(M

^S

=





a b 1 − a − b

b c 1 − b − c

1 − a − b 1 − b − c −1 + a + 2b + c



 , M

^SP

=





a b 1 − a − b

b 1 − 2b b

1 − a − b b a



)

p _M

^S

(λ) = (1 − λ)[λ ² − 2λ(a + b + c − 1) + 3ac − a − 3b ² + 2b − c], eig = a + b + c − 1 ± √

a ² + 4b ² + c ² + 2ab + 2bc − ac − a − 4b − c + 1 3 × 3 stochastic by columns circulant (symmetric) matrices:

C ∩M =





a 1 − a − b b

b a 1 − a − b

1 − a − b b a





C ∩M

^S

=





a

^1−a₂ ^1−a₂

1−a

2

a

^1−a₂

1−a

2 1−a

2

a





(2)

p _C∩M (λ) = (1 − λ)[λ ² − λ(3a − 1) + b ³ + a ³ − 3ab(1 − a − b) + (1 − a − b) ³ ]

= (1 − λ)[λ ² − λ(3a − 1) + 3(a ² + b ² − a − b + ab) + 1]

b = 1−a−b ⇒ p C∩M

^S

(λ) = (1−λ)[λ ² −λ(3a−1)+( 3a − 1

2 ) ² ] = (1−λ)(λ− 3a − 1 2 ) ² 3 × 3 stochastic by columns tau matrices:

τ ∩ M =





a 0 1 − a

0 1 0

1 − a 0 a



 , p

τ∩M

(λ) = (1 − λ)[(2a − 1 − λ)(1 − λ)]

3 × 3 stochastic by columns eta matrices:

η ∩ M =





a b 1 − a − b

b 1 − 2b b

1 − a − b b a



 = M

^SP

! p

η∩M

(λ) = (1 − λ)[λ

²

. . . ]

3 × 3 stochastic by columns mu matrices:

µ ∩ M =





a 0 1 − a

0 1 0

1 − a 0 a



 , p

µ∩M

(λ) = (1 − λ)[(2a − 1 − λ)(1 − λ)]

4 × 4 stochastic by columns symmetric and persymmetric matrices:

M

^SP

=







a b c 1 − a − b − c

b d 1 − b − c − d c

c 1 − b − c − d d b

1 − a − b − c c b a







4 × 4 stochastic by columns τ matrices:

M∩τ =







a b −b 1 − a

b a − b 1 − a + b −b

−b 1 − a + b a − b b

1 − a −b b a







, (M∩τ ≥ 0) =







a 0 0 1 − a

0 a 1 − a 0

0 1 − a a 0

1 − a 0 0 a







, λ = 1, 1, 2a−1, 2a−1

M∩η =







a b c 1 − a − b − c

b a 1 − b − c − a c

c 1 − b − c − a a b

1 − a − b − c c b a







, (M∩η ≥ 0) =











 λ =

M ∩ µ =







a

^a^−d₂ ^d^−a₂

1 − a

a−d

2

d 1 − d

^d^−a₂

d−a

2

1 − d d

^a^−d₂

1 − a

^d^−a₂ ^a^−d₂

a







, (M ∩ µ ≥ 0) =











 λ =

2 × 2 stochastic by columns matrices:

M =

a b

1 − a 1 − b

, λ = 1, a − b 2 × 2 stochastic by columns circulant matrices:

C ∩ M =

c 1 − c 1 − c c

, λ = 1, 2c − 1

(3)

Assuming a, b, c ∈ R, the Frobenius norm of A − X, A ∈ M, X varying in M ∩ C, is minimum for c = ^a ^+1−b ₂ , i.e. when A and X have the eigenvalues different from 1 equal (a − b = 2c − 1).

Fixed A ∈ M, where M is the set of all non negative stochastic by columns n × n matrices, such that |λ ² (A)| < 1, there exist X ∈ C ∩ M such that

|λ ² (A)| ≤ |λ ² (X)| < 1 ?

Note that the eigenvalues of X are easily computable.

6 × 6 stochastic by columns symmetric and persymmetric matrices:

M

^SP

=







a b c d e 1 − a − b

−c − d − e

b f g h 1 − b − f

−g − h − e e

c g i 1 − c − g

−i − h − d h d

d h 1 − c − g

−i − h − d i g c

e 1 − b − f

−g − h − e h g f b

1 − a − b

−c − d − e e d c b a







6 × 6 stochastic by columns η matrices:

M∩η =







a b c d e 1 − a − b

−c − d − e

b a + c − e b e −2b − a

−c − e + 1 e

c b a −a − b − c

−d − e + 1 e d

d e −a − b − c

−d − e + 1 a b c

e −2b − a

−c − e + 1 e b a + c − e b

1 − a − b

−c − d − e e d c b a







• 3 × 3 singular stochastic by columns matrices:





a a a

b b b

1 − a − b 1 − a − b 1 − a − b





n

=





a a a

b b b

1 − a − b 1 − a − b 1 − a − b









a a c

b b d

1 − a − b 1 − a − b 1 − c − d





n

=?

• 3 × 3 singular stochastic by columns symmetric and persymmetric matrices:

A =





a 1 − 2a a

1 − 2a −1 + 4a 1 − 2a

a 1 − 2a a



 ∈ η ! λ = 0, 1, ∈ R

A has rank 1 iff a = ¹ ₃ (in such case λ = 0, 1, 0). A ≥ 0 iff ¹ 4 ≤ a ≤ ¹ 2 . A is semi positive definite iff a ≥ ¹ ₃ . A ∈ τ iff a = ¹ ₂ (in such case λ = 1, 0, 1).

A

ⁿ

=





a

n

b

n

a

n



 ,

a

n

b

n

=

2a 1 − 2a

2(1 − 2a) 4a − 1

a

n−1

b

n−1

(4)

EXAMPLE. Let A be the following n × n stochastic by columns matrix

A =







0 b

1

1 0 b

2

1 − b

1

0 1 − b

2

b

n−2

0 1

1 − b

ⁿ−2

0 





, b

i

∈ [0, 1].

Note that the eigenvalues of A are real (even if A is not hermitian), and in the interval [−1, 1]. They are distinct if b i ∈ (0, 1). Obviously, 1 is eigenvalue.

Moreover, also −1 is eigenvalue (prove it!). The remaining eigenvalues are not known (for generic values of the b j ).

Let C be the space of n × n circulant matrices. Let us compute C A , the minimizer of kA − Xk F , X ∈ C, with the aim to compare its eigenvalues with those of A. Let {J ¹ , J 2 , . . . , J _n } be a basis of C. Then C A = P n

k=1 α _k J _k , where Bα = c, B rs = (J r , J s ), c r = (J r , A ), 1 ≤ r, s ≤ n. If J k = J ₂ ^k−1 where

J 2 =





 0 1

. ..

1 1 0







then (J r , J _s ) = nδ rs , (J 1 , A) = (J r , A ) = 0, r = 3, . . . , n − 1, and (J ² , A) = 1 + P b j , (J n , A ) = n − 1 − P b j . Thus

C

^A

=







p q

q p

q . . . . . . p

p q







, p = 1 + P b

j

n , q = n − 1 − P b

j

n = 1 − p.

The eigenvalues of C A can be easily computed. In fact, recalling that C

^A

= F d(F C

^TA

e

₁

)d(F e

1

)

⁻¹

F

^H

, [F ]

ij

= 1

√ n ω

n^{(i−1)(j−1)}

, 1 ≤ i, j ≤ n, ω

ⁿ

= e

ⁱ^2πⁿ

,

first write the vector √

nF C A ^T e 1 :

√ nF





 0 p 0

· 0 q







= √ n(pF e

2

+qF e

n

) = √ n(pF +qF

^H

)e

2

, F = F

^H

Q, Q =





 1

1 . . . 1





 ,

and then observe that the eigenvalues of C A are its entries:

pω _n ⁱ ⁻¹ + (1 − p)ω n ⁱ ⁻¹ = pω ⁱ _n ⁻¹ + (1 − p)ω n ⁿ ⁻ⁱ⁺¹ , i = 1, . . . , n.

Note that ¹ _n ≤ p ≤ ⁿ⁻¹ n , and that it is sufficient to study the eigenvalues of C ^A

for ¹ ₂ ≤ p ≤ ⁿ ⁻¹ n .

(5)

The case p = ⁿ ⁻¹ _n (b j = 1 ∀ j).

In this case the eigenvalues of A are obviously known, they are −1, 0 with algebraic multiplicity n − 2, and 1. The eigenvalues of C A are

n − 1

n ω

_nⁱ⁻¹

+ 1

n ω

_nⁱ⁻¹

, i = 1, . . . , n.

(draw them!). They are all inside the set {z : |z| ≤ 1}, except 1 (i = 1) and, for even n, −1 (i − 1 = ⁿ ₂ ).

The case p = ¹ ₂ (P b _j + 1 = ⁿ ₂ ).

In this case the eigenvalues of A are not known (?, perhaps are known if b j =

1 2 ∀ j, and in other particular cases). The eigenvalues of C A are <(ω n ⁱ ⁻¹ ) = cos ^2π(i−1) _n , i = 1, . . . , n. They are all inside the set [−1, 1], except 1 (i = 1) and, for even n, −1 (i − 1 = ⁿ 2 ). . . .

RESULT. Let A ∈ C ^n×n be a stochastic by columns (or by rows) n × n matrix.

So, 1 is eigenvalue of A. Let U be a unitary matrix such that U e i = √ ¹ n ee ⁱ ^θ , for some i and θ (i = 1, θ = 0 if U = F ), and set L = {Ud(z)U ^H : z ∈ C ⁿ }.

Note that L is a n-dimensional subspace of C ^n×n , i.e. there exist J k ∈ L, k = 1, . . . , n, linearly independent such that L = Span {J k }. Let L A be the minimizer of kA − Xk F in L,

L A = U diag ((U ^H AU ) jj )U ^H = U d(U ^T z ^r _A )d(U ^T v) ⁻¹ U ^H , L A = P n

k=1 α _k J _k , α = B ⁻¹ c, B _rs = (J r , J _s ), c r = (J r , A) where v is chosen such that (U ^T v) j 6= 0 ∀ j (v = e ¹ if U = F ).

Then L A is stochastic by columns and by rows (SEE the second Theorem in the next pages). In particular, 1 is eigenvalues of L A . All eigenvalues of L A are particular points of the convex set { ^z z

^HH

^Az z : z ∈ C ⁿ }. So, when A is normal (hermitian) they are in the minimum polygon (real interval) containing the eigenvalues of A. When alternatively A ≥ 0 (?A ^k ≥ 0 for some k?) they are in the set {z : |z| ≤ 1} whenever L A ≥ 0, but even in the latter case they can be either inside or outside the minimum polygon containing the eigenvalues of A, SEE the above example (however, if A is also normal, they are inside).

[Question: there are matrices A simultaneously normal, non negative and stochastic by columns (or by rows) which are not real symmetric and not in C ? ] Proposition.

If A is a non negative n × n matrix, then its best approximation in C is also non negative. (Proof: We know that C A = P

k α k J k with J k = J ₂ ^k−1 ≥ 0 and α k = _n ¹ (J k , A). If A is non negative then also α k ≥ 0, so C A ≥ 0).

Question: Given A non negative, is L A non negative for other spaces L ? Is τ A

non negative ? Recall that

τ = {Ud(z)U

^H

: z ∈ C

ⁿ

}, U =

r 2

n + 1 sin ijπ

n + 1 , 1 ≤ i, j ≤ n.

Elements of a basis of τ are obtained by choosing J k ∈ τ such that e ^T 1 J k = e ^T _k , k = 1, . . . , n. They are matrices made up of zeros and ones only. Moreover, for such J k , we have

τ

A

= X

k

α

k

J

k

, α = B

⁻¹

c , B

⁻¹

= 1

2n + 2 (3J

1

− J

³

), c

r

= (J

r

, A).

[·]. If A is non negative, then c ≥ 0, but B ⁻¹ c may have negative entries, but

perhaps τ A is yet non negative (investigate!).

(6)

THREE THEOREMS on s-stochastic matrix algebras L and on the best approximation in L of A (each more general than the previous one):

First theorem

Set L = {Ud(x)U ^H : x ∈ C ⁿ } where U is a unitary matrix. Choose v such that [U ^T v] j 6= 0 ∀ j. Note that the choice v = e ¹ works for L = C, τ, η, µ, . . . but not for all low complexity matrix spaces L [. . .].

Then L = {L(z) : z ∈ C ⁿ } where we have set L(z) = Ud(U ^T z)d(U ^T v) ⁻¹ U ^H . Note that v ^T L(z) = z ^T , and that x ^T L(z) = z ^T L(x), ∀ x, z ∈ C ⁿ . Moreover, L(v) = I.

Observe that if L(e) = we ^T for some w ∈ C ⁿ , then L(z) is z ^T w-stochastic by columns, i.e.

e ^T L(z) = z ^T L(e) = (z ^T w)e ^T

[we have observed this first for L = η where w = e, v = e 1 (see previous pages)].

When, in general, L(e) = we ^T ? Iff v ^T w = 1 and we ^T ∈ L. Assume w such that v ^T w = 1, so w is in particular non null. Then we ^T ∈ L iff

we ^T = U



 e ^T w



 U ^H = (e ^T w)(U e i )(U e i ) ^H , e ^T w 6= 0. (∗)

Since U e i 6= 0, the equation () times e j , with e j | (Ue i ) ^H e j 6= 0, implies w = αU e i α 6= 0, and, since (U ^T v) i 6= 0, v ^T times the equation () implies U e i = βe β 6= 0. Thus w = γe γ 6= 0, and, since v ^T w = 1, we have γ = v

T

¹ e . So, if L(e) = we ^T , then v ^T e must be non zero and w must be equal to v

T

^e e . Now, provided that v ^T e 6= 0, the matrix v ^ee

T^T

e is in L iff

ee ^T v ^T e = U





e

^T

e v

T

e



 U ^H = n

v ^T e (U e i )(U e i ) ^H . (∗∗) If U e i = ^√ ¹ _n ee ⁱ ^θ , then v ^T e 6= 0 and (**) holds. So, we have proved the following theorem:

Theorem.

If U e i = ^√ ¹ _n ee ⁱ ^θ , then L(e) = ^ee v

T^T

e . It follows that, for any z ∈ C ⁿ , the matrix L(z) is ^z v

^TT

^e e -stochastic by columns, in particular the best approximation of A in L, L A = L(z A ) = U diag ((U ^H AU ) jj )U ^H , is ^z v

^T^AT

^e e -stochastic by columns, i.e.

e ^T L A = ^z v

^T^AT

^e e e ^T . Finally note that one of the eigenvalues of L A , (U ^H AU ) ii , is equal to ¹ _n e ^T Ae, and that ^z v

^T^AT

^e e = _n ¹ e ^T Ae.

[For example, if L = C, η, . . . (where v can be chosen equal to e ¹ ), then e ^T L(z) = z ^T L(e) = (z ^T e)e ^T = (P z _i )e ^T , and e ^T L A = ( _n ¹ e ^T Ae)e ^T ].

In particular, if A is stochastic by columns (e ^T A = e ^T ) or by rows (Ae = e),

then L A is stochastic by columns.

(7)

Second theorem

Let U be a n × n unitary matrix, and set L = {Ud(z)U ^H : z ∈ C ⁿ }. Choose v ∈ C ⁿ such that (U ^T v) j 6= 0 ∀ j (v = e ¹ if L = C, τ, η, µ, . . .; v ∈ R ⁿ whenever possible, f.i. if U ∈ R ⁿ ^×n ). Then, if we set

L r (z) = U d(U ^T z)d(U ^T v) ⁻¹ U ^H (v ^T L r (z) = z ^T ), L c (z) = U d(U ^H v) ⁻¹ d(U ^H z)U ^H (L c (z)v = z),

L can be also represented as L = {L r (z) : z ∈ C ⁿ } = {L c (z) : z ∈ C ⁿ }. Note that x ^T L r (y) = y ^T L r (x), L c (y)x = L c (x)y, L r (v) = I, L c (v) = I.

Theorem.

If one of the columns of U has all entries equal each other, i.e. ∃ i and θ such that U e i = ^√ ¹ _n ee ⁱ ^θ , then

(1) L r (e) = ^ee e

T^T

v , L c (e) = e ^ee

T^T

v (⇒ ee ^T ∈ L) and therefore L r (z) is e ^z

T^T

^e v - stochastic by columns, and L ^c (z) is e ^z

T^T

^e v -stochastic by rows; in other words, X ∈ L ⇒ X is s X -stochastic by rows and by columns for some s X . [Note that v ^T e 6= 0 because (v ^T U ) i 6= 0 ((v ^T U ) j 6= 0 ∀ j)].

(2) Given A ∈ C ^n×n , the matrix L ^A = L ^r (z ^r _A ) = L ^c (z ^c _A ) = U diag ((U ^H AU ) jj )U ^H , defined as the minimizer on L of kA − Xk ^F , has (U ^H AU ) ii = _n ¹ e ^T Ae as eigenvalue, and is _n ¹ e ^T Ae -stochastic by rows and by columns, i.e. L ^A e =

1 n (e ^T Ae)e, e ^T L A = _n ¹ (e ^T Ae)e ^T .

(3) If A is stochastic by columns or by rows, then L A is stochastic by rows and by columns.

proof. (1): Note that M

j

:= U



 e

^T

e



 U

^H

∈ L, ∀ j, M

^j

= (e

^T

e )(U e

j

)(U e

j

)

^T

. Moreover, by the assumption U e i = √ ¹

n ee ⁱ ^θ , we have M i = ee ^T . So, ee ^T ∈ L and, obviously, L r (e) = ^ee _v

T^T

e (v ^T L r (e) = e ^T !), L c (e) = _v ^ee

T^T

e (L c (e)v = e!).

Thus, ∀ z ∈ C ⁿ we have e ^T L r (z) = z ^T L r (e) = ^z v

^TT

^e e e ^T , L c (z)e = L c (e)z = e ^e

T^T

^z v e.

(2): It is enough to observe that (U ^H AU ) ii = _n ¹ e ^T Ae and use the formula L A = U diag ((U ^H AU ) jj )U ^H . However, let us obtain the thesis from (1). As a consequence of (1), the matrix

L

^A

= U d(U

^T

z

^r_A

)d(U

^T

v )

⁻¹

U

^H

= U d(U

^H

v )

⁻¹

d(U

^H

z

^c_A

)U

^H

is both ^e v

^TT

^z e

^r^A

-stochastic by columns and ^e _v

^TT

^z e

^c^A

-stochastic by rows. Let us prove that ^e v

^TT

^z e

^r^A

= ^e v

^TT

^z e

^c^A

= _n ¹ e ^T Ae. Since U ^H e = √

ne ^−iθ e _i , we have (z ^r _A ) ^T e = v ^T U diag ((U ^H AU ) jj )U ^H e = √

ne ^−iθ v ^T U e _i (U ^H AU ) ii

= v ^T e(U ^H AU ) ii = v ^T e(U e i ) ^H A(U e i ) = v ^T e ¹ _n e ^T Ae and, since e ^T U = √

ne ⁱ ^θ e ^T _i , we have

e ^T z ^c _A = e ^T U diag ((U ^H AU ) jj )U ^H v = √

ne ⁱ ^θ (U ^H AU ) ii e ^T _i U ^H v

= e ^T v(U ^H AU ) ii = e ^T v ¹ _n e ^T Ae.

Question: when L c (z) = L r (x) ? Iff U ^T x = d(u)U ^H z, u = d(U ^H v) ⁻¹ U ^T v

[u = e if U ∈ R ⁿ ^×n (v ∈ R ⁿ ) or U = F (v = e 1 ); |u i | = 1 ∀ i].

(8)

Third Theorem

Let U, V be two n×n unitary matrices. Choose v, u ∈ C ⁿ such that (U ^T v) i 6= 0, (V ^T u) i 6= 0, ∀ i. Given z ∈ C ⁿ , set

L r (z) = U d(V ^T z)d(U ^T v) ⁻¹ V ^H , L c (z) = U d(V ^H u) ⁻¹ d(U ^H z)V ^H . Note that v ^T L ^r (z) = z ^T , L ^c (z)u = z.

Theorem.

If there exists i such that V e i = √ ¹ n ee ⁱ ^θ , U e i = √ ¹ n ee ⁱ ^ϕ , then V d(U ^T e)d(U ^T v) ⁻¹ V ^H =

ee

^T

e

T

v , U d(V ^H u) ⁻¹ d(V ^H e)U ^H = e ^ee

T^T

u , and therefore

e ^T L r (z) = z ^T V d(U ^T e)d(U ^T v) ⁻¹ V ^H = ( ^z e

^TT

^e v )e ^T , L c (z)e = U d(V ^H u) ⁻¹ d(V ^H e)U ^H z = e( e ^e

T^T

^z u ).

In other words, X ∈ L ⇒ X is s X -stochastic by rows and by columns for some

s X ∈ C. Since, moreover, (U ^H AV ) ii = _n ¹ e ⁱ ^(θ−ϕ) e ^T Ae , if L A = U d(V ^T z ^r _A )d(U ^T v) ⁻¹ V ^H = U d(V ^H u) ⁻¹ d(U ^H z ^c _A )V ^H = U diag ((U ^H AV ) jj )V ^H is the best approximation of

A in L = {Ud(z)V ^H : z ∈ C ⁿ }, then we have that

(z ^r _A ) ^T e = v ^T U diag ((U ^H AV ) jj )V ^H e = ¹ _n (e ^T Ae)v ^T e, e ^T (z ^c _A ) = e ^T U diag ((U ^H AV ) jj )V ^H u = _n ¹ (e ^T Ae)e ^T u, and therefore

e ^T L A = 1

n (e ^T Ae)e ^T , L A e = 1

n (e ^T Ae)e.

It follows that whenever A ∈ C ⁿ ^×n is stochastic by rows (Ae = e) or stochastic by columns (e ^T A = e ^T ), its better approximation L A in L is stochastic simultaneously by rows and by columns.

proof. V e i = √ ¹

n e ⁱ ^θ e U e i = √ ¹

n e ⁱ ^ϕ e ⇒

V





e^Te eTv



 V

^H

= e

^T

e e

^T

v

V e

i

(V e

i

)

^H

= ee

^T

e

^T

v

h U





e^Te eTu



 U

^H

= e

^T

e e

^T

u

U e

i

(U e

i

)

^H

= ee

^T

e

^T

u

i .

U e _i = ^√ ¹ _n e ⁱ ^ϕ e V e _i = ^√ ¹ _n e ⁱ ^θ e ⇒ e ^T _i U ^H = √ ¹

n e ^−iϕ e ^T e ^T _i V ^H = √ ¹

n e ^−iθ e ^T ⇒

e

^T

i

(U

^H

e ) =

√¹

n

e

^−iϕ

e

^T

e =

^(U_e^H_T^v_v⁾ⁱ

e

^T

e , e

^T

j

(U

^H

e ) = e

^T_j

U

^H

e

^−iϕ

√ nU e

i

= 0, j 6= i h e

^T

i

(V

^H

e ) =

√¹

n

e

^−iθ

e

^T

e =

^(V_e^H_T^u_u⁾ⁱ

e

^T

e , e

^T

j

(V

^H

e ) = e

^T_j

V

^H

e

^−iθ

√ nV e

i

= 0, j 6= i i

.

Thus we have

d(U ^H e)d(U ^H v) ⁻¹ =





e

^T

e e

T

v



 d(V ^H e)d(V ^H u) ⁻¹ =





e

^T

e e

T

u



 ,

and therefore V d(U ^T e)d(U ^T v) ⁻¹ V ^H = ^ee e

T^T

v U d(V ^H u) ⁻¹ d(V ^H e)U ^H = e ^ee

T^T

u .

The equalities V ^H e = e ^−iθ √

ne _i , e ^T U = e ⁱ ^ϕ √

ne ^T _i , let us easily obtain the

assertions on L A .

(9)

REMARK. L = {Ud(z)V ^H : z ∈ C ⁿ }, U, V unitary, L A = U diag ((U ^H AV ) jj )V ^H :

V e i = √ ¹

n e ⁱ ^θ e ⇒ L A e = ((U e i ) ^H Ae)U e i , (U e i ) ^H L A = ^{(U e}

ⁱ

_n ⁾

^H

^Ae e ^T ; V e i = √ ¹

n e ⁱ ^θ e, U e i = _ke ^e

^iϕ

≤

k e _≤ , Ae = e _≤ ⇒ L A e = e _≤ , e ^H _≤ L A = ^ke

^≤

_n ^k

²

e ^T ;

U e i = √ ¹

n e ⁱ ^ϕ e ⇒ e ^T L A = (e ^T AV e i )(V e i ) ^H , L A (V e i ) = ^e

^T

^{AV e} _n

ⁱ

e;

U e i = ^√ ¹ _n e ⁱ ^ϕ e, V e i = _ke ^e

^iθ

≤

k e _≤ , e ^T A = e ^H _≤ ⇒ e ^T L A = e ^H _≤ , L A e _≤ = ^ke

^≤

_n ^k

²

e.

Exercise.

Given w ∈ C ⁿ , w 6= 0, set M = {X ∈ C ⁿ ^×n : Xwe ^T = we ^T X }. Prove that (i) M is a matrix algebra ;

(ii) M = {X ∈ C ⁿ ^×n : Xw = cw & e ^T X = ce ^T for some c ∈ C}, i.e. X ∈ M implies X is s X -stochastic by columns ;

(iii) if w = e, then M = {X ∈ C ⁿ ^×n : Xe = ce & e ^T X = ce ^T for some c ∈ C}, i.e. X ∈ M implies X is s ^X -stochastic by rows and by columns.

→ Investigate low complexity spaces L of matrices commuting with we ^T , in

particular commutative spaces L including we ^T . We have seen examples in the

case w = e.

(10)

Let U be a n × n unitary matrix. Set L = {U(µ ◦ Z)U ^H : Z ∈ C ^n×n } where µ is a fixed matrix whose entries are 0 or 1 and ◦ is the entry by entry product.

For example

µ =





1 1 1 1 1



 , Z =





z 11 z 12 z 13

z 21 z 22 z 23

z 31 z 32 z 33



 , µ ◦ Z =





z 13

z 21 z 22 z 23

z 33



 .

Note that if µ = I, then L = {Ud(z)U ^H : z ∈ C ⁿ }.

The space of matrices L is a vector subspace of C ⁿ ^×n , and is a matrix algebra (i.e. product of matrices from L are in L) if the matrix µ satisfies the condition

[µ] ij = 0 ⇒ [µ ² ] ij = 0

[or [µ ² ] ij 6= 0 ⇒ [µ] ij 6= 0; or µ ² ≤ αµ for some α > 0 (the pattern of µ ² is enclosed in the pattern of µ)]. Examples of µ satisfying µ ² ≤ αµ:

µ = I, ee

^T

,



 1 1



 ,



 1 1



 ,



 1 1 1 1

1 

 ,



 1 1 1



 ,





1 1 1 1 1



 ,



 1 1 1 1 1 1



 ,



 1 1 1

1 

 ,



 1

1 1 1



 ,



 1

1 1



 ,



 1 1 1

1 1



 ,



 1 1 1 1



 ,





1 1 1 1 1



 .

Given A ∈ C ⁿ ^×n , and defined L A as the minimizer of kA − U(µ ◦ Z)U ^H k F , Z ∈ C ^n×n , we have

L ^A = U (µ ◦ (U ^H AU ))U ^H .

Observe that if µ has a triangular structure, then the eigenvalues of L ^A are µ jj (U ^H AU ) jj , j = 1, . . . , n, i.e. null or the same of U diag ((U ^H AU ) jj )U ^H .

In the particular case where U e 1 = ^√ ¹ _n e ⁱ ^θ e and µ 11 = 1, the matrix µ ◦ (U ^H AU ) can be written as follows

µ ◦ (U

^H

AU) =







[

_n¹

e

^T

Ae] · µ

1j

[

√¹

n

e

^−iθ

e

^T

A(U e

j

)] ·

· ·

µ

i1

[

√¹

n

e

ⁱ^θ

(U e

i

)

^H

Ae] · µ

ij

[(U e

i

)

^H

A(U e

j

)] ·

· ·





 .

Theorem (stoch by rows).

Assume U e 1 = ^√ ¹ _n e ⁱ ^θ e and µ 11 = 1. If Ae = e, then L A e = e and

µ ◦ (U

^H

AU ) =







1 · µ

1j

[

√¹n

e

^−iθ

e

^T

A(U e

j

)] ·

0 ·

· · µ

ij

[(U e

i

)

^H

A(U e

j

)] ·

0 ·





 .

If moreover µ 1j = 0, j = 2, . . . , n, then

µ ◦ (U

^H

AU) =







1 0 · 0

0 ·

· · µ

ij

[(U e

i

)

^H

A(U e

j

)] ·

0 ·







, e

^T

L

^A

= e

^T

.

(11)

(thus choose µ ≥ e 1 e ^T ₁ + e 1 e ^T _j for some j in order to have e ^T L A 6= e ^T ).

If alternatively A is quasi-stochastic by rows, Ae = e _≤ , with 0 ≤ e ≤ ≤ e, then L A e = ^e

^T

_n ^e

^≤

e whenever µ i1 = 0 ∀ i ≥ 2.

proof: investigate the first column in the equality L A U = U (µ ◦ (U ^H AU )):

L A e = 1

n (e ^T Ae)e +

n

X

i=1, i6=1

µ i1 ((U e i ) ^H Ae)U e i .

Theorem (stoch by columns).

Assume U e 1 = √ ¹

n e ⁱ ^θ e and µ 11 = 1. If e ^T A = e ^T , then e ^T L A = e ^T and

µ ◦ (U

^H

AU) =







1 0 · 0

· ·

µ

i1

[

√¹

n

e

^iθ

(U e

i

)

^H

Ae] · µ

ij

[(U e

i

)

^H

A(U e

j

)] ·

· ·





 .

If moreover µ i1 = 0, i = 2, . . . , n, then

µ ◦ (U

^H

AU) =







1 0 · 0

0 ·

· · µ

ij

[(U e

i

)

^H

A(U e

j

)] ·

0 ·





 , L

^A

e = e.

(thus choose µ ≥ e 1 e ^T ₁ + e i e ^T ₁ for some i in order to have L A e 6= e).

If alternatively A is quasi-stochastic by columns, e ^T A = e ^T _≤ , with 0 ≤ e ≤ ≤ e, then e ^T L A = ^e

^T

_n ^e

^≤

e ^T whenever µ 1j = 0 ∀ j ≥ 2.

proof: investigate the first row in the equality U ^H L A = (µ ◦ (U ^H AU ))U ^H :

e ^T L ^A = 1

n (e ^T Ae)e ^T +

n

X

j =1, j6=1

µ 1j (e ^T A(U e j ))(U e j ) ^H .

Note. If L A and its eigenvalues do not fit our requirements, then we could introduce a perturbation of L A , yet in L, in place of it, for example the matrix

M = L A + U (µ ◦ ( (U ^H AU ) ◦ ε ))U ^H , ε ∈ R ^n×n , |ε ij | small.

Note that

kA − Mk F ≤ kA − L A k F +

s X

i,j, µ

ij

=1

|ε ij | ² |(U ^H AU ) ij | ² .

But the Frobenius norm is the right norm?

(12)

Results on L A more general than the ones in REMARK, where L = {Ud(z)V ^H : z ∈ C ⁿ }, and the ones in Theorem stoch by columns, and Theorem stoch by rows, where L = {U(µ ◦ Z)U ^H : Z ∈ C ^n×n }:

Let U, V be n × n unitary matrices. Set L = {U(µ ◦ Z)V ^H : Z ∈ C ⁿ ^×n } where µ is a fixed matrix whose entries are 0 or 1 and ◦ is the entry by entry product.

Note that if µ = I, then L = {Ud(z)V ^H : z ∈ C ⁿ }.

The space of matrices L is a vector subspace of C ^n×n . In general it is not a matrix algebra.

Given A ∈ C ⁿ ^×n , and defined L A as the minimizer of kA − U(µ ◦ Z)V ^H k F , Z ∈ C ⁿ ^×n , we have

L A = U (µ ◦ (U ^H AV ))V ^H .

Observe that if µ has a diagonal structure, then the eigenvalues of L A L ^H A are µ _jj |(U ^H AV ) jj | ² , j = 1, . . . , n.

Note that the equalities L A V = U (µ◦(U ^H AV )) and U ^H L A = (µ◦(U ^H AV ))V ^H imply, respectively,

L A (V e 1 ) = µ 11 (U ^H AV ) 11 U e 1 + P n

i =1, i6=1 µ i1 (U ^H AV ) i1 U e i , (U e 1 ) ^H L A = µ 11 (U ^H AV ) 11 (V e 1 ) ^H + P n

j=1, j6=1 µ 1j (U ^H AV ) 1j (V e j ) ^H . In the following two theorems e _≤ can be an arbitrary vector with complex entries. However, we think to use the stated results for e _≤ = e (stochastic case) or for e _≤ , 0 ≤ [e ≤ ] j ≤ 1 (quasi-stochastic case).

Theorem (stochastic by rows) V e 1 = √ ¹

n e ⁱ ^θ e & µ 11 = 1 ⇒

L

^A

e = ((U e

1

)

^H

Ae)U e

1

+

n

X

i=1, i6=1

µ

i1

((U e

i

)

^H

Ae)U e

i

.

Thus

(i) if Ae = e _≤ & U e 1 = _ke ^e

^iϕ

≤

k e _≤ , then L A e = e _≤ . If, moreover, µ 1j = 0

∀ j 6= 1, then e ^H _≤ L A = ^ke

^≤

_n ^k

²

e ^T .

(ii) if Ae = e _≤ & U e 1 = ^e ^√

^iϕ

_n e , then L A e = ^e

^T

_n ^e

^≤

e whenever µ i1 = 0 ∀ i 6= 1.

Theorem (stochastic by columns) U e 1 = √ ¹ n e ⁱ ^ϕ e & µ 11 = 1 ⇒

e

^T

L

^A

= (e

^T

A(V e

1

))(V e

1

)

^H

+

n

X

j=1, j6=1

µ

1j

(e

^T

AV e

j

)(V e

j

)

^H

.

Thus

(i) if e ^T A = e ^H _≤ & V e 1 = _ke ^e

^iθ

≤

k e _≤ , then e ^T L A = e ^H _≤ . If, moreover, µ i1 = 0

∀ i 6= 1, then L A e _≤ = ^ke

^≤

_n ^k

²

e.

(ii) if e ^T A = e ^H _≤ & V e 1 = √ ^e

^iθ

n e, then e ^T L A = ^e

H

≤

e

n e ^T whenever µ 1j = 0

∀ j 6= 1.

Given z T , the first row of L(z), compute the first row of L(z) 2 , L(z) 4 , L(z) 8 , . . . , L(z) 2k. Cost = one U T transform +kn a.o.+ one U transform.

U n × n unitary, [U T e 1 ] i 6= 0 ∀ i,

L(z) = Ud(U T z)d(U T e 1 ) −1 U H , z ∈ C n ( e T 1 L(z) = z T ), L(x) = L(z) 2 ,

U T x = d(U T e 1 ) −1 d(U T z)U T z, x T = z T L(z) (Gianluca),

L = {L(z) : z ∈ C n } is a commutative matrix algebra, L(x)L(y) = L(L(x) T y ) = L(y)L(x) = L(L(y) T x).