Adaptive Spectral Galerkin Methods with Dynamic Marking

(1)

Adaptive Spectral Galerkin Methods

with Dynamic Marking

Claudio Canuto

∗

Ricardo H. Nochetto

†

Rob Stevenson

‡

Marco Verani

§

August 28, 2018

Abstract

The convergence and optimality theory of adaptive Galerkin methods is almost exclusively based on the D¨orfler marking. This entails a fixed parameter and leads to a contraction constant bounded below away from zero. For spectral Galerkin methods this is a severe limitation which affects performance. We present a dynamic marking strategy that allows for a super-linear relation between consecutive discretization errors, and show exponential convergence with linear computational complexity whenever the solution belongs to a Gevrey approximation class.

1 Introduction

The modern analysis of adaptive discretizations of partial differential equations aims at establish-ing rigorous results of convergence and optimality. The former results concern the convergence of the approximate solutions produced by the successive iterations of the adaptive algorithm towards the exact solution u, with an estimate of the error decay rate measured in an appro-priate norm. On the other hand, optimality results compare the cardinality of the active set of basis functions used to expand the discrete solution to the minimal cardinality needed to ap-proximate the exact solution with similar accuracy; this endeavor borrows ideas from Nonlinear Approximation theory. Confining ourselves in the sequel to second-order elliptic boundary value problems, such kind of analysis has been carried out first for wavelet discretizations [14, 18], then for h-type finite elements [19, 22, 13, 12, 15], [19] dealing just with convergence, and more recently for spectral-type methods [6, 7, 10]; we refer to the surveys [5, 11, 20, 24]. In contrast, the state of the art for hp-type finite elements is still in evolution; see [16, 3] and the more recent paper [8] which includes optimality estimates.

For all these cases, convergence is proven to be linear, i.e., a certain expression controlling the error (a norm, or a combination of norm and estimator) contracts with some fixed parameter ρ < 1 from one iteration to the next one, e.g.,_ku−uk+1k ≤ ρku−ukk. This is typically achieved

if the adaptation strategy is based on some form of D¨orfler marking (or bulk chasing) with fixed

∗_{Dipartimento di Scienze Matematiche, Politecnico di Torino, Corso Duca degli Abruzzi 24, I-10129 Torino, Italy}

(claudio.canuto@polito.it )

†_{Department of Mathematics and Institute for Physical Science and Technology, University of Maryland, College}

Park, Maryland 20742, USA (rhn@math.umd.edu)

‡_{Korteweg-de Vries Institute for Mathematics, University of Amsterdam, P.O. Box 94248, 1090 GE Amsterdam,}

The Netherlands (r.p.stevenson@uva.nl)

§_{MOX-Dipartimento di Matematica, Politecnico di Milano, P.zza Leonardo Da Vinci 32, I-20133 Milano, Italy}

(2)

parameter θ < 1: assuming that P_i∈Iη2

i is some additive error estimator at iteration k, one

identifies a minimal subset I′_{⊂ I such that}

X i∈I′ η2i ≥ θ2 X i∈I ηi2

and utilizes I′ _{for the construction of the new discretization at iteration k + 1. For wavelet or}

h-type fem discretizations, optimality is guaranteed by performing cautious successive adaptations, i.e., by choosing a moderate value of θ, say 0 < θ _{≤ θ}max < 1 [22]. This avoids the need of

cleaning-up the discrete solution from time to time, by subjecting it to a coarsening stage. On the other hand, the resulting contraction factor ρ = ρ(θ) turns out to be bounded from below by a positive constant, say 0 < ρmin ≤ ρ < 1 (related to the ‘condition number’ of the

exact problem), regardless of the choice of θ. This entails a limitation on the speed of convergence for infinite-order methods [6, 7], but is not restrictive for fixed-order methods [22, 13].

It has been shown in [6] that such an obstruction can be avoided if a specific property of the differential operator holds, namely the so-called quasi-sparsity of the inverse of the associated stiffness matrix. Upon exploiting this information, a more aggressive marking strategy can be adopted, which judiciously enlarges the set I′ _{coming out of D¨orfler’s stage. The resulting}

contraction factor ρ can be now made arbitrarily close to 0 by choosing θ arbitrarily close to 1. When a method of spectral type is used, one expects a fast (possibly, exponentially fast) decay of the discretization error for smooth solutions. In such a situation, a slow convergence of the iterations of the adaptive algorithm would endanger the overall performance of the method; from this perspective, it is useful to be able to make the contraction factor as close to 0 as desired. Yet, linear convergence of the adaptive iterations is not enough to guarantee the optimality of the method. Let us explain why this occurs, and why a super-linear convergence is preferable, using the following idealized setting.

As customary in Nonlinear Approximation, we consider the best N -term approximation error EN(u) of the exact solution u, in a suitable norm, using combinations of at most N functions

taken from a chosen basis. We prescribe a decay law of EN(u) as N increases, which classically

for fixed-order approximations is algebraic and reads sup

N

NsEN(u) <∞, (1.1)

for some positive s. However, for infinite-order methods such as spectral approximations an exponential law is relevant that reads

sup

N

eηNα_E

N(u) <∞ (1.2)

for some η > 0 and α_{∈ (0, 1], where α < 1 accommodates the inclusion of C}∞_{-functions that}

are not analytic. This defines corresponding algebraic and exponential sparsity classes for the exact solution u. These classes are related to Besov and Gevrey regularity of u respectively.

We now assume the ideal situation that at each iteration of our adaptive algorithm1

ku − ukk h N_k−s or ku − ukk h e−ηN

α

k_, _(1.3)

where Nk is the cardinality of the discrete solution uk, i.e., the dimension of the approximation

space activated at iteration k. We assume in addition that the error decays linearly from one iteration to the next, i.e., it satisfies precisely

ku − uk+1k = ρ ku − ukk. (1.4)

1_{Throughout the paper, we write A}

k. Bkto indicate that Akcan be bounded by a multiple of Bk, independently

of the iteration counter k and other parameters which Ak and Bk may depend on; Ak h Bk means Ak . Bk and

(3)

If u belongs to a sparsity class of algebraic type, then one easily gets Nk h ρ−k/s, i.e.,

cardinalities grow exponentially fast and

∆Nk := Nk+1− Nk hNkhku − ukk−1/s,

i.e., the increment of cardinality between consecutive iterations is proportional to the current cardinality as well as to the error raised to the power_{−1/s. The important message stemming} from this ideal setting is that for a practical adaptive algorithm one should be able to derive the estimates_{ku − u}k+1k ≤ ρ ku − ukk and ∆Nk.ku − ukk−1/s, because they yield

Nn= n−1 X k=0 ∆Nk . n−1 X k=0 ku − ukk−1/s≤ ku − unk−1/s n−1_X k=0 ρ(n−k)/s_._{ku − u} nk−1/s.

This geometric-series argument is precisely the strategy used in [22, 13] and gives an estimate similar to (1.1). The performance of a practical adaptive algorithm is thus quasi-optimal.

If u belongs to a sparsity class of exponential type, instead, the situation changes radically. In fact, assuming (1.3) and (1.4), one has e−ηNkα h_ρk_{, and so}

lim k→∞k −1/α_N k= | log ρ|/η 1/α ,

i.e., the cardinality Nk grows polynomially. For a practical adaptive algorithm, proving such

a growth is very hard if not impossible. This obstruction has motivated the insertion of a coarsening stage in the adaptive algorithm presented in [6]. Coarsening removes the negligible components of the discrete solution possibly activated by the marking strategy and guarantees that the final cardinality is nearly optimal [14, 6], but it does not account for the workload to create uk.

One of the key points of the present contribution is the observation that if the convergence of the adaptive algorithm is super-linear, then one is back to the simpler case of exponential growth of cardinalities which is ameanable to a sharper performance analysis. To see this, let us assume a super-linear relation between consecutive errors:

ku − uk+1k = ku − ukkq (1.5)

for some q > 1. If additionally uksatisfies (1.3), then one infers that e−ηN

α k+1_h_e−ηqNkα, whence lim k→∞ ∆Nk Nk = q1/α − 1, lim k→∞ | log ku − ukk|1/α Nk = η1/α_,

the latter being just a consequence of (1.3). This suggests that the geometric-series argument may be invoked again in the optimality analysis of the adaptive algorithm.

This ideal setting does not apply directly to our practical adaptive algorithm. We will be able to prove estimates that are consistent with the preceding derivation to some extend, namely

ku − uk+1k ≤ ku − ukkq, ∆Nk≤ Q| log ku − ukk|1/ ¯α,

with constants Q > 0 and ¯α_{∈ (0, α]. Invoking ku − u}nk ≤ ku − ukkq

n−k

, we then realize that Nn= n−1 X k=0 ∆Nk ≤ Q n−1_X k=0 log ku − ukk 1/ ¯α ≤ Qq 1/ ¯α q1/ ¯α_{− 1} log ku − unk 1/ ¯α . Setting ¯η := _qQq1/ ¯α1/ ¯₋₁α − ¯α

, we deduce the estimate sup n e ¯ ηNα¯ nku − u nk ≤ 1,

(4)

which is similar to (1.2), albeit with different class parameters. The most important parameter is ¯α. Its possible degradation relative to α is mainly caused by the fact that the residual, the only computable quantity accessible to our practical algorithm, belongs to a sparsity class with a main parameter generally smaller than that of the solution u. This perhaps unexpected property is typical of the exponential class and has been elucidated in [6].

In order for the marking strategy to guarantee super-linear convergence, one needs to adopt a dynamic choice of D¨orfler’s parameter θ, which pushes its value towards 1 as the iterations proceed. We accomplish this requirement by equating the quantity 1− θ2

k to some function of

the dual norm of the residual rk, which is monotonically increasing and vanishing at the origin.

This defines our dynamic marking strategy. The order of the root at the origin dictates the exponent q in the super-linear convergence estimate of our adaptive algorithm.

The paper is organized as follows. In Sect. 2 we introduce the model elliptic problem and its spectral Galerkin approximation based on either multi-dimensional Fourier or (modified) Legendre expansions. In particular, we highlight properties of the resulting stiffness matrix that will be fundamental in the sequel. We present the adaptive algorithm in Sect. 3, first for the static marking (θ fixed) and later for the dynamic marking (θ tending towards 1); super-linear convergence is proven. With the optimality analysis in mind, we next recall in Sect. 4 the definition and crucial properties of a family of sparsity classes of exponential type, related to Gevrey regularity of the solution, and we investigate how the sparsity class of the Galerkin residual deteriorates relative to that of the exact solution. Finally, in Sect. 5 we relate the cardinality of the adaptive discrete solutions, as well as the workload needed to compute them, to the expected accuracy of the approximation. Our analysis confirms that the proposed dynamic marking strategy avoids any form of coarsening, while providing exponential convergence with linear computational complexity, assuming optimal linear solvers.

2 Model Elliptic Problem and Galerkin Methods

Let d_{≥ 1 and consider the following elliptic PDE in a d-dimensional rectangular domain Ω with} periodic or homogeneous Dirichlet boundary conditions:

Lu =_{−∇ · (ν∇u) + σu = f} in Ω, (2.1)

where ν and σ are sufficiently smooth real coefficients satisfying 0 < ν∗ ≤ ν(x) ≤ ν∗<∞ and

0 < σ∗≤ σ(x) ≤ σ∗<∞ in Ω; let us set

α_∗= min(ν_∗, σ_∗) and α∗= max(ν∗, σ∗) . Let V be equal to H1

0(Ω) or Hp1(Ω) depending on the boundary conditions and denote by V∗

its dual space. We formulate (2.1) variationally as

u∈ V : a(u, v) =hf, vi ∀v ∈ V , (2.2)

where a(u, v) =R_Ων_{∇u · ∇¯v +}R_Ωσu¯v (bar indicating as usual complex conjugate). We denote by_{|||v||| =}pa(v, v) the energy norm of any v_{∈ V , which satisfies}

√_α

∗kvkV ≤ |||v||| ≤

√

α∗_kvk_V _. _(2.3)

2.1 Riesz Basis

We start with an abstract formulation which encompasses the two examples of interest: trigono-metric functions and Legendre polynomials. Let φ =_{φk : k∈ K} be a Riesz basis of V . Thus,

we assume the following relation between a function v =P_k∈Kˆvkφk ∈ V and its coefficients:

kvk2V ≃

X

k∈K

(5)

for suitable weights dk > 0. Correspondingly, any element f ∈ V∗ can be expanded along

the dual basis φ∗ ₌ _{φ∗

k} as f =

P

k∈Kfˆkφ∗k, with ˆfk = hf, φki, yielding the dual norm

representation kfk2V∗ ≃ X k∈K | ˆfk|2d−1k =:kvk 2 φ∗. (2.5)

For future reference, we introduce the vectors v = (ˆvkd1/2k )k∈K and f = ( ˆfkd−1/2k )k∈K as well

as the constants β_∗_{≤ β}∗ _{of the norm equivalence in (2.4)}

β_∗_kvkV ≤ kvkφ=kvkℓ2 ≤ β∗kvk_V ∀v ∈ V . (2.6) This implies 1 β∗kfkV∗ ≤ kfkφ∗ =kfkℓ2 ≤ 1 β_∗kfkV∗ ∀f ∈ V ∗_. _(2.7)

The two key examples to keep in mind are trigonometric basis and tensor products of Babuˇska-Shen basis. We discuss them briefly below.

Trigonometric basis. Let Ω = (0, 2π)d_{and the trigonometric basis be φ}

k(x) = _(2π)1d/2 eik·x

for any k_{∈ K = Z}d _{and x}

∈ Ω. Any function v ∈ L2_{(Ω) can be expanded in terms of}

{φk}_k∈Zd as follows: v =X k ˆ vkφk, ˆvk =hv, φki , kvk2L2_(Ω)= X k |ˆvk|2. (2.8) The space V := H1

p(Ω) of periodic functions with square integrable weak gradient can now be

easily characterized as the subspace of those v_{∈ L}2_{(Ω) for which}

kvk2V =kvk2H1 p(Ω)=

X

k

| ˆVk|2<∞ (where ˆVk:= ˆvkd1/2_k , with dk:= 1 +|k|2).

This induces an isomorphism between H1

p(Ω) and ℓ2(Zd): for each v∈ Hp1(Ω) let v = ( ˆVk)k∈K∈

ℓ2_(Zd_{) and note that} _kvk H1

p(Ω) = kvkℓ2. Likewise, the dual space H

−1

p (Ω) = (Hp1(Ω))′ is

characterized as the space of those functionals f for which kfk2 V∗ =kfk2 H−1 p (Ω)= X k | ˆFk|2 with Fˆk := ˆfkd−1/2_k .

We also have an isomorphism between H−1

p (Ω) and ℓ2(Zd) upon setting f = ( ˆFk)k∈K for f ∈

H−1

p (Ω) and realizing thatkfkHp−1(Ω)=kfkℓ2.

Babuˇska-Shen basis. Let us start with the one-dimensional case d = 1. Set I = (_{−1, 1),} V := H1

0(I), and let Lk(x), k ≥ 0, stand for the k-th Legendre orthogonal polynomial in I,

which satisfies deg Lk = k, Lk(1) = 1 and

Z

I

Lk(x)Lm(x) dx = 2

2k + 1δkm , m≥ 0 . (2.9)

The natural modal basis in H1

0(I) is the Babuˇska-Shen basis (BS basis), whose elements are

defined as ηk(x) = r 2k_{− 1} 2 Z 1 x L_k−1(s) ds = √ 1 4k_{− 2} Lk−2(x)− Lk(x) , k_{≥ 2 .} (2.10)

The basis elements satisfy deg ηk= k and

(ηk, ηm)H1 0(I) =

Z

I

(6)

i.e., they form an orthonormal basis for the H1

0(I)-inner product. Equivalently, the

(semi-infinite) stiffness matrix Sη of the Babuˇska-Shen basis with respect to this inner product is the

identity matrix.

We now consider, for simplicity, the two-dimensional case d = 2 since the case d > 2 is similar. Let Ω = (_{−1, 1)}2_{, V = H}1

0(Ω), and consider the tensorized Babuˇska-Shen basis, whose

elements are defined as

ηk(x) = ηk1(x1)ηk2(x2) , k1, k2≥ 2 , (2.12)

where we set k = (k1, k2) and x = (x1, x2); indices vary in the set K ={k ∈ N2 : ki≥ 2 for i =

1, 2}, which is ordered ‘a la Cantor’ by increasing total degree ktot= k1+ k2 and, for the same

total degree, by increasing k1. The tensorized BS basis is no longer orthogonal, since

(ηk, ηm)H1

0(Ω)= (ηk1, ηm1)H01(I)(ηk2, ηm2)L2(I)+ (ηk1, ηm1)L2(I)(ηk2, ηm2)H10(I) , (2.13)

whence (ηk, ηm)H1

0(Ω) 6= 0 if and only if k1 = m1 and k2− m2 ∈ {−2, 0, 2}, or k2 = m2 and

k1− m1∈ {−2, 0, 2}. Obviously, we cannot have a Parseval representation of the H01(Ω)-norm

of v =P_k∈Kˆvkηk in terms of the coefficients ˆvk. With the aim of getting (2.4), we follow [10]

and we first perform the orthonormalization of the BS basis via a Gram-Schmidt procedure. This allows us to build a sequence of functions

Φk =

X

m≤k

gmkηm, (2.14)

such that gkk6= 0 and

(Φk, Φm)H1

0(Ω)= δkm ∀ k, m ∈ K .

We will refer to the collection Φ :={Φk: k∈ K} as the orthonormal Babuˇska-Shen basis (OBS

basis), for which the associated stiffness matrix SΦwith respect to the H01(Ω)-inner product is

the identity matrix. Equivalently, if G = (gmk) is the upper triangular matrix which collects

the coefficients generated by the Gram-Schmidt algorithm above, one has

GTSηG = SΦ= I , (2.15)

that is the validity of (2.6) with dk = 1. However, unlike Sη, which is very sparse, the upper

triangular matrix G is full; in view of this, we next apply a thresholding procedure to wipe-out a significant portion of the non-zero entries sitting in the leftmost columns of G. This leads to a modified basis whose computational efficiency is quantitatively improved, without significantly deteriorating the properties of the OBS basis. To be more precise, we use the following notation: Gt indicates the matrix obtained from G by setting to zero a certain finite set of off-diagonal

entries, so that in particular diag(Gt) = diag(G); correspondingly, E := Gt− G is the matrix

measuring the truncation quality, for which diag(E) = 0. Finally, we introduce the matrix

Sφ= GTtSηGt (2.16)

which we interpret as the stiffness matrix associated to the modified BS basis defined in analogy to (2.14) as

φk=

X

m∈Mt(k)

gmkηm (2.17)

where Mt(k) ={m ≤ k : Emk= 0}. This forms a new basis in H01(Ω) (because k∈ Mt(k) and

gkk 6= 0). We will term it a nearly-orthonormal Babuˇska-Shen basis (NOBS basis). Note that

only the basis functions φk having total degree not exceeding a certain value, say p, may be

affected by the compression, while all the others coincide with the corresponding orthonormal basis functions Φk defined in (2.14).

(7)

If Dφ= diag Sφ, then for any value of p there are strategies to build Gt(depending on p)such

that the eigenvalues λ of

Sφx = λDφx (2.18)

are close to one and bounded from above and away from 0, independently of p [10]. This guarantees the validity for NOBS basis of (2.4) with dk equal to the diagonal elements of the

matrix Dφ and (2.6) for suitable choice of constants β∗, β∗ depending on the eigenvalues of

(2.18) (see [10] for more details).

2.2 Infinite Dimensional Algebraic Problem

Let us identify the solution u =P_kuˆkφk of Problem (2.1) with the vector u = (ˆuk)k∈K of its

coefficients w.r.t. the basis _{φk}k∈K. Similarly, let us identify the right-hand side f with the

vector f = ( ˆfℓ)ℓ∈K of its dual coefficients. Finally, let us introduce the bi-infinite, symmetric

and positive-definite stiffness matrix

A= (aℓ,k)ℓ,k∈K with aℓ,k= a(φk, φℓ) . (2.19)

Then, Problem (2.1) can be equivalently written as

Au= f , (2.20)

where, thanks to (2.3) and the norm equivalences (2.6)-(2.7), A defines a bounded invertible operator in ℓ2_(K).

Decay Properties of A and A−1. The decay of the entries of A away from the diagonal depends on the regularity of the coefficients ν and σ of L. If ν and σ are real analytic in a neighborhood of Ω, then ak,m decays exponentially away from the diagonal [6, 7, 10]: there

exist parameters cL, ηL> 0 such that

|ak,m| ≤ cLexp(−ηL|k − m|) ∀k, m ∈ K; (2.21)

we then say that A belongs to the exponential class De(ηL), in particular A is quasi-sparse.

This justifies the symmetric truncation AJ of A with parameter J, defined as (AJ)ℓ,k= aℓ,k if

|ℓ − k| ≤ J and (AJ)ℓ,k= 0 otherwise, which satisfies [6, 7, 10]

kA − AJk ≤ CA(J + 1)d−1e−ηLJ (2.22)

for some CA> 0 depending only on cL.

Most notably, the inverse matrix A−1_{is also quasi-sparse [6, 7, 10]. Precisely A}−1_{∈ D} e(¯ηL)

for some ¯ηL ∈ (0, ηL] and ¯cL only dependent on cL and ηL. Thus, there exists an explicit

constant C_A−1 (depending only on c_L and η_L) such that the symmetric truncation (A−1)_J of

A−1 satisfies

kA−1_{− (A}−1₎

Jk ≤ CA−1(J + 1)d−1e−¯ηLJ≤ C_A−1e−˜ηLJ, (2.23)

for a suitable exponent eηL< ¯ηL.

Galerkin Method. Given any finite index set Λ ⊂ K, we define the subspace VΛ =

span_{φk| k ∈ Λ} of V ; we set |Λ| = card Λ, so that dim VΛ =|Λ|. If v ∈ V admits the

expan-sion v =P_k∈Kvˆkφk, then we define its projection PΛv upon VΛby setting PΛv :=P_k∈Λvˆkφk.

Similarly, we define the subspace V∗

Λ = span{φ∗k| k ∈ Λ} of V∗. If f admits an expansion

f =P_k∈Kfˆkφ∗k, then we define its projection PΛ∗f onto VΛ∗ upon setting PΛ∗f :=

P

(8)

Given any finite Λ_{⊂ K, the Galerkin approximation of (2.1) is defined as}

uΛ∈ VΛ : a(uΛ, vΛ) =hf, vΛi ∀vΛ∈ VΛ . (2.24)

Let uΛ be the vector collecting the coefficients of uΛ indexed in Λ; let fΛ be the analogous

restriction for the vector of the coefficients of f . Finally, denote by RΛthe matrix that restricts

a vector indexed in K to the portion indexed in Λ, so that RH

Λ is the corresponding extension

matrix. If

AΛ:= RΛARHΛ , (2.25)

then problem (2.24) can be equivalently written as

AΛuΛ= fΛ. (2.26)

For any w_{∈ V}Λ, we define the residual r(w)∈ V∗ as

r(w) = f_{− Lw =} X

k∈K

ˆ

rk(w)φ∗k , where ˆrk(w) =hf − Lw, φki = hf, φki − a(w, φk) .

The definition (2.24) of uΛis equivalent to the condition PΛ∗r(uΛ) = 0, i.e., ˆrk(uΛ) = 0 for every

k∈ Λ. By the continuity and coercivity of the bilinear form, one has 1

α∗kr(uΛ)kV∗ ≤ ku − uΛkV ≤

1

α_∗kr(uΛ)kV∗ , (2.27)

which in view of (2.3) and (2.7) can be rephrased as β∗ √ α∗kr(uΛ)kφ∗ ≤ |||u − uΛ||| ≤ β∗ √_α ∗kr(u Λ)kφ∗ . (2.28)

Therefore, if (ˆrk(uΛ))k∈K are the coefficients of r(uΛ) with respect to the dual basis φ∗, the

quantity kr(uΛ)kφ∗ =  X k6∈Λ | ˆRk(uΛ)|2   1/2 with Rˆk(uΛ) = ˆrk(uΛ)d−1/2k

is an error estimator from above and from below. However, this quantity is not computable because it involves infinitely many terms. We discuss feasible versions in [6, 7, 10] but not here. Equivalent Formulation of the Galerkin Problem. For future reference, we now rewrite the Galerkin problem (2.24) in an equivalent (infinite-dimensional) manner. Let

PΛ: ℓ2(K)→ ℓ2(K)

be the projector operator defined as

(PΛv)λ:=

(

vλ if λ∈ Λ ,

0 if λ /_{∈ Λ .}

Note that PΛ can be represented as a diagonal bi-infinite matrix whose diagonal elements are

1 for indexes belonging to Λ, and zero otherwise. We set QΛ := I− PΛ and introduce the

bi-infinite matrix bAΛ:= PΛAPΛ+ QΛ which is equal to AΛ for indexes in Λ and to the identity

matrix, otherwise. The definitions of the projectors PΛ and QΛ yield the following property:

(9)

Furthermore, the constants C_A_b_Λ and C_{( b}_A_Λ₎−1 which appear in the inequalities (2.22) and (2.23)

for bAΛ can be bounded uniformly in Λ, since in turn they can be bounded in terms of ηL and

cL, respectively.

Now, let us consider the following extended Galerkin problem: find ˆu_{∈ ℓ}2such that b

AΛuˆ = PΛf . (2.30)

Let uΛ be the Galerkin solution to (2.26); then, it is easy to check that ˆu= RHΛuΛ.

3 Adaptive Spectral Galerkin Method

In this section we present our adaptive spectral Galerkin method, named DYN-GAL, that is based on a new notion of marking strategy, namely a dynamic marking. In Section 3.1 we recall the enriched D¨orfler marking strategy, introduced in [6], which represents an enhancement of the classic D¨orfler marking strategy. In Section 3.2 we introduce the dynamic marking strategy, present DYN-GAL and prove its quadratic convergence.

3.1 Static D¨

orfler Marking

Fix any θ_{∈ (0, 1) and set Λ}0=∅, uΛ0 = 0. For n = 0, 1, . . . , assume that Λn and un := uΛn∈

VΛn and rn:= r(un) = Lun− f are already computed and choose Λn+1:= Λn∪ ∂Λnwhere the

set ∂Λnis built by a two-step procedure that we call E-D ¨ORFLERfor enriched D¨orfler:

∂Λn= E-D ¨ORFLER(Λn, θ)

f

∂Λn= DORFLER (rn, θ)

∂Λn= ENRICH ( f∂Λn, J)

The first step is the usual D¨orfler’s marking with parameter θ: kP∗ f ∂Λnrnkφ ∗ =kP∗ e Λn+1rnkφ ∗ ≥ θkr_nk_φ∗ or X k∈ f∂Λn | ˆRk(un)|2 ≥ θ2 X k∈K | ˆRk(un)|2, (3.1)

with eΛn+1= Λn∪ e∂Λn. This also reads

krn− P_Λ_e∗ n+1rnkφ ∗ ≤ p 1_{− θ}2_kr nkφ∗ (3.2)

and can be implemented by rearranging the coefficients ˆRk(un) in decreasing order of modulus

and picking the largest ones (greedy approach). However, this is only an idealized algorithm be-cause the number of coefficients ˆRk(un) is infinite. This marking is known to yield a contraction

property between un and the Galerkin solution eun+1∈ VeΛn+1 of the form

|||u − eun+1||| ≤ ρ(θ)|||u − un||| ,

with ρ(θ) =p1₋α∗

α∗θ2 [6, 7]. When α_∗ < α∗ we see that in contrast to (3.2), ρ(θ) is bounded

below away from 0 byp1−α∗

α∗.

The second step of E-D ¨ORFLER is meant to remedy this situation and hinges on the a priori structure of A−1 _{already alluded to in} _{§2.2. The goal is to augment the set f}_∂Λ

n to ∂Λn

judiciously. This is contained in the following proposition (see [6]) whose proof is reported here for completeness.

(10)

Proposition 3.1(enrichment). Let f∂Λn= DORFLER (rn, θ), and let J = J(θ) > 0 satisfy

C_A−1e−eηLJ ≤

s 1_{− θ}2

α_∗α∗ , (3.3)

where C_A−1 and eη_L are defined in (2.23). Let ∂Λ_n= ENRICH ( f∂Λ_n, J) be built as follows

∂Λn:=k∈ K : there exists ℓ∈ f∂Λn such that |k − ℓ| ≤ J .

Then for Λn+1= Λn∪ ∂Λn, the Galerkin solution un+1∈ VΛn+1 satisfies

|||u − un+1||| ≤ ¯ρ(θ)|||u − un||| (3.4) with ¯ ρ(θ) = 2β∗ √ α∗ β∗√α∗ p 1_{− θ}2_. _(3.5) Proof. Let gn := P_∂Λ∗_f nrn= P ∗ e

Λn+1rn which, according to (3.2), satisfies

krn− gnkφ∗ ≤

p

1− θ2_kr nkφ∗ .

Let wn ∈ V be the solution of Lwn= gn, which in general will have infinitely many components,

and let us split it as

wn = PΛn+1wn+ PΛcn+1wn =: yn+ zn∈ VΛn+1⊕ VΛcn+1 .

The minimality property in the energy norm of the Galerkin solution un+1 over the set Λn+1

yet to be defined, in conjunction with (2.3) and (2.28), implies |||u − un+1||| ≤ |||u − (un+ yn)||| ≤ |||u − un− wn+ zn|||

≤√1_α ∗kL(u − u n− wn)k +√α∗kznk = β ∗ √_α ∗kr n− gnkφ∗ +√α∗kz_nk , whence |||u − un+1||| ≤ β ∗ √_α ∗ p 1− θ2_kr nkφ∗+ √ α∗_kz_n_{k .} Since zn= PΛc n+1L −1_P∗ f ∂Λn

rn, we now construct Λcn+1to controlkznk. If

k_{∈ Λ}c n+1 and ℓ∈ f∂Λn ⇒ |k − ℓ| > J , then we have kPΛc n+1L −1_P∗ f ∂Λnk ≤ kA −1_{− (A}−1₎ Jk ≤ CA−1e−eηLJ ,

where we have used (2.23). We now choose J = J(θ) > 0 to satisfy (3.3), and we exploit that kznk ≤ CA−1e−˜ηLJkr_nk to obtain |||u − un+1||| ≤ 2 β ∗ √_α ∗ p 1− θ2_kr nkφ∗ ≤ 2β ∗√_α∗ β∗√α∗ p 1− θ2_{|||u − u} n||| , (3.6) as asserted.

(11)

We observe that, as desired, the new error reduction rate ¯ ρ(θ) = 2β∗ √ α∗ β_∗√α_∗ p 1_{− θ}2 _(3.7)

can be made arbitrarily small by choosing θ suitably close to 1. This observation was already made in [6, 7], but we improve it in Section 3.2 upon choosing θ dynamically.

Remark 3.2 (Cardinality of ∂Λn). Since we add a ball of radius J around each point of f∂Λn we

get a crude estimate

|∂Λn| ≤ |Bd(0, J)∩ Zd| | f∂Λn| ≈ ωdJd| f∂Λn|, (3.8)

where ωd is the measure of the d-dimensional Euclidean unit ball B(0, 1) centered at the origin.

3.2 Dynamic D¨

orfler Marking and Adaptive Spectral Algorithm

In this section we improve on the above marking strategy upon making the choice of θ dynamic. At each iteration n let us select the D¨orfler parameter θn such that

p 1_{− θ}2

n = C0kr nkφ∗

kr0kφ∗ (3.9)

for a proper choice of the positive constant C0that will be made precise later. This implies

J(θn) =− 1

e ηL

logkrnkφ∗

kr0kφ∗ + K1 (3.10)

according to (3.3), where K1:=−_η_e1_Llog √_α1_∗_α∗

C0

C_A−1

+ δn and δn∈ [0, 1).

We thus have the following adaptive spectral Galerkin method with dynamic choice (3.9) of the marking parameter θn= (1− C02krnk2φ∗/kr0k2φ∗)1/2:

DYN-GAL(ε) set r0:= f , Λ0:=∅, n = −1 do n← n + 1 ∂Λn:= E-D ¨ORFLER Λn, (1− C02krnk2φ∗/kr0k2φ∗)1/2 Λn+1:= Λn∪ ∂Λn un+1:= GAL (Λn+1) rn+1:= RES (un+1) whilekrn+1kφ∗ > εkr₀k_φ∗

where GAL computes the Galerkin solution and RES the residual. The following result shows the quadratic convergence of DYN-GAL.

Theorem 3.3 (quadratic convergence). Let the constant C0 of (3.9) satisfy C0 ≤ 1₄pα_α∗∗

β∗

and C1:= √

α∗

2β∗kfkφ∗. Then the residual rn of DYN-GAL satisfies

krn+1kφ∗ 2_kr0kφ∗ ≤ krnkφ∗ 2_kr0kφ∗ 2 ∀n ≥ 0, (3.11)

and the algorithm terminates in finite steps for any tolerance ε. In addition, two consecutive solutions of DYN-GAL satisfy

(12)

Proof. Invoke (3.6) and (3.9) to figure out that krn+1kφ∗ kr0kφ∗ ≤ √ α∗ β∗ |||u − un+1||| kr0kφ∗ ≤ 2 r α∗ α∗ β∗ β∗ p 1_{− θ}2 nkr nkφ∗ kr0kφ∗ ≤ 2C0 r α∗ α∗ β∗ β∗ krnkφ∗ kr0kφ∗ 2 ≤ 1₂ krnkφ∗ kr0kφ∗ 2 , (3.13)

which implies (3.11). We thus realize that DYN-GAL converges quadratically and terminates in finite steps for any tolerance ε. Finally, combining (2.28) with (3.13), we readily obtain (3.12) upon using r0= f .

Remark 3.4 (super-linear rate). If the dynamic marking parameter θnis chosen so that

p 1− θ2 n = C0 krnkφ∗ kr0kφ∗ σ

for some σ > 0, then we arrive at the rate|||u − un+1||| ≤ C1|||u − un|||1+σ.

It seems to us that the quadratic rate (3.12) is the first one in adaptivity theory. The relation (3.11) reads equivalently krn+1kφ∗ 2_kr0kφ∗ ≤ krn+1−kkφ∗ 2_kr0kφ∗ 2k 0_{≤ k ≤ n + 1,} (3.14)

and implies thatkrnkφ∗/kr₀k_φ∗ is within machine precision in about n = 6 iterations. This fast

decay is consistent with spectral methods. Upon termination, we obtain the relative error |||u − un||| ≤ β ∗√_α∗ β∗√α∗|||u||| ε, because_kfkφ∗ ≤ √ α∗ β∗ |||u|||.

The algorithm DYN-GAL entails exact computation of the residual rn, which in general

has infinitely many terms. We do not dwell here with inexact or feasible versions of DYN-GAL and refer to [6, 7, 10] for a full discussion which extends to our present setting.

4 Nonlinear Approximation and Gevrey Sparsity Classes

Given any v_{∈ V we define its best N-term approximation error as} EN(v) = inf

Λ⊂K, |Λ|=Nkv − PΛvkφ.

We are interested in classifying functions v according to the decay law of EN(v) as N → ∞,

i.e., according to the “sparsity” of their expansions in terms of the basis_{φk}k∈K. Of special

interest to us is the following exponential Gevrey class.

Definition 4.1 (exponential class of functions). For η > 0 and 0 < t_{≤ d, we denote by A}η,t_G the subset of V defined as

Aη,t G := n v∈ V : kvkAη,t G := sup N ≥0 EN(v) exp ηω−t/d_d Nt/d< +∞o

where ωdis the measure of the d-dimensional Euclidean unit ball Bd(0, 1) centered at the origin.

Definition 4.2 (exponential class of sequences). Let ℓη,t_G (K) be the subset of sequences v _∈ ℓ2_{(K) so that} kvkℓη,t_G (K):= sup n≥1 n(1−t/d)/2expηω_d−t/dnt/d_|v∗n| < +_{∞ ,} where v∗_{= (v}∗

(13)

The relationship between Aη,t_G and ℓη,t_G (K) is stated in the following [6, Proposition 4.2]. Proposition 4.3(equivalence of exponential classes). Given a function v_{∈ V and the sequence} v= (ˆvk√dk)k∈K of its coefficients, one has v∈ Aη,tG if and only if v∈ ℓ

η,t G (K), with kvkAη,t G .kvkℓ η,t G (K).kvkA η,t G .

For functions v in Aη,t_G one can estimate the minimal cardinality of a set Λ such that_kv−PΛvkφ≤

ε as follows: sincekv − PeΛvkφ> ε for any set eΛ with cardinality|eΛ| = |Λ| − 1, we deduce

|Λ| ≤ ωd 1 η log kvkAη,t G ε !d/t + 1. (4.1)

For the analysis of the optimality of our algorithm it is important to investigate the sparsity class of the image Lv for the operator L defined in (2.1), when the function v belongs to the sparsity class Aη,t_G . Sparsity classes of exponential type for functionals f _{∈ V}∗ _{can be defined}

analogously as above, using now the best N -term approximation error in V∗

EN∗(f ) = inf

Λ⊂K, |Λ|=Nkf − P ∗ Λfkφ∗ .

The following result is based on [6, Proposition 5.2].

Proposition 4.4 (continuity of L in Aη,t_G ). Let L be such that the associated stiffness matrix A satisfies the decay condition (2.21). Given η > 0 and t∈ (0, d], there exist ¯η > 0, ¯t ∈ (0, t] and a constant CL≥ 1 such that

kLvkAη,¯¯t G ≤ CLkvkA η,t G ∀v ∈ A η,t G . (4.2)

Proof. Let A be the stiffness matrix associated with the operator L. In [6] it is proven that if Ais banded with 2p + 1 non-zero diagonals, then the result holds with ¯η =_(2p+1)η t/d and ¯t = t;

on the other hand, if A∈ De(ηL) is dense, but the coefficients ηL and η satisfy the inequality

η < ηLωdt/d, then the result holds with ¯η = ζ(t)η and ¯t = t 1+t, where ζ(t) = 1+t 2d_ω1+t d t d(1+t) . Finally, if η ≥ ηLωt/d_d , we introduce an arbitrary ˆη > 0 satisfying ˆη < ηLω_dt/d; then the result

holds with ¯η = ζ(t)ˆη and ¯t = _1+tt , sincekvkAˆη,t

G ≤ kvkA η,t G .

Keeping into account that ζ(t)_{≤ 1 for 1 ≤ d ≤ 10 (see again [6]), this result indicates that} the residual is expected to belong to a less favorable sparsity class than the one of the solution. Counterexamples in [6] show that (4.2) cannot be improved.

Finally, we discuss the sparsity class of the residual r = r(uΛ) for any Galerkin solution uΛ.

Proposition 4.5 (sparsity class of the residual). Let A ∈ De(ηL) and A−1 ∈ De(¯ηL), for

constants ηL > 0 and ¯ηL ∈ (0, ηL] so that (2.22) and (2.23) hold. If u ∈ Aη,tG for some η > 0

and t∈ (0, d], then there exist suitable positive constants ˜η ≤ η and ˜t ≤ t such that r(uΛ)∈ Aη,˜G˜t

for any index set Λ and

kr(uΛ)kAη,˜˜t

G

._kukAη,t

G .

Proof. Proposition 4.4 yields the existence of ¯η > 0 and ¯t_{∈ (0, t] such that} kfkAη,¯¯t G =kLukA ¯ η,¯t G .kukA η,t G . (4.3)

In order to bound_kr(uΛ)kAη,˜˜t

G in terms ofkfkA ¯ η,¯t

G , let us write

(14)

then use uΛ= ( bAΛ)−1(PΛf) from (2.30) to get

rΛ = f− A( bAΛ)−1(PΛf).

Now, assuming just for simplicity that the indices in Λ come first (this can be realized by a permutation), we have A= AΛ B BT C ! and AbΛ= AΛ O OT I ! , whence ( bAΛ)−1 = (AΛ)−1 O OT I !

Setting f = (fΛ fΛc)T, so that P_Λf = (f_Λ 0)T, we have

A( bAΛ)−1(PΛf) = AΛ B BT _C ! (AΛ)−1 O OT _I ! fΛ 0 ! = AΛ B BT C ! (AΛ)−1fΛ 0 ! = fΛ BT(AΛ)−1fΛ ! Then, rΛ= 0 fΛc− BT(A_Λ)−1f_Λ ! = O O −BT_(A Λ)−1 I ! fΛ fΛc ! =: Rf .

Now, since A∈ De(ηL) and (AΛ)−1 ∈ De(¯ηL), it is easily seen that R∈ De(˜ηL) with ˜ηL = ¯ηL

if ¯ηL< ηL, or ˜ηL< ηL arbitrary if ¯ηL= ηL.

Finally, we apply Proposition 4.4 to the operator R defined by the matrix R, obtaining the existence of constants ˜η > 0 and ˜t_{∈ (0, ¯t] such that}

kr(uΛ)kA˜η,˜t

G

._kfk_Aη,¯¯t G ,

whence the result.

5 Optimality Properties of DYN-GAL

In this section we derive an exponential rate of convergence for_{|||u − u}n||| in terms of the number

of degrees of freedom Λnactivated by DYN-GAL and assess the computational work necessary

to achieve this rate. This is made precise in the following theorem.

Theorem 5.1 (exponential convergence rate). Let u ∈ Aη,tG where the Gevrey class A η,t G is

introduced in Definition 4.1. Upon termination of DYN-GAL, the iterate un+1 ∈ VΛn+1 and

set of active coefficients Λn+1satisfy |||u − un+1||| ≤ β

∗ √_α ∗kfkφ∗ε and |Λn+1| ≤ ωd   _η1 ∗ logC∗ kukAη,t G kfkφ∗ ε    d/t∗ , (5.1)

with parameters C∗ > 0, η∗ < η and t∗< t. Moreover, if the number of arithmetic operations

needed to solve a linear system scales linearly with its dimension, then the workload Wε of

DYN-GALupon completion satisfies

W_ε_{≤ ω}_d   _η1_∗log C ∗kukAη,t G kfkφ∗ ε4_{| log | log ε||}−1,    d/t∗ (5.2) where η∗_{< η}

(15)

Proof. We proceed in several steps.

1. Expression of J(θk): Our first task is to simplify the expression (3.10) for J(θk), namely

to absorb the term K1: there is C2> 1 such that

J(θk)≤C2 e ηL log₂krkkφ∗ kr0kφ∗ . (5.3)

In fact, if C2is given by C2= 1 +eηLmax(0,K_{log 2} 1), then

K1≤ C2− 1 e ηL log 2_≤C2− 1 e ηL log krkkφ ∗ 2_kr0kφ∗

because_krkkφ∗ ≤ kr₀k_φ∗ for all k≥ 0 according to (3.13). This in turn implies (5.3)

J(θk)≤ 1 e ηL log krkkφ ∗ 2_kr0kφ∗ +C2_η_e− 1_L log krkkφ ∗ 2_kr0kφ∗ =C_e_η_L2 log krkkφ ∗ 2_kr0kφ∗ .

2. Active set Λk: We now examine the output Λk of E-D ¨ORFLER. Employing the

mini-mality of D¨orfler marking and (3.2), we deduce E∗ | g∂Λk|(rk) =krk− P ∗ g ∂Λkrkkφ ∗ ≤ q 1_{− θ}2 kkrkkφ∗, (5.4)

which clearly implies E∗

| g∂Λk|−1>

p 1− θ2

kkrkkφ∗. The latter inequality, together with the

Defi-nition 4.2 of_krkkAη,¯¯t G , yields krkkAη,¯¯t G > q 1− θ2 kkrkkφ∗exp ¯ ηω−¯dt/d(| f∂Λk| − 1)¯t/d . We note that this, along with| f∂Λk| ≥ 1, ensures

krkk_{A ¯}η,¯t G √ 1−θ2 kkrkkφ∗ > 1 whence | g∂Λk| ≤ ωd 1 ¯ ηlog krkkAη,¯¯t G p 1− θ2 kkrkkφ∗ !d/¯t + 1.

We now recall the membership of the residual rk := r(uk) to the Gevrey class Aη,¯G¯t for all

k_{≥ 0, established in Proposition 4.5: there exists C}3> 0 independent of k and u such that

krkkAη,¯¯t

G ≤ C3kukA η,t G .

Combining this with the dynamic marking (3.9) implies krkkAη,¯¯t G p 1_{− θ}2 kkrkkφ∗ =kfkφ ∗kr_kk Aη,¯¯t G C0krkk2φ∗ ≤ C4kfkφ∗kuk_Aη,t G krkk2φ∗ , with C4= C3/C0, whence | g∂Λk| ≤ ωd 1 ¯ η log C4kfkφ∗kuk_Aη,t G krkk2_φ∗ !d/¯t + 1. We let C5 satisfy 1 = ωd 1_¯_ηlog C5

d/¯t

, and use that ¯t < t_{≤ d, to obtain the simpler expression}

| f∂Λk| ≤ ωd 1 ¯ ηlog C4C5kfkφ∗kuk_Aη,t G krkk2_φ∗ !d/¯t . (5.5)

(16)

On the other hand, in view of (5.3), the enrichment step (3.8) of E-D ¨ORFLERyields |∂Λk| ≤ C6ωdJ(θk)d| f∂Λk| ≤ C6C2dω2d e ηd Lη¯d/¯t log krkkφ ∗ 2_kr0kφ∗ d logC4C5kfkφ ∗kuk_Aη,t G krkk2φ∗ !d/¯t . (5.6)

We exploit the quadratic convergence (3.11) to write for k≤ n log krkkφ ∗ 2kr0kφ∗ d ≤ 2d(k−n) log krnkφ ∗ 2kr0kφ∗ d = 2d(k−n) log2kfkφ∗ krnkφ∗ d . (5.7)

Using the boundkfkφ∗ ≤ kfk_Aη,¯¯t

G =kr0kA ¯ η,¯t

G ≤ C3kukA η,t

G in the two previous inequalities yields

logC4C5kfkφ ∗kuk_Aη,t G krkk2_φ∗ ≤ 2 log C7kukAη,t G krkkφ∗ and log 2_kfkφ∗ krnkφ∗ ≤ log 2C3kukAη,t G krnkφ∗ ,

with C7= (C3C4C5)1/2. Introducing the constants

C8= max(2C3, C7), 1 ˆ η = 2d/¯t_C 6C2dωd e ηd Lη¯d/¯t , t_∗= ¯t 1 + ¯t, the derived upper bound for_|∂Λk| can be simplified as follows:

|∂Λk| ≤ 2d(k−n)ωd ˆ η log C8kukAη,t G krnkφ∗ !d/t∗ . Recalling now that_|Λ0| = 0 and for n ≥ 0

|Λn+1| = n X k=0 |∂Λk| we have |Λn+1| ≤ ωd ˆ η n X k=0 2d(k−n) ! logC8kukA η,t G krnkφ∗ !d/t∗ . This can be written equivalently as

|Λn+1| ≤ ωd 1 η∗ logC8kukA η,t G krnkφ∗ !d/t∗ (5.8) with η_∗=P∞ηˆ k=12−dk t∗/d

. At last, we make use of_krnkφ∗ > εkfk_φ∗ to get the desired estimate

(5.1) with C_∗= C8.

3. Computational work: Let us finally discuss the total computational work Wε of

DYN-GAL. We start with some useful notation. We set δn := kr_krn₀_kkφ∗

φ∗ for n≥ 0 and εℓ+1 = ε

2 ℓ for

ℓ≥ 1 with ε1 = 1₂. We note that there exists an integer L > 0 such that εL+1 < ε≤ εL with

ε being the tolerance of DYN-GAL. In addition, we observe that for every iteration n > 0 of DYN-GALthere exists ℓ > 0 such that δn ∈ Iℓ:= (εℓ+1, εℓ] and that for each interval Iℓ there

exists at most one δn ∈ Iℓ because δn+1≤ 1₂δn2 according to (3.11). Finally, to each interval Iℓ

we associate the following computational work Wℓ to find and store un+1

Wℓ=

(

C#|Λn+1| if there exists n such that δn∈ Iℓ

(17)

This assumes that the number of arithmetic operations needed to solve the linear system for un scales linearly with its dimension and C# is an absolute constant that may depend on the

specific solver. The total computational work of DYN-GAL is bounded by W_ε₌

L

X

ℓ=1

Wℓ.

We now get a bound for Wε. In view of (5.1) we have

Wℓ≤ C#ωd   _η1 ∗ logC∗ kukAη,t G kfkφ∗ εℓ+1    d/t∗ = C#ωd   _η1 ∗ logC∗ kukAη,t G kfkφ∗ ε2ℓ 1    d/t∗ .

Therefore, upon adding over ℓ and using that d/t∗≥ 1, we obtain

W_ε_≤C#ωd ηd/t∗ ∗ L X ℓ=1 log C∗ kukAη,t G kfkφ∗ + L X ℓ=1 log ε−21 ℓ !d/t∗ ≤ C#ωd ηd/t∗ ∗   log LC_∗kukAη,t G kfkφ∗ ε2L+1 1    d/t∗ . Since ε_{≤ ε}L = ε2 L−1 1 , we deduce ε4 ≤ ε2 L+1 1 and L ≤

log| log ε|_{log 2}

log 2 + 1 ≤ C9log| log ε|. Inserting

this bound in the preceding expression yields

W_ε_{≤ ω}_d   _η1_∗log C ∗kukAη,t G kfkφ∗ ε4_{| log | log ε||}−1    d/t∗ . (5.9) with η∗= η∗ Ct∗/d # , C∗= C9C∗,

which is the asserted estimate (5.2). The proof is thus complete.

Remark 5.2. Note that the bound on the workload, given in (5.2), is at most an absolute multiple of the bound, given in (5.1), on the number of active coefficients.

Remark 5.3 (super-linear convergence). Ifp1_{− θ}2 n= C0

krnkφ∗

kr0kφ∗

σ

with σ > 0, then (5.1) still holds with the same parameters η∗, t∗.

Remark 5.4 (algebraic class). Let us consider the case when u belongs to the algebraic class As B:= n v∈ V : kvkAs B := sup N ≥0 EN(v) N + 1 s/d < +∞o, which is related to Besov regularity. We can distinguish two cases:

1. A belongs to an exponential class but the residuals belong to an algebraic class;

2. A belongs to an algebraic class Da(ηL), i.e. there exists a constant cL > 0 such that its

elements satisfy_|aℓ,k| ≤ cL(1 +|ℓ − k|)−ηL, and the residual belongs to an algebraic class.

We now study the optimality properties of DYN-GAL for these two cases. Let us first observe that whenever the residuals belong to the algebraic class As

B, the bound (5.5) becomes

(18)

This results from the dynamic marking (3.9) together with (5.4) in the algebraic case.

Let us start with Case 1. Using (5.10) and (5.3), which is still valid here as we assume that Abelongs to an exponential class, the bound (5.6) is replaced by

|∂Λk| . ku − ukk−2d/s log₂krkkφ∗ kr0kφ∗ d ._{ku − u}kk−2d/s2d(k−n) log2kfkφ∗ krnkφ∗ d , (5.11)

where in the last inequality we have employed (5.7). Hence, we have |Λn+1| = n X k=0 |∂Λk| . log2kfkφ∗ krnkφ∗ d n_X k=0 ku − unk−2 d s2 k−n 2d(k−n) . log2kfkφ∗ krnkφ∗ d ku − unk−2 d s n X k=0 2d(k−n) . log2kfkφ∗ krnkφ∗ d ku − unk−2 d s _. log 2kfkφ∗ ku − unk d ku − un+1k− d s

where in the last inequality we have employed the quadratic convergence of DYN-GAL. The above result implies that DYN-GAL is optimal for Case 1 (up to a logarithmic factor).

Let us now consider Case 2. Since A belongs to an algebraic class, (5.3) is replaced by J(θk) hku − ukk−1/s.

Thus, the bound (5.6) is replaced by

|∂Λk| . ku − ukk−3d/s, which implies |Λn+1| = n X k=0 |∂Λk| . ku − unk−3 d s _.ku − u n+1k− 3 2ds

where in the last inequality we have again employed the quadratic convergence of DYN-GAL. This result is not optimal for Case 2, due to the factor 2/3 multiplying s in the exponent. We recall that the algorithm FA-ADFOUR of [6] is similar to DYN-GAL but with static marking parameter θ. The theory of FA-ADFOUR requires neither a restriction on θ nor coarsening and is proven to be optimal in the algebraic case; see [6, Theorem 7.2].

Acknowledgements. The first and the fourth authors are partially supported by the Italian research grant Prin 2012 2012HBLYE4 “Metodologie innovative nella modellistica differenziale numerica”. The first author is also partially supported by Progetto INdAM-GNCS 2015 “Tec-niche di riduzione computazionale per problemi di fluidodinamica e interazione fluido-struttura”. The second author is partially supported by NSF grants DMS-1109325 and DMS-1411808. The fourth author is also partially supported by Progetto INdAM - GNCS 2015 “Non-standard numerical methods for geophysics”.

References

[1] P. Binev, W. Dahmen, and R. DeVore. Adaptive finite element methods with convergence rates. Numer. Math., 97(2):219–268, 2004.

[2] P. Binev and R. DeVore. Fast computation in adaptive tree approximation. Numer. Math., 97(2):193–217, 2004.

(19)

[3] M. B¨urg and W. D¨orfler. Convergence of an adaptive hp finite element strategy in higher space-dimensions. Appl. Numer. Math., 61(11):1132–1146, 2011.

[4] C. Canuto, M. Y. Hussaini, A. Quarteroni, and T. A. Zang. Spectral methods. Fundamentals in single domains. Scientific Computation. Springer-Verlag, Berlin, 2006.

[5] C. Canuto, R. H. Nochetto, R. Stevenson, and M. Verani. High-order adaptive Galerkin methods. Proceedings of ICOSAHOM 2014 (to appear).

[6] C. Canuto, R. H. Nochetto, and M. Verani. Adaptive Fourier-Galerkin methods. Math. Comp., 83(288):1645–1687, 2014.

[7] C. Canuto, R. H. Nochetto, and M. Verani. Contraction and optimality properties of adaptive Legendre-Galerkin methods: the one-dimensional case. Comput. Math. Appl., 67(4):752–770, 2014.

[8] C. Canuto, R. H. Nochetto, R. Stevenson, and M. Verani. Convergence and Optimality of hp-AFEM. arXiv:1503.03996.

[9] C. Canuto, V. Simoncini, and M. Verani. On the decay of the inverse of matrices that are sum of Kronecker products. Linear Algebra Appl., 452:21–39, 2014.

[10] C. Canuto, V. Simoncini, and M. Verani. Contraction and optimality properties of an adaptive Legendre-Galerkin method: the multi-dimensional case. J. Sci. Comput., doi: 10.1007/s10915-014-9912-3.

[11] C. Canuto and M. Verani. On the numerical analysis of adaptive spectral/hp methods for elliptic problems. In Analysis and numerics of partial differential equations, volume 4 of Springer INdAM Ser., pages 165–192. Springer, Milan, 2013.

[12] C. Carstensen, M. Feischl, M. Page, and D. Praetorius. Axioms of adaptivity. Comput. Math. Appl. 67(6):1195-1253, 2014.

[13] J. M. Casc´on, C. Kreuzer, R. H. Nochetto, and K. G. Siebert. Quasi-optimal convergence rate for an adaptive finite element method. SIAM J. Numer. Anal., 46(5):2524–2550, 2008. [14] A. Cohen, W. Dahmen, and R. DeVore. Adaptive wavelet methods for elliptic operator

equations – convergence rates. Math. Comp, 70:27–75, 1998.

[15] L. Diening, Ch. Kreuzer, and R. Stevenson, Instance optimality of the adaptive maximum strategy. Found. Comp. Math., 1–36, 2015.

[16] W. D¨orfler and V. Heuveline. Convergence of an adaptive hp finite element strategy in one space dimension. Appl. Numer. Math., 57(10):1108–1124, 2007.

[17] G. B. Folland, Real analysis, second ed., John Wiley & Sons Inc., New York, 1999. [18] T. Gantumur, H. Harbrecht, and R. Stevenson. An optimal adaptive wavelet method without

coarsening of the iterands. Math. Comp., 76(258):615–629, 2007.

[19] P. Morin, R.H. Nochetto, and K. G. Siebert. Data oscillation and convergence of adaptive FEM. SIAM J. Numer. Anal., 38(2):466–488 (electronic), 2000.

[20] R. H. Nochetto, K. G. Siebert, and A. Veeser. Theory of adaptive finite element methods: an introduction. In Multiscale, nonlinear and adaptive approximation, pages 409–542. Springer, Berlin, 2009.

[21] Ch. Schwab. p- and hp-finite element methods. Numerical Mathematics and Scientific Computation. The Clarendon Press, Oxford University Press, New York, 1998. Theory and applications in solid and fluid mechanics.

[22] R. Stevenson. Optimality of a standard adaptive finite element method. Found. Comput. Math., 7(2):245–269, 2007.

[23] R. Stevenson. The completion of locally refined simplicial partitions created by bisection. Math. Comp., 77:227–241, 2008.

(20)

[24] R. Stevenson. Adaptive wavelet methods for solving operator equations: An overview. In Multiscale, nonlinear and adaptive approximation, pages 543-597. Springer, Berlin, 2009.