Notes on theory and numerical methods for hyperbolic conservation laws
Mario Putti
Department of Mathematics – University of Padua, Italy e-mail: [email protected]
January 19, 2017
Contents
1 Partial differential equations 2
1.1 Mathematical preliminaries and notations . 3
1.1.1 The chain rule of differentation . . 6
1.1.2 Integration by parts . . . . 6
1.2 Classification of the PDEs . . . . 7
1.2.1 Standard classification of linear PDEs 9 1.2.2 Simple examples and solutions . . . 11
1.2.3 Conservation laws . . . . 15
1.3 Well posedness and continuous dependence of solutions on the data . . . . 16
1.3.1 Ill-conditioning and instability . . . 18
2 Hyperbolic Equations 21 2.1 Some examples . . . . 21
2.1.1 The transport equation . . . . 22
2.1.2 The second-order wave equation . . 23
2.1.3 Simple finite difference discretization 24 2.2 The method of characteristics . . . . 32
2.3 Initial-Boundary value problems . . . . 39
2.4 Classification of PDEs and propagation of discontinuities . . . . 46
2.4.1 Propagation of singularities . . . . 47
2.5 Linear second order equations . . . . 48
2.6 Systems of first-order equations . . . . 50
2.6.1 The linear case . . . . 50
2.6.2 The quasi-linear case . . . . 54
2.7 Weak solutions for conservation laws . . . . 55
2.7.1 Jump conditions . . . . 58
2.7.2 Admissibility conditions . . . . 61
2.8 The Riemann Problem. . . . 65
2.8.1 Shock and rarefaction curves . . . . 69
3 Numerical Solution of Hyperbolic Conservation Laws 73 3.1 The Differential Equation . . . . 74
3.2 The Finite Volume Method . . . . 74
3.2.1 Preliminaries . . . . 74
3.2.2 Notations . . . . 77
3.3 The numerical flux. . . . 78
3.3.1 One spatial dimension: first order . 78 3.3.2 One spatial dimension: second or- der gradient reconstruction . . . . . 82
A Finite Difference Approximations of first and second derivatives in R1. 85 B Fortran program: Godunov method for Burgers equation 87 C Memorization of 2D Computational Meshes. 100 D Review of linear algebra 101 D.1 Vector spaces, linear independence, basis . 105 D.1.1 Orthogonality . . . 107
D.2 Projection Operators. . . 108
D.3 Eigenvalues and Eigenvectors . . . 112
D.4 Norms of Vectors and Matrices . . . 115
D.5 Quadratic Forms . . . 118
D.6 Symmetric caseA = AT . . . 123
These notes are a free elaboration from [7, 1, 2] and other sources.
1 Partial differential equations
In this notes we will look at the numerical solution for partial differential equations. We will be mainly concerned with differential models stemming from conservation laws, such as those arising from force conservations i.e., second Newton’s law F = ma, such as de Saint-Venant equations, governing the equilibrium of a solid, or the Navier-Stokes equations, governing the dynamics of fluid flow. These equations are also called “equations in divergence form”, to identify the fact that the divergence of a vector translates in mathematical terms the conservation of the flux represented by that vector field.
As an example, let us consider the advection-diffusion equation (ADE), that governs the conservation of mass of a solute moving within the flow of the containing solvent. A typical application is the transport of a contaminant by a water body moving with laminar flow. The flow of the solvent is given by the vector (velocity) field λ, and the solute is undergoing chemical (Fickian) diffusion with a diffusion field D(x). We remark that if the density of the solvent is constant, then mass conserva- tion is equivalent to volume concentration, and thus density does not appear in the equations. The mathematical model is then given by:
∂u
∂t = div (D∇ u) − div(λu) + f in Ω∈ Rd (1.1)
where the equation is defined on a subspace of the d-dimensional Euclidean space Rd(d = 1, 2, or 3), the function u(x, t) : Ω× [0 : T ] 7→ R represents the concentration of the solute (mass/volume of solute per unit mass/volume of solvent), t is time, div = P
i∂/∂xi is the divergence operator, D is the diffusion coefficient, possibly a second order tensor, and∇ = {∂/∂xi} is the gradient operator. A problem is mathematically well posed if i) it has a solution; ii) the solution is unique; iii) the solution depends continuously on the data of the problem. For a PDE problem to be well-posed we need initial and boundary conditions. So we let the domain boundary Γ = ∂Ω be the union of three non overlapping sub-boundaries such that Ω = ΓD∪ ΓN ∪ ΓC, so that we:
u(x, 0) = uo(x) x∈ Ω, t = 0 (initial conditions) (1.2a) u(x, t) = gD(x) x∈ ΓD, t > 0 (Dirichlet BCs) (1.2b)
−D ∇ u(x, t) · ν = qN(x) x∈ ΓN, t > 0 (Neumann BCs) (1.2c) λu− D ∇ u(x, t) · ν = qC(x) x∈ ΓC, t > 0 (Cauchy BCs) (1.2d) where ν is the outward unit normal defined on Γ. Formally, under some regularity assumption and the hypothesis that D never vanishes, this is called a “parabolic” equation. The term “parabolic” is used to classify partial differential equations (PDEs) on the basis of certain qualitative properties of the solution. This can be done relatively easily with linear PDEs, and becomes more complicated for nonlinear PDEs.
1.1 Mathematical preliminaries and notations
We start this discussion by giving some general definitions. All our objects or equations will be defined in a domain Ω ⊂ Rd, and we will indicate with x ∈ Ω a point of the d-dimensional space (d=1, 2, or 3) as a vector with d coordinates x = (x1, x2, . . . , xd)T with respect to a fixed Cartesian (orthonormal) reference system. 1 Note that, for simplicity, we assume vectors to be 1-dimensional collections of numbers ordered columnwise, and will use the transpose operation as above to indicate a row vector (see appendix D). If d = 2 we will often identify x = x1 and y = x2. A function u(x) is a spatially-variable real-valued scalar function taking values in Ω ⊂ Rd and returning real values in R. A vector-valued function q(x) is a collection of functions ordered in a column vector. Thus we have:
u : Ω7→ R q : Ω7→ Rm; q(x) =
q1(x)
... qm(x)
.
Typically, physical quantities such as fluxes, velocities, etc. are defined in Rd, i.e., in the above we have m = d.
Definition 1.1 (Derivatives). 1. first order partial derivative:
uxi = ∂u(x)
∂xi = lim
h→0
u(x + hei)− u(x) h
where ei is the i-th coordinate vector, i.e., the vector with the i-th component equal to one and all the other ones equal to zero.
2. second order partial derivatives:
uxixj = ∂2u(x)
∂xi∂xj,
with obvious extensions for higher order.
3. Multiindex Notation.
(a) Given a vector α = (α1, . . . , αd)T, where αi ≥ 0 (the multi-index of order | α |= α1+ . . . + αd), the α-th derivative is:
∇αu(x) := ∂|α|u(x)
∂xα11· · · ∂xαdd
= ∂xα11· · · ∂xαdd.
1We would like to remark that formallyΩ is an open subset of Rd, and its closure is indicated with ¯Ω = Ω d Γ, where Γ = ∂Ω is the boundary of Ω, which forms a subset of Rd of co-dimension 1, i.e.,d− 1 (for d = 2 the boundary is a curve, ford = 3 it is a surface). We will not use or make reference to these technicalities, as empirical intuition can be sufficiently supported without them.
(b) if k is a nonnegative integer, then the set of all partial derivatives of order k is given by:
∇ku(x) :={∇αu(x) :| α |= k} .
Giving an order to the partial derivatives of the set above, we can think of∇ku(x) as a point (vector) in Rdk, and we can define the length of this point as:
| ∇ku(x)|=
X
|α|=k
| ∇αu|2
1 2
.
(c) Special cases. If k = 1 we order the elements of ∇ u in a vector thus leading to the definition of the “gradient” operator:
∇ u :=
ux1
... uxd
Thus we can regard∇ u as a point of Rd.
4. The divergence of a vector function q(x) : Ω⊂ Rd7→ Rdas:
div q(x) :=∇ ·q(x) = ∇T q(x) = ∂q1(x)
∂xi + . . .∂qd(x)
∂xd For example, for d = 2 we have:
∇ u(x) = ux1 ux2
=
∂u
∂x1
∂u
∂x2
div q(x) =∇ ·q(x) = ∂q1(x)
∂x1 + ∂q2(x)
∂x2
We would like to stress here that we will generally see the gradient as an operator acting on a scalar-valued function and the divergence as an operator acting on a vector-valued function.
Note that the divergence and the gradient operators are intimately related by the divergence (or Gauss’ or Stokes) theorem, which we will discuss later on.
5. For k = 2 we organize the elements of∇2u(x) in a matrix, called the Hessian matrix:
∇2u(x) =∇ ∇T u(x) =
ux1x1 · · · ux1xd
. ..
uxdx1 · · · uxdxd
Unfortunately, the symbol∇2 should not be confused with the Laplacian operator, often indi- cated by∇2.
6. Laplacian operator: The Laplacian operator is given by:
∆u(x) = div∇ u(x) = ∇T ∇ u(x) = Tr ∇2u(x) =
d
X
i=1
uxixi.
where the operator Tr (A) is the trace of a matrix A (see appendix D).
Definition 1.2 (Partial differential equation). A k-th order partial differential equation (PDE) can be written as:
F (x, u(x),∇ u(x), ∇2u(x), . . . ,∇ku(x)) = 0, x∈ Ω. (1.3) where formally F : Ω× R × Rd. . .× Rdk−1× Rdk 7→ R. For example, for k = 2 and d = 3 we have:
F (x, y, z, u, ux, uy, uz, uxx, uxy, uyy, uxz, uzz, uyz) = 0.
The solution of this equation is a function u(x) : Ω ⊂ Rd 7→ R that satisfies eq. (1.3) and possibly some other boundary conditions.
The problem we are concerned with is finding solutions (either approximate or exact) to eq. (1.3). If F is a linear function of u and its derivatives, then the equation is called linear, and, assuming d = 2, it can be written as:
a(x, y) + b(x, y)u + c(x, y)ux+ d(x, y)uy+ e(x, y)uxx+ f (x, y)uyy+ g(x, y)uxy = 0. (1.4) The order of a PDE is the order of the derivative of maximum degree that appears in the equation.
Thus, in the previous case the order is 2. Typical examples are:
∆u = ∂2u
∂x2 + ∂2u
∂y2 = 0 2oorder (Laplace equation)
∂u
∂t + ∂u
∂x = 0 1oorder (transport or convection equation)
∂u
∂t − ∂2u
∂x2 = 0 2oorder (diffusion equation).
Note that we often identify one of the variables as time, in which case, setting xd+1 = t, the solution u has domain given by Ω× [0, T ]:
u : Ω× [0, T ] 7→ R.
Obviously, we consider time as a positive real number.
1.1.1 The chain rule of differentation
We say that a function u(x) is smooth when it is infinitely differentiable, i.e., u(x)∈ C∞(Ω). We will be using often the chain (product) rule of differentiation and its derived property, i.e., integration by parts. For d = 1, using u0 to indicate differentation, we can write:
(uv)0 = u0v + uv0 (1.5)
which is valid for all u and v inC∞. By extension, we can think that the derivative of a scalar valued function in Ω as the gradient operator, i.e., the collection of all its partial derivatives. The chain rule is then:
∇(uv) = u ∇ v + v ∇ u. (1.6)
If we have a scalar function u : Ω 7→ R and a vector-valued function q : Ω 7→ Rd, we may form the product qu = (q1u, . . . , qdu). Applying eqs. (1.5) and (1.6) component by component, we obtain:
div(qu) = q· ∇ u + u div q. (1.7)
Note that this is a straight forward extension if we think that the gradient acts on scalar functions, and the divergence acts on vector functions.
1.1.2 Integration by parts
Recall the rule of integration by parts. It can be derived quickly from the chain rule. For d = 1, in fact, using eq. (1.5), we can write:
Z b a
u0(x)v(x) dx = Z b
a
u(x)v(x)0
dx− Z b
a
u(x)v0(x) dx
=
u(b)v(b)− u(a)v(a)
− Z b
a
u(x)v0(x) dx (1.8a)
In words, integration by parts states that the integral of the product of two functions (here u0(x) and v(x)) is equal to the product of the primitive of the first function multiplied by the other function evaluated at the extremes, minus the integral of the primitive times the derivative.
These results are easily extended in the multidimensional case. In this case, it is easy2 to remember all these results by using the above definitions of gradient and divergence as vector operators. Recall again that the gradient acts on scalars and the divergence acts on vectors. We recall here the important Gauss’ (or Divergence) Theorem that states a conservation principle:
Z
Ω
div q(x) dx = Z
Γ
q(x)· ν(x) dx,
2at least for me
where the vector ν(x) is the outward unit normal defined on the boundary Γ of Ω. We can now state the theorem of multidimensional integration by parts (or second Green’s Lemma) by recasting the one-dimensional integration by parts in the proper setting. Thus Green’s Lemma states that:
Z
Ω
u(x) div q(x) dx = Z
Γ
u(x)q(x)· ν(x) dx − Z
Ω
∇ u(x) · q(x) dx. (1.9)
To make the parallelism with the one-dimensional analogue, which helps remembering the theorem, we first look at eq. (1.8a). We observe that the term u(b)v(b)− u(a)v(a) can be interpreted as a boundary integral once we note that the “normal” in x = a is ν(a) =−1 and the “normal” in x = b is ν(b) = +1, and the “zero”-dimensional integral is just the evaluation of the integrand at the extremes of integration. Looking at eq. (1.9) we can see that q(x) can be interpreted as the primitive of div q(x) and∇ u(x) as the (multiindex) first derivative of u(x). Then we can put Green’s Lemma in words:
the integral of the product of two functions is given by the d− 1-dimensional integral of the product between the primitive of one of the two functions times the other one, everything projected along the outward unit normal to the d− 1-dimensional boundary of the domain, minus the integral over the domain of the product of the primitive function times the derivative of the other function.
We would like to stress here that we have assumed that the functions are infinitely differentiable. This is too strong an assumption, and we could have assumed that first derivatives of the functions exist and are finite for all x. This means that u and qimust be continuous functions of x (must be inC0(Ω)).
In the case they are not, we need to be careful in applying the chain rule. Moreover, numerically the chain rule is troublesome, and should be avoided. We will discuss this in later sections.
1.2 Classification of the PDEs
We first start our classification by looking at the “linearity” of the function F in eq. (1.3). We say that a PDE is linear if all the derivatives appear as linear combinations, i.e. if it has the form:
a0(x)u + a1(x)∇ u + . . . + ak(x)∇ku = X
|α|≤k
aα(x)∇αu = f (x)
If f = 0 the PDE is called homogeneous. A PDE is semilinear if only derivatives of order strictly smaller than k appear as nonlinear terms:
X
|α|=k
aα(x)∇αu + a0
∇k−1, . . . ,∇ku, u, x
= 0.
It is called quasilinear if the highest order derivatives are multiplied by functions of strictly lower order derivatives:
X
|α|=k
aα
∇k−1, . . . ,∇ku, u, x
∇αu + a0
∇k−1, . . . ,∇ku, u, x
= 0.
and it is fully nonlinear if the highest order derivatives appear nonlinearly in the equations.
Examples.
Laplace Equation
div∇ u = ∆u = uxx+ uyy = 0
This is an ubiquitous linear equation whose solution u is called potential function or harmonic function. In two dimensions, we can associate to a harmonic function u(x, y) a “conjugate”
harmonic function v(x, y) such that the Cauchy-Riemann equations are satisfied:
ux = vy uy =−vx
We can interpret the vector field (u(x, y),−v(x, y) as the velocity of an irrotational and incom- pressible fluid.
Wave equation
utt = λ2∆u
Again, this is a ubiquitous linear equation that governs the vibration of a string or the propaga- tion of waves in an incompressible medium (e.g., fluid waves, sound).
Maxwell equation
Et= curl H; µHt=− curl E; div E = div H = 0
where E = (E1, E2, E3) and H = (H1, H2, H3) are the electric and magnetic vector fields, respectively, in vacuum. Note that each component, Ej and Hj of E and H satisfy a wave equation with λ2 = 1/µ.
Linear Elastic waves
ρutt = µ∆u + (λ + µ)∇(div u),
where ρ is the density of the medium, u = {ui(x, y, z, t)} represents the displacement vector, and λ and µ are the Lam´e constants. Each component ui of u satisfies a fourth order linear equation:
∂2
∂t2 − λ + 2µ
ρ ∆ ∂2
∂t2 − µ ρ∆
ui = 0
It is easy to see that in the case of equilibrium (ut= 0) this equation reduces to the bi-harmonic equation:
∆2u = 0.
σ
FIGURE 1.1: Curve γ and local reference system
Minimal surface We want to find a surface z = u(x, y) with minimal area for a given contour. The function u satisfies the nonlinear equation:
(1 + u2y)uxx− 2uxuyuxy + (1 + u2x)uyy = 0.
Navier-Stokes
ut+ u· ∇ u = −1
p∇ p + ν∆u div u = 0
where u(x, y, z, t) is the velocity, p(x, y, z, t) is the pressure, ρ is the density and ν is the kinematic viscosity of the fluid. These nonlinear equations govern the movement of an incom- pressible viscous fluid.
1.2.1 Standard classification of linear PDEs
To start in our task of classification, assume for simplicity a 2-dimensional domain d = 2, and a constant coefficient second order PDE:
auxx+ buxy + cuyy+ e = 0. (1.10)
We look for a curve γ : R2 7→ R that is sufficiently regular and such that when we write the PDE along this curve it turns into and Ordinary Differential Equation (ODE). We write this curve in parametric form as γ(σ) (fig. 1.1) as follows:
γ = x = x(σ) y = y(σ)
Writing the above equations on a local reference system, we obtain:
dux
dσ = ∂ux
∂x dx
dσ + ∂ux
∂y dy dσ = uxx
dx dσ + uxy
dy dσ duy
dσ = ∂uy
∂x dx
dσ +∂uy
∂y dy
dσ = uxydx
dσ + uyydy dσ.
Writing uxx from the previous system and substituting it in eq. (1.10), we have:
uxy
"
a dy dx
2
− bdy dx+ c
#
−
adux
dx dy
dx + cduy
dx + edy dx
= 0.
This equation is a re-definition of the PDE on the curve γ(σ), or, in other words, the equation is satisfied on γ. Now we can choose γ so that the first term in square brackets is zero, obtaining an equation for uxand uy where only ordinary derivatives appear:
a dy dx
2
− bdy
dx+ c = 0.
We note that dy/dx is the slope of γ, which can then be obtained by solving the ODE:
dy
dx = b±√
b2− 4ac
2a . (1.11)
The solution of this ODE yields families of curves, that are called characteristic curves. Different families arise depending on the sign of the discriminant ∆ = b2 − 4ac. We then call the equations depending on this sign, obtaining the following classification:
• b2− 4ac < 0: two complex solutions: the equation is “elliptic”;
• b2− 4ac = 0: one real solution: the equation is “parabolic”;
• b2− 4ac > 0: two real solutions: the equation is “hyperbolic”.
Examples
Laplace equation
∆u = ∂2u
∂x2 +∂2u
∂y2 = 0
a = c = 1 b = 07→ b2 − 4ac < 0 is an elliptic equation;
Wave equation
∂2u
∂t2 −∂2u
∂x2 = 0 (1.12)
a = 1 b = 0 c =−1 7→ b2− 4ac > 0 is a hyperbolic equation;
diffusion or heat equation
∂u
∂t −∂2u
∂x2 = 0
a =−1 b = c = 0 7→ b2− 4ac = 0 is a parabolic equation.
1.2.2 Simple examples and solutions
We show in this paragraph some simple but clarifying examples of PDEs and their exact analytical solution. From these solutions we will extrapolate some typical characteristics of the solutions of PDEs.
Example 1.3. Find u : [0, 1]7→ R such that:
−u00= 0 x∈ [0, 1]
u(0) = 1;
u(1) = 0.
This is a elliptic equation. In this simple case the solution is obtained directly by integration between x = 0 and x = 1. We have:
u(x) = 1− x.
Example 1.4. Find u : [0, 1]7→ R such that:
−(a(x)u0)0 = 0 x∈ [0, 1] (1.13)
u(0) = 1;
u(1) = 0;
where the diffusion coefficient a(x) assumes the values:
a(x) =
(a1 if 0≤ x < 0.5 a2 if 0.5 < x≤ 1
Since a(x) > 0 for each x ∈ [0, 1] is is an elliptic equation. In this case the solution can be obtained by first subdividing the domain interval in two halves and integrating the equation in each subinterval:
u(x) =
(u1(x) = c11x + c12 x∈ [0, 0.5) u2(x) = c21x + c22 x∈ (0.5, 1].
We can see that the solutions are defined in terms of four constants. We need thus four equations. Two are given by the boundary conditions, but the other two are still missing. One natural condition is the request that u(x) be continuous (at least C0([0, 1])) in the domain [0, 1]. The second condition can be determined by looking at the left-hand-side of eq. (1.13) and looking for existence requirement of this term. Before we discuss this requirement we note that we can define the “flux” of u(x) as q(x) =−a(x)u0(x). Hence, the requirement for the existence of the left-hand-side (as long as we do not use the product rule for the derivative of the flux) is that q(x) must be continuous for all x∈ [0, 1]
x u(x)
FIGURE 1.2: Solution of example 1.4 for a1 = 1 and a2 = 10.
(again the requirement here is q(x)∈ C0([0, 1]). This observation suggests the sought condition, that yield the following system of equations for the constants ci:
u1(0) = 0 u2(1) = 0;
u1(0.5) = u2(0.5) q1(0.5) = q2(0.5),
−a1(0.5)u01(0.5) =−a2(0.5)u02(0.5).
We note that the last condition physically means that the flux of the quantity u(x) that exits from the subdomain on the left of x = 0.5 enters the subdomain on the right of x = 0.5. It is a conservation statement. Solving the system, the solution becomes:
u(x) =
(1− a12a+a22x x∈ [0, 0.5]
2a1
a1+a2 − a12a+a12x x∈ [0.5, 1], shown in fig. 1.2 in the case a1 = 1 and a2 = 10.
Remark 1.5. The previous example shows that the differential equation with discontinuous coeffi- cients has a solution that is continuous but not differentiable: the gradient is discontinuous. On the other hand the flux is continuous, and thus more regular. We will use this fact in to properly define our numerical solution. This property, that can be also shown theoretically, is very important in ap- plications, and characterizes “conservation laws”. In other words, the partial differential eq. (1.13) represents the balance of the quantity u(x). This quantity can be thought of as mass, then the equation is a mass-balance equation, a temperature, in which case the equation is an energy conservation equa- tion, a fluid velocity, and then the equation is a force balance equation (first Newton law), etcetera.
The determination of the conservation properties of numerical discretization schemes is an active and important field of research in the case of highly variable diffusion coefficients.
We would like to remark that in the case of jumps in the diffusion coefficient we cannot use the product rule to expand the left-hand-side of eq. (1.13). In fact we cannot write the following:
−a(x)u00(x)− a0(x)u0(x) = 0
because both u00(x) and a0(x) do not exist for x = 0.5. However, the solution u(x) exists and is intuitively sound, i.e., without any singularity, although it does not possess a second derivative. Hence, the equation must be written exclusively as in eq. (1.13). In general, using the chain rule for derivative is numerically counterproductive even if the regularity of the mathematical objects allows it.
Example 1.6 (Poisson equation). Find u : [0, 1]7→ R such that:
−u00= f (x) x∈ [0, 1] (1.14)
u(0) = u(1) = 0 (1.15)
with
f (x) =
(1 if x = 0.5, 0 otherwise..
This is an elliptic equation. The solution if this problem can be found by means of Green’s functions and is given by:
u(x) = (1
4(1− x) if 0 ≤ x ≤ 0.5,
1
4(x− 1) if 0.5≤ x ≤ 1. (1.16)
This solution is continuous but it has a piecewise constant first derivative with a jump in x = 0.5.
Hence the second derivative u00(x) does not exists in the midpoint. This seems a contradiction as in this case the left-hand-side of eq. (1.14) does not exists for all x ∈ [0, 1]. However, the solution u(x) given in eq. (1.16) in terms of the integral of the Green’s function is mathematically sound.
Thus we need to define a more “forgiving” formulation, whose solution can have discontinuous first derivatives. This is the role of the so called “weak” formulation to be seen in the next sections.
Example 1.7. Transport equation.
Given a vector field of constant velocity λ > 0, find the function u = u(x, t) such that:
ut+ λux = 0, (1.17a)
u(x, 0) = f (x). (1.17b)
-
x t
x− λt = ξ
ξ
−λx
6
t = t1
- b
b b
b b
bb b
b b
b b
bb
u
x
x x + λt
t = 0
6
FIGURE 1.3: Left panel: characteristic lines for eq. (1.17a) in the (x, t) plane. Right panel: graph of the solutionu(x, t) at t = 0 and t = t1 > 0 in the (u, x) plane. The solution is a wave with shape given byf (x) (a line in this case) that propagates towards the right with speed λ.
The characteristic curve is a line in the plane (x, t) given by:
x− λt = const = ξ.
Along this line the original eq. (1.17a) becomes:
du dt = ∂
∂tu(ξ + λt, t) = λux+ ut = 0.
Hence, the solution u is constant along a characteristic curve and this constant is determined by the initial conditions eq. (1.17b):
u(x, t) = f (ξ) = f (x− λt). (1.18)
At a fixed time t1 the solution is given by the rigid translation of the initial condition (f (x)) by a quantity λt1, as shown in fig. 1.3 (right panel).
Example 1.8. Advection (or convection) and diffusion equation (ADE).
Find the function u(x, t) : [0, T ]× R 7→ R such that:
∂u
∂t = D∂u2
∂x2 − v∂u
∂c, (1.19a)
u(x, t) = 1 for x = 0, (1.19b)
u(x, t) = 0 for x7→ ∞, (1.19c)
u(x, 0) = 0 for t = 0 and x > 0, (1.19d)
u(x, 0) = 1 for t = 0 and x = 0. (1.19e)
The solution is given by [3]:
u(x, t) = 1 2
erfc x− vt 2√
Dt
+ exp
vx Dt
erfc x + vt 2√
Dt
, where the function erfc is the complementary error function.
1.2.3 Conservation laws
From the physical point of view, the problems that we are facing are related to the principle of con- servation. For example the equilibrium of an elastic string fixed at the end points and subject to a distributed load is governed by an equation that determines the vertical displacement u(x) of the points ox of the string and its tension stresses σ(x), once the load g(x) and the elastic characteristics of the string E (Young’s modulus) are specified. The problem is written as:
σ(x) = Eu0(x) Hook’s law;
−σ0(x) = g(x) Elastic equilibrium;
u(0) = u(1) = 0 Boundary conditions.
Another interpretation of the same problem can be thought of as u(x) being the temperature of a rod subjected to a heat source g(x). In this case the symbol k is generally used in place of E to identify the thermal conductivity of the rod material and q(x) is the heat flux. The model thus is written as:
q(x) =−ku0(x) Fourier’s law; (1.20a)
q0(x) = g(x) Energy conservation; (1.20b)
u(0) = u(1) = 0 Boundary conditions. (1.20c)
The same equation can be thought as governing the diffusion of a substance dissolved in a fluid. In this case we talk about Fick’s law, concentration u(x). diffusion coefficient k, solute mass flux q(x).
Yet another interpretation of the same equation is the flow of water in a porous material. We talk then about Darcy’s law. More in general, we can say that all these equations represent a conservation principle. In fact, eq. (1.20a) represents the conservation of momentum deriving from Newton second law (F = ma), while eq. (1.20b) states the conservation of the energy of the system.
All these problems are equations written in “divergence form” or in conservative form. For example, consider the advection-diffusion equation (eq. (1.19a)). From the physical point of view, our solution function u represents the density of the conserved quantity. Thus we can introduced the density flux of this quantity as:
~
q =−D ∇ u + ~vu,
where the first term on the right-hand-side represents the diffusive flux and the second term represents the advective flux (the quantity u is transported by the velocity ~v and at the same time is diffused).
Equation (1.19a) can then be re-written as:
∂u
∂t + div ~q = f (x).
The first term represents the variation in time of the mass of this quantity. The second term represents the variation in space. Integrating the above equation in a subset U ⊂ Ω of the domain we have:
Z
U
∂u
∂t + div ~q dx = Z
U
f (x),
and assuming the boundary of U to be smooth, we can apply the divergence theorem:
Z
U
∂u
∂t dx + Z
∂U
~
q· ν ds = Z
U
f (x).
We recognize the classical mass conservation principle:
rate of change = inflow-outflow
Note that from a purely mathematical point of view, writing the equation in divergence form has no formal advantage with respect to any other alternative formulation. However, this is not true for the numerical formulation, in which the divergence form is always to be preferred.
1.3 Well posedness and continuous dependence of solutions on the data
The question of finding a solution to a PDE rests on the definition of solution. We can state that a solution is a function that satisfies the equation and has certain regularity properties3. However, the answer to the question what is a solution can be tricky. For a clear account and several interesting examples see [4]. We report here a few remarks that are useful for the developments and analysis of numerical methods.
We talk about a “classical” solution of a k-th order PDE to indicate a function that satisfies the PDE and the auxiliary conditions and that is k times differentiable. This is an intuitive requirement so that the derivatives that appear in the expression of the PDE can be formally calculated without
3This last request has to be made to avoid trivial and non interesting solutions.
worrying about singularities. However, this notion is often too restrictive, and there may be functions that are less regular that indeed satisfy the PDE and the auxiliary conditions. Moreover, by this strong regularity requirements, we may restrict the search of solutions only to cases that have enough regularity of the auxiliary conditions and of the data of the problem (e.g. the coefficients of the PDE). Thus we usually resort to a less restrictive definition of a solution, which is called a “weak”
solution. Thus we need to change the formulation of the PDE to accommodate this lower regularity requirement, maintaining at the same time the physical notion of the process that lead to the PDE.
Remark 1.9. Example 1.4 gives an instance of the application of this concept: the global solution, i.e., the solution over the entire domain I = [0, 1], is continuous but its derivative is not. Thus a classical solution to the problem does not exist, but a “weak” solution can be defined by appropriately relaxing the continuity conditions of the search space (the space of functions that are candidate solutions).
In any case, it is intuitive to look for solutions that are unique. There is certainly no hope to be able to find numerically a solution that is not unique. In fact, any computational algorithm in this case would never converge and would oscillate continuously among the several solutions of the problem. But uniqueness is not sufficient. We also require the notion of “continuous dependence of the solution form the data of the problem”. In essence, we require that, if for example some coefficients are changed slightly, then the solution changes slightly. This notion is useful for two important reasons.
First, in a computer implementation of any algorithm there is no hope to be able to specify a coefficient (which is a real number) with infinite accuracy. Next, and probably more importantly, uncertainties in physical constants or functions are intrinsically present in any model of a physical process. This uncertainty results in values of, e.g., boundary conditions or forcing functions that are not known precisely. But it is highly desirable that our mathematical model governing the physics be relatively insensitive to these uncertainties. This is reflected within the concept of well-posedness that we can make a little bit more formal by stating the following:
Definition 1.10 (Well posedness). Given a problem governed by a k-th order PDE:
F (x, u, ∂u, . . . , ∂ku, Σ) = 0
where Σ denotes the set of the data defining the problem, we say that this problem is well posed if:
1. the solution u exists;
2. the solution u is unique;
3. the solution u depends continuously on the data, i.e., if one element σ ∈ Σ is perturbed by a quantity δ, the corresponding solution ˜u to the perturbed problem is such thatk˜u − uk ≤ L kδk.
Note that this is not a very precise statement, as we need to specify what we mean with the symbol k·k. But this definition depends on the functions with which we are dealing, and thus it must be analyzed and specified for each problem.
These three requirements are obvious, especially in the context of numerical solution of a mathemat- ical problem. The first requirement states the incontrovertible fact that we cannot possibly think of finding a numerical approximation of the solution if this does not exist. The second statement asserts that the solution is unique as a function of the data (for a given set of data). No numerical scheme can approximate simultaneously two different solutions to the same problem, so we can only hope to find one of them, but in general we get a numerical solution that oscillates between the two and does not make sense. If there are more than two solutions the situations is obviously worse. Finally, the last condition is the most important one. In practical applications we are not so much concerned with the PDE itself but mainly with its solution, which generally gives the so called “state” of the system.
As such, a solution of the PDE is “physically” meaningful if small changes in the data (the parame- ters) of the problem cause small changes to the solution. In other words, a mathematical model of a physical phenomenon in general will not be useful if small errors in the measured data would lead to drastically different solutions.
As an example, consider the PDE of equation eq. (1.1) with the auxiliary conditions eq. (1.2). It is a well posed problem as it satisfies all the three conditions above. However, assume that we are at steady state, i.e., ∂u/∂t = 0, there are no source/sink, i.e., f = 0, and that only Neumann BCs are imposed, i.e. the only auxiliary condition present in our problem is eq. (1.2c). Then, any constant function u(x, t) = C is a solution of the problem, independently of the value of C. Thus the solution is not unique, and we cannot possibly hope to find it numerically.
In practice, the question of whether a PDE problem is well posed can depend also on the auxiliary conditions that we impose. Generally, if these auxiliary conditions are too many then a solution may not exist. If they are too few, then the solution may not be unique. And they must be “consistent”
with the PDE or the solution may not depend continuously on the data. Thus the question of well- posedness can be difficult to assess.
1.3.1 Ill-conditioning and instability
Two more concepts that are related to well-posedness need to be clearly stated when we move from the continuous setting to the discrete (numerical) setting. The first we would like to discuss is “ill- conditioning”. The “condition” of a problem is a property of the mathematical problem (not of the numerical scheme used to solve it) and can be stated intuitively as follows:
Definition 1.11 (Ill-conditioning). A mathematical problem is said to be ill-conditioned if small per- turbations on the data cause large variations of the solution.
The definition is problem specific, but a simple example related to linear algebra can be illuminating.
Example 1.12. Consider the following 2× 2 system of linear equations:
3x + 2y = 2 (1.21)
2x + 6y = −8. (1.22)
-7 -5 -3 -1 1 3 5 x
-4 -2 0 2 4 y
-6 -4 -2 0 2 4 6
-4 -2 0 2 4 y=-3x/2+1
y=-x/3-4/3
δ ε∼δ
-2 -1 0 1 2 3 4
x
-6 -4 -2 0 2 4 y
-2 -1 0 1 2 3 4
-6 -4 -2 0 2
4 1 2
ε>>δ
δ
FIGURE 1.4: Geometric interpretation of a “well-conditioned” (left) and an “ill-conditioned” linear system (right).
The mathematical problem can be stated as follows:
Problem 1.13. find the pair of real values (x, y) such that examples 1.12 and 1.12 are satisfied simul- taneously.
The solution to this problem is evidently P = (x, y) = (2,−2). We can rewrite the linear system as:
y = −3
2x + 1 (1.23)
y = −1 3x− 4
3. (1.24)
This reformulation, allows to change the problem into an equivalent formulation:
Problem 1.14. find the point P = (x, y) ∈ R2 that represents the intersection between the two lines identified by examples 1.12 and 1.12 (see fig. 1.4).
Now we want to analyze the conditioning of this problem. To do this we specify a small perturbation to the data of our problem and look at how its solution changes. In our case we can, for example, change the right hand side of the second equation by a quantity δ, yielding a downward translation of the line (fig. 1.4, left). The point of intersection between the two lines has now moved by a quantity
≈ δ. This problem is well-condition and the ratio /δ measures somehow the conditioning of our problem.
Now, if the two lines have almost equal slopes, the situation is different (fig. 1.4, right). A small perturbation δ to one if the right hand side values yield a large movement of the solution (the point of
intersection), by a quantity δ. The conditioning is measured again by the quantity /δ which is now much larger than one. The problem is thus “ill-conditioned”.
We note that both problems are actually “well-posed” as they admit a unique solution which is con- tinuously dependent upon the data. But the numerical solution may loose accuracy.
The second concept is called stability. Unlike conditioning, stability is a property of the numerical scheme used to solve a mathematical problem.
Definition 1.15 (Stability). A numerical scheme is stable if errors in initial data remain bounded as the algorithm progresses.
As an example, consider the following numerical algorithm given by the linear recursion:
u(k)= Au(k−1), k = 1, 2, . . .
where u(k) ∈ Rn, A is a constant n× n matrix, and the recursion is initiated with a given (possibly arbitrary) initial guess u(0). The representation u(0)h of the values of u(0) in the computer is not exact, so the actual algorithm involves the numerical approximation u(k)h :
u(k)h = Au(k−1)h , k = 1, 2, . . . (1.25)
Stability of the algorithm requires that the errors with which we represent u(0)h are not magnified by the algorithm process. More formally, we define the error as e(k) = u(k)− u(k)h , k = 1, 2, . . .. From this last equation we have that u(k) = u(k)h + e(k), and after substitution in eq. (1.25) we obtain the error propagation equation:
e(k) = Ae(k−1).
Stability of the scheme is achieved if the norm of the error remains bounded as k increases, i.e. (using compatible norms):
e(k) ≤
Ae(k−1)
≤ kAk
e(k−1)
≤ kAkk e(0)
which implieskAk ≤ 1.
Finally we would like to mention the following famous and completely general result known as the Lax-Richtmeyer equivalence theorem:
Theorem 1.1 (Lax-Richtmeyer equivalence theorem). A consistent numerical scheme for the solution of a well posed problem is convergent if and only it is stable.
2 Hyperbolic Equations
An intuitve, although empirical, definition of a hyperbolic equation is as follows. Given x ∈ Rd, the PDE
F (x, u,∇ u) = 0
is hyperbolic at the point x if it is possible to transform it into an Ordinary Differential Equa- tion (ODE) with respect to one of the d variables (say xd) as a function of the remaining ones x1, x2, . . . , xd−1.
2.1 Some examples
Notice that in the above statement one of the components of the vector x is take as time (we will do this often in the sequel). Thus, for example, we can set t := x1 and x := x2, and consider the simple one-dimensional equation:
ut+ ux = 0.
This equation can be transformed into an ODE if we write it along the lines (called “characteristics”):
ξ = x− t.
To see this, we make a simple change of variable x = ξ + t, so that we can write u(x, t) = u(x(t), t) as:
u(x, t) = u(ξ + t, t); x = ξ + t; dx
dt = 1; dx dξ = 1, so that:
d
dtu(ξ + t, t) = ∂u
∂x dx dt + ∂u
∂t = ut+ ux= 0.
From this we see that our original equation is an ODE with respect to time t as a function of space, x (i.e. along the characteristic lines of equation x = ξ + t). Thus, we are mainly interested in looking at solutions of a Cauchy problem, although we will be looking at the effects of boundary conditions as well. However, it is intuitive to think that any auxiliary condition on the boundary can be transformed into an auxiliary condition at t = 0, and vice-versa. Another picture for this statement can be envisioned by stating that we can revert the sign of the time variable, and go backwards in time, i.e., we can find the solution at an earlier time given the solution at a later time. We will see that indeed this is the case, contrary to e.g. a parabolic PDE. In other words, a hyperbolic equation governs a reversible phenomenon, while a parabolic equation governs a irreversible (or dissipative) phenomenon.
t
x
−x λ
ξ
x− λt = ξ
y
x∗ x t = 0
x∗+ λt1
t = t1
FIGURE 2.1: Characteristic line for equation (2.1) in the (x, t) plane (left). Graph of the solution u(x, t) at t = 0 and t = t1 in the (u, x) plane. A wave of form f (x) (here f (x) = e−a(x−b)2) propagates to the right with speedλ without changing shape.
To study the behavior of a hyperbolic equation, we need to study the geometry of the solution function in the space of its dependent variables, i.e., z = u(t, x), which represents a surface in Rd+1, if x∈ Rd. This will help us acquiring an understanding of the type of solutions we may expect from a hyperbolic partial differential equation. This is what we are trying to do in the next few sections We start this task with some simple examples.
2.1.1 The transport equation
We look at the simplest PDE that one can devise, i.e., the transport equation. Let us consider the following problem (the same problem already seen in example 1.7):
Problem 2.1. Given a constant velocity vector field q(x)(q1(x), . . . , qd(x)) in a domain Ω⊂ Rd, find a function u = u(x, t) such that
ut+ q· ∇ u = 0, in Rd× (0, ∞) (2.1a)
u(x, 0) = f (x), (2.1b)
Physically, this model represents the motion of particles (solutes, sediments in a fluid) moving with speed q. It is a conservation law. Indeed:
div(u(t, x)) = ∂u
∂t + div(qu) = ∂u
∂t + q· ∇ u (if div q = 0, i.e., q is a conservative field).
In the one-dimensional case d = 1, the velocity field is one-dimensional λ = q1. In this case the characteristic curve is a line in the plane (x, t) given by:
x− λt = const = ξ. (2.2)
Along this line the equation takes the form:
du dt = ∂
∂tu(ξ + λt, t) = λux+ ut = 0.
Thus our solution u is constant along the characteristic line. The constant is determined by the initial condition (2.1b):
u(x, t) = f (ξ) = f (x− λt). (2.3)
We want to see how this general solution can be visualized geometrically. First note that equation (2.3) provides the function u(x, t) in terms of the initial condition (2.1b). The solution at any point (x, t) depends only on the value of f at ξ = x− λt, the intersection of the characteristic line with the x-axis (Figure 2.1, left). We remark that the x-axis is the initial line, i.e., the line at t = 0. We say that the “domain of dependence” of u(x, t) on the initial values f (x) is given by the single point ξ. The initial values completely determine the solution. They influence the solution only at the points of the characteristic line.
We can look at the graph of the solution at fixed times. Thus, given the initial condition u(x, 0) = f (x), how does the solution at t = t1, i.e., u(x, t1), look like? It is easy to verify that:
u(x, 0) = u(x + λt1, t1) = f (x).
This implies that the solution at every fixed time time t is just the rigid translation of the initial condition along x by the quantity λt. Hence, u(x, t) represents a wave with spatial form given by f (x) that moves to the right (towards increasing x) with speed λ (Figure 2.1, right).
2.1.2 The second-order wave equation
As a classical example of a hyperbolic equation, we consider the second order (one-dimensional) wave equation (1.12). To better understand the geometrical behavior of the solution, we transform this equation into a system of first order partial differential equations. To this aim, let v = ∂u/∂t and w = ∂u/∂x. Then we have:
∂v
∂t = ∂w
∂x
∂w
∂t = ∂v
∂x.
We can write the previous system in matrix form letting our unknown function u be the vector u = (v, w):
∂
∂t
v w
= 0 1 1 0
∂
∂x
v w
,
kt 1t 0t mt m−1tk−1tk+1
xj
x1
x0 x
x n
x n−1
j−1 x
j+1
t
FIGURE 2.2: Subdivision of the plane (x, t) in a finite difference grid.
or, using an abbreviated notation for the derivative ua = ∂u/∂a:
ut= Aux. (2.4)
We note that matrix A is real, symmetric but not positive definite. It has two distinct real eigenvalues λ1 =−1 and λ2 = 1 and corresponding normalized eigenvectors e1 = √1
2[1,−1]T and e2 = √1
2[1, 1]T. In all generality, equation (2.4) represents a general system of first order PDEs. We say that such a system is hyperbolic if matrix A is diagonalizable and has real eigenvalues. This is obviously always true if A is also symmetric. If the real eigenvalues are distinct then the PDE is “strictly” hyperbolic.
2.1.3 Simple finite difference discretization
We now proceed with the simplest numerical approach for solving (2.1) in the case d = 1, in which case we write q = λ, which is assumed positive. Hence, our model problem is:
ut+ λux = 0 u(x, 0) = u0(x) (λ > 0)
We use the finite difference method, that is we use various incremental ratios to approximate the time and space derivatives. To this aim, we subdivide the temporal and spatial intervals uniformly with subintervals of dimensions ∆t and ∆x, respectively, whereby xj = x0 + j∆x, j = 0, . . . , n and
tk = t0 + k∆t, k = 0, . . . , m and denote by ukh,j = u(xj, tk) the numerical approximation of the solution at the grid point (j, k) (Figure 2.2). We can approximate the temporal and spatial derivatives using simple incremental fractions with size ∆x or ∆t or even 2∆x or 2∆t. We obtain the following:
ut(xj, tk)≈ uk+1h,j − ukh,j
∆t ut(xj, tk)≈ uk+1h,j − uk−1h,j
2∆t ux(xj, tk)≈ ukh,j+1− ukh,j
∆x ux(xj, tk)≈ ukh,j − ukh,j−1
∆x ux(xj, tk)≈ ukh,j+1− ukh,j−1
2∆x
We can then use first forward discretizations of the first derivatives both in space and time to write the following difference equation:
uk+1h,j − ukh,j
∆t =−λukh,j+1− ukh,j
∆x
which can be written by setting µ = λ∆t/∆x as:
uk+1h,j = ukh,j − µλ ukh,j+1− ukh,j = (1 + µλ)ukh,j − µλukh,j+1 (2.5) It is easy to see that this method is consistent, i.e., substitution of the real solution u in place of the numerical solution uh,yields a residual that goes to zero as ∆t and ∆x tend to zero. This is readily obtained using the following Taylor series expansions:
u(xj+ ∆x, tk) = u(xj, tk) + ux(xj, tk)∆x + uxx(xj, tk)∆x2
2 +O ∆x3 u(xj, tk+ ∆t) = u(xj, tk) + ut(xj, tk)∆t + utt(xj, tk)∆t2
2 +O ∆t3
and substituting the real solution u(xj, tk) in place of ukh,j into eq. (2.5) we can calculate the residual (local truncation error) as:
τh(xj, tk) =u(xj, tk) + ut(xj, tk)∆t + utt(xj, tk)∆t2
2 +O ∆t3
− (1 + µλ)u(xj, tk) + µλ
u(xj, tk) + ux(xj, tk)∆x + uxx(xj, tk)∆x2
2 +O ∆x3
=utt(xj, tk)∆t2
2 +O ∆t3 + µλ
uxx(xj, tk)∆x2
2 +O ∆x3
, (2.6)
which shows that the local error tends to zero with the discretization size ∆t and ∆x. Since this is a local error (betweeen two mesh nodes and two time steps), the global error is one order less, i.e.,
O (∆t + ∆x). Other schemes can be devised and analyzed. We can write the following:
uk+1h,j =ukh,j − µλ ukh,j+1− ukh,j
forward Euler - downwind
uk+1h,j =ukh,j − µλ ukh,j− ukh,j−1
forward Euler - upwind
uk+1h,j =ukh,j − µλ
2 ukh,j+1− ukh,j−1
forward Euler - central
uk+1h,j =uk−1h,j − µλ ukh,j+1− ukh,j−1
leap-frog
uk+1h,j =ukh,j+1+ ukh,j−1
2 − µλ
2 ukh,j+1− ukh,j−1
Lax-Friedrichs
However, convergence towards the real solution is not guaranteed by consistency alone, as stated by the Lax-Richtmeyer equivalence theorem (theorem 1.1), but requires that the scheme be stable. We need to define stability in the case of our FD approximations, but also we need to properly define con- vergence, although the latter one is a more intuitive concept. We will not dwelve into these issues, at the heart of numerical analysis, but give only an intuitive and practical explanation of these concepts.
Example 2.2. Solve the transport problem:
ut+ ux =0 u(x, 0) =
(1− | x | if | x |≤ 1,
0 otherwise.
using the above schemes.
Algorithm 1 Forward-Backward finite difference method for one-dimensional transport equation with constant velocity. We use periodic boundary conditions and assume the velocity of propagation is positiv λ > 0. This implies that periodic BC must be implemented for the first node of the FD mesh and nothing is imposed on the last node.
1: Set λ > 0; a, b (interval boundaries); ∆x; T (final time); ∆t;
2: N = (b− a)/∆x; M = T/∆t; µ = λ∆t/Dx.
3: for j = 1, 2, . . . , N do
4: uh,j = u0(a + (j− 1) ∗ ∆x);
5: end for
6: for k = 1, . . . , M do
7: uh,1 = (1− λµ)uh,1+ λµuh,N +1
8: for j = 2, . . . , N do
9: uh,j = (1− λµ)uh,j+ λµuh,j−1
10: end for
11: end for
We report in fig. 2.3 (top row) the results obtained at t = 0.1 and t = 0.2 using the leap-frog (left panel) and the forward-forward (right panel) schemes. We see immediately that the leap-frog shows