Notes on theory and numerical methods for hyperbolic conservation laws

(1)

Notes on theory and numerical methods for hyperbolic conservation laws

Mario Putti

Department of Mathematics – University of Padua, Italy e-mail: [email protected]

January 19, 2017

1 Partial differential equations

In this notes we will look at the numerical solution for partial differential equations. We will be mainly concerned with differential models stemming from conservation laws, such as those arising from force conservations i.e., second Newton’s law F = ma, such as de Saint-Venant equations, governing the equilibrium of a solid, or the Navier-Stokes equations, governing the dynamics of fluid flow. These equations are also called “equations in divergence form”, to identify the fact that the divergence of a vector translates in mathematical terms the conservation of the flux represented by that vector field.

As an example, let us consider the advection-diffusion equation (ADE), that governs the conservation of mass of a solute moving within the flow of the containing solvent. A typical application is the transport of a contaminant by a water body moving with laminar flow. The flow of the solvent is given by the vector (velocity) field λ, and the solute is undergoing chemical (Fickian) diffusion with a diffusion field D(x). We remark that if the density of the solvent is constant, then mass conservation is equivalent to volume concentration, and thus density does not appear in the equations. The mathematical model is then given by:

∂u

∂t = div (D∇ u) − div(λu) + f in Ω∈ R^d (1.1)

where the equation is defined on a subspace of the d-dimensional Euclidean space R^d(d = 1, 2, or 3), the function u(x, t) : Ω× [0 : T ] 7→ R represents the concentration of the solute (mass/volume of solute per unit mass/volume of solvent), t is time, div = P

i∂/∂x_i is the divergence operator, D is the diffusion coefficient, possibly a second order tensor, and∇ = {∂/∂xⁱ} is the gradient operator. A problem is mathematically well posed if i) it has a solution; ii) the solution is unique; iii) the solution depends continuously on the data of the problem. For a PDE problem to be well-posed we need initial and boundary conditions. So we let the domain boundary Γ = ∂Ω be the union of three non overlapping sub-boundaries such that Ω = ΓD∪ Γ^N ∪ Γ^C, so that we:

u(x, 0) = u_o(x) x∈ Ω, t = 0 (initial conditions) (1.2a) u(x, t) = g_D(x) x∈ Γ^D, t > 0 (Dirichlet BCs) (1.2b)

−D ∇ u(x, t) · ν = q^N(x) x∈ Γ^N, t > 0 (Neumann BCs) (1.2c) λu− D ∇ u(x, t) · ν = q^C(x) x∈ Γ^C, t > 0 (Cauchy BCs) (1.2d) where ν is the outward unit normal defined on Γ. Formally, under some regularity assumption and the hypothesis that D never vanishes, this is called a “parabolic” equation. The term “parabolic” is used to classify partial differential equations (PDEs) on the basis of certain qualitative properties of the solution. This can be done relatively easily with linear PDEs, and becomes more complicated for nonlinear PDEs.

(3)

1.1 Mathematical preliminaries and notations

We start this discussion by giving some general definitions. All our objects or equations will be defined in a domain Ω ⊂ R^d, and we will indicate with x ∈ Ω a point of the d-dimensional space (d=1, 2, or 3) as a vector with d coordinates x = (x₁, x₂, . . . , x_d)^T with respect to a fixed Cartesian (orthonormal) reference system. ¹ Note that, for simplicity, we assume vectors to be 1-dimensional collections of numbers ordered columnwise, and will use the transpose operation as above to indicate a row vector (see appendix D). If d = 2 we will often identify x = x₁ and y = x₂. A function u(x) is a spatially-variable real-valued scalar function taking values in Ω ⊂ R^d and returning real values in R. A vector-valued function q(x) is a collection of functions ordered in a column vector. Thus we have:

u : Ω7→ R q : Ω7→ R^m; q(x) =





 q₁(x)

... q_m(x)





.

Typically, physical quantities such as fluxes, velocities, etc. are defined in R^d, i.e., in the above we have m = d.

Definition 1.1 (Derivatives). 1. first order partial derivative:

u_x_i = ∂u(x)

∂x_i = lim

h→0

u(x + he_i)− u(x) h

where ei is the i-th coordinate vector, i.e., the vector with the i-th component equal to one and all the other ones equal to zero.

2. second order partial derivatives:

u_x_i_x_j = ∂²u(x)

∂x_i∂x_j,

with obvious extensions for higher order.

3. Multiindex Notation.

(a) Given a vector α = (α₁, . . . , α_d)^T, where α_i ≥ 0 (the multi-index of order | α |= α¹+ . . . + α_d), the α-th derivative is:

∇^αu(x) := ∂^|α|u(x)

∂x^α₁¹· · · ∂x^αd^d

= ∂_x^α₁¹· · · ∂x^αd^d.

1We would like to remark that formallyΩ is an open subset of R^d, and its closure is indicated with ¯Ω = Ω d Γ, where Γ = ∂Ω is the boundary of Ω, which forms a subset of R^d of co-dimension 1, i.e.,d− 1 (for d = 2 the boundary is a curve, ford = 3 it is a surface). We will not use or make reference to these technicalities, as empirical intuition can be sufficiently supported without them.

(4)

(b) if k is a nonnegative integer, then the set of all partial derivatives of order k is given by:

∇^ku(x) :={∇^αu(x) :| α |= k} .

Giving an order to the partial derivatives of the set above, we can think of∇^ku(x) as a point (vector) in R^d^k, and we can define the length of this point as:

| ∇^ku(x)|=



 X

|α|=k

| ∇^αu|²





1 2

.

(c) Special cases. If k = 1 we order the elements of ∇ u in a vector thus leading to the definition of the “gradient” operator:

∇ u :=





 u_x₁

... ux_d







Thus we can regard∇ u as a point of R^d.

4. The divergence of a vector function q(x) : Ω⊂ R^d7→ R^das:

div q(x) :=∇ ·q(x) = ∇^T q(x) = ∂q₁(x)

∂x_i + . . .∂q_d(x)

∂x_d For example, for d = 2 we have:

∇ u(x) = u_x₁ u_x₂

=

∂u

∂x1

∂u

∂x2

div q(x) =∇ ·q(x) = ∂q₁(x)

∂x₁ + ∂q₂(x)

∂x₂

We would like to stress here that we will generally see the gradient as an operator acting on a scalar-valued function and the divergence as an operator acting on a vector-valued function.

Note that the divergence and the gradient operators are intimately related by the divergence (or Gauss’ or Stokes) theorem, which we will discuss later on.

5. For k = 2 we organize the elements of∇²u(x) in a matrix, called the Hessian matrix:

∇²u(x) =∇ ∇^T u(x) =







u_x₁_x₁ · · · ux1xd

. ..

u_x_d_x₁ · · · uxdxd







Unfortunately, the symbol∇² should not be confused with the Laplacian operator, often indicated by∇².

(5)

6. Laplacian operator: The Laplacian operator is given by:

∆u(x) = div∇ u(x) = ∇^T ∇ u(x) = Tr ∇²u(x) =

d

X

i=1

u_x_i_x_i.

where the operator Tr (A) is the trace of a matrix A (see appendix D).

Definition 1.2 (Partial differential equation). A k-th order partial differential equation (PDE) can be written as:

F (x, u(x),∇ u(x), ∇²u(x), . . . ,∇^ku(x)) = 0, x∈ Ω. (1.3) where formally F : Ω× R × R^d. . .× R^d^k−1× R^d^k 7→ R. For example, for k = 2 and d = 3 we have:

F (x, y, z, u, u_x, u_y, u_z, u_xx, u_xy, u_yy, u_xz, u_zz, u_yz) = 0.

The solution of this equation is a function u(x) : Ω ⊂ R^d 7→ R that satisfies eq. (1.3) and possibly some other boundary conditions.

The problem we are concerned with is finding solutions (either approximate or exact) to eq. (1.3). If F is a linear function of u and its derivatives, then the equation is called linear, and, assuming d = 2, it can be written as:

a(x, y) + b(x, y)u + c(x, y)u_x+ d(x, y)u_y+ e(x, y)u_xx+ f (x, y)u_yy+ g(x, y)u_xy = 0. (1.4) The order of a PDE is the order of the derivative of maximum degree that appears in the equation.

Thus, in the previous case the order is 2. Typical examples are:

∆u = ∂²u

∂x² + ∂²u

∂y² = 0 2^oorder (Laplace equation)

∂u

∂t + ∂u

∂x = 0 1^oorder (transport or convection equation)

∂u

∂t − ∂²u

∂x² = 0 2^oorder (diffusion equation).

Note that we often identify one of the variables as time, in which case, setting x_d+1 = t, the solution u has domain given by Ω× [0, T ]:

u : Ω× [0, T ] 7→ R.

Obviously, we consider time as a positive real number.

(6)

1.1.1 The chain rule of differentation

We say that a function u(x) is smooth when it is infinitely differentiable, i.e., u(x)∈ C^∞(Ω). We will be using often the chain (product) rule of differentiation and its derived property, i.e., integration by parts. For d = 1, using u⁰ to indicate differentation, we can write:

(uv)⁰ = u⁰v + uv⁰ (1.5)

which is valid for all u and v inC^∞. By extension, we can think that the derivative of a scalar valued function in Ω as the gradient operator, i.e., the collection of all its partial derivatives. The chain rule is then:

∇(uv) = u ∇ v + v ∇ u. (1.6)

If we have a scalar function u : Ω 7→ R and a vector-valued function q : Ω 7→ R^d, we may form the product qu = (q1u, . . . , qdu). Applying eqs. (1.5) and (1.6) component by component, we obtain:

div(qu) = q· ∇ u + u div q. (1.7)

Note that this is a straight forward extension if we think that the gradient acts on scalar functions, and the divergence acts on vector functions.

1.1.2 Integration by parts

Recall the rule of integration by parts. It can be derived quickly from the chain rule. For d = 1, in fact, using eq. (1.5), we can write:

Z b a

u⁰(x)v(x) dx = Z b

a

u(x)v(x)0

dx− Z b

a

u(x)v⁰(x) dx

=

u(b)v(b)− u(a)v(a)

− Z b

a

u(x)v⁰(x) dx (1.8a)

In words, integration by parts states that the integral of the product of two functions (here u⁰(x) and v(x)) is equal to the product of the primitive of the first function multiplied by the other function evaluated at the extremes, minus the integral of the primitive times the derivative.

These results are easily extended in the multidimensional case. In this case, it is easy² to remember all these results by using the above definitions of gradient and divergence as vector operators. Recall again that the gradient acts on scalars and the divergence acts on vectors. We recall here the important Gauss’ (or Divergence) Theorem that states a conservation principle:

Z

Ω

div q(x) dx = Z

Γ

q(x)· ν(x) dx,

2at least for me

(7)

where the vector ν(x) is the outward unit normal defined on the boundary Γ of Ω. We can now state the theorem of multidimensional integration by parts (or second Green’s Lemma) by recasting the one-dimensional integration by parts in the proper setting. Thus Green’s Lemma states that:

Z

Ω

u(x) div q(x) dx = Z

Γ

u(x)q(x)· ν(x) dx − Z

Ω

∇ u(x) · q(x) dx. (1.9)

To make the parallelism with the one-dimensional analogue, which helps remembering the theorem, we first look at eq. (1.8a). We observe that the term u(b)v(b)− u(a)v(a) can be interpreted as a boundary integral once we note that the “normal” in x = a is ν(a) =−1 and the “normal” in x = b is ν(b) = +1, and the “zero”-dimensional integral is just the evaluation of the integrand at the extremes of integration. Looking at eq. (1.9) we can see that q(x) can be interpreted as the primitive of div q(x) and∇ u(x) as the (multiindex) first derivative of u(x). Then we can put Green’s Lemma in words:

the integral of the product of two functions is given by the d− 1-dimensional integral of the product between the primitive of one of the two functions times the other one, everything projected along the outward unit normal to the d− 1-dimensional boundary of the domain, minus the integral over the domain of the product of the primitive function times the derivative of the other function.

We would like to stress here that we have assumed that the functions are infinitely differentiable. This is too strong an assumption, and we could have assumed that first derivatives of the functions exist and are finite for all x. This means that u and q_imust be continuous functions of x (must be inC⁰(Ω)).

In the case they are not, we need to be careful in applying the chain rule. Moreover, numerically the chain rule is troublesome, and should be avoided. We will discuss this in later sections.

1.2 Classification of the PDEs

We first start our classification by looking at the “linearity” of the function F in eq. (1.3). We say that a PDE is linear if all the derivatives appear as linear combinations, i.e. if it has the form:

a₀(x)u + a₁(x)∇ u + . . . + ak(x)∇^ku = X

|α|≤k

a_α(x)∇^αu = f (x)

If f = 0 the PDE is called homogeneous. A PDE is semilinear if only derivatives of order strictly smaller than k appear as nonlinear terms:

X

|α|=k

a_α(x)∇^αu + a₀

∇^k−1, . . . ,∇^ku, u, x

= 0.

It is called quasilinear if the highest order derivatives are multiplied by functions of strictly lower order derivatives:

X

|α|=k

a_α

∇^k−1, . . . ,∇^ku, u, x

∇^αu + a₀

∇^k−1, . . . ,∇^ku, u, x

= 0.

and it is fully nonlinear if the highest order derivatives appear nonlinearly in the equations.

(8)

Examples.

Laplace Equation

div∇ u = ∆u = u^xx+ u_yy = 0

This is an ubiquitous linear equation whose solution u is called potential function or harmonic function. In two dimensions, we can associate to a harmonic function u(x, y) a “conjugate”

harmonic function v(x, y) such that the Cauchy-Riemann equations are satisfied:

u_x = v_y u_y =−v^x

We can interpret the vector field (u(x, y),−v(x, y) as the velocity of an irrotational and incompressible fluid.

Wave equation

u_tt = λ²∆u

Again, this is a ubiquitous linear equation that governs the vibration of a string or the propagation of waves in an incompressible medium (e.g., fluid waves, sound).

Maxwell equation

E_t= curl H; µH_t=− curl E; div E = div H = 0

where E = (E1, E₂, E₃) and H = (H₁, H₂, H₃) are the electric and magnetic vector fields, respectively, in vacuum. Note that each component, E_j and H_j of E and H satisfy a wave equation with λ² = 1/µ.

Linear Elastic waves

ρu_tt = µ∆u + (λ + µ)∇(div u),

where ρ is the density of the medium, u = {uⁱ(x, y, z, t)} represents the displacement vector, and λ and µ are the Lam´e constants. Each component u_i of u satisfies a fourth order linear equation:

∂²

∂t² − λ + 2µ

ρ ∆ ∂²

∂t² − µ ρ∆

u_i = 0

It is easy to see that in the case of equilibrium (ut= 0) this equation reduces to the bi-harmonic equation:

∆²u = 0.

(9)

σ

FIGURE 1.1: Curve γ and local reference system

Minimal surface We want to find a surface z = u(x, y) with minimal area for a given contour. The function u satisfies the nonlinear equation:

(1 + u²_y)u_xx− 2u^xu_yu_xy + (1 + u²_x)u_yy = 0.

Navier-Stokes

u_t+ u· ∇ u = −1

p∇ p + ν∆u div u = 0

where u(x, y, z, t) is the velocity, p(x, y, z, t) is the pressure, ρ is the density and ν is the kinematic viscosity of the fluid. These nonlinear equations govern the movement of an incompressible viscous fluid.

1.2.1 Standard classification of linear PDEs

To start in our task of classification, assume for simplicity a 2-dimensional domain d = 2, and a constant coefficient second order PDE:

auxx+ buxy + cuyy+ e = 0. (1.10)

We look for a curve γ : R² 7→ R that is sufficiently regular and such that when we write the PDE along this curve it turns into and Ordinary Differential Equation (ODE). We write this curve in parametric form as γ(σ) (fig. 1.1) as follows:

γ = x = x(σ) y = y(σ)

Writing the above equations on a local reference system, we obtain:

dux

dσ = ∂ux

∂x dx

dσ + ∂ux

∂y dy dσ = uxx

dx dσ + uxy

dy dσ du_y

dσ = ∂u_y

∂x dx

dσ +∂u_y

∂y dy

dσ = u_xydx

dσ + u_yydy dσ.

Writing uxx from the previous system and substituting it in eq. (1.10), we have:

uxy

"

a dy dx

2

− bdy dx+ c

#

−

adu_x

dx dy

dx + cdu_y

dx + edy dx

= 0.

(10)

This equation is a re-definition of the PDE on the curve γ(σ), or, in other words, the equation is satisfied on γ. Now we can choose γ so that the first term in square brackets is zero, obtaining an equation for uxand uy where only ordinary derivatives appear:

a dy dx

2

− bdy

dx+ c = 0.

We note that dy/dx is the slope of γ, which can then be obtained by solving the ODE:

dy

dx = b±√

b²− 4ac

2a . (1.11)

The solution of this ODE yields families of curves, that are called characteristic curves. Different families arise depending on the sign of the discriminant ∆ = b² − 4ac. We then call the equations depending on this sign, obtaining the following classification:

• b²− 4ac < 0: two complex solutions: the equation is “elliptic”;

• b²− 4ac = 0: one real solution: the equation is “parabolic”;

• b²− 4ac > 0: two real solutions: the equation is “hyperbolic”.

Examples

Laplace equation

∆u = ∂²u

∂x² +∂²u

∂y² = 0

a = c = 1 b = 07→ b² − 4ac < 0 is an elliptic equation;

Wave equation

∂²u

∂t² −∂²u

∂x² = 0 (1.12)

a = 1 b = 0 c =−1 7→ b²− 4ac > 0 is a hyperbolic equation;

diffusion or heat equation

∂u

∂t −∂²u

∂x² = 0

a =−1 b = c = 0 7→ b²− 4ac = 0 is a parabolic equation.

(11)

1.2.2 Simple examples and solutions

We show in this paragraph some simple but clarifying examples of PDEs and their exact analytical solution. From these solutions we will extrapolate some typical characteristics of the solutions of PDEs.

Example 1.3. Find u : [0, 1]7→ R such that:

−u⁰⁰= 0 x∈ [0, 1]

u(0) = 1;

u(1) = 0.

This is a elliptic equation. In this simple case the solution is obtained directly by integration between x = 0 and x = 1. We have:

u(x) = 1− x.

Example 1.4. Find u : [0, 1]7→ R such that:

−(a(x)u⁰)⁰ = 0 x∈ [0, 1] (1.13)

u(0) = 1;

u(1) = 0;

where the diffusion coefficient a(x) assumes the values:

a(x) =

(a₁ if 0≤ x < 0.5 a₂ if 0.5 < x≤ 1

Since a(x) > 0 for each x ∈ [0, 1] is is an elliptic equation. In this case the solution can be obtained by first subdividing the domain interval in two halves and integrating the equation in each subinterval:

u(x) =

(u₁(x) = c¹₁x + c¹₂ x∈ [0, 0.5) u₂(x) = c²₁x + c²₂ x∈ (0.5, 1].

We can see that the solutions are defined in terms of four constants. We need thus four equations. Two are given by the boundary conditions, but the other two are still missing. One natural condition is the request that u(x) be continuous (at least C⁰([0, 1])) in the domain [0, 1]. The second condition can be determined by looking at the left-hand-side of eq. (1.13) and looking for existence requirement of this term. Before we discuss this requirement we note that we can define the “flux” of u(x) as q(x) =−a(x)u⁰(x). Hence, the requirement for the existence of the left-hand-side (as long as we do not use the product rule for the derivative of the flux) is that q(x) must be continuous for all x∈ [0, 1]

(12)

x u(x)

FIGURE 1.2: Solution of example 1.4 for a₁ = 1 and a₂ = 10.

(again the requirement here is q(x)∈ C⁰([0, 1]). This observation suggests the sought condition, that yield the following system of equations for the constants c_i:

u₁(0) = 0 u₂(1) = 0;

u₁(0.5) = u₂(0.5) q₁(0.5) = q₂(0.5),

−a¹(0.5)u⁰₁(0.5) =−a²(0.5)u⁰₂(0.5).

We note that the last condition physically means that the flux of the quantity u(x) that exits from the subdomain on the left of x = 0.5 enters the subdomain on the right of x = 0.5. It is a conservation statement. Solving the system, the solution becomes:

u(x) =

(1− _a₁^2a_+a²₂x x∈ [0, 0.5]

2a1

a1+a2 − a1^2a+a¹2x x∈ [0.5, 1], shown in fig. 1.2 in the case a1 = 1 and a₂ = 10.

Remark 1.5. The previous example shows that the differential equation with discontinuous coefficients has a solution that is continuous but not differentiable: the gradient is discontinuous. On the other hand the flux is continuous, and thus more regular. We will use this fact in to properly define our numerical solution. This property, that can be also shown theoretically, is very important in applications, and characterizes “conservation laws”. In other words, the partial differential eq. (1.13) represents the balance of the quantity u(x). This quantity can be thought of as mass, then the equation is a mass-balance equation, a temperature, in which case the equation is an energy conservation equation, a fluid velocity, and then the equation is a force balance equation (first Newton law), etcetera.

(13)

The determination of the conservation properties of numerical discretization schemes is an active and important field of research in the case of highly variable diffusion coefficients.

We would like to remark that in the case of jumps in the diffusion coefficient we cannot use the product rule to expand the left-hand-side of eq. (1.13). In fact we cannot write the following:

−a(x)u⁰⁰(x)− a⁰(x)u⁰(x) = 0

because both u⁰⁰(x) and a⁰(x) do not exist for x = 0.5. However, the solution u(x) exists and is intuitively sound, i.e., without any singularity, although it does not possess a second derivative. Hence, the equation must be written exclusively as in eq. (1.13). In general, using the chain rule for derivative is numerically counterproductive even if the regularity of the mathematical objects allows it.

Example 1.6 (Poisson equation). Find u : [0, 1]7→ R such that:

−u⁰⁰= f (x) x∈ [0, 1] (1.14)

u(0) = u(1) = 0 (1.15)

with

f (x) =

(1 if x = 0.5, 0 otherwise..

This is an elliptic equation. The solution if this problem can be found by means of Green’s functions and is given by:

u(x) = (1

4(1− x) if 0 ≤ x ≤ 0.5,

1

4(x− 1) if 0.5≤ x ≤ 1. (1.16)

This solution is continuous but it has a piecewise constant first derivative with a jump in x = 0.5.

Hence the second derivative u⁰⁰(x) does not exists in the midpoint. This seems a contradiction as in this case the left-hand-side of eq. (1.14) does not exists for all x ∈ [0, 1]. However, the solution u(x) given in eq. (1.16) in terms of the integral of the Green’s function is mathematically sound.

Thus we need to define a more “forgiving” formulation, whose solution can have discontinuous first derivatives. This is the role of the so called “weak” formulation to be seen in the next sections.

Example 1.7. Transport equation.

Given a vector field of constant velocity λ > 0, find the function u = u(x, t) such that:

u_t+ λu_x = 0, (1.17a)

u(x, 0) = f (x). (1.17b)

(14)

-

x t

x− λt = ξ

ξ

−_λ^x

6

t = t₁

- b

b b

bb b

b b

bb

u

x

x x + λt

t = 0

6

FIGURE 1.3: Left panel: characteristic lines for eq. (1.17a) in the (x, t) plane. Right panel: graph of the solutionu(x, t) at t = 0 and t = t₁ > 0 in the (u, x) plane. The solution is a wave with shape given byf (x) (a line in this case) that propagates towards the right with speed λ.

The characteristic curve is a line in the plane (x, t) given by:

x− λt = const = ξ.

Along this line the original eq. (1.17a) becomes:

du dt = ∂

∂tu(ξ + λt, t) = λu_x+ u_t = 0.

Hence, the solution u is constant along a characteristic curve and this constant is determined by the initial conditions eq. (1.17b):

u(x, t) = f (ξ) = f (x− λt). (1.18)

At a fixed time t₁ the solution is given by the rigid translation of the initial condition (f (x)) by a quantity λt1, as shown in fig. 1.3 (right panel).

Example 1.8. Advection (or convection) and diffusion equation (ADE).

(15)

Find the function u(x, t) : [0, T ]× R 7→ R such that:

∂u

∂t = D∂u²

∂x² − v∂u

∂c, (1.19a)

u(x, t) = 1 for x = 0, (1.19b)

u(x, t) = 0 for x7→ ∞, (1.19c)

u(x, 0) = 0 for t = 0 and x > 0, (1.19d)

u(x, 0) = 1 for t = 0 and x = 0. (1.19e)

The solution is given by [3]:

u(x, t) = 1 2

erfc x− vt 2√

Dt

+ exp

vx Dt

erfc x + vt 2√

Dt

, where the function erfc is the complementary error function.

1.2.3 Conservation laws

From the physical point of view, the problems that we are facing are related to the principle of conservation. For example the equilibrium of an elastic string fixed at the end points and subject to a distributed load is governed by an equation that determines the vertical displacement u(x) of the points ox of the string and its tension stresses σ(x), once the load g(x) and the elastic characteristics of the string E (Young’s modulus) are specified. The problem is written as:

σ(x) = Eu⁰(x) Hook’s law;

−σ⁰(x) = g(x) Elastic equilibrium;

u(0) = u(1) = 0 Boundary conditions.

Another interpretation of the same problem can be thought of as u(x) being the temperature of a rod subjected to a heat source g(x). In this case the symbol k is generally used in place of E to identify the thermal conductivity of the rod material and q(x) is the heat flux. The model thus is written as:

q(x) =−ku⁰(x) Fourier’s law; (1.20a)

q⁰(x) = g(x) Energy conservation; (1.20b)

u(0) = u(1) = 0 Boundary conditions. (1.20c)

The same equation can be thought as governing the diffusion of a substance dissolved in a fluid. In this case we talk about Fick’s law, concentration u(x). diffusion coefficient k, solute mass flux q(x).

Yet another interpretation of the same equation is the flow of water in a porous material. We talk then about Darcy’s law. More in general, we can say that all these equations represent a conservation principle. In fact, eq. (1.20a) represents the conservation of momentum deriving from Newton second law (F = ma), while eq. (1.20b) states the conservation of the energy of the system.

(16)

All these problems are equations written in “divergence form” or in conservative form. For example, consider the advection-diffusion equation (eq. (1.19a)). From the physical point of view, our solution function u represents the density of the conserved quantity. Thus we can introduced the density flux of this quantity as:

~

q =−D ∇ u + ~vu,

where the first term on the right-hand-side represents the diffusive flux and the second term represents the advective flux (the quantity u is transported by the velocity ~v and at the same time is diffused).

Equation (1.19a) can then be re-written as:

∂u

∂t + div ~q = f (x).

The first term represents the variation in time of the mass of this quantity. The second term represents the variation in space. Integrating the above equation in a subset U ⊂ Ω of the domain we have:

Z

U

∂u

∂t + div ~q dx = Z

U

f (x),

and assuming the boundary of U to be smooth, we can apply the divergence theorem:

Z

U

∂u

∂t dx + Z

∂U

~

q· ν ds = Z

U

f (x).

We recognize the classical mass conservation principle:

rate of change = inflow-outflow

Note that from a purely mathematical point of view, writing the equation in divergence form has no formal advantage with respect to any other alternative formulation. However, this is not true for the numerical formulation, in which the divergence form is always to be preferred.

1.3 Well posedness and continuous dependence of solutions on the data

The question of finding a solution to a PDE rests on the definition of solution. We can state that a solution is a function that satisfies the equation and has certain regularity properties³. However, the answer to the question what is a solution can be tricky. For a clear account and several interesting examples see [4]. We report here a few remarks that are useful for the developments and analysis of numerical methods.

We talk about a “classical” solution of a k-th order PDE to indicate a function that satisfies the PDE and the auxiliary conditions and that is k times differentiable. This is an intuitive requirement so that the derivatives that appear in the expression of the PDE can be formally calculated without

3This last request has to be made to avoid trivial and non interesting solutions.

(17)

worrying about singularities. However, this notion is often too restrictive, and there may be functions that are less regular that indeed satisfy the PDE and the auxiliary conditions. Moreover, by this strong regularity requirements, we may restrict the search of solutions only to cases that have enough regularity of the auxiliary conditions and of the data of the problem (e.g. the coefficients of the PDE). Thus we usually resort to a less restrictive definition of a solution, which is called a “weak”

solution. Thus we need to change the formulation of the PDE to accommodate this lower regularity requirement, maintaining at the same time the physical notion of the process that lead to the PDE.

Remark 1.9. Example 1.4 gives an instance of the application of this concept: the global solution, i.e., the solution over the entire domain I = [0, 1], is continuous but its derivative is not. Thus a classical solution to the problem does not exist, but a “weak” solution can be defined by appropriately relaxing the continuity conditions of the search space (the space of functions that are candidate solutions).

In any case, it is intuitive to look for solutions that are unique. There is certainly no hope to be able to find numerically a solution that is not unique. In fact, any computational algorithm in this case would never converge and would oscillate continuously among the several solutions of the problem. But uniqueness is not sufficient. We also require the notion of “continuous dependence of the solution form the data of the problem”. In essence, we require that, if for example some coefficients are changed slightly, then the solution changes slightly. This notion is useful for two important reasons.

First, in a computer implementation of any algorithm there is no hope to be able to specify a coefficient (which is a real number) with infinite accuracy. Next, and probably more importantly, uncertainties in physical constants or functions are intrinsically present in any model of a physical process. This uncertainty results in values of, e.g., boundary conditions or forcing functions that are not known precisely. But it is highly desirable that our mathematical model governing the physics be relatively insensitive to these uncertainties. This is reflected within the concept of well-posedness that we can make a little bit more formal by stating the following:

Definition 1.10 (Well posedness). Given a problem governed by a k-th order PDE:

F (x, u, ∂u, . . . , ∂^ku, Σ) = 0

where Σ denotes the set of the data defining the problem, we say that this problem is well posed if:

1. the solution u exists;

2. the solution u is unique;

3. the solution u depends continuously on the data, i.e., if one element σ ∈ Σ is perturbed by a quantity δ, the corresponding solution ˜u to the perturbed problem is such thatk˜u − uk ≤ L kδk.

Note that this is not a very precise statement, as we need to specify what we mean with the symbol k·k. But this definition depends on the functions with which we are dealing, and thus it must be analyzed and specified for each problem.

(18)

These three requirements are obvious, especially in the context of numerical solution of a mathematical problem. The first requirement states the incontrovertible fact that we cannot possibly think of finding a numerical approximation of the solution if this does not exist. The second statement asserts that the solution is unique as a function of the data (for a given set of data). No numerical scheme can approximate simultaneously two different solutions to the same problem, so we can only hope to find one of them, but in general we get a numerical solution that oscillates between the two and does not make sense. If there are more than two solutions the situations is obviously worse. Finally, the last condition is the most important one. In practical applications we are not so much concerned with the PDE itself but mainly with its solution, which generally gives the so called “state” of the system.

As such, a solution of the PDE is “physically” meaningful if small changes in the data (the parame- ters) of the problem cause small changes to the solution. In other words, a mathematical model of a physical phenomenon in general will not be useful if small errors in the measured data would lead to drastically different solutions.

As an example, consider the PDE of equation eq. (1.1) with the auxiliary conditions eq. (1.2). It is a well posed problem as it satisfies all the three conditions above. However, assume that we are at steady state, i.e., ∂u/∂t = 0, there are no source/sink, i.e., f = 0, and that only Neumann BCs are imposed, i.e. the only auxiliary condition present in our problem is eq. (1.2c). Then, any constant function u(x, t) = C is a solution of the problem, independently of the value of C. Thus the solution is not unique, and we cannot possibly hope to find it numerically.

In practice, the question of whether a PDE problem is well posed can depend also on the auxiliary conditions that we impose. Generally, if these auxiliary conditions are too many then a solution may not exist. If they are too few, then the solution may not be unique. And they must be “consistent”

with the PDE or the solution may not depend continuously on the data. Thus the question of well- posedness can be difficult to assess.

1.3.1 Ill-conditioning and instability

Two more concepts that are related to well-posedness need to be clearly stated when we move from the continuous setting to the discrete (numerical) setting. The first we would like to discuss is “ill- conditioning”. The “condition” of a problem is a property of the mathematical problem (not of the numerical scheme used to solve it) and can be stated intuitively as follows:

Definition 1.11 (Ill-conditioning). A mathematical problem is said to be ill-conditioned if small per- turbations on the data cause large variations of the solution.

The definition is problem specific, but a simple example related to linear algebra can be illuminating.

Example 1.12. Consider the following 2× 2 system of linear equations:

3x + 2y = 2 (1.21)

2x + 6y = −8. (1.22)

(19)

-7 -5 -3 -1 1 3 5 x

-4 -2 0 2 4 y

-6 -4 -2 0 2 4 6

-4 -2 0 2 4 y=-3x/2+1

y=-x/3-4/3

δ ε∼δ

-2 -1 0 1 2 3 4

x

-6 -4 -2 0 2 4 y

-2 -1 0 1 2 3 4

-6 -4 -2 0 2

4 1 2

ε>>δ

δ

FIGURE 1.4: Geometric interpretation of a “well-conditioned” (left) and an “ill-conditioned” linear system (right).

The mathematical problem can be stated as follows:

Problem 1.13. find the pair of real values (x, y) such that examples 1.12 and 1.12 are satisfied simultaneously.

The solution to this problem is evidently P = (x, y) = (2,−2). We can rewrite the linear system as:

y = −3

2x + 1 (1.23)

y = −1 3x− 4

3. (1.24)

This reformulation, allows to change the problem into an equivalent formulation:

Problem 1.14. find the point P = (x, y) ∈ R² that represents the intersection between the two lines identified by examples 1.12 and 1.12 (see fig. 1.4).

Now we want to analyze the conditioning of this problem. To do this we specify a small perturbation to the data of our problem and look at how its solution changes. In our case we can, for example, change the right hand side of the second equation by a quantity δ, yielding a downward translation of the line (fig. 1.4, left). The point of intersection between the two lines has now moved by a quantity

≈ δ. This problem is well-condition and the ratio /δ measures somehow the conditioning of our problem.

Now, if the two lines have almost equal slopes, the situation is different (fig. 1.4, right). A small perturbation δ to one if the right hand side values yield a large movement of the solution (the point of

(20)

intersection), by a quantity δ. The conditioning is measured again by the quantity /δ which is now much larger than one. The problem is thus “ill-conditioned”.

We note that both problems are actually “well-posed” as they admit a unique solution which is continuously dependent upon the data. But the numerical solution may loose accuracy.

The second concept is called stability. Unlike conditioning, stability is a property of the numerical scheme used to solve a mathematical problem.

Definition 1.15 (Stability). A numerical scheme is stable if errors in initial data remain bounded as the algorithm progresses.

As an example, consider the following numerical algorithm given by the linear recursion:

u^(k)= Au^(k−1), k = 1, 2, . . .

where u^(k) ∈ Rⁿ, A is a constant n× n matrix, and the recursion is initiated with a given (possibly arbitrary) initial guess u⁽⁰⁾. The representation u⁽⁰⁾_h of the values of u⁽⁰⁾ in the computer is not exact, so the actual algorithm involves the numerical approximation u^(k)_h :

u^(k)_h = Au^(k−1)_h , k = 1, 2, . . . (1.25)

Stability of the algorithm requires that the errors with which we represent u⁽⁰⁾_h are not magnified by the algorithm process. More formally, we define the error as e^(k) = u^(k)− u^(k)h , k = 1, 2, . . .. From this last equation we have that u^(k) = u^(k)_h + e^(k), and after substitution in eq. (1.25) we obtain the error propagation equation:

e^(k) = Ae^(k−1).

Stability of the scheme is achieved if the norm of the error remains bounded as k increases, i.e. (using compatible norms):

e^(k) ≤

Ae^(k−1)

≤ kAk

e^(k−1)

≤ kAk^k e⁽⁰⁾

which implieskAk ≤ 1.

Finally we would like to mention the following famous and completely general result known as the Lax-Richtmeyer equivalence theorem:

Theorem 1.1 (Lax-Richtmeyer equivalence theorem). A consistent numerical scheme for the solution of a well posed problem is convergent if and only it is stable.

(21)

2 Hyperbolic Equations

An intuitve, although empirical, definition of a hyperbolic equation is as follows. Given x ∈ R^d, the PDE

F (x, u,∇ u) = 0

is hyperbolic at the point x if it is possible to transform it into an Ordinary Differential Equa- tion (ODE) with respect to one of the d variables (say x_d) as a function of the remaining ones x₁, x₂, . . . , x_d−1.

2.1 Some examples

Notice that in the above statement one of the components of the vector x is take as time (we will do this often in the sequel). Thus, for example, we can set t := x1 and x := x2, and consider the simple one-dimensional equation:

u_t+ u_x = 0.

This equation can be transformed into an ODE if we write it along the lines (called “characteristics”):

ξ = x− t.

To see this, we make a simple change of variable x = ξ + t, so that we can write u(x, t) = u(x(t), t) as:

u(x, t) = u(ξ + t, t); x = ξ + t; dx

dt = 1; dx dξ = 1, so that:

d

dtu(ξ + t, t) = ∂u

∂x dx dt + ∂u

∂t = u_t+ u_x= 0.

From this we see that our original equation is an ODE with respect to time t as a function of space, x (i.e. along the characteristic lines of equation x = ξ + t). Thus, we are mainly interested in looking at solutions of a Cauchy problem, although we will be looking at the effects of boundary conditions as well. However, it is intuitive to think that any auxiliary condition on the boundary can be transformed into an auxiliary condition at t = 0, and vice-versa. Another picture for this statement can be envisioned by stating that we can revert the sign of the time variable, and go backwards in time, i.e., we can find the solution at an earlier time given the solution at a later time. We will see that indeed this is the case, contrary to e.g. a parabolic PDE. In other words, a hyperbolic equation governs a reversible phenomenon, while a parabolic equation governs a irreversible (or dissipative) phenomenon.

(22)

t

x

−x λ

ξ

x− λt = ξ

y

x^∗ x t = 0

x^∗+ λt1

t = t1

FIGURE 2.1: Characteristic line for equation (2.1) in the (x, t) plane (left). Graph of the solution u(x, t) at t = 0 and t = t₁ in the (u, x) plane. A wave of form f (x) (here f (x) = e^−a(x−b)²) propagates to the right with speedλ without changing shape.

To study the behavior of a hyperbolic equation, we need to study the geometry of the solution function in the space of its dependent variables, i.e., z = u(t, x), which represents a surface in R^d+1, if x∈ R^d. This will help us acquiring an understanding of the type of solutions we may expect from a hyperbolic partial differential equation. This is what we are trying to do in the next few sections We start this task with some simple examples.

2.1.1 The transport equation

We look at the simplest PDE that one can devise, i.e., the transport equation. Let us consider the following problem (the same problem already seen in example 1.7):

Problem 2.1. Given a constant velocity vector field q(x)(q₁(x), . . . , q_d(x)) in a domain Ω⊂ R^d, find a function u = u(x, t) such that

u_t+ q· ∇ u = 0, in R^d× (0, ∞) (2.1a)

u(x, 0) = f (x), (2.1b)

Physically, this model represents the motion of particles (solutes, sediments in a fluid) moving with speed q. It is a conservation law. Indeed:

div(u(t, x)) = ∂u

∂t + div(qu) = ∂u

∂t + q· ∇ u (if div q = 0, i.e., q is a conservative field).

In the one-dimensional case d = 1, the velocity field is one-dimensional λ = q1. In this case the characteristic curve is a line in the plane (x, t) given by:

x− λt = const = ξ. (2.2)

(23)

Along this line the equation takes the form:

du dt = ∂

∂tu(ξ + λt, t) = λu_x+ u_t = 0.

Thus our solution u is constant along the characteristic line. The constant is determined by the initial condition (2.1b):

u(x, t) = f (ξ) = f (x− λt). (2.3)

We want to see how this general solution can be visualized geometrically. First note that equation (2.3) provides the function u(x, t) in terms of the initial condition (2.1b). The solution at any point (x, t) depends only on the value of f at ξ = x− λt, the intersection of the characteristic line with the x-axis (Figure 2.1, left). We remark that the x-axis is the initial line, i.e., the line at t = 0. We say that the “domain of dependence” of u(x, t) on the initial values f (x) is given by the single point ξ. The initial values completely determine the solution. They influence the solution only at the points of the characteristic line.

We can look at the graph of the solution at fixed times. Thus, given the initial condition u(x, 0) = f (x), how does the solution at t = t₁, i.e., u(x, t₁), look like? It is easy to verify that:

u(x, 0) = u(x + λt₁, t₁) = f (x).

This implies that the solution at every fixed time time t is just the rigid translation of the initial condition along x by the quantity λt. Hence, u(x, t) represents a wave with spatial form given by f (x) that moves to the right (towards increasing x) with speed λ (Figure 2.1, right).

2.1.2 The second-order wave equation

As a classical example of a hyperbolic equation, we consider the second order (one-dimensional) wave equation (1.12). To better understand the geometrical behavior of the solution, we transform this equation into a system of first order partial differential equations. To this aim, let v = ∂u/∂t and w = ∂u/∂x. Then we have:

∂v

∂t = ∂w

∂x

∂w

∂t = ∂v

∂x.

We can write the previous system in matrix form letting our unknown function u be the vector u = (v, w):

∂

∂t

v w

= 0 1 1 0

∂

∂x

v w

,

(24)

kt 1t 0t mt m−1tk−1tk+1

xj

x1

x0 x

x n

x n−1

j−1 x

j+1

t

FIGURE 2.2: Subdivision of the plane (x, t) in a finite difference grid.

or, using an abbreviated notation for the derivative u_a = ∂u/∂a:

ut= Aux. (2.4)

We note that matrix A is real, symmetric but not positive definite. It has two distinct real eigenvalues λ₁ =−1 and λ2 = 1 and corresponding normalized eigenvectors e₁ = ^√¹

2[1,−1]^T and e₂ = ^√¹

2[1, 1]^T. In all generality, equation (2.4) represents a general system of first order PDEs. We say that such a system is hyperbolic if matrix A is diagonalizable and has real eigenvalues. This is obviously always true if A is also symmetric. If the real eigenvalues are distinct then the PDE is “strictly” hyperbolic.

2.1.3 Simple finite difference discretization

We now proceed with the simplest numerical approach for solving (2.1) in the case d = 1, in which case we write q = λ, which is assumed positive. Hence, our model problem is:

u_t+ λu_x = 0 u(x, 0) = u₀(x) (λ > 0)

We use the finite difference method, that is we use various incremental ratios to approximate the time and space derivatives. To this aim, we subdivide the temporal and spatial intervals uniformly with subintervals of dimensions ∆t and ∆x, respectively, whereby x_j = x₀ + j∆x, j = 0, . . . , n and

(25)

t_k = t₀ + k∆t, k = 0, . . . , m and denote by u^k_h,j = u(x_j, t_k) the numerical approximation of the solution at the grid point (j, k) (Figure 2.2). We can approximate the temporal and spatial derivatives using simple incremental fractions with size ∆x or ∆t or even 2∆x or 2∆t. We obtain the following:

u_t(x_j, t_k)≈ u^k+1_h,j − u^kh,j

∆t u_t(x_j, t_k)≈ u^k+1_h,j − u^k−1h,j

2∆t u_x(x_j, t_k)≈ u^k_h,j+1− u^kh,j

∆x u_x(x_j, t_k)≈ u^k_h,j − u^kh,j−1

∆x u_x(x_j, t_k)≈ u^k_h,j+1− u^kh,j−1

2∆x

We can then use first forward discretizations of the first derivatives both in space and time to write the following difference equation:

u^k+1_h,j − u^kh,j

∆t =−λu^k_h,j+1− u^kh,j

∆x

which can be written by setting µ = λ∆t/∆x as:

u^k+1_h,j = u^k_h,j − µλ u^kh,j+1− u^kh,j = (1 + µλ)u^k_h,j − µλu^kh,j+1 (2.5) It is easy to see that this method is consistent, i.e., substitution of the real solution u in place of the numerical solution u_h,yields a residual that goes to zero as ∆t and ∆x tend to zero. This is readily obtained using the following Taylor series expansions:

u(x_j+ ∆x, t_k) = u(x_j, t_k) + u_x(x_j, t_k)∆x + u_xx(x_j, t_k)∆x²

2 +O ∆x³ u(x_j, t_k+ ∆t) = u(x_j, t_k) + u_t(x_j, t_k)∆t + u_tt(x_j, t_k)∆t²

2 +O ∆t³

and substituting the real solution u(xj, t_k) in place of u^k_h,j into eq. (2.5) we can calculate the residual (local truncation error) as:

τ_h(x_j, t_k) =u(x_j, t_k) + u_t(x_j, t_k)∆t + u_tt(x_j, t_k)∆t²

2 +O ∆t³

− (1 + µλ)u(x^j, tk) + µλ

u(xj, tk) + ux(xj, tk)∆x + uxx(xj, tk)∆x²

2 +O ∆x³

=u_tt(x_j, t_k)∆t²

2 +O ∆t³ + µλ

u_xx(x_j, t_k)∆x²

2 +O ∆x³

, (2.6)

which shows that the local error tends to zero with the discretization size ∆t and ∆x. Since this is a local error (betweeen two mesh nodes and two time steps), the global error is one order less, i.e.,

(26)

O (∆t + ∆x). Other schemes can be devised and analyzed. We can write the following:

u^k+1_h,j =u^k_h,j − µλ u^kh,j+1− u^kh,j

forward Euler - downwind

u^k+1_h,j =u^k_h,j − µλ u^kh,j− u^kh,j−1

forward Euler - upwind

u^k+1_h,j =u^k_h,j − µλ

2 u^k_h,j+1− u^kh,j−1

forward Euler - central

u^k+1_h,j =u^k−1_h,j − µλ u^kh,j+1− u^kh,j−1

leap-frog

u^k+1_h,j =u^k_h,j+1+ u^k_h,j−1

2 − µλ

2 u^k_h,j+1− u^kh,j−1

Lax-Friedrichs

However, convergence towards the real solution is not guaranteed by consistency alone, as stated by the Lax-Richtmeyer equivalence theorem (theorem 1.1), but requires that the scheme be stable. We need to define stability in the case of our FD approximations, but also we need to properly define convergence, although the latter one is a more intuitive concept. We will not dwelve into these issues, at the heart of numerical analysis, but give only an intuitive and practical explanation of these concepts.

Example 2.2. Solve the transport problem:

u_t+ u_x =0 u(x, 0) =

(1− | x | if | x |≤ 1,

0 otherwise.

using the above schemes.

Algorithm 1 Forward-Backward finite difference method for one-dimensional transport equation with constant velocity. We use periodic boundary conditions and assume the velocity of propagation is positiv λ > 0. This implies that periodic BC must be implemented for the first node of the FD mesh and nothing is imposed on the last node.

1: Set λ > 0; a, b (interval boundaries); ∆x; T (final time); ∆t;

2: N = (b− a)/∆x; M = T/∆t; µ = λ∆t/Dx.

3: for j = 1, 2, . . . , N do

4: u_h,j = u₀(a + (j− 1) ∗ ∆x);

5: end for

6: for k = 1, . . . , M do

7: u_h,1 = (1− λµ)u^h,1+ λµu_{h,N +1}

8: for j = 2, . . . , N do

9: u_h,j = (1− λµ)uh,j+ λµu_h,j−1

10: end for

11: end for

We report in fig. 2.3 (top row) the results obtained at t = 0.1 and t = 0.2 using the leap-frog (left panel) and the forward-forward (right panel) schemes. We see immediately that the leap-frog shows

Notes on theory and numerical methods for hyperbolic conservation laws