Parallel, grid-adaptive computations
Rony Keppens
Centre for mathematical Plasma Astrophysics
KU Leuven
• MPI-AMRVAC philosophy
⇒ available physics modules with example applications
⇒ shock-capturing discretization strategies
• Adaptive Mesh Refinement criteria
• data structures for parallel execution and I/O
• Example space weather, multi-physics applications
1 MPI-AMRVAC
2 Discretizations
3 Adaptive Mesh Refinement
4 Space Weather modeling
Philosophy
• targets any set of equations of generic type
∂ t U + ∇ · F(U) = S(U, ∂ i U, ∂ i ∂ j U, x, t)
⇒ conserved variables U, fluxes F, sources S
⇒ (near-) conservation laws: hyperbolic PDEs
• emphasis on shock-governed dynamics
⇒ shock-capturing schemes
• any-dimensionality: Perl preprocessor to Fortran 90
⇒ can handle Cartesian, cylindrical, spherical geometries
• several physics modules pre-implemented
• User defines initial conditions, can select predefined or add personal boundary conditions, may add source terms, and possibly I/O additions
⇒ many example setups provided in repository version
⇒ code obtained via subversion through
svn co https://svn.esat.kuleuven.be/amrvac/trunk -username myusername mpiamrvac
⇒ myusername studCMFA09 with password cpa9amrvac
• full HTML documentation info at
http://homes.esat.kuleuven.be/~keppens
• Test module: pure advection
⇒ with U = ρ, F = ρv with v uniform velocity
⇒ testing novel functionality in discretization or adaptivity
⇒ demonstrating convergence, order of accuracy, . . .
• Discontinuity dominated 2D profile: VAC logo
⇒ advected diagonally on unit square
0.0 0.2 0.4 x 0.6 0.8 1.0 0.4 0.6
0.8 1.0 1.2 1.4 1.6 1.8 2.0
ρ
Minmod, RK3 van Leer, RK3 Koren, RK2 Koren, RK3 Cada, RK3
• 3D advection test: constant ρ in sphere, different ρ out
⇒ movie shows isosurface, slice plane
• Nonlinear Scalar equation: amrvacphys.t.nonlinear
⇒ eqpar(fluxtype_) switch for different flux expressions
⇒ inviscid Burgers (case 1), nonconvex equation (case 2) ρ t + ∇ · 1
2 ρ 2 e
= 0
ρ t + ∇ ·
ρ 3 e
= 0
⇒ in any dimensionality as e ≡ P D
i=1 ˆ e i
• Burgers for 2D: ‘advection’ of Gaussian bell profile
⇒ smooth initial condition steepens, shock formation
⇒ compare Burgers to nonconvex case
• adiabatic HD set, U = (ρ, ρv) and F = (ρv, ρvv + c ad ρ γ )
⇒ parameters c ad , γ for isothermal or polytropic scenarios
• can do dust dynamics where pressure vanishes (c ad = 0)
⇒ ultimate challenge for grid adaptivity: δ waves!
⇒ reproduces analytic result from Leveque (2004)
• adiabatic hydro encompasses ‘shallow water’ equations
⇒ interpret ρ = h as water height, γ = 2 and gravity g = 2 c ad
⇒ follow water waves in a square or round pool
• Euler equations, with or without sources
⇒ external gravity (planar or point)
⇒ optically thin radiative losses (various cooling curves)
• Example with uniform gravity: Rayleigh-Taylor evolution
• 2D HD example useful for symmetry-preservation test
⇒ Liska test: 2D ‘shock tube’ in closed box
• 2D HD example with spatio-temporal BC
⇒ Woodward & Collela shock reflection
⇒ (old) movie compares patch-based with hybrid block AMR
• modern applications: circumstellar/binary environments
⇒ Euler with optically thin radiative losses
⇒ follow 3D evolution of orbiting stars
Magnetohydrodynamics or MHD
• IDEAL MHD: conservation laws
⇒ MASS, MOMENTUM, ENERGY, MAGNETIC FLUX
⇒ 8 non-linear PDE for density ρ, velocity v, energy e, and B
⇒ add ∇ · B = 0 ⇒ no magnetic monopoles
• 7 wavespeeds entropy, ± slow, ± Alfvén, ± fast [anisotropic!]
⇒ complicated shocks, Steady & Unsteady
Resistive MHD
• current J = ∇ × B: dissipation through resistivity
⇒ from ideal to resistive (non-ideal) MHD
• spatio-temporal resistivity profile η(x, t) introduces
⇒ Ohmic heating term in energy equation S e = ∇ · (B × ηJ)
⇒ diffusion term in induction equation S B = −∇ × (ηJ)
⇒ uniform resistivity: η J 2 + B · ∇ 2 B and η∇ 2 B
• ideal (η = 0) versus resistive MHD
⇒ topological constraint on B alleviated
⇒ field lines can reconnect in regions of strong currents
Petschek reconnection
• Petschek model (1964) for fast magnetic field annihilation
⇒ two regions containing oppositely directed field lines
⇒ realize steady-state with X-type magnetic neutral point
• steady state contains pair of stationary slow shocks
⇒ where B bends towards shock front normal
• at X-point: flow controlled by diffusion
• within region bounded by slow shocks: purely B x , ‘constant’ ρ
⇒ fronts have fixed opening angle (away from neutral point)
⇒ fluid moves to boundary layer and is ejected along it
• stationary configuration δ 2
y
x
⇒ use symmetry to simulate corner region [0, 1] × [0, 4] only
• solve resistive MHD equations incorporating resistivity profile η(x , y ) = η 0 exp h
−(x/l x ) 2 − (y /l y ) 2 i
⇒ anomalous η centered on origin
⇒ parameters η 0 = 0.0001, l x = 0.05, l y = 0.1
• initial field B = (0, tanh(10x )): current sheet
⇒ γ = 5/3, p(x) = 1.25 − B y 2 (x )/2 and ρ(x ) = 2p(x )/β 1
⇒ isothermal initial condition with β 1 = β(x = 1) = 1.5
• fix Alfvén Mach number of inflow at x = 1: v x (x = 1) = −0.04
consistently evolves to Petschek reconnection configuration
• field lines, velocity field, current density evolution
⇒ checks with theoretical opening angle in steady-state!
⇒ from Tóth et al, A&A, 332, 1159 (1998)
• Ideal and resistive MHD, with or without sources
⇒ external gravity; optically thin radiative losses
• many standard (shock-governed) tests, different strategies for
∇ · B = 0, splitting off strong background fields, . . .
⇒ 2D resistive MHD, GEM Challenge, η = 0.001
• Harris Sheet evolution (tanh magnetic profile)
⇒ reconnection at η = 0.005, η = 0.001, η = 0.0001
• run on Macbook pro with effective 1920 × 1920 resolution, several days . . .
⇒ current evolution for η = 0.001
⇒ schlieren plot evolution for η = 0.001
Shear flow instabilities
• 2D planar Kelvin-Helmholtz with aligned B
⇒ two parameters: β and Mach number (field/flow strength)
⇒ single billow evolution: exclude subharmonic modes
⇒ wish to study vortex disruption and mode interactions
• extend domain size
⇒ cover multiple wavelengths of most unstable mode
⇒ perform grid-adaptive MPI-AMRVAC studies
• ‘strong’ B: M = 1, Alfvén Mach M A = 7 (β = 58.8)
⇒ reconnection disrupts single billows
⇒ trend to large scale by pairing/merging
⇒ 22 wavelengths of most unstable mode
• single layer: M = 1 M A = 7 or β = 58.8
3D jet configurations
• [astrophysical] jet of radius R jet :
⇒ 3D Kelvin-Helmholtz case study
⇒ cylindrical cross-section
⇒ shear flow (width 2a) across its circumference
⇒ t = 0 parallel uniform B, V 0 = 0.645, a = 0.05, B 0 = 0.129
⇒ reference parameters M s = 0.5, M a = 5, β = 120
• 3D perturbation wavenumber n along jet, m about jet axis
⇒ sideways ‘kink’ perturbation m = 1 = n
• Nonlinear evolution for varying initial 3D perturbation
⇒ field lines, density evolution
• Special relativistic HD and MHD
• extreme contrasts, positive p, ρ, τ , v < 1, Γ ≥ 1, solenoidal B
⇒ stringent demands on numerics and accuracy: AMR vital
⇒ different EOS implemented for relativistic modules
• Relativistic hydro refraction
⇒ 7 levels, effective 1536 × 7680 in day(s) on local desktop!
Relativistic hydro refraction t = 0.5
Relativistic hydro refraction t = 1
Relativistic hydro refraction t = 2
Relativistic hydro refraction t = 3
Relativistic hydro refraction t = 4
Relativistic hydro refraction t = 5
Relativistic hydro refraction t = 6
RMHD Orszag-Tang test
• relativistic analogue of 2D MHD Orszag-Tang test
⇒ double periodic, supersonic relativistic vortex rotation
⇒ initial field configuration: double island structure
⇒ current sheets, shock interactions, reconnections
• AMR captures small-scale reconnection effects
MPI-AMRVAC and HPC-Europa2
• excellent scaling: domain decomposition and multi-level AMR
⇒ 2D MHD at ' 400 2 , 1000 ∆t in < 5 seconds (include IO)
⇒ 10 level AMR special relativistic HD sustained 80%
efficiency on 2000 CPUs!
1 MPI-AMRVAC
2 Discretizations
3 Adaptive Mesh Refinement
4 Space Weather modeling
Discretizations
• emphasis shock-capturing schemes, finite volume approach
⇒ from TVDLF (readily used for any system) to successively more advanced schemes, incorporating better approximations to the local Riemann fans at cell interfaces
x t
t t
t
HLL HLLC HLLD Roe
• limited linear reconstruction (2nd to 3d order); PPM (parabolic)
⇒ different discretization possible per level!
Boundary conditions
• (stencil-dependent number of) ghost cell layers
⇒ physical domain boundaries: user set or preselect (a)symmetry for reflective walls;
continuous extrapolation for open boundaries;
limited inflow conditions, allowing turbulent flow structures to cross boundaries unhampered;
periodic boundaries;
π-periodicity at ‘poles’ (cylindrical/spherical)
• user can set spatio-temporally varying combinations of
primitive/conservative variables in boundary regions
1 MPI-AMRVAC
2 Discretizations
3 Adaptive Mesh Refinement
4 Space Weather modeling
AMR as an h-refinement method
• Convergence of numerical scheme to physical solution, but
⇒ both global and local, small-scale effects matter!
• dynamic regridding methods: minimize computational cost at no accuracy losses
⇒ h-refinement: refine grid locally (adding grid points)
⇒ r -refinement: relocate a grid (moving grid points)
⇒ p-refinement: change order of accuracy locally
• adaptive mesh refinement or AMR: h-refinement approach
⇒ overlay part of coarse grid with finer grid patch
⇒ repeat up to maximal allowed finest grid level
⇒ each grid patch: structured grid, time-advance with
(hyperbolic) solver
• restriction: from fine to coarse: use conservative average
∆
ρ
ij ij∆ =r
lx
1,l
x
/ r
lUpdate Underlying Coarse Cell
2,l
2
x
mnx
mn+1=
x
Fix Flux for Coarse Cell ij
x
i+1/2= x
m-1/2x
1,l∆
1,l-1
x
∆
2,l-1
x
∆
x
∆ =r
l1,l 1,l
ρ
mnρ
mn+1ρ x
∆
m+1n+1
ρ
m+1n
x
∆
ΣΣρ ρ
ij1,l-1 mn mn
• prolongation: from coarse to fine
⇒ use limited linear reconstruction, preserve conservation!
• flux fixing at grid level changes
⇒ original (2D) formula ρ
(n+1)ij= ρ
(n)ij− ∆t
∆x
i,l−1F
(n+12)i+12j
− F
(n+12)i−12j
− ∆t
∆x
j,l−1F
(n+12)ij+12
− F
(n+12)ij−12
⇒ replace facevalued, time centered numerical flux F (n+
1 2
) i+
12j by F (n+
1 2
) i+
12j ⇒ 1
r l 2
F (n+
1 4
)
m−
12n + F (n+
1 4
)
m−
12n+1 + F (n+
3 4
)
m−
12n + F (n+
3 4
) m−
12n+1
⇒ general formula where r l ≡ ∆x ∆x
l−1l
gives refinement
change, and for r l = 2 fine grid has set 2 timesteps for each
coarse ∆t (simplifies when keeping ∆t level independent)
Patch, block, hybrid block AMR
• diagonal ‘VAC’ advection: 3 approaches to nested grids
⇒ patch (left), quadtree, hybrid block-based (right)
⇒ 233 sec; 196 sec; 159 sec
⇒ van der Holst & Keppens, JCP 2007, 226, 925
• Current MPI-AMRVAC only pure block-based (quadtree/octree)
⇒ easier parallelization, load-balance
• further takes fixed r l = 2, and ∆t identical for all AMR levels
⇒ avoids many complications: internal ghost cell boundary
filling no longer needs spatio-temporal interpolations!
Generic AMR code skeleton
timeloop : do
exit time loop by user-defined criteria compute timestep constraint for all grids conditionally save data
advance all grids for one time step dt
distinguish source and flux addition strategies select numerical scheme, geometric sources, ...
collect fluxes at fine-coarse boundaries
after every parallel update of grids on all processors fill all ghost cells for all grid blocks
at final temporal update, fix for conservation regrid
Quantify the error estimator on all grids
Create new grid structure, ensure proper nesting Load balance the new grid structure
update time counters
end do timeloop
Handling curvilinear coordinates
• restriction (fine to coarse) & prolongation (coarse to fine)
⇒ 2nd order formulae for smooth solutions
⇒ conservative: ρ ij ∆V ij = P
m
P
n ρ mn ∆V mn
• general prolongation strategy for Curvilinear coordinates
⇒ when Jacobian seperable: J = J(x 1 , x 2 ) = J 1 (x 1 )J 2 (x 2 )
⇒ introduce new coordinates dy i = J i (x i )dx i
⇒ arithmetic x i center divides cell, but Taylor expand about y i arithmethic center
y1,i y1x1,i
x1,i y1
( )
y1 y2
y1 y1,i x1,i x1,ix2,j
( , )
i j
x1,i
x1,i
j i
y2=y2 2(x ) y1(x )1 y1=
dy1 J (x )1 1 dx1
=
−1/2= ( −1/2)
,i ,j
( , )
0000 1111 0000 00 1111 11
= ( )
+1/2 +1/2
V +1/4
∆ +1/4
−1/2
+1/2
∆ V
0000 1111
• end result for computing new fine cell center values ρ i+1/4j+1/4 = ρ ij + 1
2 ∆ 1 ρ ij ∆V i−1/4j+1/4
∆V i−1/4j+1/4 + ∆V i+1/4j+1/4
+ 1
2 ∆ 2 ρ ij ∆V i+1/4j−1/4
∆V i+1/4j−1/4 + ∆V i+1/4j+1/4
⇒ involves volumes & (limited) slopes per direction ∆ i ρ
⇒ applicable to polar, cylindrical, spherical grids
• planar HD shock tube solved in polar coordinates
Quadtree-Octree AMR
• Octree (3D) or Quadtree (2D) AMR
⇒ use blocks of user-set size (at compile) divide domain
⇒ use same CFL-limited ∆t for all grids
• parallelization using MPI
⇒ 1 AMR level = Domain Decomposition strategy
Quadtree-Octree AMR
• example 2D domain covered by 8 = 4 × 2 base level grid blocks
⇒ hierarchically nested AMR levels, proper nesting
⇒ fixed factor 2 refinement
1
2
4 5
6
8 9
13
3
7 10
11 12
14 15
16 17
• Space-filling Morton (Z -order) curve
⇒ for N block = 17 grid blocks
⇒ load-balancing: N p CPUs each N block /N p ‘adjacent’ blocks
⇒ N
p= 4: P
0: 1 → 5, P
1: 6 → 9, P
2: 10 → 13, P
3: 14 → 17
⇒ every timestep: full grid-tree re-evaluated
AMR criteria
• automated block-based regridding procedure: 3 steps
⇒ consider all blocks at level 1 < l < l max
⇒ quantify local error E i at each gridpoint x i in a grid block
⇒ if ANY point has E i > Tol l refine block (ensure nesting)
⇒ if ALL points have E i ≤ f Tol l Tol l coarsen block
• involves (user) parameters:
⇒ error tolerance per level Tol l
⇒ coarsen fraction f Tol l per level
Error estimation
• choice between 3 different local error E i estimators
⇒ Richardson-based: quantify error at t n+1 , use w n−1 , w n
⇒ local comparison between w n−1 , w n
⇒ Löhner (& FLASH3) estimator: use w n , normalized 2nd derivatives
• all use user-selection of (conserved or auxiliary) variables E i = X
iw
σ iw E iw Rel
⇒ local relative variable errors E iw Rel
⇒ weights obey P
iw σ iw = 1
• estimators augment with user-coded (de)refinement
Richardson estimator
• procedure: compute 2 future solutions w n+1 , 3 time levels
⇒ start from w n−1 , coarsen to 2∆x , integrate with 2∆t
⇒ Coarsened-Integrated solution w CI
⇒ start from w n , integrate with ∆t, coarsen to 2∆x
⇒ Integrated-Coarsened solution w IC
⇒ local relative variable errors E iw Rel from
E iw Rel = | w iw CI − w iw IC | P
iw σ iw | w iw IC |
• integrate 1st order D-unsplit scheme, only unsplit source terms
Local comparison
• Local comparison: employ 2 time levels w n−1 , w n
⇒ local relative variable errors E iw Rel from
E iw Rel = | w iw n−1 − w iw n |
| w iw n−1 |
• Richardson/Local may need added buffer zone to refine
⇒ additional user parameters n buff
Löhner estimator I
• Löhner (1987) as PARAMESH/FLASH3 from instantaneous w n
⇒ quantifies normalized 2nd derivatives
⇒ local relative variable errors E iw Rel from
E iw Rel = s
N iw max (D iw , )
⇒ numerator N iw = P
i
1P
i
2∆ i
1(∆ i
2w iw ) 2
, denominator D iw = X
i
1X
i
2h | L i
1w iw | + | R i
1w iw | +f l S i
2(S i
1| w iw |) i 2
⇒ discrete central, left and right shifts ∆ i , L i , R i , as well
as sum operator S i for direction i
Löhner estimator II
• estimator quantifies weighted 2nd derivative in grid point i as in
P
i
1P
i
2∆x i
1∆x i
2∂x ∂
2w
i1
∂x
i2|
i2
P
i
1P
i
2| ∆x i
1∂x ∂w
i1
|
i−1| + | ∆x i
1∂x ∂w
i1
|
i+1| +f l | w m |
2
1 2
• (level dependent) ‘wavefilter’ parameter f l , order 10 −2
⇒ can also use logarithm for (positive) variables
• Note: tolerance Tol l O(0.1), smaller for Richardson/Local
Parallel I/O
• using full MPI-2 functionality
⇒ all processors write simultaneously to single file, with MPI_FILE_IWRITE_AT constructions, and offset from
Morton-ordered space filling curve
• pure master-slave data file dump option as well
• Files used for restarts
⇒ (increasing AMR levels, different platforms)
• postprocess conversions to unstructured VTK format, for
use with Paraview or Visit, or data formats native to Idl,
Tecplot or OpenDX
Data structures
• grid blocks indexed in variety of ways
⇒ global grid indices, identifying block location
⇒ boolean tree, identifying block presence/absence
⇒ linked list, with pointers to same grid level blocks
(12,7)
(2,1)
(1,1) (3,1) (4,1)
(1,2) (2,2)
(2,3)
(5,3) (6,3) (5,4) (1,3)
(12,8) (11,7) (11,8)
F
TTTT TTT
F
F
TTTT T
T T
F
TTTT F
TTTT
T T T T
T
level 3 grids (4) level 2 grids (15) level 1 grids (8)
⇒ Morton ordered space curve index
• can allow different discretizations per AMR grid level
• code modular structure allows to
⇒ add physics module, separate from discretizations
⇒ add other discretizations (e.g. higher order FD)
⇒ easily add algorithmic improvements (novel limiters, . . . )
1 MPI-AMRVAC
2 Discretizations
3 Adaptive Mesh Refinement
4 Space Weather modeling
Space weather modeling
• Modern MHD simulations: trigger + evolution of CMEs, e.g.
– Chen et al., NJU: CME by flux emergence
– Mikic et al., SAIC San Diego: CME by flux cancellation – van der Holst et al.: Breakout scenario, followed by Soenen et al.. CPA: CME salvo in breakout scenario
• Jacobs, Poedts et al. KULeuven: 3D CME in solar wind
Virtual coronagraph
• solar wind as MHD steady-state
⇒ [Keppens & Goedbloed, Ap. J. 530, 1036 (2000)]
⇒ super(magneto)sonic impacts on Earth’s magnetosphere
Solar wind powering auroral displays
Space weather effects: Northern Lights
• CME scenarios with BATS-R-US AMR code
⇒ CME initiation studies
• Compute impact CME on Earth’s magnetosphere
⇒ faster than real time
– compute challenge (few days), significant range of scales – Gombosi, Tóth et al., Univ. of Michigan:
Centre for Space Environment movie
Current Space weather models: frameworks
• http://csem.engin.umich.edu/
⇒ Center Space Environment Modeling
⇒ coupled simulations CME to planetary (B) influence:
Centre for Space Environment movie 2
• Earth magnetosphere: bow shock due to impinging solar wind – magnetopause (contact discontinuity)
– inner magnetosphere,
– night-side magnetotail with equatorial current sheet
• Size estimate from magnetic pressure B field vs. ram pressure wind:
∼ 10 R E for Earth,
∼ 60 R J for Jupiter.
• Space weather affects all planets! Near-alignment of Earth, Jupiter, Saturn (2000)
⇒ Series of CMEs (seen by SOHO) leading to interplanetary
shock (overtaking and merging shocks), detected as auroral
storms on Earth (Polar orbiter), observed in Jovian radio
activity as measured by Cassini (fly-by on its way to Saturn),
seen by Hubble as auroral activity on Saturn.
• First observation CME event traced from Sun to Saturn Prangé et al., Nature, 432 (4), 78 (2004)
Right: compare MHD simulations, input from WIND spacecraft
mm
SOHO: LASCO/EIT
Earth POLAR
aurora image
Jupiter: Casini radio signal
Saturn: HST
aurora image