• Non ci sono risultati.

Inference of cosmological parameters from gravitational wave observations

N/A
N/A
Protected

Academic year: 2021

Condividi "Inference of cosmological parameters from gravitational wave observations"

Copied!
87
0
0

Testo completo

(1)

DIPARTIMENTO DIFISICA

CORSO DILAUREA MAGISTRALE INFISICA

I

NFERENCE OF COSMOLOGICAL PARAMETERS

FROM GRAVITATIONAL WAVE OBSERVATIONS

C

ANDIDATE

:

S

UPERVISOR

:

S

TEFANO

R

INALDI

P

ROF

. W

ALTER

D

EL

P

OZZO

(2)
(3)

Contents

Introduction i

1 Gravitational waves as standard sirens 1

1.1 Derivation from Einstein equations . . . 1

1.2 Gravitational waves from compact binary coalescences . . . 4

1.2.1 Radiation in presence of matter . . . 4

1.2.2 Binary black hole inspiral . . . 6

1.3 A brief review of cosmology . . . 9

1.3.1 Friedmann equations and ΛCDM model . . . 10

1.3.2 Redshift versus distance: the Hubble law . . . 11

1.4 Standard sirens . . . 13

2 Bayesian statistics 15 2.1 Desiderata and quantitative rules . . . 16

2.2 Probability assignment . . . 18

2.3 Parameter estimation and model selection . . . 20

2.3.1 Likelihood for a deterministic model . . . 20

2.3.2 Model selection . . . 22

3 The state-of-art for H0: the Hubble tension 25 3.1 Planck . . . 26

3.2 SH0ES . . . 28

3.3 GW170817 . . . 29

3.4 The galaxy-catalog method . . . 31

4 An alternative approach to the galaxy-catalog method 35 4.1 Model . . . 35

4.1.1 Catalog . . . 36

4.1.2 Background information . . . 37

(4)

4.2 Simulations . . . 46

4.2.1 Completeness . . . 52

4.3 GW170817 . . . 55

4.3.1 GW170817 without electromagnetic counterpart . . . 55

4.3.2 Completeness . . . 60

4.3.3 GW170817 with NGC4993 as host . . . 60

5 Upside down: physical properties of gravitational wave hosts 63 5.1 Model . . . 64

5.1.1 Single event likelihood . . . 65

5.2 Simulations . . . 67

5.2.1 Gaussian distribution with fixed variance - mth = 18 . . . 67

5.2.2 Gaussian distribution with fixed variance - mth = 99 . . . 68

5.2.3 Schechter distribution with a sharp cutoff magnitude . . . 73

6 Conclusions and future prospects 75

Acknowledgements 77

(5)

Introduction

One of the greatest limitations in astrophysics and cosmology is the distance mea-surement. Before the first gravitational wave (GW) detection, the only solution to this problem was the cosmic distance ladder, which is a combination of several distance indicators. On the other hand, a gravitational signal carries the information on the luminosity distance, giving access to direct distance measurement on a cosmologically relevant scale.

These events can be used as standard sirens to calibrate the relation between luminosity distance and redshift, the generalized Hubble law. Once the luminosity distance and the host galaxy redshift are known, one can apply the Hubble law to measure the cosmological parameters such as H0and ΩM.

Unfortunately, our inability to pinpoint the source makes a direct, precise and solely GW-based cosmological parameters determination impossible, since we are completely unaware of the host’s redshift. Up to day, the only exception is GW170817, which is the first gravitational event with associated electromagnetic counterpart [1].

A possible solution could be an improvement in luminosity distance and sky po-sition determination, which can be achieved with the new Japanese interferometer Kagra, which joined the LIGO-Virgo Collaboration just before the end of the third LIGO-Virgo observing run O3. This enhanced interferometer network will be able, hopefully during O4, to tighten the position constraint on the gravitational wave host to help the subsequent follow-up campaign.

In order to be able to detect the associated electromagnetic signal, the burst has to point toward us, which is quite unlikely since the beam is believed to be very collimated. In addition to this, not every compact binary coalescence event is believed to produce a burst: according to the available models, binary black holes coalescences are not believed to produce any electromagnetic signals, making the task of spotting the source even more difficult.

In a realistic scenario, where no electromagnetic counterpart can be identified and no redshift can be measured, two different paths can be followed. The first, described by Li, Del Pozzo and Messenger [13], relies on the deformability of neutron stars. This

(6)

method, even if its applicability is constrained to binary neutron star (BNS) events, could help to achieve a completely electromagnetic independent cosmological parameters inference.

The second, more general, path makes use of Bayesian statistics to infer the host galaxy and it was first outlined by Schutz [18]: he proposed a statistical approach to the problem, taking into account all the galaxies which live within the volume recon-structed via GW posteriors. Hence the galaxy catalog method name. The effectiveness of this method, however, relies on the completeness of the considered catalog: the more complete the catalog is, the more likely it is that the host is included in the list. While working with very short distances, the incompleteness of the catalog can be neglected, but in a more realistic scenario for instance in binary black hole mergers -the number of undetected galaxies can be comparable to or even exceed -the number of objects in the catalog: the statistical description of the phenomena must properly treat this feature.

The method was demonstrated by Del Pozzo [8] and applied in Abbott et al. [19] as described in Gray et al. [11]. With the events detected during the second observing run, the standard siren measurement of the Hubble constant is H0 = 67+13−7 km s−1Mpc

−1.

The standard approach used in [19] requires to marginalize over the case where the host is, and is not, in the catalog, weighting each case with the appropriate probability: this leads to the necessity of computing a selection function.

The aim of this thesis is to present a different approach to the galaxy-catalog method: making use of Bayesian statistics, we show that it is possible, giving a more general definition of galaxy catalog and rigorously applying probability theory rules, to explicitly derive the cosmological parameters’ posterior probability distribution given a galaxy catalog and a gravitational signal.

We apply the derived method to a set of simulated signals to demonstrate that, at least in a simplified scenario, the recovered posterior distribution is compatible with the injected value. Furthermore, we analyze GW170817 with and without electromagnetic counterpart, comparing our results with Abbott et al. [2] and Fishback et al [10].

Finally, we show how the formalism we developed for the cosmological param-eters inference, which relies on the assumption of perfect knowledge of the galaxy population, can be inverted: assuming a fiducial cosmology, we demonstrate that, in principle, it is possible to infer the properties of gravitational wave hosts.

(7)

Chapter 1

Gravitational waves as standard sirens

1.1

Derivation from Einstein equations

In this chapter we want to show how Einstein’s General Relativity predicts the existence of gravitational waves and how it is possible to use such waves to measure luminosity distances, making them a great example of standard candles or, following the now common analogy with sound, standard sirens.

In doing so we will follow both the derivation and the notation presented in Mag-giore’s textbook [14]. The flat space metric is

ηµν = (−, +, +, +) ,

while the curved space metric is denoted by gµν(x). The Christoffel symbol is

Γρµν = 1 2g

ρσ(∂

µgνσ+ ∂νgµσ− ∂σgµν) .

We give also the definitions of Riemann curvature tensor νρσ = ∂ρΓµνσ − ∂σΓµνρ+ Γ µ αρΓ α νσ− Γ µ ασΓ α νρ

and its contractions, the Ricci tensor

Rµν = Rαµαν,

and the Ricci scalar

R = gµνRµν.

In this notation, Einstein equations read

Rµν

1

2gµνR = 8πG

(8)

where Tµν is the energy-momentum tensor. We wish to study Einstein equation

expanding the curvature tensor around flat-space metric to linear order.

gµν = ηµν+ hµν, |hµν|  1 . (1.2)

First of all we want to prove that hµν is a tensor and the smallness of this tensor is

preserved under certain coordinate transformations. A global Lorentz transformation can be written as

→ Λν µx

ν. (1.3)

From the definition of Lorentz transformation, Λµ

ν satisfies

ΛρµΛσνηρσ = ηµν, (1.4)

so we can transform the general metric tensor gµν

gµν(x) → g0µν(x 0) = Λρ µΛ σ νgρσ(x) = ΛρµΛ σ ν(ηρσ+ hρσ(x)) = ηµν + ΛρµΛ σ νhρσ(x) . (1.5)

Under Lorentz transformations, the metric becomes g0µν(x0) = ηµν+ h0µν(x 0 ) , (1.6) where h0 µν(x 0) = Λρ

µΛσνhρσ(x). Here we see that hµν is a tensor. Furthermore, we need

to preserve hµν smallness. Now, if we consider Lorentz transformation, any rotation

satisfies the required condition, while for boosts we need to restrict ourselves to ones which do not spoil (1.2).

Hence, Riemann tensor becomes Rµνρσ =

1

2(∂ν∂ρhµσ+ ∂µ∂σhνρ− ∂µ∂ρhνσ− ∂ν∂σhµρ) , (1.7) where we neglected all the terms higher than first order in hµν. Please note that in this

linearized theory all we need to raise and lower the indices is the flat space metric ηµν.

The trace of the perturbation is

h = ηµνhµν. (1.8)

Now, if we define the flat space Dalambertian  = ηµν∂ν∂µ= ∂µ∂µ, the linearized Ricci

tensor and scalar are: Rµν = 1 2  ∂α∂νhαµ+ ∂α∂µhαν − ∂µ∂νh − hµν  , (1.9) R = ∂α∂µhαµ− h . (1.10)

(9)

1.1. DERIVATION FROM EINSTEIN EQUATIONS

It is useful, for a shorthand notation, to define ¯

hµν = hµν

1

2ηµνh . (1.11)

Since this new tensor trace is ¯

h = ηµν¯hµν = h − 2h = −h , (1.12)

one can write the inverse, hµνin terms of ¯hµν:

hµν = ¯hµν

1 2ηµν

¯

h . (1.13)

Substituting ¯h in (1.9) and (1.10) one obtains the linearized Einstein equations ¯hµν+ ηµν∂ρ∂σ¯hρσ− ∂ρ∂µ¯hνρ− ∂ρ∂ν¯hµρ = −

16πG

c4 Tµν. (1.14)

The gauge freedom previously mentioned allows us to choose what is called the harmonic gauge

∂ν¯hµν = 0 . (1.15)

In this gauge, the three last left hand side terms in (1.14) vanish leaving a simple wave equation

¯hµν = −

16πG

c4 Tµν. (1.16)

This set of ten equations is reduced to six independent components by the gauge choice. Outside the source, the energy-momentum tensor vanishes, Tµν = 0, giving

¯hµν = 0 . (1.17)

These equations are solved by a light speed travelling plane wave. It is interesting to note that our previous gauge choice does not uniquely determines the metric. In fact, a coordinate transformation xµ → xµ+ ξµwith

ξ = 0 (1.18)

allows the harmonic gauge to hold. We use this residual gauge freedom to fix some useful properties of ¯hµν. First of all, we use ξ0 to set ¯h = 0. In this particular case,

¯

hµν = hµν. The three spatial components ξi are chosen so that h0i = 0. These four,

along with the four Lorentz conditions, set

h0µ= 0, hii = 0, ∂jhij = 0 , (1.19)

which are the conditions for the transverse-traceless gauge. A plane wave solution for (1.17), travelling along the z axis, is

hT Tij (z, t) =   h+ h× h× −h+   ij cos(ω(t − z/c)) . (1.20)

(10)

1.2

Gravitational waves from compact binary coalescences

Up to now, we derived gravitational waves in vacuum, demonstrating the possibility of their existence regardless of how these waves can be generated. In presence of matter, the energy-momentum tensor is not zero, hence Eq. (1.20) is not a solution of

¯hµν = −

16πG

c4 Tµν. (1.21)

In this section we want to present a more general solution to the linearized Einstein equations in order to show that gravitational waves can be radiated by every system with a varying quadrupole moment. Furthermore, we will focus on a precise kind of source, the compact binary system, which is of specific interest for this thesis.

1.2.1

Radiation in presence of matter

The exact solution for Eq. (1.21) is

¯ hµν(t, ¯r) = − 4G c4 Z T µν(t, ¯r0) |¯r − ¯r0| d 3¯ r0, (1.22)

where ¯r is the position of the source with respect to the observer and ¯r0runs over all

the spatial extension of the source itself.

The signals we expect to detect in LIGO and Virgo interferometers are emitted from sources which are far away from us. Since each source has a typical length L, which is for example the orbital radius for binary systems or the object’s radius if one considers a spinning star, the requirement of great distance reads

r  L , (1.23)

thus we can expand

1 |¯r − ¯r0| ∼ 1 r 1 + ¯ r0· ˆr r ! . (1.24)

During the early stages of life of a binary system, it is reasonable to expect the radiation wavelength to be greater than L. Using the fact that the gravitational wave has the same frequency of the system, one obtains

λ  L v

c  1 , (1.25)

which is called low-velocity approximation. This allows us to evaluate the right hand side of Eq. (1.22) at retarded time. With these approximations, (1.22) becomes

¯ hµν(t, ¯r) = − 4G c4 1 r Z Tµν(t − r c, ¯r 0 )dr0. (1.26)

(11)

1.2. GRAVITATIONAL WAVES FROM COMPACT BINARY COALESCENCES

It can be shown that using gauge conditions, symmetry arguments and lowest-order energy conservation - which we remind can be written as

∂µTµν = 0 instead of DµTµν = 0 , (1.27)

where Dµis the proper covariant derivative1- the general solution for the linearized

Einstein equation is ¯ hij = − 2G c4 1 r∂00Qij, (1.28)

where Qij is the mass quadrupole moment of the system

Qij = Z x0ix0jT00 c2 d 3x¯0 . (1.29)

Since we want a transverse-traceless solution, we need to project onto the required subspace. Considering a transverse projector for a vector

Pijn) = δij − ninj, (1.30)

which is symmetric, transverse and has trace Pii = 2. Using this, one can construct

Λijkl = PikPjl

1

2PijPkl (1.31)

which is a projector for a matrix:

ΛijklΛklmn= Λijmn (1.32)

This projector is traceless with respect to each couple of indices

Λiikl = Λijkk = 0 (1.33)

so we can use Λijklto project the solution along the TT subspace for every line-of-sight

vector ˆn. The final expression for the first order metric perturbation is hT Tij (t, ¯r) = −2G c4 1 rΛijklr)∂00Qkl(t − r c) . (1.34)

We remind once more that this solution is just an approximation. Linearizing Einstein equations means that we are neglecting the effect of the perturbed space-time on the source itself, so there is no back-reaction - the source is not affected by the presence of the gravitational waves. An exact solution is required to account for this. What we are ignoring while imposing Eq. (1.27) in its approximate form is the energy and momentum loss due to the gravitational radiation. This feature will be properly treated in the following section, since it is crucial in understanding the evolution of the system.

(12)

1.2.2 Binary black hole inspiral

The general solution derived in the previous section is now applied to a specific case, the compact binary system. This source is of both didactic and practical interest because it is a relatively simple object to deal with, at least at early stage, and because up to day is the only detected astrophysical source in ground-based interferometers. A binary system is composed of two objects orbiting around their common barycen-ter. We will consider two black holes of mass m1and m2and positions ¯r1, ¯r2, treated as

point masses in circular orbit. This Newtonian description holds as long as the objects are non-relativistic and their spatial separation is great enough, which is the case of the inspiral phase. In the following, please note that time must always be considered as retarded time, even if we are not explicitly stating it in every equation.

The reference frame is chosen so that the orbit lies in the x, y plane and the origin overlaps the center-of-mass of the system. As in classical dynamics, it is useful to define the relative coordinate ¯x = ¯r1− ¯r2, the total mass M = m1+ m2and the reduced

mass µ = m1m2

M . Thereby this reduces the problem to a two-body system in which the

heaviest mass M is fixed in the center of mass with an orbiting particle µ at distance R with coordinates x(t) = R cos  ωst + π 2  , y(t) = R sin  ωst + π 2  , z(t) = 0 ,

where ωsdenotes the mass’ angular velocity. If the system is isolated, the center of

mass does not accelerate hence the first mass does not produce gravitational waves. The second mass moment2 is

Mij = µxi(t)xj(t) (1.35)

whose non-vanishing components are

M11= µR2 1 − cos(2ωst) 2 ¨M11 = 2µR 2 ωs2cos(2ωst) , (1.36) M22= µR2 1 + cos(2ωst) 2 ¨M22= −2µR 2 ωs2cos(2ωst) , (1.37) M12 = − 1 2µR 2sin(2ω st) ¨M12= 2µR2ω2ssin(2ωst) . (1.38)

2Eq. 1.34 is written in terms of Qijinstead of Mij: for our purposes, however, the two quantities are equivalent, and the latter is slightly more practical to compute.

(13)

1.2. GRAVITATIONAL WAVES FROM COMPACT BINARY COALESCENCES

. Putting those quantities in Eq. 1.34 with the explicit form of Λijklwe get the wave

amplitude h+ = 1 r 4Gµω2 sR2 c4 1 + cos2ι 2 ! cos(2ωst) , (1.39) h× = 1 r 4Gµω2 sR2 c4 cos ι sin(2ωst) , (1.40)

where ι is the angle between the normal to the orbital plane and the line of sight, named inclination. The orbital radius can be removed with Kepler’s law in such a way that, introducing the mass combination

Mc= (m1m2) 3 5 (m1+ m2) 1 5 (1.41) and ωGW = 2ωs, the amplitude becomes

h+ = 4 r GM c c2 53ω GW 2c 23 1 + cos2ι 2 ! cos(ωGWt) , (1.42) h× = 4 r GM c c2 53ω GW 2c 23 cos ι sin(ωGWt) . (1.43)

In the quadrupole approximation one can compute the total power radiated by the system P = 1 10 G c5(McωGW) 10 3 . (1.44)

Since we are assuming circular orbits, as long as our approximations hold we can compute the system’s total energy at a given angular velocity

E = − G 2M5 cωGW 23 !13 . (1.45)

The total radiated power must be equal to the energy variation, P = dE dt: ˙ ωGW = 12 5 2 1 3 GM c c3  5 3 ω 11 3 GW. (1.46)

The solution of this differential equation diverges at a precise time, named coalescence time tc. This divergence tells us that the approximations we made before are not valid

anymore, hence we are not allowed to describe the system as two point masses in circular orbit. In fact, at this time the masses are plunging one into another.

The analytic solution for ωGW(t)is

ωGW(t) = 2  5 256 1 tc− t 38GM c c3 −58 , (1.47)

which gives the characteristic chirp behaviour to the signal. Signal and frequency evolution of a mock system are plotted in Figure 1.1.

(14)

20.0

17.5

15.0

12.5

10.0

7.5

5.0

2.5

0.0

1.0

0.5

0.0

0.5

1.0

h

+

1e 19

20.0

17.5

15.0

12.5

10.0

7.5

5.0

2.5

0.0

t t

c

[s]

20

40

60

80

GW

[s

1

]

Figure 1.1: Plus polarization (upper panel) and frequency (lower panel) of a face-on (ι = 0) system with Mc = 200M at r = 40 Mpc, computed using Eq. (1.42) and (1.47).

(15)

1.3. A BRIEF REVIEW OF COSMOLOGY

1.3

A brief review of cosmology

Humanity believed for centuries that Earth was the center of the Universe, and ev-erything else moved around it. Nowadays we know that this is not the case: there is no reason to think that Earth and the Solar System occupy a privileged place in the Universe. The founding principle of cosmology, the Cosmological Principle, states that

Viewed on a sufficiently large scale, the properties of the Universe are the same for all observers.

In other words, we make the hypothesis that the Universe is spatially homogeneous and isotropic. Under these assumption it is possible to choose a set of coordinates t, r, θ, φ, in a maximally symmetric space, for which the metric takes the form

ds2 = dt2− a2(t) dr2

1 − kr2 + r

22 + r2sin2θdφ2

!

, (1.48)

where a(t) acts as a scale factor, which is an unknown function of time alone, and k is a constant which express the maximally symmetric space curvature:

k = −1: negative curvature, open space (hyperbolic) k = 0: flat space (Euclidean)

k = 1: positive curvature, closed space (sferic)

These coordinates are called comoving coordinates. Using these coordinates, one can describe the galaxies’ motion as free-falling objects with fixed spatial coordinates r, θ, φ. However, in an expanding (or contracting) Universe, the temporal evolution of the scale factor a(t) ends up in a radial motion of the galaxies according to each observer, which might serve as center of the Universe. We will refer to this motion as the Hubble flow.

The radial coordinate r alone is not an expression of the physical spatial separation between the observer and an object with coordinate r1. The notion of proper distance

is defined by dprop(t) = a(t) Z r1 0 dr1 − kr2 . (1.49)

The fact that the Universe is expanding3affects not only bodies like galaxies but

also light. It can be proven that a photon emitted at a certain time teand detected at

3In principle, everything remains valid even in the opposite case, in which the Universe is contract-ing, but since nowadays we know that this is not the case we will consider only the expansion case.

(16)

another time t0experiences a redshift z

z = a(t0) a(te)

− 1 . (1.50)

It will be useful in the following to define the Hubble parameter H(t) = ˙a(t)

a(t), (1.51)

and the Hubble constant, which is defined as

H0 = H(t0) , (1.52)

where the subscript0means at present time.

It is interesting to note that, if galaxies are neither created nor destroyed, the galaxy current Jµ

G

JGµ = nGUµ, (1.53)

where

Ut= 1, Ui = 0 (1.54)

obeys the conservation equation 0 = ∂µJGµ= g −1/2 ∂xµ  g1/2JGµ= g−1/2∂ ∂t  g1/2nG  , (1.55)

where −g is the determinant of metric (1.48)

g = a6(t)r4(1 − kr2)−1sin2θ . (1.56) The conservation of galaxies hence reads

nG(t)a3(t) = constant (1.57)

so the galaxy number density per unit of comoving volume remains constant.

The scale factor a(t) is the main ingredient of every cosmological calculation one can do: it is clear the determination of the function a(t) is of the greatest importance.

1.3.1

Friedmann equations and ΛCDM model

The Universe is a closed system: this means that its evolution depends only on its content. Rearranging Einstein equations, the conservation of energy and the equation of state of the perfect fluid that composes the Universe one ends up with Friedmann equations H2 = 8πG 3 ρ − kc a2 H2 H2 0 = ρ ρcr + ρk ρcr , (1.58)

(17)

1.3. A BRIEF REVIEW OF COSMOLOGY ˙ H + H2 = −4πG 3  ρ + 3p c2  , (1.59)

where ρ is the total density of the Universe components, ρcr =

3H2 0

8πG is the critical density,

which is defined as the required density to have a flat Universe, and ρk = −akc2 is a

spatial curvature density. Once that this composition is known, the dynamics can be predicted using Friedmann equations. As far as we know, the Universe contains three different kinds of energy, plus curvature:

• Non-relativistic matter: baryonic matter and dark matter. ρM ∝ a−3.

• Relativistic matter: photons and neutrinos. ρR∝ a−4due to redshift.

• Dark energy: ρΛ = constant.

• Curvature: ρk∝ a−2.

It is common practice, in cosmology, to define the fraction parameter ωias the ratio

between each energy density and the critical density Ωi =

ρi

ρcr

, (1.60)

where the index i = M, R, Λ, k. Actually, a simple solution for Friedmann equation exists only in simplified scenarios, for example when one of the densities dominates among the others. Since each density has a different dependency on scale factor a(t), it is reasonable to believe that different eras during the growth of the Universe experienced different energy domination. From observations, we are now aware that we live in a Λ-dominated era, with a significant amount of energy carried by cold (mostly dark) matter. Radiation energy density is negligible, as well as curvature. Hence, the name of ΛCDM model.

1.3.2

Redshift versus distance: the Hubble law

An alternative way to measure distances in cosmology - since direct measurement is impossible - is the so-called luminosity distance measurement. If one identifies a source of known absolute luminosity L = dE

dt and measured flux F , the luminosity

distance DLis defined as

DL=

s L

4πF . (1.61)

A major issue of this is that we are forced to know the intrinsic luminosity of the source, which is not trivial in most cases. We will show in the next section that gravitational

(18)

waves give direct access to this quantity. If the source we are considering is redshifted due to the expansion of the Universe, two main effects must be considered:

• The energy is redshifted: E0 = Ee/(1 + z)

• The time is dilated: dt0 = (1 + z)dte

The physical surface of a sphere with comoving radius r is independent of curvature:

A = 4πa(t0)r , (1.62)

hence the flux for a redshifted source is

F = L

4πa2(t

0)r2(1 + z)2

. (1.63)

Comparing this equation with (1.61), we obtains

DL = (1 + z)a(t0)r . (1.64)

If we consider a galaxy located in (r, θ, φ), the physical distance between us and this galaxy is rphys(t) = a(t) Z r 0 dr0 √ 1 − kr02. (1.65)

A photon (or equivalently a gravitational wave) emitted by this galaxy at a certain time tewould be detected by an observer at a time t0, obtained imposing ds2 = 0. Thus we

can write Z t0 te cdt a(t) = Z r 0 dr0 √ 1 − kr02 . (1.66)

By mean of this relation and differentiating the redshift definition (1.50) dt a(t) = − 1 a(t0) dz H(z), (1.67)

where H(z) is given by (1.58) one can write (assuming a flat Universe, the right hand side integral equals r):

a(t0)r = c

Z z

0

dz0

H(z0). (1.68)

Using (1.64), we obtains the general expression for Hubble law

DL(z) = c(1 + z)

Z z

0

dz0

H(z0). (1.69)

The knowledge of DL(z)is therefore crucial, encoding the whole expansion history of

(19)

1.4. STANDARD SIRENS

1.4

Standard sirens

The solution we derived in Section 1.2 lives in a flat, static Universe. We now want to discuss how a gravitational wave propagates across cosmological distance in a Friedmann-Robertson-Walker-Lemaˆıtre Universe as described in the previous section.

It is useful to define what is called the local wave zone as the region which is far enough from the source to have the 1/r gravitational field behaviour, as for waves, but still close enough to neglect the expansion of the Universe. Assuming that the inspiralling time is small compared to the Universe expansion time scale H(t)4, the

physical distance the wave travelled before being detected is

rphys = ra(t0) , (1.70)

where r is the comoving distance of the emitting galaxy. In the following, it will be useful to distinguish between quantities as seen by the source, denoted by s, and quantities measured by an observer at cosmological distance, obs. The two polarizations (1.42) and (1.43) becomes h+(ts) = hc(trets ) 1 + cos2ι 2 cos Z tsret fGW(s) (t0s)dt0s ! , (1.71)

h×(ts) = hc(trets ) cos ι cos

Z tsret fGW(s) (t0s)dt0s ! , (1.72) with amplitude hc(trets ) = 4 a(t0)r GM c c2 53   πfGW(s) (tret s ) c   2 3 (1.73) and frequency fGW(s) (τs) = 1 π  5 256 1 τs 38GM c c3 −58 , (1.74)

since the calculation we made was made in the source’s reference frame.

It is however more convenient to express these quantities in terms of tobs = ts/(1+z)

and f(obs)

GW = (1 + z)f

(s)

GW. The phase factor in Equations (1.71) and (1.72) is not affected by

the change of reference frame, since the effects on time and frequency balance exactly. Remembering the relation between physical distance and luminosity distance (1.64) and properly transforming frequencies, the amplitude (1.73) becomes

hc(tretobs) = 4 DL (1 + z)53 GM c c2 53  

πfGW(obs)(tretobs c

)23 . (1.75)

4It can be proved that this assumption is satisfied as long as 2πf

(20)

Now, if we define the chirp mass Mc= (1 + z)Mc, (1.76) amplitude reads hc(tretobs) = 4 DL GM c c2  5 3   πfGW(obs)(tret obs) c   2 3 , (1.77)

as well as the differential equation for frequency

˙ fGW(obs) = 96 5 π 8 3 GM c c3 53 (fGW(obs))113 . (1.78)

These results are of the greatest importance: we shown that the problem has only three free parameters, ι, Mcand DL. Since we are able to measure the amplitude of

both polarization and ˙f(obs)

GW , the parameters of the problem can be determined.

Since this kind of signals carry direct information on luminosity distance, gravi-tational waves can be used as standard candles or, as proposed by Holz and Hughes, standard sirens.

(21)

Chapter 2

Bayesian statistics

I see a man running with a bag. Is that man a thief? With this the only information at hand, answering with certitude is impossible.

A more correct way to ask the question would be how probable is it that the man had stolen the bag?. Even if better posed than before, this question still has no definite answer.

Imagine that he is carrying a woman’s bag, with someone in the crowd crying ”My bag!”. We would all agree that it is very likely that the man is a thief. On the other hand, if the bag is a backpack and he’s running towards the bus stop, we would conclude that our alleged thief is just late.

This simple example is meant to highlight that there is no such thing as an abso-lute probability assignment: every assignment is conditioned on some other piece of information we have, even if this is the complete lack of information.

In this chapter we are describing the basics of Bayesian statistics, which gives the rules to quantitatively assign probabilities. This is going to be the main tool to infer the cosmological parameters we are interested in.

Bayesian statistics is about logical propositions: these will be denoted via letters and symbols, to be defined each time a new statement is presented. For example:

A: I drew a red ball from an urn

B: The urn contains 10 red balls and 10 white balls C: The sky is cloudy

The notation we are going to use is the following:

(22)

p(A|B) = probability that A is true given that B is true (2.2) along with boolean algebra operations

A + B logical disjunction OR A · B logical conjunction AND

¯

A logical negation NOT

Since every probability assignment is conditioned on some kind of background information we have, we will reserve, in the following and throughout the whole thesis, the letter I for the set of statements that describes our state of knowledge of the phenomena we are investigating.

It is important to keep in mind that Bayesian statistics is based on plausibility, not probability. However, probability can be defined as a function of plausibility. Since the aim of this chapter is to give just a brief overview of this approach to statistics, we are not going to give a formal proof of the connection between plausibility and probability.

2.1 Desiderata and quantitative rules

The Bayesian approach to statistics is based on three founding principles, which describe what we desire from our probability theory. These statements, called the desiderata, are:

1. Plausibility is represented by real numbers:

(A|C) = x . (2.3)

If we have more information, say C0, the plausibility of our statement must be

bigger:

(A|C0) = y y > x . (2.4)

Finally if we have two different statements on the same information C, if state-ment A is more plausible than statestate-ment B then we must have

(A|C) > (B|C) . (2.5)

2. Our theory must be in qualitative accordance with common sense. If we have two statements A and B that satisfy

(23)

2.1. DESIDERATA AND QUANTITATIVE RULES

meaning that statement B is not affected by C or C0 while statement A is, we

must get that

(AB|C0) > (AB|C) . (2.7)

3. Our theory has to be consistent, meaning that:

(a) If a result can be obtained in more than one way, we must obtain the same result;

(b) Every available piece of information must be used;

(c) Same sets of information are always going to get the same plausibility value. From these desiderata it is possible to build a probability theory which allows us to assign probability distributions.

In particular, as demonstrated by Cox [6], one can derive the sum rule for probability

p(A + B|C) = p(A|C) + p(B|C) − p(AB|C) (2.8)

and the product rule or chain rule

p(B|AC)p(A|C) = p(AB|C) = p(A|BC)p(B|C) (2.9)

because of the fact that the logical product is commutative. It is interesting to note that, once the chain rule is proven, Bayes theorem follows immediately:

p(A|B) = p(B|A)p(A)

p(B) . (2.10)

A set of propositions A1. . . AN is said to be mutually exclusive if

p(AiAj|C) = 0 ∀ i 6= j (2.11)

so the sum rule reads, for such a set, p(X i Ai|C) = X i p(Ai|C) . (2.12)

A set of propositions that satisfies X

i

p(Ai|C) = 1 (2.13)

(24)

If we imagine having a set of propositions (or hypothesis, or models) Hi that are

mutually exclusive and exhaustive that describes some kind of data D we have, thanks to Bayes theorem we can write

1 =X i p(Hi|DI) = P ip(D|HiI)p(Hi|I) p(D|I) (2.14)

which leads to the marginalization rule p(D|I) =X

i

p(D|HiI)p(Hi|I) , (2.15)

which is often useful whenever one has to deal with nuisance parameters or variables whose measurement is not possible for any reason. We considered, by now, the case in which Hi is discrete. However, dealing with real problems, it is not unlikely to

have a model M describing a physical phenomenon with some unknown continuous parameter θ - e.g. the problem of fitting a line over a set of data. The generalization to a continuous parameter space of what has been presented before is done introducing the probability density function

p(θ|M I)dθ (2.16)

which is the probability that θ ∈ [θ, θ + dθ]. Using this formalism, there is no distinction between discrete and continuous probability distributions. Equation (2.15) becomes

p(D|M I) = Z

Θ

p(D|θM I)p(θ|M I)dθ (2.17)

where Θ is the parameter space.

2.2 Probability assignment

By now we gave only rules to deal with probabilities once distributions are known. Probability assignment is hence crucial in statistical inference, but is also one of the most discussed tasks, since there is not an unique way to decide what is the probability distribution that best fits our state of knowledge of the problem we are dealing with.

However, some principles can be used as guidelines to define probability distri-bution for certain classes of problems. In the following, we will use the word prior distributionmeaning a probability assignment which is not conditioned to any data or piece of information which is not contained in the background information I. In a certain way, this reflects our state of knowledge of the problem before performing any experiment, a priori.

(25)

2.2. PROBABILITY ASSIGNMENT

• Principle of Indifference: stated by Laplace as If among the possible out-comes, there is no reason to prefer any of them over any other, then all outcomes should be equally probable. The resulting probability distribution is

p(Ai|I) =

1

N (2.18)

for N possible discrete outcomes and

p(θ|I) = 1

θmax− θmin

(2.19) for a continuous parameter supposed to lie in [θmin, θmax]. It is important to

underline that we are not conditioning on the information I just for the sake of formalism. The name uninformative prior is kind of misleading: it does not mean that we have no information at all, since we know at least that the number of possible outcomes is finite or that the parameter space is limited. A more appropriate name might be very few informative prior.

• Invariance principle: if the problem shows some degree of symmetry, a natural choice for a probability distribution is that it should reproduce the properties of the described system. This is the case of an overall scale factor, under whose transformation the physics of the system is invariant. The probability distribution that arises from this principle in the scale factor case is the Jeffreys prior

p(σ|I) ∝ 1

σ. (2.20)

• Maximum entropy principle: proposed by Jaynes in 1957 [12], it is useful to quantify the information encoded in a probability distribution. It states that the probability distribution which best represents the current state of knowledge is the one with largest entropy. In the discrete case, assuming that piis the probability associated

to the i−th possible outcome out of N, entropy S is defined as

S(p1. . . pN) = N

X

i=1

−pilog pi (2.21)

or, dealing with continuous variables, S[p(x)] =

Z

−p(x) log p(x)dx . (2.22)

These are not the sole available guiding principles. Nothing stops us from assigning different prior distribution according to the state of knowledge of the system.

(26)

From a naive point of view, this arbitrariness in probability distribution assignment might mean that one can obtain different probability distribution with different a priorichoices. This is only partially true, since the result will always be affected by the prior assignment, but this dependence is highly suppressed if the data we collect from experiments are sufficiently informative. In the following section we are going to give the basis for parameter estimation and model selection.

2.3

Parameter estimation and model selection

Parameter estimation is a common problem while dealing with the outcome of an experiment. Imagine that we have a certain dataset D produced by an experiment, which we believe is well described by a model M, which requires a set of parameters θ. For now, we are not questioning ourselves on how much we believe in this model, postponing this question to the end of the section. What is the more probable value for θaccording to the fact that we measured the data D and we have a certain background information I? In a Bayesian framework, we can write

p(θ|DM I) = p(D|θM I)p(θ|M I)

p(D|M I) . (2.23)

These four terms have specific names, reflecting their role in Bayesian inference. • p(θ|MI) is the prior probability, which is the estimation of the parameter θ before

the acquisition of the data D;

• p(D|θMI) is called the likelihood, and it represents the probability of observing the data D given the model M and the parameters θ;

• p(D|MI) is the evidence, being the probability of observing D regardless of θ. It acts as normalization constant;

• p(θ|DMI) is the posterior probability, since it describes the updated probability for θ once we observed the data D.

It is interesting to note that the all the information we are looking for (best estimate, credible intervals. etc.) are encoded in the posterior probability distribution, while the connection between data and model is fully described in the likelihood function.

2.3.1

Likelihood for a deterministic model

We now want to compute the likelihood for a deterministic model in the general case in which we have errors both on the dependent and independent stochastic

(27)

2.3. PARAMETER ESTIMATION AND MODEL SELECTION

measurement. We define the data D as a collection of single data points Di = (xi, yi).

We have also a theoretical model M with its parameter vector θ, which allows us, given a certain value ¯x, to make prediction on ¯y:

¯

y = f (¯x; θ) . (2.24)

Furthermore, our background information I contains our knowledge of, or our belief on, the error distribution, say g(x) and h(y). We will assume that every measurement Di is independent from the others.

Since our model prediction depends on the true value of the independent variable, which is unknown since we are not neglecting errors on it, it is useful to introduce a fictitious variable ηi, which denotes the true value of x associated to the i−th

measure-ment. Following a formal approach we can write p(D|θM I) =Y i p(Di|θM I) = Y i p(xi, yi|θM I) = Y i Z p(xi, yi|ηiθM I)p(ηi|θM I)dηi. (2.25) Now, applying the product rule on a single data point, we get

p(xi, yi|ηiθM I) = p(yi|ηixiθM I)p(xi|ηiθM I) . (2.26)

In principle, once the error distribution is known, xiis completely independent from

θand M. The probability of measuring yigiven that we know ηi is the error function

evaluated on the residual yi− f (ηi; θ). In general this reads

p(yi|ηiθM I) = h(yi− f (ηi; θ)) . (2.27)

It can be proved that, thanks to the maximum entropy principle, the best guess for an error distribution with known mean and variance is the Gaussian distribution.

p(yi|ηiθM I) = 1 σy,iexp − (yi− f (ηi; θ))2 2 y,i ! . (2.28)

The probability of measuring xiis

p(xi|ηiI) = g(xi − ηi) (2.29)

and, assuming errors to be Gaussian distributed, p(xi|ηiI) = 1 σx,iexp − (xi− ηi)2 2 x,i ! . (2.30)

Since no information on η is provided, the prior is assumed uniform: p(ηi|θM I) = N.

In the general case, the likelihood (2.25) can be written as p(D|θM I) =Y

i

N Z

(28)

while in the particular case of both Gaussian distributions for x and y errors it becomes p(D|θM I) =Y i N 2πσx,iσy,i Z exp −1 2 " (yi− f (ηi; θ))2 σ2 y,i +(xi− ηi) 2 σ2 x,i #! dηi. (2.32)

The most probable value for θ can be computed maximizing the likelihood function. It is interesting to note that, if we assume Gaussian errors and an exact measurement for xi1, the commonly used χ2 minimization corresponds to a maximization of the

likelihood.

2.3.2

Model selection

We now want to consider the case in which more than one model is available for the description of the same phenomenon. Let’s assume that we have two different models, say M1and M2, with an associated set of parameters θ1 and θ2. The question we want

to answer is the following:

Given the data D, what model is to be preferred among M1and M2?

This problem is called model selection. Taking the odds ratio of the two models as the ratio between posterior distributions,

O12= p(M1|DI) p(M2|DI) = p(D|M1I)p(M1|I) p(D|I) p(D|I) p(D|M2I)p(M2|I) = p(D|M1I) p(D|M2I) p(M1|I) p(M2|I) , (2.33) we can define the Bayes factor B12as

B12=

p(D|M1I)

p(D|M2I)

. (2.34)

The second ratio on the right hand side of (2.33) encodes the prior probability we assign to each model. Since in general there is no reason to prefer one model among the others, it is common practice to set p(M1|I) = p(M2|I)2. Under this assumption, the

odds ratio is equal to the Bayes factor. This factor is the ratio of the models’ likelihoods regardless of the parameters value, on which we have to marginalize.

p(D|MiI) =

Z

Θ

p(D|θiMiI)p(θi|MiI)dθi. (2.35)

1Which means taking the limit σ

x,i→ 0, h(xi− ηi) = δ(xi− ηi).

2This is not the only possible assumption, obviously. As we pointed out several times, every prob-ability assumption is conditioned on the information I, hence even here there is room for different prior choices - let’s say, for example, a probability assignment dictated by Occam’s razor which prefers models with less free parameters. However, in the following we will assume that every model has the same prior probability.

(29)

2.3. PARAMETER ESTIMATION AND MODEL SELECTION

This quantity is called evidence. If the Bayes factor is much greater than unity, the first model has to be preferred, while if it is much smaller than 1 the second model is favorite. In the unlucky case in which B12 ∼ 1, no selection can be done according to

(30)
(31)

Chapter 3

The state-of-art for H

0

: the Hubble

tension

Before the first Binary Neutron Star coalescence, which happened on August 17th,

2017, there were two main competing measurements for the Hubble constant: the Planck experiment uses the data collected by the homonym satellite on the CMB anisotropy - hence the early Universe technique name - while on the other hand local Universetechnique [17] relies on the observation of type 1a supernovae to reconstruct the Hubble parameter actual value. This second method makes extensive use of the cosmic distance ladder.

The measured H0 values, however, are not in mutual accordance: early Universe

measurement [16] gives H0 = (67.4±0.4) km s−1Mpc−1while local Universe experiment

ended up with H0 = (73.24 ± 1.74) km s−1Mpc−1.

The discrepancy between the two measurements reaches today the 3.5σ level, and a discussion among scientists is open on whether one of the two measurements is affected by some kind of systematic error or if some new physics is hidden behind the curtain. This debate goes under the name of Hubble tension. A third independent measurement is hence required to at least strengthen one of the two hypothesis.

The cosmic distance ladder-independent luminosity distance measurement pro-vided by gravitational wave observations is the best candidate to provide a third, impar-tial H0 value. GW170817 detection and the subsequent association with GRB170817A as

its electromagnetic counterpart gave H0 = 67+13−7 km s−1 Mpc

−1, which is in accordance

with both values.

Please note that the purpose of the present section is to give an overview of the cur-rently available measurements, hence no data will be reported except the ones which are relevant for the present thesis - in general, only H0 values. A precise explanation of

(32)

Figure 3.1: 2018 Planck map of the temperature anisotropy of the CMB. Courtesy of ESA and the Planck Collaboration.

3.1 Planck

Planck satellite was conceived to measure the anisotropy of the Cosmic Microwave Background Radiation with an higher than ever precision. During its four years activity, it orbited the Sun in the L2 Lagrangian point and mapped the whole sky ending up with what is shown in Figure 3.1.

The temperature fluctuations map can be expanded in spherical harmonics. In the following, we will use P. Peter and J.-P. Uzan’s [15] notation: the average h·i is done over the sky position, the direction versor is denoted ˆe, the angular separation between ˆ

eand ˆe0 is called θ and Θ(ˆe) = T (ˆe)−hT ihT i is the normalized deviation from the average temperature.

Θobse) =

X

`m

aobs`mY`me) . (3.1)

Since the Universe is isotropic, the angular two point correlation function can be obtained from an average over the sky.

Cobs(θ) = hΘobs(e)Θobs(e0)i . (3.2)

(33)

3.1. PLANCK

Figure 3.2: Planck’s CMB power spectrum from [16]. Courtesy of ESA and the Planck Collaboration.

terms of Legendre polynomials:

Cobs(θ) = 1 X ` C`obsP`(cos θ) = 1 X ` 1 2` + 1 X m a obs `m 2 P`(cos θ) . (3.3)

The knowledge of the correlation function allows us to write the power spectrum as a function of multipole moments `

DT T ` =

`(` + 1)C`

. (3.4)

The spectrum in Figure 3.2 is computed using Planck data. Three different peaks are visible in the 30 < ` < 1000 region. These peaks, named acoustic peaks, are mainly affected by the baryonic density ΩBh2. This is due to photon-baryon plasma oscillations

before last scattering and to the Sachs-Wolfe effect. We are not going into calculation since it is not relevant for the goals of this thesis, but one can prove that the main dependence of the peaks’ position and height is given by last scattering surface’s redshift and radius.

(34)

Figure 3.3: Cosmic distance ladder. Here Cepheids variables are used to calibrate intrinsic luminosity of type 1a supernovae. Credits to Ms. Tabitha Dillinger.

Since acoustic peaks alone are not enough to break h − ΩBdegeneracy, Planck’s H0

values relies on Baryonic Acoustic Oscillation measurements from galaxies redshift surveys. The overall result of this effort is H0 = (67.4 ± 0.4) km s−1Mpc−1.

3.2 SH0ES

On the other hand, late Universe experiments use the cosmic distance ladder to asso-ciate a luminosity distance to objects with known redshift.

The cosmic distance ladder is a collection of several distance indicators, each of them useful for a certain distance range. The far end of each rung is used to calibrate the next one. Since many of these relie on an intrinsic luminosity measurement, the target objects are known as standard candles. In Figure 3.3 different methods and their applicability ranges are represented. The SH0ES experiment makes use of the Hubble Space Telescope observations of Cepheids variables in 11 host galaxies of recent type 1a supernovae to increase the number of Cepheid-calibrated objects and hence reduce the uncertainties on the distance of cosmologically relevant supernovae host galaxies.

The composition of several distance measurements with this ladder leads to an Hubble constant value H0 = (73.24 ± 1.74) km s−1Mpc−1[17].

(35)

3.3. GW170817

3.3

GW170817

On August 17th, the two LIGO instruments detected a long-lasting signal in both

inter-ferometers with a SNR of 32.4 and a total mass of 2.74+0.04

−0.01M [2]. Due to the very low

masses of the object, the event has been labeled as the very first binary neutron star inspiral ever detected.

The lack of signal in Virgo1 due to the lower instrument sensitivity and to the

different antenna pattern function shrinks the source localization to a 28 deg2region.

This, along with the luminosity distance of 40+8

−14Mpc, makes this event the closest and

the best localized.

1.7 seconds after the coalescence time, Fermi-GBM detected a short Gamma Ray Burst coming from inside the BNS coalescence credible region. The presence of both temporal and spatial coincidence lead the astronomers to identify GW170817’s electromagnetic counterpart with NGC4993. The probability of a misidentification has been computed being 0.004%. This also strengthened the hypothesis of neutron star mergers as short GRB progenitors.

Aposterior for H0can be derived in a Bayesian framework given a perfect knowledge

of the Hubble flow velocity. For the sake of coherence with other parts of this thesis, we will use a different notation than the one used in [1], denoting with D observed data. The only relevant parameters for this analysis are the luminosity distance DLand the

inclination cos ι, the latter because of a degeneracy between these two quantities. The likelihood for the observed data is obtained marginalizing over every other parameter, here ¯λ:

p(D|DL, cos ι) =

Z

p(D|DL, cos ι, ¯λ)p(¯λ)d¯λ . (3.5)

Thanks to the Bayes’ theorem the probability for the data D can be turned into a probability distribution for DLand cos ι:

p(DL, cos ι|D) =

p(D|DL, cos ι)p(DL, cos ι)

p(D) . (3.6)

Introducing now the Hubble law v = H0DLthe posterior for DLbecomes a posterior

distribution for H0: p(H0, cos ι|D) ∝ v H2 0 p(D|v/H0, cos ι)p(v/H0)p(cos ι) . (3.7)

This model, however, does not correspond to reality, since the measured NGC4993’s velocity is not entirely due to the Hubble flow. The peculiar motion of the galaxy needs to be taken into account, introducing an uncertainty on v. This uncertainty is 1Actually, after the detection, a sub-threshold signal with a SNR of 2.0 has been found in Virgo data.

(36)

Figure 3.4: Posterior distribution for H0after GW170817, along with Planck and SH0ES

measurements. The LIGO-Virgo collaboration.

described as being Gaussian, centered on vp+ H0Dwhere vp is the peculiar velocity of

the host galaxy. Once again, the real NGC4993 peculiar velocity is unknown and the probability distribution for this quantity must be taken into account. This probability too is assumed being Gaussian with mean hvpi, which is the average over the local

group peculiar velocity.

Finally, the posterior distribution for all the relevant quantities is

p(H0, DL, cos ι, vp|D, vr, hvpi) ∝ p(D|DL, cos ι)p(vr|DL, vp, H0)p(hvpi|vp

× p(DL)p(vp)p(cos ι)p(H0) . (3.8)

which needs to be marginalized over DL, cos ι and vpin order to obtain the posterior

distribution for H0.

p(H0|D, vr, hvpi) ∝ p(H0)

Z

p(D|DL, cos ι)p(vr|DL, vp, H0)p(hvpi|vp

× p(DL)p(vp)p(cos ι)dDLd cos ιdvp. (3.9)

The computed posterior distribution is plotted in Figure 3.4, with a measured value H0 = 70+12−8 km s−1 Mpc

−1. Unfortunately, the obtained posterior is too broad to

(37)

3.4. THE GALAXY-CATALOG METHOD

3.4 The galaxy-catalog method

Since the detection of the electromagnetic counterpart and the subsequent identifi-cation of the host galaxy of a gravitational wave is something very rare, hoping for a Hubble constant measurement on sole EM-associated events is way too optimistic. A way to get information from every detection on our disposal was outlined by Schutz [18], whose idea relies on statistical basis.

The reconstruction via parameter estimation of both sky coordinates and luminosity distance allows us to define a confident volume in which the host is supposed to live. Assuming that we are able to see every object in that particular portion of the sky, we could list all the possible hosts using a galaxy catalog. For each galaxy in the catalog one can compute the probability distribution for H0assuming that the considered galaxy

is the host, weighting this fact with the probability for the galaxy of actually being the host.

p(H0|{Gi}, D) =

X

i

p(H0|Gi, D)p(Gi|D) . (3.10)

Please note that this is just an heuristic derivation and it is far from being a complete proof of the outlined method. A precise and formal proof can be find in the following Chapter or in [19].

What one should end up with is a probability distribution with several peaks of different heights, one for each galaxy in the catalog. Among these peaks, only one will correspond to the correct H0value, while the others are just random noise. The

multiple peaks structure, however, is not visible while dealing with real data because of the low precision of the available redshift measurements as well as the high number of objects in the given volume.

Taking into account several probability distributions from different detection will increase the height of the correct peak, while the noise will be flattened out. Actually, the method presented in the previous section is just a particular realization of the general case in which the catalog contains only one galaxy, the real host.

The whole discussion here presented relies on the hypothesis that our catalog is complete, because we need to assume that we know the parameters of the host galaxy, even if we are not able to tell which one is. Our catalogs, unfortunately, are far from being complete if one takes into consideration the horizon luminosity distance for LIGO. According to [19], where the group computed the probability for the host galaxy of being included in the catalog (which is closely linked to completeness), the probability quickly falls below 0.5 for redshifts & 0.1 for all-sky catalogs like GLADE. If one considers deep-sky surveys, the catalog can be considered almost complete up to higher redshifts, keeping in mind that the surveyed area is usually limited for these

(38)

Figure 3.5: Completeness of three different galaxy catalogs (GLADE, GWENS, DES-Y1) from [19]. Vertical lines correspond to median redshift for each event - solid and thick up to the intercept with the galaxy catalog they are used with, thin and dashed above. Only the events with p(G) greater than a certain threshold are used. Upper grey dashed horizontal line shows the lowest p(G) value for used events. Lower line displays the highest p(G) for rejected events. Courtesy of LVC Cosmology Group.

kind of missions.

This feature must be properly encoded in the statistical description. The way in which LVC deals with this is to compute the probability for the host galaxy of being or not being in the catalog, say p(G|H0)or p( ¯G|H0)following the same formalism of [19].

Rewriting Eq. (3.10) in this fashion, it becomes

p(H0|D) = p(D|G, H0)p(G|H0) + p(D| ¯G, H0)p( ¯G|H0) , (3.11)

where p( ¯G|H0) = 1 − p(G|H0)and the first term is just Eq. (3.10). The second term,

however, requires a marginalization over the unknown parameters of the dark host galaxy.

The probability for the host of being in the catalog is computed as in [11] using a simulation. Since most of the useful information comes up from the galaxies we actually see, not every detected event has been taken into account while performing the calculation. The events with a very low catalog support have been discarded2. The

completeness plot from the paper is reported in Figure 3.5. The posterior probability for H0is instead displayed in Figure 3.6. As expected, most of the information comes from

2However, since one of the founding principles of the Bayesian analysis, the desiderata, states that all the available information must be used, a more general framework must consider also these events.

(39)

3.4. THE GALAXY-CATALOG METHOD

(40)

GW170817 while the others, which are also displayed on their own, give only a small but visible contribution in sharpening the distribution. Both Planck and SH0ES H0

value are still in 68% confidence region, hence up to day nothing can be said about the possibility of one of the two measurements to be wrong or biased using gravitational wave inference.

However, given a certain detection, computing the probability for the host galaxy of being included in our catalog is not an easy task. In the next chapter we present a different approach which does not explicitly performs the marginalization over the probability of seeing the host, allowing us to relax the caveat of the high catalog support.

(41)

Chapter 4

An alternative approach to the

galaxy-catalog method

4.1 Model

In order to define the notation, we are going to call D the observed data from a set of GW detectors. From these data, via parameter estimation, it is possible to reconstruct a posterior probability distribution for the parameters of the observed GW signal.

A single galaxy, denoted Gi, carries several pieces of information: sky position,

redshift, apparent magnitude. We will use gi to indicate the measured quantities

αi, δi, zi, mi.

Dealing with (in)completeness, we are taking into account the possibility that we are missing some galaxies including in Gialso an indicator variable Si which marks if

the galaxy has been seen (Si = 1) or not (Si = 0). For the sake of clarity, we’ll call bright

galaxythe former case and dark galaxy the latter.

In principle, every galaxy can be the host of a gravitational wave. However, GWs are rare events, and it is reasonable to assume that their emission probability is proportional to the total mass. Under this assumption, we can ignore small enough galaxies. We can assign to each galaxy a probability εiof being a potential emitting source.

However, building an accurate model to account for this fact is no easy task, more-over complicated by the fact that the quantity we measure, the magnitude, does not depend on the mass alone, but also on other parameters like age and metallicity of the galaxy. We simplified the problem setting a sharp cutoff magnitude: every galaxy with a lower absolute magnitude is a potential emitter, hence ε = 1, while all the objects which are fainter than Mcutof f have ε = 0. This is another way of stating that dwarf

(42)

will strongly depend on the chosen cutoff magnitude.

The logical proposition Gi reads The galaxy Gi exists, has coordinates αi, δi, zi,

magnitude mi, observation status Siand emission possibility probability εi.

Gi = giSiεi. (4.1)

It is useful, for our purposes, to define a new logical statement regarding the fact that a galaxy has emitted or not a particular gravitational wave. The statement Eiis

Ei : The galaxy Gi emitted a gravitational wave (4.2)

and its logical negation ¯Ei

¯

Ei : The galaxy Gidid not emit a gravitational wave (4.3)

Finally, Ω is the set of cosmological parameters we are considering.

4.1.1

Catalog

The use of the word catalog, in this context, might be a bit misleading. The idea behind this kind of approach is that we assume an almost perfect knowledge of the local Universe. This is necessary because we want to somehow enumerate all the objects that lie within the boundaries of our reconstructed volume. In this way, a catalog is not a simple list of objects that are visible from Earth: instead, it should be a collection of all the existing objects, even if we know nothing about their properties. We might refer to this as a wide-sense catalog, while a collection of objects that can be seen from Earth would be a strict-sense catalog.

A wide-sense catalog C, assuming that we know the total number of objects within a certain volume Nt, is the logical product of the Gis.

C =Y

i

Gi. (4.4)

Since for each galaxy Ei+ ¯Ei = 1, it is true that

Y

j

(Ej + ¯Ej) = 1 (4.5)

so we can write the catalog C as

C =Y i Gi Y j (Ej + ¯Ej) . (4.6)

Riferimenti

Documenti correlati

The development of a heater is a significant technological challenge tied with the need of reaching high ignition temperatures (1500-2000 K) minimizing the power

An Australian study noted over a decade ago that evaluations in the 1990s did not establish if that country’s science awareness programme “caused Australians to become more or

Se all’atto della fondazione di Santa Maria degli Angeli era già presente a Firenze il monastero camaldolese di San Salvatore e altri se ne aggiungeranno negli

Then we have studied the abundance gradients along the Galactic disk produced by our best cosmological model and their dependence upon several parameters: a threshold in the surface

In questa tesi vengono proposte estensioni del linguaggio LAILA di coordinamento tra agenti logico/abduttivi, con particolare riguardo alle primitive di comunicazione e ai

We will relate the transmission matrix introduced earlier for conductance and shot noise to a new concept: the Green’s function of the system.. The Green’s functions represent the

Tuttavia, il prezzo predatorio rappresenta un dilemma che ha tradizionalmente sollecitato l’attenzione della comunità antitrust: da un lato la storia e la teoria economica