Study of gamma + jets data with the CMS detector in pp collisions at sqrt(s) = 8 TeV

(1)

Chapter 3 Reconstruction and Identification of

Physics Objects

The feature of the channel under investigation is the presence, in the final state, of a photon with high transverse momentum in association with jets, as shown in the Event Display reported in Figure3.1. In this chapter the algorithms used for the photon and jets reconstruction and their

Photon pT = 128 GeV/c η = 0.5 φ = 2.0 rad Anti-kT 0.5 PFJet pT = 139 GeV/c η = -0.6 φ = -1.1 rad (a) 3D view. Photon pT = 128 GeV/c η = 0.5 φ = 2.0 rad Anti-kT 0.5 PFJet pT = 139 GeV/c η = -0.6 φ = -1.1 rad (b)ρ − ϕ view.

Figure 3.1: Event Display of the Event No. 59867556 belonging to the Run No. 139365, during the 7 TeV data collection era. A photon candidate with a transverse momentum equal to 128 GeV/c is back-to-back to a jet candidate with a transverse momentum equal to 139 GeV/c.

calibration using samples of data collected and Monte Carlo simulation are described.

3.1 Photon Reconstruction

In this section the reconstruction of high pT prompt photons for single photon studies is

described. Because of the very high background originating from hadronic jets, the reconstruc-tion of prompt photons starts selecting on triggered photons and the criteria are different from the ones used in the reconstruction of low pT, non-triggering, neutral energy deposits that are

used in energy-flow jet reconstruction algorithms. For the offline reconstruction of prompt pho-tons, only photons that exceed a selected HLT trigger transverse energy threshold and that have passed the selected isolation criteria of the Level 1 and HLT triggers are considered.

(2)

3.1.1 Clusterization Algorithms

Photon reconstruction begins with the clusterization of the energy deposits in the ECAL calorimeter: every single photon or electron generates an electromagnetic shower by interact-ing with the crystals of the calorimeter. A collection of adjacent ECAL crystals, which is used to reconstruct the energy and the direction of a particle, is commonly referred as cluster. Ap-proximately 94% of the incident energy of a single electron or photon is contained within a cluster of 3 × 3 crystals, and 97% into a cluster of 5 × 5 crystals [119]. Summing up the energy measured in such fixed arrays gives the best performance for unconverted photons and electrons in the test beam [126,127]. However, the energy measurement in the calorimeter is hampered by the presence of the Tracker material that particles traverse before ECAL and by the presence of the CMS magnetic field aligned with the beam1_{. Electrons that cross the Tracker silicon}

layers radiate bremsstrahlung photons that propagate in tangential direction with respect to the electron that bend in the r − ϕ plane. The ECAL deposited energy is therefore widely spread in ϕ direction. The assembling of clusters and, in particular, of superclusters (clusters of clus-ters) has been optimized to account for all these effects. The cluster assembling process starts with the individuation of the seed: among all the crystals where the deposited energy exceeds a threshold value, the one with the highest value of transverse energy deposited in the crystal is selected. Starting from each seed, the adjacent crystals in which some transverse energy has been deposited in the event are grouped using two different algorithms [128,129]: the Hybrid algorithm, used in the barrel, and the Multi 5 × 5 algorithm, employed in the endcaps. The ne-cessity of using two different algorithms originates from the different geometrical arrangement of the crystals and from the different mapping of the magnetic field in the two regions.

The Hybrid Algorithm

The clustering in the barrel region is obtained by using the Hybrid algorithm [128], which takes advantage of the η − ϕ projective geometry of the crystals to perform the collection of both the energy in individual showers and the set of showers compatible with a bremsstrahlung emission. This is done by collecting energy within a rectangular window extended in the ϕ direction.

The Hybrid algorithm operates as follows. First, a list of seed crystals with ET > 1 GeV

is constructed. Starting from one of these seed crystals, a cluster is defined as an ensemble of ϕ-contiguous “dominos”, which have collected an energy larger than 0.1 GeV. Each domino consists of 5 crystals with the same ϕ value (which corresponds to a domino width of 0.087 in η). “Valleys”, where less than 0.1 GeV are collected in a domino, separate different clusters.

The dominos are then clustered in ϕ in order to form superclusters. Each distinct cluster of dominos grouped in the supercluster is requested to have a seed domino with energy greater than 0.35 GeV. The ϕ roads are allowed to extend up to ±17 crystals around the seed, which corresponds to ±0.174 rad. The hybrid supercluster is made up of a series of showers at constant η but spread in the ϕ-direction. In Figure3.2the domino and the supercluster construction steps are shown. Each energy deposit can be well contained in a 5 × 5 crystal window.

(3)

3.1 Photon Reconstruction 75

(a) Domino construction. (b) Supercluster construction.

Figure 3.2: Individual steps of the Hybrid algorithm [130].

The Multi 5× 5 Algorithm

Since the crystals in the endcaps are not arranged in an η − ϕ projective geometry as in the barrel, the Hybrid algorithm cannot be applied there. The same idea of collecting energy deposits within a window in η and ϕ must be therefore implemented differently. This is achieved by using the Multi 5 × 5 algorithm [129], which operates as follows.

The Multi 5 × 5 algorithm starts from a seed crystal with ET > 180 MeV, and checks if

it is a local maximum in energy by comparing its energy to the energy of its 4 neighbours by side in a Swiss Cross pattern. If the crystal is not a local maximum, the algorithm continues by searching for other seeds. Around each seed it constructs a 5 × 5 matrix of crystals, including only crystals that do not already belong to a cluster.

To allow closely overlapping showers (due to bremsstrahlung) to be collected, the outer 16 crystals of the 5 × 5 matrix may seed a new matrix. In case of overlaps, the overlapping crystals are associated to the cluster with largest seed ET. An example of the result of this process is

shown in Figure3.3.

Figure 3.3: An illustration of two overlapping Multi 5 × 5 clusters. Crystals indicated in yellow are eligible to seed further Multi 5 × 5 clusters, provided they are local maxima in energy [128].

To produce the final supercluster by recovering bremsstrahlung, a rectangular window in η and ϕ extended in the ϕ-direction is created around energy deposits with transverse energy above 1.0 GeV. Other energy deposits falling within the window are added to form the supercluster.

(4)

This procedure is performed in descending order of ET of the energy deposits and an energy

deposit may belong only to one cluster.

In the endcaps, the region 1.6 < |η| < 2.6 is covered by the preshower detector. Electrons and photons reconstructed in this region will deposit some fraction of their energy in the preshower, so this energy must be measured and added to each cluster. This is done by summing the energy measured by the preshower strips intersected at the position extrapolated between each energy deposit in the calorimeter and the primary vertex. This energy is added to each endcap supercluster before any energy scale corrections are applied.

3.1.2 Photon Energy Reconstruction and Calibration

Photon energy is computed starting from the raw energy deposited in the ECAL crystals. A cluster shape parameter, R9, is also defined in order to distinguish photons that convert

upstream of ECAL from those entering ECAL unconverted. It is defined as the ratio of E3×3, the

energy contained within the 3 × 3 array of crystals centered around the crystal with maximum energy deposit, to the total energy of the supercluster ES C:

R9=

E3×3

ES C

. (3.1)

Showers from photons that interact with the Tracker material will spread out in the magnetic field reducing the value of R9. A value of 0.94 has been chosen to distinguish between photons

that convert in the Tracker material (R9 < 0.94) and unconverted photons (R9 ≥ 0.94).

Accord-ing to MC studies, about 70% of the photons with R9≥ 0.94 in the barrel are unconverted [131].

To have an optimal performance in the measurements with electrons or photons it is essential to have a high resolution in the electromagnetic calorimeter. In particular, the emission of bremsstrahlung radiation or the photon conversion into pairs of electrons contribute to decrease the energetic resolution of ECAL, leading to an underestimation of the measured energy of electrons or photons. The role of the energy correction is to compensate for these losses. The energy Ee,γreleased in the electromagnetic calorimeter is calculated using the equation [112]:

Ee,γ = Fe,γ·       G · X i Si(t) · Ci· Ai+ EES       , (3.2)

where the sum is over the crystals i belonging to the supercluster. The energy deposited in each crystal is given by the pulse amplitude (Ai), in ADC counts, multiplied by ADC-to-GeV

conversion factors (G), measured separately for ECAL barrel and endcaps, by the intercali-bration coefficients (Ci) of the corresponding channel, and by Si(t), a correction term due to

radiation-induced channel response changes as a function of time t. The term Fe,γ represents

the energy correction, applied to the superclusters to take into account the η- and ϕ-dependent geometry and material effects as well as the fact that electrons and photons shower slightly dif-ferently. The term EES is the preshower energy. This factorization of the various contributions

to the electromagnetic energy determination enables stability and intercalibration to be studied separately from material and geometry effects.

(5)

Single-Channel Intercalibration

The single ECAL channel intercalibration, i.e. the relative calibration Ci between one

chan-nel and another, is obtained in situ mainly by equalizing the response to low mass di-photon resonances (π0_{, η) across the detector [}₁₁₂_{]. Supplementary information and cross checks}

are provided by studying the approximate ϕ-invariance of the energy flow in minimum bias data [132], and the ratio of the energy measured in ECAL over the track momentum measured in the Silicon Tracker (E/p) of isolated electrons deriving from W- and Z-boson decays [133].

In Figure 3.4a the effects of the individual channel calibration are shown for Z → e+e−

events from data collected at a center-of-mass energy of 8 TeV with an integrated luminosity of 19.6 fb−1_[₁₃₄_].

Corrections for Changes in Response

The ECAL response varies under irradiation due to the formation of colour centers that reduce the transparency of the lead tungstate. The crystal transparency recovers in the periods without irradiation through spontaneous annealing [135]. A Light Monitoring (LM) system, based on the injection of laser light at 440 nm, close to the emission peak of scintillation light from PbWO4, into each crystal, is used to track and correct for response changes during LHC

operation [136,137]. Additional laser (LED in the endcaps) light sources provide ancillary information on the system stability. The laser light is injected through optical fibers in each barrel and endcap crystal through the front and rear face respectively. The spectral composition and the path for the collection of laser light at the photodetector are different from those for scintillation light. A conversion factor is required to relate the changes in the ECAL response to laser light to the changes in the scintillation signal. The relationship is described by a power law: S(t) S0 = R(t) R0 !α , (3.3)

where S (t) is the channel response to scintillation light at a particular time t, S0 is the initial

response, and R(t) and R0 are the corresponding response to laser light. The exponent α is

independent of the loss for small transparency losses and has been measured in a beam test for a limited set of crystals under irradiation.

During the 8 TeV data collection campaign, the response change observed in the ECAL channels is up to 6% in the barrel and it reaches up to 30% at η ∼ 2.5, the limit of the Tracker acceptance [134].

The response corrections were tuned and validated using the energy of electrons from W-boson decays, the reconstructed mass from η decays to two photons, and the energy resolution measured with Z → e+e−events.

In Figure3.4bthe effects of the corrections for changes in response are shown for Z → e+e−

events from data collected at a center-of-mass energy of 8 TeV with an integrated luminosity of 19.6 fb−1[134].

Superclusters Energy Corrections

Superclusters are used to reconstruct the energies of photons and electrons, and to form seeds for electron track reconstruction. A correction function, Fe,γ, derived from MC

(6)

simula-) 2 (GeV/c ee M 60 80 100 120 Events / 0.5 GeV 0 20 40 60 80 100 120 140 160 180 3 10 × no corrections Intercalibrations (IC) IC + LM corrections CMS Preliminary 2012 -1 = 8TeV, L = 19.6 fb s ECAL barrel

(a) Calibration in ECAL barrel.

) 2 (GeV/c ee M 60 80 100 120 Events / 0.5 GeV 0 2 4 6 8 10 12 14 16 18 no corrections Intercalibrations (IC) IC + LM corrections CMS Preliminary 2012 -1 = 8TeV, L = 19.6 fb s ECAL endcap 3 10 ×

(b) Calibration in ECAL endcaps

Figure 3.4: Reconstructed invariant mass from Z → e+e−decays, for single-channel corrections set to unity (no correction), for final intercalibration (IC), and for both final intercalibration and light moni-toring (LM) corrections (IC+ LM), in the barrel (a) and endcaps (b). Analyzed data correspond to an integrated luminosity of 19.6 fb−1collected at a center-of-mass energy of 8 TeV [134].

tion, is applied to the supercluster energy to account for energy containment effects, including both shower containment in the calorimeter, and energy containment in the supercluster for par-ticles that shower in the material in front of ECAL. The energy corrections have been tuned for electrons and photons separately to account for the differences in the way they interact with the material in front of the ECAL.

The corrections for photons have been optimized using a multivariate regression technique based on a Boosted Decision Tree (BDT) implementation [138]. The regression has been trained on prompt photons (from γ+ jets MC samples) using the ratio of generator level photon energy to the supercluster energy, including the energy in the preshower EES for the endcaps, as the

target variable. The input variables are the η and ϕ coordinates of the supercluster in CMS, a collection of shower shape variables, and a set of local cluster coordinates to measure the distance of the clusters from ECAL boundaries. The local coordinates provide information on the amount of energy which is likely to be lost in crystal and module gaps and cracks, and drive the level of local containment corrections predicted by the regression. The other variables provide information on the likelihood and location of a photon conversion and the degree of showering in the material. They are correlated with the global η and ϕ position of the supercluster. These variables drive the degree of global containment correction predicted by the regression. The global and local containment corrections address different effects. However, these corrections are allowed to be correlated in the regression to account for the fact that a photon converted before reaching ECAL is not incident at a single point on the calorimeter face, and is then relatively less affected by local containment. This approach leads to better energy resolution than factorized parametric corrections of the different effects. The number of primary vertices is also included as input to the BDT, in order to correct for the dependence of the shower energy on spurious energy deposits due to pile-up events.

The primary validation tool for the regression is to compare data and MC simulation perfor-mance for electrons in Z- and W-boson decays. A BDT with identical training settings and input

(7)

variables to those described above has been trained on a MC sample of electrons from Z-boson decays. The related corrections are different from the ones used for the electron reconstruction in CMS, where Tracker information is included in the energy measurement. However, they enabled a direct comparison of the ECAL calibration and resolution in data and MC simulation. Figure 3.5 shows the di-electron invariant mass distribution for Z-boson events, recon-structed applying a fixed-matrix clustering of 5 × 5 crystals with respect to using the super-cluster reconstruction to recover radiated energy, and then applying the energy corrections. For the endcaps, the effect of adding the preshower energy is also shown. The improvement in the Z-boson mass resolution is particularly evident for the supercluster reconstruction, which e ffi-ciently recovers the radiated energy and reduces the low-energy tails of the distributions relative to the 5 × 5 fixed-matrix clustering.

) 2 (GeV/c ee M 60 80 100 120 Events / 0.5 GeV 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22 6 10 × 5x5 crystals E uncorrected SuperCluster E corrected SuperCluster E CMS Preliminary 2012 -1 = 8TeV, L = 19.6 fb s ECAL barrel

(a) Calibration in ECAL barrel.

) 2 (GeV/c ee M 60 80 100 120 Events / 0.5 GeV 0 2 4 6 8 10 12 14 16 18 3 10 × 5x5 crystals E uncorrected SuperCluster E uncorrected SuperCluster+ES E corrected SuperCluster E CMS Preliminary 2012 -1 = 8TeV, L = 19.6 fb s ECAL endcap

(b) Calibration in ECAL endcaps

Figure 3.5: Reconstructed di-electron invariant mass for electrons from Z → e+e− events applying a fixed-matrix clustering of 5 × 5 crystals, the supercluster reconstruction to recover radiated energy, and the supercluster energy corrections for barrel (a) and endcaps (b). For the endcaps the effect of adding the preshower detector energy is shown. Analyzed data correspond to an integrated luminosity of 19.6 fb−1 collected at a center-of-mass energy of 8 TeV [134].

Absolute Energy Calibration

In CMS, the absolute energy calibration (G) is computed in a reference region of the ECAL where the effects of upstream material and uncertainties in the energy corrections are minimal. The reference region in the barrel is defined as the central 150 crystals in the first module of each supermodule (|η| < 0.35), requiring a minimum distance of 5 crystals from the border of each module in both η and in ϕ. This region is chosen because the material budget in front of the first module is small, the geometry of these crystals is very similar, and the centrality of the crystals in the module is required so that there is no energy leakage due to the gaps between modules or supermodules. In the endcaps, the reference region is defined as the central region of each endcaps (1.7 < |η| < 2.1), to which crystals exposed to the beam test belong. The absolute energy calibration in the MC simulation is computed using 50 GeV unconverted photons. It is defined such that the energy reconstructed in a 5 × 5 crystal matrix is equal to the true energy of the photon in the reference region. Decays of Z-bosons into two electrons are used to set the

(8)

overall energy scale in barrel and endcaps in data relative to the MC simulation, and to validate the energy correction function Fe for electrons, using the Z-boson mass constraint.

An analytic fit to the Z invariant mass peak, built with supercluster energies (including the energy corrections derived from Monte Carlo simulations), is performed using a convolution of a Breit-Wigner function with a Crystal Ball (CB) function [112]. In the fit, the Breit-Wigner pa-rameters are fixed to the PDG [18] values (mZ = 91.188 GeV/c2andΓZ = 2.495 GeV/c2) while

the CB parameters accounting for showering electrons, whose energy is not fully recovered by the clustering algorithms, are constrained from MC simulation studies.

The ADC-to-GeV conversion factor G of Equation 3.2 for data is adjusted such that the fitted Z → e+e− peak agrees with that of the Monte Carlo simulation separately for the barrel and endcap calorimeters.

Decays of Z-bosons into two muons where one muon radiates a photon, Z → µµγ, are used to cross-check the energy calibration of photons [112].

3.1.3 Photon Energy Resolution

The energy resolution for electrons is measured using Z → e+e−_{events. The electron}

en-ergies are reconstructed from the ECAL energy deposits with the calibrations and corrections described in the previous sections. The di-electron invariant mass resolution (which is domi-nated by the electron energy resolution) is related to the single-electron energy resolution by an approximate scaling factor of √2, verified using MC simulations. The intrinsic detector reso-lution is estimated by the Gaussian width parameter of the Crystal Ball function introduced in Section3.1.2. Similarly, the energy resolution for photons has been studied from the line shape of Z → µµγ events, in an ET range slightly lower, but comparable, to that of Z → e+e−events.

The resolution σE/E is extracted from an unbinned likelihood fit to Z → e+e−events, using

a Voigtian (Landau convoluted with Gaussian) distribution as the signal model [112]. The resolution is plotted separately for data and MC events.

For both the electrons from Z-boson decays and the photons from Z → µµγ, the energy resolution in the data is not correctly described by the MC simulation. These differences are accommodated in CMS analyses by applying additional Gaussian smearing, in bins of η and R9,

to the electron and photon energies in MC simulation.

Figure3.6 shows the energy resolution extracted using this method for both data and MC simulation. The average resolution σE/E for electrons from Z-boson decays is plotted as a

function of η in the barrel and endcaps, and is shown separately for electrons with R9≥ 0.94 and

in the inclusive sample. The resolution in the barrel depends on the amount of material in front of the ECAL and is degraded in the vicinity of the ECAL module boundaries, as indicated by vertical lines in the plots. The resolution in the endcaps shows an η dependence that is correlated with the amount of material in front of the ECAL, up to |η| ≈ 2. At larger pseudorapidity, single-channel response variations, not fully modeled in simulation, are also contributing to the difference between data and MC simulation.

To accommodate the mismatch in the energy resolution between data and simulation, an additional smearing term is extracted, which is the quadratic difference between the electron resolution in data and MC simulation of Figure 3.6. This term is added in quadrature as a constant Gaussian smearing to the electron and photon energy in the MC events, assuming the same degradation in resolution between data and MC events for photons and electrons. The

(9)

3.2 Photon Identification 81

consistency of this method was checked by comparing the mass resolution in Z → e+e− _and

Z →µµγ events in data and in a Monte Carlo sample with this smearing term applied.

| η SuperCluster | 0 0.5 1 1.5 2 2.5 / E E σ 0 0.01 0.02 0.03 0.04 0.05 0.06 , s = 8TeV -1 CMS 2012 preliminary: L = 19.5 fb-1, s = 8TeV CMS 2012 preliminary: L = 19.5 fb-1, s = 8TeV CMS 2012 preliminary: L = 19.5 fb 0.94 ≥ 9 Prompt reconstruction, R 0.94 ≥ 9 Winter2013 re-reconstruction , R 0.94 ≥ 9 MC, R

(a) Electrons with R9≥ 0.94.

| η SuperCluster | 0 0.5 1 1.5 2 2.5 / E E σ 0 0.01 0.02 0.03 0.04 0.05 0.06 , s = 8TeV -1 CMS 2012 preliminary: L = 19.5 fb-1, s = 8TeV CMS 2012 preliminary: L = 19.5 fb-1, s = 8TeV CMS 2012 preliminary: L = 19.5 fb

Prompt reconstruction, inclusive Winter2013 re-reconstruction, inclusive MC, inclusive

(b) Electrons inclusive sample.

Figure 3.6: Relative electron (ECAL) energy resolution in data and Monte Carlo unfolded in bins of pseudorapidity η for the barrel and the endcaps, using electrons from Z → e+e−decays. The resolution is shown separately for low bremsstrahlung electrons (R9 ≥ 0.94) (a) and for the inclusive sample (b).

The resolution, especially in the endcaps, improves significantly after a dedicated calibration using the full 2012 CMS dataset with respect to the prompt calibration from early 2012 CMS data. The η cracks between ECAL modules are indicated by the vertical lines in the plot [134].

3.2 Photon Identification

As already described in Section 3.1, photons are reconstructed in CMS as isolated clusters of energy in the ECAL. The largest background to prompt photons arises from jets: they can fragment into light mesons, like the π0, that carry most of the energy of the jet and that primarily decay into two photons (the branching fraction of π0 → γγ is (98.823± 0.034)% [18]). If the meson has enough energy, the photons may be boosted enough to create a narrow deposit of energy in the ECAL, and thus resemble the deposit of a single prompt photon.

A series of photon identification (photon ID) criteria is defined to discriminate between prompt photons and electrons or jets by fixing some variables related to three main parameters: conversion safe electron veto, shower shape and isolation.

3.2.1 Conversion Safe Electron Veto

To avoid misidentifying an electron as a photon, a conversion safe electron veto is applied. Isolated electrons have a very similar behaviour to photons in the ECAL calorimeter and they can pass photon ID and isolation selections. An explicit electron veto based on the presence of a track can avoid this issue. An efficiency loss for converted photons is experienced when a conversion happens in the first two pixel layers. The conversion safe electron veto removes

(10)

the photon candidate if its supercluster is matched to an electron with no missing hits on the innermost tracker layers and it is not matched to a reconstructed conversion.

The event yields passing and failing the electron veto in Z → µµγ process (which is ∼ 99.6% a pure source of photons) for data and simulation can be seen in Figure 3.7 for ECAL barrel and endcaps [139].

Electron veto

Passed Veto Failed Veto

Entries 2 10 3 10 4 10 -1 dt = 19.6 fb L

∫

γ µ µ → Data Z γ µ µ → MC Z CMS Preliminary 2012 s = 8 TeV | < 1.4442 γ η 0.0 < |

(a) Conversion Safe Electron Veto in ECAL barrel.

Electron veto

Passed Veto Failed Veto

Entries 2 10 3 10 -1 dt = 19.6 fb L

∫

(b) Conversion Safe Electron Veto in ECAL endcaps.

Figure 3.7: Event yields passing and failing the electron veto in the ECAL barrel (a) and endcaps (b) for data and Monte Carlo simulation of Z → µµγ process. The hatched band is the uncertainty on the simulation prediction. The residual data to Monte Carlo disagreement is likely to be due to improper simulation of the detector materials [139].

3.2.2 Shower Shape

The shower shapes has different features in the longitudinal or lateral development and it has to be studied in both cases.

Longitudinal Shape

Photons and electrons deposit most of their energy in the ECAL, whereas hadrons deposit most of their energy in the HCAL. A variable that is useful for identification of photons is the hadronic to electromagnetic fraction (H/E). In this variable, the numerator H is defined as the energy of the HCAL tower located behind the seed crystal of the seed basic cluster of the supercluster, while the denominator E is the energy of the electromagnetic supercluster.

Due to the 25.8 (barrel) and 24.7 (endcaps) radiation length depth of ECAL crystals, isolated photons deposit almost all of their energy in the ECAL itself and the variable H/E has a value close to or equal to zero. A significant amount of energy in the HCAL directly behind the ECAL is evidence of the presence of other particles in association with the candidate photon: for jets, which carry both hadronic and electromagnetic energy, the variable H/E will so have a generally higher value than for photons.

(11)

3.2 Photon Identification 83

By minimizing the number of HCAL towers that contributes to the variable, its pile-up dependence is also minimized for the high pile-up conditions achieved in the 2012 data period. The single tower H/E distribution of photons in Z → µµγ process for data and simulation is shown in Figure3.8for ECAL barrel and endcaps [139].

Hadronic to EM Energy Ratio

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 Entries/0.005 1 10 2 10 3 10 4 10 -1 dt = 19.6 fb L

∫

(a) H/E in ECAL barrel.

Hadronic to EM Energy Ratio

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 Entries/0.005 1 10 2 10 3 10 -1 dt = 19.6 fb L

∫

(b) H/E in ECAL endcaps.

Figure 3.8: Distribution of the ratio of hadronic energy behind the supercluster to the photon supercluster energy in the ECAL barrel (a) and endcaps (b) for data and Monte Carlo simulation of the Z → µµγ process. The hatched band is the uncertainty on the simulation prediction [139].

Lateral Shape

The variable that allows to study the lateral development of the shower is the σiηiη. It is the

η − η element of the η − ϕ covariance matrix and it represents the width of the ECAL cluster along the η direction. It is computed with logarithmic weights and it is defined as [140]:

σ2 iηiη = P5×5 i wi(ηi− ¯η5×5)2 P5×5 i wi , (3.4)

where the index i runs over all the crystals in a 5 × 5 block of crystals centered on the seed crystal, ηiis the pseudorapidity position of the ithcrystal, ¯η5×5is the energy weighted η mean of

the 5 × 5 block of crystals defined as

¯η5×5= P5×5 i wi ηi P5×5 i wi . (3.5)

In that equation wi is the weight of the ithcrystal, defined as

wi = max 0, 4.7+ ln

Ei

E5×5

!

, (3.6)

(12)

The value of σ2

iηiηtends to be small for single isolated photons, directly produced in the hard

scattering, that are not affected by the magnetic field and leave some isolated deposits of energy in the ECAL. The value is also small in a shower produced by a converted photon (γ → e+e−): the CMS magnetic field bends charged particles along the ϕ direction, so even in that case the shower profile along the η direction is expected to be narrow. In the case of deposits of energy in the ECAL which arise from jets, consisting of multiple π0s each decaying to two photons, the σ2

iηiη value is expected to be larger with a wider shower profile along the η direction.

The σiηiη distribution of photons in Z → µµγ process for data and simulation is shown in

Figure3.9for ECAL barrel and endcaps [139].

η i η i σ 0 0.005 0.01 0.015 0.02 0.025 0.03 Entries/0.0005 0 500 1000 1500 2000 2500 3000 3500 4000 4500 -1 dt = 19.6 fb L

∫

(a)σiηiηin ECAL barrel.

η i η i σ 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 Entries/0.001 0 100 200 300 400 500 600 700 800 -1 dt = 19.6 fb L

∫

(b)σiηiηin ECAL endcaps.

Figure 3.9: Distribution of the covariance σiηiη of the photon supercluster in the ECAL barrel (a) and endcaps (b) for data and Monte Carlo simulation of the Z → µµγ process. The hatched band is the uncertainty on the simulation prediction [139].

3.2.3 Isolation

Isolation is a powerful tool to reject the non-prompt background due to electromagnetic showers originating in jets – mainly due to single and multiple π0s. Isolations make use of the Particle Flow (PF) algorithm [141–143] that combines the information from all CMS subdetec-tors to identify and reconstruct all particles in the event. It has some advantages with respect to a detector based isolation: it avoids a double counting of information in each subdetector and the photon footprint removal (i.e. removal of the energy deposits due to the photon) is an intrinsic feature of the algorithm. A series of geometrical vetoes are still needed in cases where the signal photon is not identified by the Particle Flow algorithm in the reconstruction.

The computation of the isolation is split into two parts:

• the computation of the measurable quantities (i.e. direction, pT) around the photon

di-rection for three PF objects: charged hadrons, neutral hadrons and photons. Any veto is applied at this step;

(13)

3.2 Photon Identification 85 • the computation of the sum of the pT of the above candidates in a cone with ∆R =

p

(∆η)2+ (∆ϕ)2 = 0.3 centered around the photon seed: Iso

ChHad for charged hadrons,

I soNeHad for neutral hadrons and I soPho for photons. The sums are computed with di

ffer-ent vetoes depending on whether the particle was idffer-entified or not by the PF algorithm. For each one of the isolation defined above, the energy deposited within the isolation cone is contaminated by energy from pile-up and from the underlying event. Since the contamina-tion increases with the number of pile-up vertices, the efficiency of the isolation cut decreases with increasing pile-up. In order to maintain high efficiency under high pile-up conditions, the contribution to isolation from pile-up and the underlying event is estimated using the FastJet algorithm [144–146] ρ variable, i.e. the average transverse energy per unit area of the detector on an event-by-event basis. The jets used to measure it are detected only in the active Tracker volume of the detector, |η| < 2.5. The ρ value has units of ET/area: in order to derive an

iso-lation correction, it must be multiplied times the geometrical area of the isoiso-lation cone but this computation is complicated by the detector geometry. An effective area Ae f f is defined as the

slope of a linear fit of the average isolation of photon objects versus ρ, excluding veto regions. A detailed description of that method will be provided in Section3.3.3. For each one of the iso-lation I soX defined above (where X = ChHad, NeHad, Pho), the pile-up corrected isolation

sum I soρcorr

X is then given by:

I soρcorr

X = max(0, IsoX −ρ · Ae f f) . (3.7)

The effective area Ae f f used for the ρ-based isolation corrections for the three PF isolation sums

used in this analysis are reported in Table3.1. These effective areas have been determined using γ + jets Monte Carlo simulations.

Table 3.1: Calculated Ae f f effective areas for the ρ based corrections in Charged Hadrons, Neutral

Hadrons and Photons Isolation.

η range Aeff (unit of area)

Charged Hadrons Neutral Hadrons Photons

|η| < 1.0 0.012 0.030 0.148 1.0 < |η| < 1.479 0.010 0.057 0.130 1.479 < |η| < 2.0 0.014 0.039 0.112 2.0 < |η| < 2.2 0.012 0.015 0.216 2.2 < |η| < 2.3 0.016 0.024 0.262 2.3 < |η| < 2.4 0.020 0.039 0.260 |η| > 2.4 0.012 0.072 0.266

The isolation distribution of photons corrected for the presence of additional proton-proton collisions in Z → µµγ process for data and simulation is shown, in Figure 3.10 for charged hadrons, in Figure3.11for neutral hadrons and in Figure3.12for photons, for ECAL barrel and endcaps [139].

(14)

Charged Hadron Isolation Sum (GeV) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Entries/0.1 1 10 2 10 3 10 4 10 -1 dt = 19.6 fb L

∫

γ µ µ → Data Z γ µ µ → MC Z CMS Preliminary 2012 s = 8 TeV | < 1.4442 γ η 0.0 < | (a) I soρcorr

ChHadin ECAL barrel.

Charged Hadron Isolation Sum (GeV)

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Entries/0.1 1 10 2 10 3 10 -1 dt = 19.6 fb L

∫

γ µ µ → Data Z γ µ µ → MC Z CMS Preliminary 2012 s = 8 TeV | < 2.5 γ η 1.566 < | (b) I soρcorr

ChHadin ECAL endcaps.

Figure 3.10: Distribution of selected photons charged hadron isolation corrected for the presence of additional proton-proton collisions, in the ECAL barrel (a) and endcaps (b) for data and Monte Carlo simulation in the Z → µµγ process. The hatched band is the uncertainty on the simulation prediction [139].

3.3 Jet Reconstruction

Jets are the experimental signature of quarks and gluons produced in high energy physics processes and they play a predominant role at the LHC. Due to their large production cross section, the jets detected by the CMS experiment allow studies of new kinematic regimes, com-paring predictions of perturbative QCD and probing physics processes within and beyond the Standard Model. A correct reconstruction and calibration of jets is therefore crucial to perform an analysis with jets in the final state. In this section, the jets reconstruction adopted in CMS and in this analysis is described.

3.3.1 Standard Jet Clustering Algorithms

Jets clustering algorithms are among the most important tools for data analysis of hadronic collisions. Their widespread use in experiments at the Tevatron and LEP and the extreme com-plexity of the final states in the experiments at the LHC has produced a wide-ranging debate on the different algorithms available.

From a “theoretical standpoint”, the following features are desirable to build an ideal algo-rithm [147]:

• infrared safety: the algorithm should not only be infrared safe, in the sense that any infrared singularities must not appear in the perturbative calculations, but should also find solutions that are insensitive to soft radiation in the event;

• collinear safety: the algorithm should not only be collinear safe, in the sense that collinear singularities must not appear in the perturbative calculations, but should also find jets that are insensitive to any collinear radiation in the event;

(15)

3.3 Jet Reconstruction 87

Neutral Hadron Isolation Sum (GeV)

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Entries/0.1 1 10 2 10 3 10 4 10 -1 dt = 19.6 fb L

∫

NeHadin ECAL barrel.

Neutral Hadron Isolation Sum (GeV)

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Entries/0.1 1 10 2 10 3 10

∫

L dt = 19.6 fb-1 γ µ µ → Data Z γ µ µ → MC Z CMS Preliminary 2012 s = 8 TeV | < 2.5 γ η 1.566 < | (b) I soρcorr

NeHadin ECAL endcaps.

Figure 3.11: Distribution of the selected photons neutral hadron isolation corrected for the presence of additional proton-proton collisions, in the ECAL barrel (a) and endcaps (b) for data and Monte Carlo simulation in the Z → µµγ process. The hatched band is the uncertainty on the simulation prediction [139].

• invariance under boosts: the algorithm should find the same solutions independent by boosts in the longitudinal direction. This is particularly important for pp collisions where the center-of- mass of the individual parton-parton collisions is typically boosted with respect to the pp center-of-mass;

• boundary stability: it is desirable that the kinematic variables used to describe the jets exhibit kinematic boundaries that are insensitive to the details of the final state;

• order independence: the algorithm should find the same jets at parton, particle, and de-tector level;

• straightforward implementation: the algorithm should be straightforward to implement in perturbative calculations.

From an “experimental standpoint”, a desirable jet algorithm obeys the following criteria [147]: • detector independence: the performance of the algorithm should be as independent as possible of the detector that provides the data. For example, the algorithm should not be strongly dependent on detector segmentation, energy response, or resolution;

• minimization of resolution smearing and angle biases: the algorithm should not amplify the inevitable effects of resolution smearing and angle biases;

• stability with luminosity: jet finding should not be strongly affected by multiple hard scatterings at high beam luminosities;

• efficient use of computer resources: the jet algorithm should provide jet identification with a minimum of computer time consumption;

(16)

Neutral EM Isolation Sum (GeV) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Entries/0.1 10 2 10 3 10 4 10 -1 dt = 19.6 fb L

∫

Pho in ECAL barrel.

Neutral EM Isolation Sum (GeV)

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Entries/0.1 1 10 2 10 3 10 -1 dt = 19.6 fb L

∫

γ µ µ → Data Z γ µ µ → MC Z CMS Preliminary 2012 s = 8 TeV | < 2.5 γ η 1.566 < | (b) I soρcorr

Pho in ECAL endcaps.

Figure 3.12: Distribution of the selected photons neutral electromagnetic isolation corrected for the presence of additional proton-proton collisions, in the ECAL barrel (a) and endcaps (b) for data and Monte Carlo simulation in the Z → µµγ process. The hatched band is the uncertainty on the simulation prediction [139].

• maximal reconstruction efficiency: the jet algorithm should efficiently identify all physi-cally interesting jets;

• ease of calibration: the algorithm should not obstruct to the reliable calibration of the final kinematic properties of the jet;

• ease of use: the algorithm should be straightforward to implement with typical experi-mental detectors and data;

• fully specified: the algorithm must include specifications for clustering, energy and angle definition, and all details of jet splitting and merging.

Since a parton is not a well-defined object, a jet definition is not unique and several ap-proaches are therefore available for jet clustering, each of them with different characteristics in order to satisfy the previous requirements. Mainly, two broad classes of jet clustering algorithms have been developed.

The first one consists of the conical recombination [147], where jets are defined as dominant directions of energy flow. One introduces the concept of stable cone as a circle of fixed radius Rin the η − ϕ plane such that the sum of all the momenta of the particles within the cone points to the same direction as the center of the circle. Cone algorithms attempt to identify all the stable cones. Most implementations use a seeded approach to do so: starting from one seed for the center of the cone, one iterates until the cone is found to be stable. The set of seeds can be taken as the set of initial particles (sometimes over a pT threshold) or as the midpoints between

previously-found stable cones.

The second class is called sequential recombination and it works by defining a distance between pairs of particles, performing subsequent recombinations of the pair of closest particles

(17)

and stopping when all resulting objects are too far apart. Algorithms within that class differ in the definition of the distance.

As described in Section1.6.1 the main requirement for a jet clustering algorithm is that it is InfraRed and Collinear safe (IRC-safe), i.e. the algorithm must be independent of infrared and collinear corrections in perturbative QCD. The infrared corrections concern the emission of a gluon with an infinitely low energy, a phenomenon which happens with infinite probability in perturbative QCD. As shown in Figure3.13, if the algorithm is IR-safe (InfraRed safe) this soft emission must not affect the number of reconstructed jets. At the same time, the collinear

(a) (b)

Figure 3.13: Stable cones found by an algorithm which is not IR-safe for a 3-particle event (a) and for the same event with an additional infinitely soft gluon (b). The algorithm is not IR-safe as it finds a different number of stable cones in the two cases [148].

corrections concern the splitting of a hard particle into two collinear particles, a process that also has an infinite probability in perturbative QCD. The algorithm is C-safe (Collinear safe) if this fact does not change the number of reconstructed jets, as shown in Figure3.14. An

IRC-(a) (b)

Figure 3.14: Jets found by an algorithm which is not C-safe for a 3-particle event (a) and for the same event with a collinear splitting (b). The algorithm is not C-safe as it finds a different number of stable cones in the two cases [148].

safe algorithm presents also the advantage of being less sensitive to the calorimeter noise (small energy deposits do not affect the number of reconstructed jets).

(18)

The standard IRC-safe algorithms adopted by CMS are the SISCone [148–150], in the coni-cal recombination class, and the ktand anti-kt[151] algorithms, in the sequential recombination

class. In the following the different characteristics of the three algorithms are described. The SISCone Algorithm

The Seedless Infrared Safe Cone (SISCone) [148–150] is an IRC-safe algorithm based on a seedless approach to stable cones identification. The cone search follows these steps:

• for each particle i (that could also be the result of a group of collinear particles merging), all the particles j = 1, · · · N with a distance from i lower than 2R are considered, where R = p(∆η)2+ (∆ϕ)2 _{is a fixed parameter. If there are no such particles, i forms a stable}

cone of its own;

• for each couple of particles i j, the two circles of radius R which have the two particles on the circumference is created;

• for each circle the pT weighted centroid of all the particles which lie within the circle is

created;

• the cone centered on the centroid of the nth circle is declared stable only if it contains all the initial particles;

• the cones with overlaps are split/joined if the scalar sum of the common particles pT is

lower/higher than a fraction f of the energy of the cone with higher momentum. Default values used for the R and f parameters are: R= 0.5 and f = 0.75.

The kt and anti-kt Algorithms

The kt and anti-kt are IRC-safe algorithms which belong to the sequential recombination

class. They introduce two definitions of distance: di j, the distance between the objects (particles

or pseudojets) i and j, and diB, the distance between the object i and the beam. These distances

are defined as follows:

di j = min(k2p_ti , k_{t j}2p) ∆2 i j R2 , (3.8) diB = k 2p ti , (3.9) where∆2 i j = (yi− yj) 2+ (φ

i−φj)2and kti, yi and φi are, respectively, the transverse momentum,

rapidity and azimuth of particle i. In the two expressions, R is the radial parameter and p is a parameter which rules the energy relative power versus the geometrical scale∆i j. The algorithm

therefore proceeds as follows:

• the distances di jand diBfor each particle i are calculated and the smallest distance between

di j and diB is identified;

• if di j < diB the objects i and j are recombined;

(19)

3.3 Jet Reconstruction 91 • the distances are recalculated and the procedure is repeated until no objects are left. Changing the value of the p parameter in the distance definition, a different jet clustering algo-rithm can be obtained:

• the value p = 1 defines the kt algorithm [152]. The general behaviour of the algorithm

for p > 0 with respect to soft radiation is rather similar to the one observed for the kt

algorithm, because what matters is the ordering between particles, and for finite ∆i j this

is maintained for all positive values of p;

• the value p = 0 defines the inclusive Cambridge/Aachen algorithm [153];

• the value p = −1 finally defines the so-called anti-ktalgorithm [151] and it has been used

for this work.

The functionality of the IRC-safe anti-kt algorithm can be understood by considering an

event with a few well separated hard particles with transverse energy kt1, kt2, · · · and many

soft particles. The d1i = min(1/k2_t1, 1/k2ti) ∆2

1i

R2 distance between a hard particle 1 and a soft

particle i is exclusively determined by the transverse momentum of the hard particle and by the∆i j separation. The di j between similar soft particles will instead be greater. Therefore the

algorithm will tend to cluster soft particles with the hard ones instead of along themselves. If a hard particle has no hard neighbours within a distance 2R, then it will simply accumulate all the soft particles within a circle of radius R, resulting in a perfectly conical jet. If another hard particle 2 is present such that R < ∆12 < 2R then there will be two hard jets and it will

not be possible for both to be perfectly conical. If kt1 kt2, jet 1 will be conical, while jet

2 will be partly conical, since it will miss the part overlapping with jet 1. Instead if kt1 = kt2

neither jet will be conical and the overlapping part will simply be divided by a straight line equally between the two and in general if kt1 ∼ kt2 both cones will be clipped. In the case

where∆12< R, the two hard particles will cluster to form a single jet, which will be conical and

centered on k1if kt1 kt2, and will be instead more complex if kt1∼ kt2. The key feature above

is that the soft particles do not modify the shape of the jet, while hard particles do. It results in a more regular shape of the anti-kt clustered jets with respect to the other algorithms, as shown

in Figure3.15: anti-kt generates a circular hard jet, which clips a lens-shaped region out of the

soft one. The anti-ktis therefore a more robust algorithm with respect to the other ones as far as

non-perturbative effects like hadronization and underlying event contamination are concerned, improving in this way the momentum resolution and the calorimeter performance. Its more regular shape also allows to subtract in an easier way the contribution of pile-up, as explained below. Finally, another feature that justifies the choice of this algorithm is its operating speed. For all these reasons, the anti-kt has been used as the main clustering algorithm by the CMS experiment and in the present work. The distance parameter was fixed at the value R= 0.5.

3.3.2 Jet Reconstruction in CMS

The CMS subdetectors involved in the jet reconstruction are the hadronic calorimeter HCAL, the electromagnetic calorimeter ECAL, which identifies photons and electrons belonging to the jet (in particular the ones deriving from π0decays), and the Silicon Tracker, which adds track in-formation in order to improve the pTresponse and resolution of calorimeter jets. Three different

(20)

Figure 3.15: A MC sample at parton-level, together with many random soft “ghosts”, clustered with four different jets algorithms, illustrating the “active” catchment areas of the resulting hard jets. For kt

and Cam/Aachen the detailed shapes are in part determined by the specific set of ghosts used and change when the ghosts are modified. The anti-kt jet shapes are instead more regular and not modified by soft

particles [151].

methods of jet reconstruction are employed by CMS, characterized by the way the subdetector inputs are used during the jet finding procedure: Calorimeter jets (Calo jets), Jet-Plus-Tracks jets(JPT jets) and Particle Flow jets (PF jets). In particular:

• Calorimeter jets [154] are reconstructed using energy deposits in the ECAL and HCAL cells, combined into calorimeter towers as inputs. A calorimeter tower consists of one or more HCAL cells and the geometrically corresponding ECAL crystals. In the barrel region of the calorimeters |η| < 1.4, the unweighted sum of one single HCAL cell and a 5 × 5 ECAL crystals dominos form a projective calorimeter tower. The energy of a single calorimetric tower is calculated as the sum of the two single energetic deposit. The association between HCAL cells and ECAL crystals is more complex in the endcap regions of the ECAL 1.4 < |η| < 3.0, because of different geometry of the detector. Beyond the coverage of the ECAL (|η| > 3.0), each calorimeter tower corresponds to one HCAL cell. To suppress noise contributions originating from the readout electronics of the calorimeters, some thresholds are applied to the energies of the cells during the construction of the calorimetric towers; In addition, to suppress the contribution from the pile-up, the calorimetric towers with transverse energy ET < 0.3 GeV are not used in the

reconstruction of jets;

• JPT jets are built with the Jet-Plus-Tracks (JPT) algorithm [155] that corrects the energy and the direction of a calorimeter jet. It exploits the excellent performance of the CMS

(21)

tracking detectors to improve the pT response and resolution of calorimeter jets

(track-ing coverage extends up to |η| ∼ 2.4). Charged particle tracks are associated with each calorimeter jet based on spatial separation in η − ϕ between the jet axis and the track momentum measured at the interaction vertex. The associated tracks are classified as in-conetracks if their projection onto the surface of the ECAL falls within the jet cone. Conversely, if they are bent outside the cone by the magnetic field, then they are called out-of-conetracks. The momenta of both in-cone and out-of-cone tracks are then added to the energy of the associated calorimeter jet. For in-cone tracks the expected average energy deposition in the calorimeters is subtracted, based on the track and the hypothesis that it originates from a charged pion;

• PF jets are built with the Particle Flow (PF) algorithm that aims to reconstruct, identify and calibrate each individual particle in the event by combining the information from all CMS subdetector systems.

Particle Flow jets are used for this work, since they have a better jet pT resolution. The jets

were clustered using the anti-kt algorithm described in the Section3.3.1.

Particle Flow Jets (PF Jets)

The Particle Flow algorithm [141–143] combines the informations from all CMS subde-tectors to identify and reconstruct all particles in the event, namely muons, electrons, photons, charged hadrons and neutral hadrons, as shown in Figure3.16. Charged hadrons, in particular,

Figure 3.16: The Particle Flow algorithm. Particles in the CMS detector are seen as tracks and energy deposits. The PF algorithm attempts to fully reconstruct an event by combining informations from all CMS subdetectors.

are reconstructed from tracks in the central Tracker. Photons and neutral hadrons are recon-structed from the energy clusters in the electromagnetic and hadron calorimeters. A cluster which does not have a corresponding associated track is a clear signature of a neutral parti-cle. A neutral particle overlapping with charged particles in the calorimeters can be detected as a calorimetric energy excess with respect to the sum of the associated track momenta. PF jets are then reconstructed from the resulting list of particles in the event. The jet momentum and spatial resolutions is so improved with respect to calorimetric jets since the use of track-ing detectors and the excellent granularity of the ECAL allow to resolve and precisely measure charged hadrons and photons inside jets, which constitute about 90% of the jet energy.

(22)

3.3.3 Jet Energy Corrections and p

T

Resolution

A detailed understanding of the energy calibration and resolution of jets is of crucial impor-tance and is a leading source of systematic uncertainty for many analyses with jets in the final state.

The jet energy measured in the detector is typically different from the corresponding particle jet energy. The latter is obtained in the simulation by clustering, with the same jet algorithm, the stable generated particles produced during the hadronization process that follows the hard interaction. The main cause for this energy mismatch is the non uniform and non linear re-sponse of the CMS calorimeters to the jet showers. Furthermore, electronics noise and the pile-up originating from the additional pp interactions in the same bunch crossing can lead to extra unwanted energy. The purpose of the jet energy correction is to relate, on average, the energy measured in the detector to the energy of the corresponding particle jet. CMS has de-veloped a factorized multi-step procedure for the Jet Energy Calibration (JEC) [156,157]. The corrected jet four-momentum vector pcor _{is obtained by applying a multiplicative factor C to}

each component of the raw jet four-momentum vector praw(components are indexed by µ in the following):

pcor_µ = C · praw_µ . (3.10)

The correction factor C is composed of the offset correction Coffset, the MC calibration factor

CMC, and the residual calibrations Crel and Cabs for the relative and absolute energy scales,

respectively. The offset correction removes the extra energy due to noise and pile-up, and the MC correction removes the bulk of the non-uniformity in η and the non-linearity in pT . Finally,

the residual corrections account for the small differences between data and simulation. The various components are applied in sequence as described by the equation below:

C= C_offset(praw_T ) · CMC(p 0

T, η) · Crel(η) · Cabs(p 00

T) , (3.11)

where p0_T is the transverse momentum of the jet after applying the offset correction and p00_T is the pT of the jet after all previous corrections. In the following sections, each component of the

jet energy calibration will be discussed separately. Offset Correction

The offset correction is the first step in the chain of the factorized corrections. Its purpose is to estimate and subtract the energy not associated with the high-pT scattering. The excess

energy includes contributions from electronics noise and pile-up. Recent developments in the jet reconstruction algorithms have allowed a novel approach for the treatment of pile-up [144,158]: for each event, an average pT density ρ per area unit is estimated, which characterizes the soft

jet activity and is a combination of the underlying event, the electronics noise, and the pile-up. The two latter components contaminate the hard jet energy measurement and need to be corrected for with the offset correction. The key element for this approach is the jet area Aj.

A very large number of infinitely soft four-momentum vectors (soft enough not to change the properties of the true jets) are artificially added in the event and clustered by the jet algorithm together with the true jet components. The extent of the region in the y−ϕ space occupied by the soft particles clustered in each jet defines the active jet area. The other important quantity for the pile-up subtraction is the pT density ρ, which is calculated with the kt jet clustering algorithm

(23)

with a distance parameter R = 0.6. The kt algorithm naturally clusters a large number of soft

jets in each event, which effectively cover the entire y − ϕ space, and can be used to estimate an average pT -density. The quantity ρ is defined on an event-by-event basis as the median of the

distribution of the variable pTj/Aj, where j runs over all jets in the event across the full detector

acceptance (|η| < 5), and is not sensitive to the presence of hard jets. At the detector level, the measured density ρ is the convolution of the true particle-level activity (underlying event, pile-up) with the detector response to the various particle types. Based on the knowledge of the jet area and the event density ρ, an event-by-event and jet-by-jet pile-up correction factor can be defined as:

C_offset(praw_T )= 1 −(ρ − hρU Ei) · Aj praw

T

. (3.12)

In the formula above, hρU Ei is the pT-density component due to the underlying event and

elec-tronics noise, and is measured in events with exactly one reconstructed primary vertex (no pile-up). In Figure3.17athe offset transverse momentum correction for the CMS reconstructed jets are shown as a function of jet pseudorapidity in data and Monte Carlo for different in-tervals of reconstructed primary vertices (NPV). Values are based on 11 fb−1 of 8 TeV data

collection [159].

MC Calibration Correction

After the first collision data at √s= 7 TeV , it was realized that the comparison between jet reconstruction in data and in MC was not perfect but there were small differences, up to 10%, depending on η. The MC calibration is based on the simulation and it corrects the energy of the reconstructed jets such that it is equal on average to the energy of the generated MC particle jets. The jet reconstruction in MC is identical to the one applied to the data. Each reconstructed jet is spatially matched in the η − φ space with a MC particle jet by requiring ∆R < 0.25. In each bin of the MC particle transverse momentum pgen_T , the response variable R = p

reco T

pgen_T and the

detector jet preco_T are recorded. The average correction in each bin is defined as: CMC(precoT )=

1

hRi , (3.13)

and is expressed as a function of the average detector jet pT, hprecoT i.

In Figure3.17bthe MC calibration corrections for the CMS reconstructed jets as a function of jet pseudorapidity for three reference transverse momentum values (30 GeV/c, 100 GeV/c and 300 GeV/c) are shown on MC samples at 8 TeV [159].

Relative Jet Energy Scale

The goal of the relative jet energy scale correction is to make the jet response flat versus η. The uniformity in pseudorapidity is achieved by employing a data-driven method, the di-jet pT

-balance technique. Starting from an event with at least two jets in the final state, one jet (barrel jet) is required to lie in the central region of the detector (|η| < 1.3) and the other jet (probe jet) at arbitrary η. The central region is chosen as a reference because of the uniformity of the detector, the small variation of the jet energy response, and because it provides the highest jet

(24)

pT -reach. It is also the easiest region to calibrate in absolute terms, using γ+ jet and Z + jet

events. The two leading jets in the event must be azimuthally separated by∆ϕ > 2.7 rad. Figure 3.17c shows the residual relative corrections Crel(η) as a function of jet

pseudora-pidity for a class of PF jets built with the Charged Hadron Subtraction (CHS) algorithm. The Jet Energy Scale (JES) uncertainty, taken at the representative transverse momentum value of 100 GeV/c is shown in with the statistical uncertainty. Values are based on 11 fb−1of 8 TeV data samples [159]. The CHS algorithm uses the CMS detector excellent tracking capabilities to identify and remove jet constituents (charged hadrons) which are known to have originated from pile-up vertices. The algorithm, that uses a particle-by-particle pile-up subtractions technique, does not remove unassociated tracks, as they may have originated from the high-pT vertex.

Absolute Jet Energy Scale

The goal of the absolute jet energy scale correction is to make the jet response flat versus pT. Once a jet has been corrected for η dependence, it is corrected back to particle level. The

absolute jet energy response is measured in the reference region |η| < 1.3 with the Missing Transverse Energy Projection Fraction(MPF) method [160] using γ, Z+ jets events.

It is based on the fact that the γ, Z + jets events have no intrinsic missing transverse energy ~

E_Tmiss and that, at parton level, the γ or Z is perfectly balanced by the hadronic recoil in the transverse plane:

~p_Tγ,Z+ ~precoil

T = 0 . (3.14)

For reconstructed objects, this equation can be rewritten as: Rγ,Z ·~pTγ,Z+ Rrecoil·~pTrecoil = − ~E

miss

T , (3.15)

where Rγ,Z and Rrecoil are the detector responses to the γ or Z and the hadronic recoil,

respec-tively. Solving the two above equations for Rrecoilgives:

Rrecoil = Rγ,Z+ ~ E_Tmiss·~p_Tγ,Z p_Tγ,Z2 ≡ RMPF . (3.16)

This equation provides the definition of the MPF response RMPF. The additional step needed

is to extract the jet energy response from the measured MPF response. In general, the recoil consists of additional jets, beyond the leading one, soft particles and unclustered energy. The relation Rleadjet = Rrecoilholds to a good approximation if the particles, that are not clustered into

the leading jet, have a response similar to the ones inside the jet, or if these particles are in a direction perpendicular to the photon axis. The γ or the Z are used as reference objects because their energy is accurately measured in ECAL (photon, Z → e+e−_{) or in the Tracker and muon}

detectors (Z → µ+µ−).

In Figure3.17dthe residual absolute corrections Cabs(pT) in Z → µµ events are shown for

the CHS PF jets reconstructed in the center of the barrel (|η| < 1.3) and corrected with the full chain of corrections (offset, MC calibration, relative and absolute residual corrections). The extrapolation as a function of the second jet relative transverse momentum is shown for the MPF method and the pT balancing method used as control. Values are based on 11 fb−1 of

(25)

(a) Offset corrections. (b) MC calibration corrections.

(c) Relative residual calibration corrections. (d) Absolute residual calibration corrections. Figure 3.17: Jet energy corrections for data and MC based on 11 fb−1 of 8 TeV data collected by CMS [159].

Jet Energy Uncertainties at CMS

Each type of correction has uncertainties arising from many different sources. These sources can be categorized as it follows [161]:

• physics modeling in MC such as showering, modeling of underlying event, etc.;

• MC modeling of true detector response and properties such as noise, zero suppression, etc.;

(26)

At CMS more than 16 such sources of uncertainties have been identified. Several are related and can be combined into groups that are relative to the absolute scale, relative scale, extrapolation in pT, pile-up, jet flavor and time stability. In Figure 3.18 the contribution of the jet energy

correction uncertainty from the combined sources is shown as a function of the jet pT and jet

η. The total uncertainty on the jet energy correction is computed as the quadrature sum of the uncertainty of each different contribution.

(a) (b)

Figure 3.18: Jet energy correction uncertainties as a function of pT for CMS PF jets reconstructed

around |η|= 0 (a) and as a function of jet η for jets with pT = 100 GeV/c (b), based on 11 fb−1of 8 TeV

data collected by CMS [159]. Different contributions are shown with markers of different colors, and the total uncertainty is shown as a gray band. All jets are reconstructed with the anti-kt algorithm with

R= 0.5.

3.4 Jet Identification

The 2012 LHC run at 8 TeV had an average pile-up rate of 21 additional collisions, with some events exhibiting well over 40 pile-up collisions, as shown in Figure2.6.

In the current CMS detector, some of the subdetectors also read data in an extended window around the time of the current collision. This allows pile-up from both previous and following proton bunches to affect the reconstructed event. This effect is known as out-of-time pile-up (as opposed to in-time pile-up). The influence of out-of-time pile-up on the event is much smaller and both effects are combined and referred to generically as pile-up.

Due to the fact that pile-up jets primarily originate from overlapping jets incurred during pile-up interactions, pile-up jets exhibit two characteristic features: they are both diffuse and, where charged particle identification is possible, some fraction of the charged particles will not point to the primary vertex. These characteristics allow for the identification of pile-up jets in both regions where charged particle tracking is available and regions where jet shape identifi-cation is possible studying a set of different variables that could allow for jet identification (jet ID) [162]:

(27)

3.4 Jet Identification 99 • vertexing related variables: Charged PF candidates, with associated tracks, contribute to roughly half of the total pile-up. The other half of the pile-up originates from neutral candidates. Inside or near the Tracker volume a distinct enhancement in the ability to discriminate against pile-up is possible by exploiting the compatibility of the jet tracks with the primary vertex. To perform the jet identification, a variable β∗is defined as the sum of the pT of all PF charged candidates associated to another PV divided by the sum

of the pT of all charged candidates in the jet:

β∗₌ P i∈otherPVpTi P ipTi ; (3.17)

• shape related variables: outside the Tracker volume the use of vertexing is not possible; thus jet shower shapes are the only handle to distinguish pile-up jets. Since character-istically overlapping pile-up jets tend to result in wider jets, shape related variables are precisely designed to target the diffuseness of a jet. To perform the jet identification, a single radial variable is defined as:

h∆R2i= P i∆R2i · p 2 Ti P i p2Ti , (3.18)

where the sum runs over all PF candidates inside the jet and∆R = p(∆η)2+ (∆ϕ)2_{is the}

distance of the PF candidate with respect to the jet axis.

In Figure 3.19 the behaviour of the jet ID vertexing related variable β∗ and jet ID shape related variable h∆R2_{i are shown in a study performed on Z(→ µµ)}+ jets events for jets with

pT > 25 GeV/c in the Tracker acceptance |η| < 2.5. Values are based on 20 fb−1data collected

at a center-of-mass energy √s= 8 TeV [162].

* β 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Events/0.025 20 40 60 80 100 120 140 160 180 200 220 3 10 × Data All Gluon Quark PU Real Jet -1 = 8TeV L=20 fb s CMS Preliminary, > 25 GeV T | < 2.5 Jet p η | µ µ → Z

(a) Jet ID vertexing related variable β∗_.

> 2 R ∆ < 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 Events/0.066 100 200 300 400 500 600 3 10 × Data All Gluon Quark PU Real Jet -1 = 8TeV L=20 fb s CMS Preliminary, > 25 GeV T | < 2.5 Jet p η | µ µ → Z

(b) Jet ID shape related variable h∆R2_i.

Figure 3.19: Jet ID vertexing related variable β∗ (a) and Jet ID shape related variable h∆R2i (b) in Z(→ µµ)+ jets events for jets with pT > 25 GeV/c in the Tracker acceptance |η| < 2.5. Values are based

on 20 fb−1of 8 TeV data samples [162].

As shown in the figure, PU jets tend to be at high values of β∗_{, close to 1, and they have}

smaller values and a flatter profile in the h∆R2i distribution: therefore, they can be well separated from QCD jets.

(28)

(29)

Chapter 4 Datasets and Monte Carlo Simulation

Samples

In this chapter, the characteristics of data and Monte Carlo simulations samples used in the analysis are discussed and described, by specifying their general features and the specific requirements applied to perform the analysis in question, including the trigger. At the end of the chapter, a series of preselection schemes on photon and jets, reconstructed as discussed in Chapter3, are described before the application of the selections in the analysis.

4.1 Datasets

In this section, the characteristics of data samples used in the analysis are described in details.

4.1.1 Integrated Luminosity

As discussed in Section2.1, the instantaneous luminosity L measures the mean number of collisions per second, whereas the integrated luminosity L measures the integral over time, and it is representative of the amount of data collected. The delivered luminosity count starts from the declaration of stable beams and ends when the LHC operators request the CMS operators to turn off the sensitive detectors to allow a beam dump or beam studies.

The luminosity is determined from the counting rates as measured by the luminosity detec-tors. In CMS two detectors are exploited for the luminosity measurement: the forward hadronic calorimeter HF, featuring a dedicated high rate acquisition system independent from the central data acquisition system and capable of estimating the luminosity per bunch [163], and the sili-con pixel detector [164], characterized by very low occupancy and excellent stability over time. Thanks to its very small dependence on experimental conditions, the counting of the pixel clus-ters gives a precise offline luminosity measurement. The HF measurement benefits from smaller statistically uncertainty and it is useful for cross checks or systematic studies. The luminosity detectors have been calibrated with the use of the van-der-Meer beam-separation method [165], where the two beams are scanned against each other in the horizontal and vertical planes to measure their overlap function.