A search for the standard model Higgs boson decaying to charm quarks

(1)

JHEP03(2020)131

Published for SISSA by Springer

Received: December 3, 2019 Accepted: February 25, 2020 Published: March 24, 2020

A search for the standard model Higgs boson

decaying to charm quarks

The CMS collaboration

Abstract: A direct search for the standard model Higgs boson, H, produced in association with a vector boson, V (W or Z), and decaying to a charm quark pair is presented. The search uses a data set of proton-proton collisions corresponding to an integrated luminosity of 35.9 fb−1, collected by the CMS experiment at the LHC in 2016, at a centre-of-mass energy of 13 TeV. The search is carried out in mutually exclusive channels targeting specific decays of the vector bosons: W → `ν, Z → ``, and Z → νν, where ` is an electron or a muon. To fully exploit the topology of the H boson decay, two strategies are followed. In the first one, targeting lower vector boson transverse momentum, the H boson candidate is reconstructed via two resolved jets arising from the two charm quarks from the H boson decay. A second strategy identifies the case where the two charm quark jets from the H boson decay merge to form a single jet, which generally only occurs when the vector boson has higher transverse momentum. Both strategies make use of novel methods for charm jet identification, while jet substructure techniques are also exploited to suppress the background in the merged-jet topology. The two analyses are combined to yield a 95% confidence level observed (expected) upper limit on the cross section σ (VH) B (H → c¯c) of 4.5 (2.4+1.0−0.7) pb, corresponding to 70 (37) times the standard model prediction.

Keywords: Hadron-Hadron scattering (experiments), Higgs physics, Charm physics

(2)

JHEP03(2020)131

Contents

1 Introduction 1

2 The CMS detector 3

3 Simulated event samples 4

4 Event reconstruction and selection 5

4.1 Baseline selection 6

5 Resolved-jet topology analysis 8

5.1 Higgs boson reconstruction 8

5.2 Signal extraction 10

6 Merged-jet topology analysis 12

6.1 Higgs boson reconstruction 12

6.2 Signal extraction 14 7 Systematic uncertainties 18 8 Results 19 8.1 Resolved-jet topology 21 8.2 Merged-jet topology 21 8.3 Combination 24 9 Summary 25 The CMS collaboration 34 1 Introduction

The discovery of a Higgs boson, H, with the CERN LHC data collected in 2010–2012 by both the ATLAS [1] and CMS [2,3] experiments in 2012 represented a major step toward the characterisation of the electroweak symmetry breaking mechanism [4–6]. The mass of this particle is measured to be m_H ∼ 125 GeV [7–9] and its decays in the γ γ, ZZ, WW,

and ττ modes have been observed [10–20]. All measured properties so far [7,8, 8,21–29]

indicate that, within the measurement uncertainties, this new particle is consistent with the expectations of the standard model (SM). Nevertheless, there remains much to be learned about the properties of this new particle. One of the highest priorities of the LHC physics program is the measurement of the couplings of the H boson to other SM particles. Recently both ATLAS and CMS Collaborations reported the first direct measurements of

(3)

JHEP03(2020)131

the H boson couplings to third-generation quarks (t and b) [30–33] and found them to be also compatible with the SM prediction. A measurement of the couplings of the H boson to second generation leptons [34,35] and quarks is the next target.

In this paper, we focus on the search for H bosons decaying to cc , a charm quark-antiquark pair. The H boson to charm quark Yukawa coupling y_c can be significantly modified by physics beyond the SM [36–40]. In the absence of an observation of Higgs decays to charm quarks, one can place a bound on the charm quark Yukawa coupling. The first direct bound on κ_c≡ y_c/y_cSM of 234 at 95% confidence level (CL) has been obtained by recasting the ATLAS and CMS 8 TeV H → bb searches [41] in a model independent way. Indirect constraints on y_c obtained from a global fit to existing H boson data result in an upper bound on κ_c ≡ y_c/ySM_c of 6.2 [41] at 95% confidence level (CL), assuming the absence of non-SM production mechanisms. A direct measurement of this process is extremely challenging at a hadron collider. The branching fraction of this process according to SM computations, B(H → cc ) = 0.0288+0.0016−0.0006 [42], is a factor 20 smaller than that of

H → bb , and there is a very large background from SM processes comprised uniquely of jets produced through the strong interaction, referred to as quantum chromodynamics (QCD) multijet events. Results from direct searches for H → cc at the LHC in the ZH (Z → ``, ` = e or µ) channel were previously reported by the ATLAS Collaboration using a data sample of proton-proton (pp) collisions at a centre-of-mass energy of 13 TeV, corresponding to an integrated luminosity of 36.1 fb−1 [43]. The observed (expected) exclusion limit on the signal strength µ (defined as the product of the measured H boson production cross section and the H → cc branching fraction divided by the same quantity as predicted by the SM) at 95% CL was found to be 110 (150).

This paper presents the first direct search for the H → cc decay carried out by the CMS Collaboration. It uses pp collision data corresponding to an integrated luminosity of 35.9 fb−1, collected with the CMS experiment at the LHC in 2016 at a centre-of-mass energy of 13 TeV. The search targets H bosons produced in association with a W or Z boson, which we collectively refer to as vector (V) bosons. The presence of a V boson greatly suppresses backgrounds stemming from otherwise overwhelming QCD multijet processes, and its leptonic decays provide a crucial handle to collect the events efficiently. The most significant remaining backgrounds arise from V+jets (processes that account for one or more jets recoiling against a vector boson), tt , and VH(H → bb ) processes. To fully explore the H → cc decay mode, the analysis is split into two separate searches involving different topologies: the “resolved-jet” topology, in which the H boson candidate is reconstructed from two well-separated and individually resolved charm quark jets, and the “merged-jet” topology, in which the hadronisation products of the two charm quarks are reconstructed as a single jet. The former focuses on H boson candidates with lower transverse momentum, p_T, while the latter performs better for H boson candidates with high p_T. In practice, the two topologies can have significant overlap and so, for the final result, the two are made distinct by defining them in reference to whether the V boson in the event has pT(V) below

(4)

JHEP03(2020)131

The central feature of this search is the identification of charm quark jets. In both topologies, novel tools based upon advanced machine learning (ML) techniques are used for charm quark jet identification [44,45]. In addition, the merged-jet topology makes use of jet substructure information to further suppress the backgrounds.

2 The CMS detector

The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections. Forward calorimeters extend the pseudorapidity (η) coverage provided by the barrel and endcap detectors. Muons are detected in gas-ionisation chambers embedded in the steel flux-return yoke outside the solenoid.

In the barrel section of the ECAL, an energy resolution of about 1% is achieved for unconverted or late-converting photons that have energies in the range of tens of GeV. The remaining barrel photons have a resolution of about 1.3% up to |η| = 1, rising to about 2.5% at |η| = 1.4. In the endcaps, the resolution of unconverted or late-converting photons is about 2.5%, while other endcap photons have a resolution between 3 and 4% [46].

In the region |η| < 1.74, the HCAL cells have widths of 0.087 in η and 0.087 rad in azimuth (φ). In the η-φ plane, and for |η| < 1.48, the HCAL cells map on to 5×5 arrays of ECAL crystals to form calorimeter towers projecting radially outwards from close to the nominal interaction point. For |η| > 1.74, the coverage of the towers increases progressively to a maximum of 0.174 in ∆η and ∆φ. Within each tower, the energy deposits in ECAL and HCAL cells are summed to define the calorimeter tower energies.

Muons are measured in the range |η| < 2.4, with detection planes made using three technologies: drift tubes, cathode strip chambers, and resistive-plate chambers. The effi-ciency to reconstruct and identify muons is greater than 96%. Matching muons to tracks measured in the silicon tracker results in a relative p_T resolution, for muons with p_T up to 100 GeV, of 1% in the barrel and 3% in the endcaps. The pT resolution in the barrel is

better than 7% for muons with p_T up to 1 TeV [47].

Events of interest are selected using a two-tiered trigger system [48]. The first level, composed of custom hardware processors, uses information from the calorimeters and muon detectors to select events at a rate of around 100 kHz within a fixed time interval of less than 4 µs. The second level, known as the high-level trigger (HLT), consists of a farm of processors running a version of the full event reconstruction software optimised for fast processing, and reduces the event rate to around 1 kHz before data storage.

A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in ref. [49].

(5)

JHEP03(2020)131

3 Simulated event samples

Signal and background processes are simulated using various event generators, while the CMS detector response is modelled with Geant4 [50]. The quark-induced ZH and WH signal processes are generated at next-to-leading order (NLO) accuracy in QCD using the powheg v2 [51–53] event generator extended with the Multi-scale improved NLO (MiNLO) procedure [54,55], while the gluon-induced ZH process is generated at leading order (LO) accuracy with powheg v2. The H boson mass is set to 125 GeV for all signal samples. The production cross sections of the signal processes [42] are corrected as a function of p_T(V) to next-to-next-to-leading order (NNLO) QCD + NLO electroweak (EW) accuracy combining the vhnnlo [56–59], vh@nnlo [60, 61], and hawk v2.0 [62] generators as described in ref. [42].

The V+jets events are generated with MadGraph5 amc@nlo v2.4.2 [63] at NLO with up to two additional partons, and at LO accuracy with up to four additional partons. The production cross sections for the V+jets samples are scaled to the NNLO cross sections obtained using fewz 3.1 [64]. Events in both LO and NLO samples are reweighted to account for NLO EW corrections to pT(V), which reach up to 10% for pT(V) ≈ 400 GeV. In

addition, a LO-to-NLO correction is applied to LO samples as a function of the separation in η between the two leading jets in the event [65]. The p_T(V) spectrum in simulation after the aforementioned corrections is observed to be harder than in data, as expected due to missing higher-order EW and QCD contributions to the V+jets processes [66]. A residual reweighting of pT(V), that is obtained via a fit to the data-to-simulation ratio in

the control regions (detailed in section 5) of the W(`ν)H(cc ) and Z(``)H(cc ) channels in

the resolved analysis, is applied.

Diboson (WW, WZ and ZZ) background events are generated with Mad-Graph5 amc@nlo v2.4.2 [63] at NLO with up to two additional partons in the matrix element calculations. The same generator is used at LO accuracy to generate a sample of QCD multijet events. The tt [67] and single top production processes in the tW- and t-channels [68,69] are generated to NLO accuracy with powheg v2, while the s-channel [70] single top process is generated with MadGraph5 amc@nlo v2.4.2. The production cross sections for the tt samples are scaled to the NNLO prediction with the next-to-next-to-leading-log result obtained from Top++ v2.0 [71]. The tt samples are reweighted as a function of top quark p_T to account for the known differences between data and simula-tion [72].

The parton distribution functions (PDF) used to produce all samples are the NNLO NNPDF3.1 set [73]. For parton showering and hadronisation, including the H → cc decay, the matrix element generators are interfaced with pythia v8.230 [74] with the CUETP8M1 [75] underlying event tune. The matching of jets from matrix element calcu-lations and those from parton shower is done with the FxFx [76] (MLM [77]) prescription for NLO (LO) samples. For all samples, simulated additional pp interactions in the same or adjacent bunch crossings (pileup) are added to the hard-scattering process. The events are then reweighted to match the pileup profile observed in the collected data.

(6)

JHEP03(2020)131

4 Event reconstruction and selection

Events are reconstructed using the CMS particle-flow (PF) algorithm [78], which seeks to reconstruct and identify the individual particles in the event via an optimal combination of all information in the CMS detector. The reconstructed particles are identified as charged or neutral hadrons, electrons, muons, or photons, and constitute a list of PF candidate physics objects. At least one reconstructed vertex is required. In the case of multiple collision vertices from pileup interactions, the candidate vertex with the largest value of summed physics-object p2T is taken to be the primary pp interaction vertex. The physics

objects are the jets, clustered using the jet finding algorithm [79, 80] with the tracks assigned to candidate vertices as inputs, and the associated missing transverse momentum, taken as the negative vector sum of the pT of those jets. Events affected by reconstruction

failures, detector malfunctions, or noncollision backgrounds, are identified and rejected by dedicated filters [81].

Electrons are reconstructed by combining information from the tracker and energy deposits in the ECAL [82]. Muons are reconstructed by combining information from the tracker and the muon system [47]. Only tracks originating from the PV can be associated with the electrons or muons, and quality criteria [47,82] are further imposed that obtain more pure identification without substantial loss of efficiency. To suppress leptons stem-ming from b and c decays, while retaining leptons from V decays, isolation is required from jet activity within a cone of radius ∆R =p(∆η)2+ (∆φ)2 = 0.3. The isolation is defined as the scalar p_T sum of the PF candidates within the cone divided by the lepton p_T. The upper threshold applied on the relative isolation is 0.06 for electrons and muons in the W(`ν)H(cc ) channel and 0.15 and 0.25 for electrons and muons respectively in the

Z(``)H(cc ) channel. Charged PF candidates not originating from the PV, as well as PF candidates identified as electrons or muons, are not considered in the sum [83]. The isola-tion of electrons and muons is also corrected for the estimated energy that is contributed to the isolation region by neutral particles originating from pileup. In the case of elec-trons, the latter is estimated by an effective jet area from the measured neutral energy density [82], while for muons, the ∆β-correction method [47] is applied.

Jets are reconstructed by clustering the PF candidates with the anti-k_Talgorithm [79,

80] using a distance parameter R. The jet momentum is determined as the vectorial sum of all PF candidate momenta in the jet, and is found in simulation to be within 5 to 10% of the true momentum over the full detector acceptance and range of p_T considered in this analysis. The raw jet energies are then corrected to establish a uniform response of the calorimeter in η and a calibrated absolute response in pT. Additional corrections to

account for any residual differences between the jet energy scale in data and simulation are extracted and applied based on comparison of data and simulated samples in relevant control regions [84]. The jet energy resolution typically amounts to 15–20% at 30 GeV, about 10% at 100 GeV, and 5% at 1 TeV [84]. Corrections extracted from data control regions are applied to account for the difference between the jet energy resolution in data and simulation. Additional selection criteria are applied to each jet to remove those that are potentially dominated by instrumental or reconstruction failures [85].

(7)

JHEP03(2020)131

Two collections of jets reconstructed with the anti-k_Talgorithm are used in the search. The first consists of jets clustered with R = 0.4, and will be referred to as “small-R jets”. The charged hadron subtraction algorithm [86] is used to eliminate PF candidates from the jet constituents associated with vertices from pileup interactions. The neutral component of the energy arising from pileup interactions is estimated with the effective area method [85]. The small-R jets are required to have p_T> 20 GeV and to be within the tracker acceptance, |η| < 2.4. Any small-R jets that overlap with preselected electrons and muons, as defined by ∆R(j, `) < 0.4, are discarded.

The second jet collection is based on jets reconstructed using R = 1.5. This collection will be referred to as “large-R jets” in what follows. In this case, the PUPPI algorithm [87] is used to correct the jet energy for contributions coming from pileup. Additional informa-tion on jet substructure is obtained by reclustering the constituents of these jets via the Cambridge-Aachen algorithm [88]. The “modified mass drop tagger” algorithm [89, 90], also known as the “soft-drop” (SD) algorithm, with angular exponent β = 0, soft cutoff threshold z_cut = 0.1, and characteristic radius R₀ = 1.5 [91], is applied to remove soft, wide-angle radiation from the jet. In the default configuration, the SD algorithm identi-fies two hard subjets within the large-R jet by reversing the Cambridge-Aachen clustering history. The kinematic variables of the two subjets are used to calculate the 4-momentum of the large-R jet. The large-R jets are required to have |η| < 2.4 and a soft drop mass of 50 < m_SD< 200 GeV. Large-R jets that overlap with preselected electrons and muons, as defined by ∆R(j, `) < 1.5, are discarded.

The missing transverse momentum vector ~pTmiss is computed as the negative vector pT

sum of all the PF candidates in an event, and its magnitude is denoted pmiss_T [81]. The magnitude and direction of ~pTmiss are modified to account for corrections to the energy scale

of the reconstructed jets in the event.

One of the most challenging tasks of this analysis is the discrimination of jets that are the result of the hadronisation of c quarks from all other jet flavours. Tagging c jets is more difficult than tagging b jets because they are less distinct from light-flavour quark or gluon jets (udsg) in regard to mass, decay length of charmed hadrons produced in the hadronisation process, and multiplicity of tracks inside the jet. The resolved- and merged-jet topology analyses use different strategies for tagging c merged-jets. More details on c tagging are presented below in sections5 and 6.

4.1 Baseline selection

The search uses the leptonic decays of the vector bosons to define three mutually exclusive channels based on the charged-lepton multiplicity in the final state, namely: “0L” channel as referring to the Z(νν)H(cc ) signal process, “1L” channel as referring to the W(`ν)H(cc )

signal process, and “2L” channel as referring to the Z(``)H(cc ) signal process. The 1L and 2L channels are further subdivided based on lepton flavour. Only electrons and muons are considered in this search.

Events in the 0L channel are collected with a trigger requiring the presence of pmissT

above 170 GeV or 110 GeV and an additional threshold on the missing hadronic transverse energy of 110 GeV. Events in the 1L channel are obtained with a trigger requiring the

(8)

JHEP03(2020)131

presence of an isolated electron or muon with p_Tabove 27 and 24 GeV, respectively. Events in the 2L channel of the resolved-jet topology analysis are selected by triggers that require the presence of a pair of leptons with p_T larger than 23 and 12 GeV for electrons, and 17 and 8 GeV for muons. The same dielectron trigger has been used in the 2L Z(ee) channel of the merged-jet topology analysis, while events in the Z(µµ) channel are selected by the

above single-muon trigger, which provides high efficiency for muons produced in the decays of high-pT bosons.

The collected events are required to pass additional offline criteria. In the 0L channel corresponding to Z boson decays to neutrinos, pmiss_T > 170 GeV is required and events with identified isolated leptons are rejected. The ~pTmiss is taken to correspond to ~pT(V) in this

case. Events with a single electron (muon) with p_T > 30 (25) GeV pass the 1L selection. The leptonically decaying W boson is approximately reconstructed as the vectorial sum of the lepton momentum and ~pTmiss. The event topology is required to be compatible with the

leptonic decay of a Lorentz-boosted W boson by requiring ∆φ(pmiss_T , `) < 2.0 (1.5) in the resolved-jet (merged-jet) topology analysis. Finally, for the 2L selection, the two highest pT leptons are required to be of the same flavour, opposite electric charge, and to have

a p_T above 20 GeV. The Z boson candidates are then reconstructed as the sum of the four-momenta of these two leptons, and the invariant mass of the candidates is required to be compatible with the Z boson mass (75 < m``< 105 GeV).

A typical VH(H → cc ) event has the signature of a vector boson recoiling against a H boson with little additional activity. The event selection is designed to retain such events while suppressing background processes as much as possible. In addition to the requirement of a high-p_T vector boson, the QCD multijet background is reduced to negligible levels by demanding the ~pTmissto not be aligned with any jet in the event and requiring the azimuthal

angular separation ∆φ(~p_Tmiss_trk, ~p_Tmiss) < 0.5 for which ~p_Tmiss_trk is calculated solely from charged particles. This latter selection reduces the contribution of QCD multijet events that arise from the presence of “fake” ~pTmiss coming from jet energy mismeasurement in

the calorimeters. A significant fraction of the tt background is suppressed by rejecting events with Naj_small-R > 1 in the 0L and 1L channels, and Naj_small-R > 2 in the 2L channel of the merged-jet topology analysis, where Naj_small-R represents the additional small-R jet multiplicity. This requirement is not needed in the 2L channel of the resolved-jet topology analysis where the top quark background is negligible.

The dominant background that remains after the application of the event selection de-scribed above is V+jets. The contribution from this background is suppressed by requiring the dijet invariant mass m_jj(calculated using two small-R jets) in the resolved-jet topology analysis and mSD in the merged-jet topology analysis to satisfy 50 < mjj(mSD) < 200 GeV.

Contributions from tt and single top processes remain significant in the 0L and 1L channels because of the presence of at least one W boson and because b quarks are often misiden-tified as c quarks by the c tagging algorithms. Contributions from diboson processes are typically small as a result of their small production cross sections. The background origi-nating from H bosons decaying into b quarks presents kinematic properties similar to those of the signal, with the exception of a higher average energy of neutrinos in b jets than in c jets. This background is reduced by exploiting dedicated jet flavour taggers, as described in sections5 and 6.

(9)

JHEP03(2020)131

The details of the resolved-jet topology and merged-jet topology analyses are described in sections 5and 6, respectively. Section7 is dedicated to the treatment of the systematic uncertainties and section8presents the results of the two analyses and of their combination. Section8presents also the strategy that is used to make the two analyses mutually exclusive in order to facilitate their combination for the final results.

5 Resolved-jet topology analysis

Approximately 95% of the VH events produced at√s = 13 TeV have a vector boson with p_T lower than 200 GeV, corresponding to the phase space region where the H boson decay prod-ucts generally give rise to two distinctly reconstructed small-R jets in the CMS detector. The resolved-jet analysis aims to exploit a large fraction of this phase space which, however, contains a sizeable background contamination. The requirement of a moderate boost of the vector boson is then found to be crucial to the reduction of V+jets and tt backgrounds. Dedicated charm taggers based on ML are used to order the jets in the event by their likelihood to be c jets that are considered for use in reconstructing the H boson candidate. Backgrounds arise from the production of W and Z bosons in association with one or more jets, single and pair-produced top quarks, and diboson events. A small residual QCD background is present in the 0L and 1L channels. High-purity control regions for the V+udsg and tt backgrounds are identified in data and used to estimate expected yields of these backgrounds in the signal region. Samples of events in regions that are disjoint from the signal region in c tagging probability and dijet mass but which are enhanced in V+bb and V+cc production are used to provide data-driven constraints on the V+bb and V+cc backgrounds, respectively. Finally, a binned maximum likelihood fit is carried out simultaneously in the signal region and in the control regions for all channels.

5.1 Higgs boson reconstruction

The H candidate is reconstructed as two distinct small-R jets. The identification of c jets among those arising from other flavours of quarks or gluons is achieved with the Deep Combined Secondary Vertex (DeepCSV) algorithm [44]. This algorithm encodes a multi-classifier based on advanced ML techniques and provides three output weights p(b), p(c), and p(udsg) which can be interpreted as the probabilities for a given jet to have originated from a bottom quark, a charm quark, or a gluon or light-flavour quark, respectively. By combining the various DeepCSV outputs, it is possible to define two discriminators for c tagging. The inputs to the DeepCSV algorithm are variables constructed from observables associated with the reconstructed primary and secondary vertices, tracks, and jets. The dis-crimination between c jets and light-flavour quark or gluon jets is achieved via the probabil-ity ratio defined as CvsL = p(c)/[p(c) + p(udsg)]. In the same way, discrimination between c jets and b jets makes use of the probability ratio defined as CvsB = p(c)/[p(c) + p(b)]. The two discriminator ratio values for each jet define a two-dimensional distribution. The resulting c tagging efficiency as a function of the b jet and light-flavour quark or gluon jet efficiencies is shown in figure1. To account for residual O(10%) differences in the distribu-tions of CvsL and CvsB found in the comparison of data and simulation, reshaping scale

(10)

JHEP03(2020)131

10−2 ₁₀−1 ₁₀0 b jet efficiency 10−2 10−1 100

Light-flavour or gluon jet efficiency

VH(H → cc) Working Point DeepCSV 2016 (13 TeV) CMSSimulation 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.0 0.2 0.4 0.6 0.8 1.0 c jet efficiency

Figure 1. Efficiency to tag a c jet as a function of the b jet and light-flavour quark or gluon jet mistag rates. The working point adopted in the resolved-jet topology analysis to select the leading CvsL jets is shown with a white cross. The white lines correspond to c jet iso-efficiency curves. The plot makes use of jets with pT > 20 GeV that have been clustered with AK4 algorithm in a

simulated tt +jets sample before application of data-to-simulation reshaping scale factors.

factors have been extracted using an iterative fit to distributions in control regions enriched in Drell-Yan+jets, semileptonic tt +jets, and W+c events that provide data samples with large fractions of light-flavour quark or gluon jets, b jets, and c jets, respectively. The corresponding uncertainties, evaluated on a per jet basis as a function of the jet flavour, range from 2% for bottom, gluon, and light-flavoured quark jets to 5% for c jets.

The probability ratios CvsL and CvsB are used to discriminate candidates that are consistent with the c jet hypothesis from jets originating from light-flavour quarks or gluons, and b quarks, respectively. The two jets with the highest score of CvsL in the event are chosen to build the H candidate four-vector. Events are required to have the leading jet in CvsL score passing the c tagger working point requirements (CvsL > 0.4, CvsB > 0.2). This working point has been chosen such that the efficiency to identify a c jet is approximately 28%, while the misidentification rate is 4% for light-flavour quark or gluon jets and 15% for b jets. The misidentification rate of τ leptons as c jets is larger than the misidentification rate of b jets as c jets. However, the kinematic properties of the τ decays are exploited by the BDTs, as described in section 5.2, to discriminate signal c jet pairs from misidentified jet pairs coming from τ leptons. The contribution of the VH(H → τ τ ) process in the high signal purity bins of the BDT distributions has a negligible impact on the final results. To account for jets originating from final-state radiation (FSR), additional jets with p_T > 30 GeV and |η| < 3.0 are included in the calculation of the components of

(11)

JHEP03(2020)131

the H candidate four-vector if they lie in a cone of ∆R < 0.8 centred on the direction of one of the two leading jets.

5.2 Signal extraction

In addition to the selections reported in sections 4.1 and 5.1, in the 1L and 0L channels of the resolved-jet topology analysis, where larger backgrounds are expected, the V candi-dates are required to have pT of at least 100 and 170 GeV respectively, while, for the same

channels, the H candidates are required to have pT of at least 100 and 120 GeV. In the

2L channel, where the background from tt production is much smaller and the effective signal cross section is also lower, two regions are considered: a low-pT(V) region defined by

50 < p_T(V) < 150 GeV and a high-p_T(V) region with p_T(V) > 150 GeV (no upper cut is applied on p_T(V)). In VH(H → cc ) signal events, the vector boson is typically produced in the azimuthal direction opposite to that of the H boson. Therefore, an additional require-ment on the difference in azimuthal angle between the reconstructed V and H candidate, ∆φ(V, H) >2.5 (>2.0 in the 0L channel), is applied.

In the signal regions defined by the application of the selection criteria mentioned above, a boosted decision tree (BDT) with gradient boost [92] has been trained to enhance the signal separation from background. The same simulated samples, each normalised to the cross section of the relevant physics process, have been split into two independent subsets used for training and testing the BDTs. All the data-to-simulation scale factors relative to trigger efficiency, lepton identification and isolaton efficiency, and c-tagging have been applied to the simulated samples. Separate BDTs have been trained for 0L, 1L, and 2L (low-p_T(V) and high-p_T(V)) channels. The muon and electron samples were combined to train the BDTs to benefit from a higher number of simulated events. The distributions of all variables used to construct the BDT discriminator and hence the BDT distribution itself are taken from simulation after the application of the corrections detailed in section 3. Table1

lists the input variables considered in each channel. As expected, the most discriminating variables are found to be the H candidate invariant mass and the CvsL_max.

The remaining background contribution is estimated from a combination of simulated events and data. While the normalisations of QCD, single-top, diboson, and VH(H → bb ) processes are estimated via simulation, the normalisations of the V+jets and tt +jets backgrounds are determined from fits to data in dedicated control regions in order to avoid potential mismodelling of the flavour composition of these samples. Four control regions per channel are designed to constrain the most important background processes: a region dominated by tt +jets events (TT), a region targeting V+jets with at least one jet originating from light-flavour quarks or gluons (LF), a region enriched in V+jets events with one b jet and one b or c jet (HF), and a region enriched with V+cc events (CC). The definitions of the different control regions are based mainly on the inversion of the criteria on the charm tagger discriminators values of the CvsL-leading jet applied to define the signal regions. To define the LF control region the selection (CvsL < 0.4, CvsB > 0.2) is used while both the HF and TT control regions are defined applying the selection (CvsL > 0.4, CvsB < 0.2). In order to differentiate the TT from HF control regions, further requirements are applied such as a veto on the reconstructed Z boson mass in the

(12)

JHEP03(2020)131

Variable Description 0L 1L 2L

m(H) H mass X X X

pT(H) H transverse momentum X X X

pT(V) vector boson transverse momentum X X X

m(V) vector boson mass — — X

mT(V) vector boson transverse mass — X —

pmissT missing transverse momentum X X —

pT(V)/pT(H) ratio between vector boson and H transverse momenta X X X

CvsLmax CvsL value of the leading CvsL jet X X X

CvsBmax CvsB value of the leading CvsL jet X X X

CvsLmin CvsL value of the subleading CvsL jet X X X

CvsBmin CvsB value of the subleading CvsL jet X X X

pTmax pTof the leading CvsL jet X X X

pTmin pTof the subleading CvsL jet X X X

∆φ(V, H) azimuthal angle between vector boson and H X X X ∆R(j1, j2) ∆R between leading and subleading CvsL jets — X X

∆φ(j1, j2) azimuthal angle between leading and subleading CvsL jets X X —

∆η(j₁, j₂) difference in pseudorapidity between leading and subleading CvsL jets X X X ∆φ(`1, `2) azimuthal angle between leading and subleading pTleptons — — X

∆η(`1, `2) difference in pseudorapidity between leading and subleading pT leptons — — X

∆φ(`1, j1) azimuthal angle between leading pTlepton and leading CvsL jet — X —

∆φ(`2, j1) azimuthal angle between subleading pT lepton and leading CvsL jet — — X

∆φ(`2, j2) azimuthal angle between subleading pT lepton and subleading CvsL jet — — X

∆φ(`1, p miss

T ) azimuthal angle between leading pTlepton and missing transverse momentum — X —

Naj_small-R number of small-R jets minus the number of FSR jets X X X

N5sof t multiplicity of soft track-based jets with pT> 5 GeV X X X

Table 1. Variables employed in the training of the BDT used for each channel of the resolved-jet topology analysis. The 2L case has separate training for the low- and high-pT(V) channels, but

exploits the same input variables.

2L channel, m_`` ∈ [75, 120] GeV, and the requirement N/ aj_small-R≥ 2. The CC control region is defined identically to the signal region, except for inverting the requirement on the H mass. The simulated V+jets backgrounds are similarly split into four classes depending on the flavour(s) of the additional jet(s) present in the processes: V+2 light-flavour quark or gluon jets, V+udsg and 1b or 1c, V+bb or bc, and V+cc jets.

Separate normalisation scale factors are used to constrain Z+jets processes in the 0L and 2L channels, while the normalisation scale factors related to W+jets processes are shared between the 0L and 1L analysis channels. To constrain the tt +jets process, on the other hand, each channel relies on its own independent normalisation scale factors. The normalisation scale factors are measured together with the signal strength modifier through a simultaneous fit to data in all control and signal regions for all of the analysis channels. The simulated diboson background is split according to the presence or absence

(13)

JHEP03(2020)131

of a Z boson decaying to a pair of charm quarks, labelling them as VZ(Z → cc ) if such a Z boson decay is present and VV+other otherwise. Whereas in the signal regions the BDT discriminator is used for the final signal extraction, in the control regions the shape of the CvsB distribution is used in the TT, HF, and CC regions, while that of CvsL is used in the LF region. The reason of this choice lies in the fact that CvsB provides the best discriminant between b and c jets and thus it is used in the control regions where there is an enhanced presence of b jets, while CvsL is more efficient in separating light-flavour quark or gluon jets from c jets and hence it is preferred in the LF control region.

Figure 2shows the distributions of the CvsB discriminant for the subleading CvsL jet for the HF and CC control regions in the 2L (Z(µµ)) low-pT(V), 2L (Z(ee)) high-pT(V),

1L (W(µν)), and 0L channels. The post-fit distributions (for fit details, see section 8) in figure 2 show good agreement between the data and the simulation in these two most significant control regions. Moreover, the employment of the full distribution of the CvsB score provides a good separation between the V+bb and V+cc processes that makes it possible to constrain these two backgrounds. The corresponding distributions for the other channels are not shown but are similar in their behaviour.

6 Merged-jet topology analysis

For the case of a Lorentz-boosted H boson as flagged by a V boson with p_T(V) & 200 GeV, a merged-jet topology is considered. The dominant backgrounds after the baseline selection presented in section 4.1 come from V+jets and tt processes. The V bosons in the signal process have on average larger pT than those from the V+jets background. The analysis

focuses on the reconstruction of moderately to highly Lorentz-boosted H bosons where the decay products are contained in a single large-R jet. Dedicated object reconstruction tools based on large-R jets and advanced ML techniques were developed to identify and reconstruct Lorentz-boosted H bosons decaying to charm quarks.

6.1 Higgs boson reconstruction

The cornerstone of the merged-jet topology analysis is the reconstruction of the H → cc candidate in a single, large-R jet, which has the potential to provide a better signal purity because the signal has a tendency to be more boosted than the dominant V+jets and tt backgrounds, as noted above. In view of this, the high-pT regime with pT(V) & 200 GeV,

though representing no more than approximately 5% of the total phase space, can provide a significant contribution to the search. Moreover, the merged-jet approach has important advantages over the resolved-jet approach at high pT. The possibility for both c quarks to

reside in a single large-R jet enhances the signal acceptance, improves the identification of the correct pair of jets to use in reconstructing the H boson, and similarly facilitates the task of taking into account any FSR that may have been emitted by the quarks. A more detailed discussion of the potential advantages of this approach can be found in refs. [90,93]. Given the small fraction of signal events that survive a selection with pT(V) & 200 GeV, it is

critical to carefully choose the R parameter of the jet clustering algorithm. In general, the angular separation between the decay products of a Lorentz-boosted particle such as the

(14)

JHEP03(2020)131

Events

0 200 400

600 Data_VV+other VZ(Z_{Single top}→cc)

t t Z+cc /bc b Z+b Z+b/c Z+udsg VH(H→cc) ) b b → VH(H MC uncertainty Z+CC Control Region CMS T ), Low V-p µ µ 2L ( Resolved-jet (13 TeV) -1 35.9 fb min CvsB 0 0.2 0.4 0.6 0.8 Obs / Exp 0.5 1 1.5 Events 0 0.5 1 1.5 2 2.5 3 10 × Data VZ(Z→cc)

VV+other Single top

t t Z+cc /bc b Z+b Z+b/c Z+udsg VH(H→cc) ) b b → VH(H MC uncertainty Z+HF Control Region CMS T ), Low V-p µ µ 2L ( Resolved-jet (13 TeV) -1 35.9 fb min CvsB 0 0.2 0.4 0.6 0.8 Obs / Exp 0.5 1 1.5 Events 0 20 40 60 80 100 120 Data VZ(Z→cc)

VV+other Single top

t t Z+cc /bc b Z+b Z+b/c Z+udsg VH(H→cc) ) b b → VH(H MC uncertainty Z+CC Control Region CMS T 2L (ee), High V-p Resolved-jet (13 TeV) -1 35.9 fb min CvsB 0 0.2 0.4 0.6 0.8 Obs / Exp 0.5 1 1.5 Events 0 50 100 150 200 Data VZ(Z→cc)

VV+other Single top

t t Z+cc /bc b Z+b Z+b/c Z+udsg VH(H→cc) ) b b → VH(H MC uncertainty Z+HF Control Region CMS T 2L (ee), High V-p Resolved-jet (13 TeV) -1 35.9 fb min CvsB 0 0.2 0.4 0.6 0.8 Obs / Exp 0.5 1 1.5 Events 0 200 400 600 800 Data VZ(Z→cc)

VV+other Single top

t t W+cc /bc b W+b W+b/c W+udsg Z+cc /bc b Z+b Z+b/c Z+udsg VH(H→cc) ) b b → VH(H MC uncertainty W+CC Control Region CMS ) µ 1L ( Resolved-jet (13 TeV) -1 35.9 fb min CvsB 0 0.2 0.4 0.6 0.8 Obs / Exp 0.5 1 1.5 Events 0 0.2 0.4 0.6 0.8 1 1.2 1.4 3 10 × Data VZ(Z→cc)

VV+other Single top

t t W+cc /bc b W+b W+b/c W+udsg Z+cc /bc b Z+b Z+b/c Z+udsg VH(H→cc) ) b b → VH(H MC uncertainty W+HF Control Region CMS ) µ 1L ( Resolved-jet (13 TeV) -1 35.9 fb min CvsB 0 0.2 0.4 0.6 0.8 Obs / Exp 0.5 1 1.5 Events 0 100 200 300 400 Data VZ(Z→cc)

VV+other Single top

t t W+cc /bc b W+b W+b/c W+udsg Z+cc /bc b Z+b Z+b/c Z+udsg VH(H→cc) ) b b → VH(H MC uncertainty Z+CC Control Region CMS 0L Resolved-jet (13 TeV) -1 35.9 fb min CvsB 0 0.2 0.4 0.6 0.8 Obs / Exp 0.5 1 1.5 Events 0 200 400 600 800 Data VZ(Z→cc)

VV+other Single top

t t W+cc /bc b W+b W+b/c W+udsg Z+cc /bc b Z+b Z+b/c Z+udsg VH(H→cc) ) b b → VH(H MC uncertainty Z+HF Control Region CMS 0L Resolved-jet (13 TeV) -1 35.9 fb min CvsB 0 0.2 0.4 0.6 0.8 Obs / Exp 0.5 1 1.5

Figure 2. Post-fit CvsBmindistributions in the CC (left panel) and HF (right panel) control regions

(15)

JHEP03(2020)131

H boson is approximately given by ∆R ∼ 2m_H/p_T(H). For a p_T(H) ∼ p_T(V) of 200 GeV, this gives ∆R ≈ 1.25. For good signal purity and acceptance, we have thus chosen to use large-R jets clustered by the anti-k_T algorithm with a distance parameter of R = 1.5.

As for the resolved-jet topology analysis, one of the biggest challenges is the efficient reconstruction of the pair of c quarks from the H boson decay, while also achieving signif-icant rejection of both light-flavour quarks and gluons, as well as b quarks that contribute backgrounds to this search. To this end, a novel algorithm, DeepAK15 [45], has been used for the identification of jet substructure to tag W, Z, and H bosons, as well as top quarks. In addition, DeepAK15 is designed to discriminate between decay modes with different flavour content (e.g. H → bb , H → cc , H → qq qq ). The algorithm deploys ML methods on the PF candidates and secondary vertices, which are used as inputs. DeepAK15 is designed to exploit information related to jet substructure, flavour, and pileup simultane-ously, yielding substantial gains with respect to other approaches [45]. With the use of the adversarial training procedure [94], the algorithm is largely decorrelated from the jet mass, while preserving most of the method’s discriminating power.

The performance in terms of the receiver operating characteristic (ROC) curve of the cc discriminant for identifying a pair of c quarks from H boson decay versus quarks from the V+jets process for large-R jets with p_T > 200 GeV is shown in figure 3 (left). Three working points (WPs) are defined on the cc tagging discriminant distribution with ap-proximately 1, 2.5, and 5% misidentification rates, and the corresponding efficiencies for identifying a cc pair are approximately 23, 35, and 46%. Another important parameter is the misidentification of b jets as signal c jets. The corresponding ROC curve is displayed in figure3(right). For the three WPs defined above, the corresponding b jet misidentification rates are approximately 9, 17, and 27%. To improve the sensitivity of the analysis, three mutually exclusive cc -enriched categories, the “low-purity” (LP), “medium-purity” (MP), and “high-purity” (HP) categories, are defined based on the three WPs. The misidentifica-tion rate of τ leptons as signal c jets is larger than the misidentificamisidentifica-tion rate of b jets, but typically a factor of two smaller than the c jet efficiency. However, due to kinematic and mass selection requirements that will be detailed in section 6.2, the VH(H → τ τ ) contri-bution is much smaller than the SM VH(H → cc ) signal and hence has a negligible impact on the final result. The merged-jet algorithm is calibrated using data and MC simulated samples. The p_T-dependent data-to-simulation scale factors typically range from 0.85 to 1.30, and the corresponding uncertainties range between 20 and 40%.

6.2 Signal extraction

In the merged-jet topology analysis, events are required to have at least one large-R jet with pT > 200 GeV, with the highest pT large-R jet selected as the H boson candidate.

In VH(H → cc ) signal events, the vector boson and the H boson are typically emitted back-to-back in φ. Therefore, the difference in azimuthal angle between the reconstructed vector boson and H candidate, ∆φ(V, H), is required to be at least 2.5 rad. To avoid double-counting, small-R jets are removed from the event if they overlap the H candidate with ∆R(small-R jet, H) < 1.5.

(16)

JHEP03(2020)131

0 0.2 0.4 0.6 0.8 1 cc efficiency → H 2 − 10 1 − 10 1 V+jets efficiency (13 TeV) CMS Simulation cc vs V+jets → H 0 0.2 0.4 0.6 0.8 1 cc efficiency → H 2 − 10 1 − 10 1 bb efficiency → H (13 TeV) CMS Simulation bb → cc vs H → H

Figure 3. The performance of the cc discriminant to identify a cc pair in terms of receiver operating characteristic curves, for large-R jets with pT> 200 GeV, before the application of

data-to-simulation scale factors. Left: the efficiency to correctly identify a pair of c quarks from H boson decay vs. the efficiency of misidentifying quarks from the V+jets process. Right: the efficiency to correctly identify a pair of c quarks from H boson decay vs. the efficiency of misidentifying a pair of b quarks from H boson decay. The gray stars and crosses on the ROC curves represent the three working points used in the merged-jet topology analysis.

To further distinguish the VH signal process from the main backgrounds, a separate BDT is developed for each channel. The goal is to define a discriminant that improves the separation between VH signal and the main backgrounds, while remaining largely independent of the cc tagging discriminant and the H mass. The BDT only makes use of kinematic information from the event, without including intrinsic properties of H such as the flavour content and the mass of the large-R jet, which will be used in a fit to the data for the signal extraction. For the signal process, the VH(H → bb ) sample was used instead of the VH(H → cc ) sample, and only events with even event numbers were used to train the BDT while those with odd event numbers were used for the main analysis. As the BDT is designed to be insensitive to the flavour content of the Higgs candidate, training with the VH(H → bb ) signal sample results in no loss of performance. For the background process, only the main backgrounds, e.g., Z+jets (V+jets and tt +jets) in the case of the 2L (0L and 1L) channel are used. Table2 summarises the kinematic variables used as input to the BDT for each of the three channels.

The BDT distributions of the three channels for events passing the above selection are shown in figure 4 (left) for the VH(H → cc ) signal and the background processes. The discrimination power of the BDT depends on the channel. An improved discrimination power is obtained in the 2L and 1L channels compared to the 0L channel. In particular, in the 1L channel, improvement is achieved thanks to the presence of the charged lepton and pmissT , which are then used for the training of the BDT to provide additional handles

to suppress the background. For all channels, events with BDT values greater than 0.5 define the signal region. The value of 0.5 was obtained in an optimisation for the 1L

(17)

JHEP03(2020)131

Variable Description 0L 1L 2L

pT(V) vector boson transverse momentum X X X

pT (H) H transverse momentum X X X

|η(H)| absolute value of the H pseudorapidity X — —

∆φ(V, H) azimuthal angle between vector boson and H X X X

pmissT missing transverse momentum — X —

∆η(H, `) difference in pseudorapidity between H and the lepton — X — ∆η(V, H) difference in pseudorapidity between the vector boson and H — — X ∆η(H, j) min. difference in pseudorapidity between H and small-R jets X X X ∆η(`, j) min. difference in pseudorapidity between the lepton and small-R jets — X — ∆η(V, j) min. difference in pseudorapidity between vector boson and small-R jets — — X ∆φ(~pTmiss, j) azimuthal angle between ~p

miss

T and closest small-R jet X — —

∆φ(~pTmiss, `) azimuthal angle between ~p miss

T and lepton — X —

mT transverse mass of lepton ~pT+ ~p miss

T — X —

Naj_small-R number of additional small-R jets X X X

Table 2. Variables used in the kinematic BDT training for each channel of the merged-jet topology analysis.

channel, however further tuning of this value for the 0L and 2L channels has a small impact on the sensitivity. Figure 4 (right) shows the distributions of the cc discriminant in the three channels in the signal region for the VH signal and the background processes. Good separation is observed between signal and background. The performance of the cc discriminant degrades with the presence of b quarks, as is the case for tt events, for instance. The signal regions of the merged-jet topology analysis are finally defined requiring the large-R jet to pass one of the three working points of the cc discriminant mentioned above. Dedicated control regions, each enriched in a specific background process, are defined to aid the background estimation in each channel. Two types of control regions are de-fined: the “low-BDT” control region consisting of events with BDT value <0.5, which is enriched in V+jets background, and the high-Naj_small-R control region, defined by inverting the selection on the number of small-R jets to yield a high-purity tt sample. The latter is not used for the 2-lepton channel since the tt contribution is small in this channel. In both control regions, events are required to satisfy the same cc tagging discriminant criteria as applied in the signal regions in order to probe events with a similar flavour composition. This allows the efficiency of the above mentioned selection to be estimated directly from the data without further corrections being required, as verified in studies with simulated events and events in data validation regions orthogonal to the control and signal regions.

The low-BDT and the high-Naj_small-R control regions, together with the signal regions, are included in the maximum likelihood fit to correct for any difference between data and simulation in the production rate of the V+jets and tt processes in the phase space selected by this analysis. Parameters used to separately scale the overall normalisation of the W +jets, Z +jets, and tt background processes are allowed to float freely in the fit. These parameters scale the background rate in the same way in both the control and the signal

(18)

JHEP03(2020)131

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Kinematic BDT 0 20 40 60 80 100 3 10 × Events (13 TeV) CMS Simulation cc) → VZ(Z VV(other) Single top t t W+udsg W+b/bb W+c W+cc Z+udsg Z+b/bb Z+c Z+cc bb) → VH(H cc) → VH(H Bkg. uncertainty Merged-jet 0L (VH scaled to total bkg.) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 cc-tagging discriminant 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 Events (13 TeV) CMS Simulation cc) → VZ(Z VV(other) Single top t t W+udsg W+b/bb W+c W+cc Z+udsg Z+b/bb Z+c Z+cc bb) → VH(H cc) → VH(H Bkg. uncertainty Merged-jet 0L BDT > 0.5 (VH scaled to total bkg.) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Kinematic BDT 0 10000 20000 30000 40000 50000 60000 70000 80000 90000 Events (13 TeV) CMS Simulation cc) → VZ(Z VV(other) Single Top t t Z(ll)+jets W+udsg W+b/bb W+c W+cc bb) → VH(H cc) → VH(H Bkg. uncertainty Merged-jet 1L (VH scaled to total bkg.) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 cc-tagging discriminant 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 22000 Events (13 TeV) CMS Simulation cc) → VZ(Z VV(other) Single Top t t Z(ll)+jets W+udsg W+b/bb W+c W+cc bb) → VH(H cc) → VH(H Bkg. uncertainty Merged-jet 1L BDT > 0.5 (VH scaled to total bkg.) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Kinematic BDT 0 2000 4000 6000 8000 10000 12000 Events (13 TeV) CMS Simulation cc) → VZ(Z VV(other) Single Top t t Z+udsg Z+b/bb Z+c Z+cc bb) → VH(H cc) → VH(H Bkg. uncertainty Merged-jet 2L (VH scaled to total bkg.) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 cc-tagging discriminant 0 200 400 600 800 1000 1200 1400 1600 1800 2000 2200 Events (13 TeV) CMS Simulation cc) → VZ(Z VV(other) Single Top t t Z+udsg Z+b/bb Z+c Z+cc bb) → VH(H cc) → VH(H Bkg. uncertainty Merged-jet 2L BDT > 0.5 (VH scaled to total bkg.)

Figure 4. The VH(H → cc ) signal and background distributions of the kinematic BDT output (left), and the cc discriminant in events with BDT values greater than 0.5 (right), in the 0L (upper), 1L (middle) and 2L (lower) channels. The VH(H → cc ) signal is normalised to the sum of all backgrounds. The VH(H → bb ) contribution, similarly normalised, is also shown.

(19)

JHEP03(2020)131

regions. The parameters are defined separately for each channel, with the exception that the same scale factor is assumed for the W +jets process in the 0L and 1L channels. Any potential difference in the cc tagging efficiency between data and simulation is also taken into account in the measured simulation-to-data scale factors.

The merged-jet topology analysis has been validated in two data samples that are completely independent of the control and signal regions. The first validation sample consists of events with cc tagging discriminant values below those used in the control and signal regions, and the second validation sample consists of events with lower p_T(H). Both validation regions count much more data than the signal regions and hence can be helpful in identifying potential issues in the analysis. The outcome of these studies is in good agreement with the SM expectation.

7 Systematic uncertainties

Systematic effects can impact the shape of the BDT discriminant distribution for both analysis topologies, as well as the H candidate mass in the merged-jet topology and the distributions of the charm tagger variables in the resolved-jet topology analysis. The dom-inant uncertainties are associated with the normalisation of the background from the data control regions and the limited size of the dataset. The values of the rate parameters associated with the background normalisations range from ∼ 0.65 to ∼ 2.4 with uncer-tainties in the range of 10% to 35%. Additional systematic effects are related to the jet energy scale and resolution, which are treated as correlated between the large- and small-R jets. The efficiency to reconstruct and identify electrons and muons is measured in events with Z bosons decaying to electrons and muons via the tag and probe method [95]. The uncertainties on these efficiencies range from 2 to 4%. The efficiency for events to pass the combination of triggers based on the presence of one or two leptons is measured using Z/γ∗ → ee and Z/γ∗ → µµ events via the tag-and-probe method [95]. The uncertain-ties associated with this measurement typically range between 1 and 3%. The efficiency of the pmiss_T trigger is parametrised using a single-muon dataset and the uncertainties are typically below 2%. Theoretical uncertainties related to the cross sections and the p_T spectra of the signal and backgrounds are also considered. In the resolved-jet analysis the systematic uncertainties in PDFs, and in the renormalisation and factorisation scales are treated as uncorrelated among the four flavour classes considered in the V+jets processes, as described in section 5. The cross section uncertainties are set to their theoretical pre-dictions, while the systematics associated to the renormalisation and factorisation scales are obtained varying the parameters relative to these scales to 0.5 and 2.0 of their nominal values and estimating the effect. Lastly, the uncertainties in the c quark identification are also considered. The full list of systematic uncertainties is provided in table 3.

In table4 the uncertainty sources are grouped into categories and their impact on the fitted signal strength resulting from the combination of the resolved-jet and merged-jet topology analyses is provided (see section 8for more details). The uncertainty breakdown shows that the search for the VH(H → cc ) process is mainly limited by the size of the avail-able dataset: the related uncertainty accounts for more than 85% of the total uncertainty

(20)

JHEP03(2020)131

Source Type 0-lepton 1-lepton 2-lepton

Simulated sample event count shape X X X

c tagging efficiency shape X X X

Top quark pTreweighting shape X X X

pT(V) reweighting shape X X X

VH: pT(V) NLO EW correction shape X X X

Jet energy scale shape X X X

Jet energy resolution shape X X X

pmissT trigger efficiency rate 2% — —

Lepton trigger efficiency shape (rate) — X X

Lepton identification efficiency shape (rate) — X X

pmissT unclustered energy shape X X —

Pileup reweighting shape X X X

Integrated luminosity rate 2.5% 2.5% 2.5%

PDF shape X X X

Renormalisation and factorisation scales shape X X X

Single top cross section rate 15% 15% 15%

Diboson cross section rate 10% 10% 10%

VH: cross section (PDF) rate 1.6% − 2.4% 1.9% 1.6% − 2.4% VH: cross section (scale) rate 19% − 25% 0.5% − 0.7% 19% − 25% Table 3. Summary of the systematic uncertainties for each channel. The shape uncertainties refer to systematic uncertainties that affect both the shape of the distribution being fitted as well as their normalisation. Uncertainties in the lepton identification and trigger efficiencies are treated as a normalisation uncertainty in the resolved-jet topology analysis and as a shape uncertainty in the merged-jet topology analysis.

in the fitted µ. The statistical uncertainties include contributions from the limited number of events in the available dataset and the background normalisations extracted from the control regions. The main sources of systematic uncertainties come from the charm tagging efficiencies and the modelling of the simulated physics processes, representing ∼ 28% and ∼ 25% of the total uncertainty, respectively. The uncertainties in the theory prediction, which include uncertainties in the cross sections, p_T spectra, PDFs, renormalisation and factorisation scales, also play a considerable role and represent approximately 30% of the total uncertainty in µ.

8 Results

The signal extraction strategy is based on a binned likelihood fit to the data, with the sig-nal and control regions fitted simultaneously. Separately fitting electron and muon events in the 1L and 2L channels is driven by the fact that muons have a significantly smaller misreconstruction rate and greater signal sensitivity. In addition, the trigger, identifica-tion and isolaidentifica-tion scale factors are different because electrons and muons are reconstructed differently with the CMS detector. The upper limit (UL) on the signal strength µ for

(21)

JHEP03(2020)131

Uncertainty source ∆µ | µ = 37

Statistical +17.3 −17.1

Background normalisations +10.1 −10.2

Experimental +7.6 −8.2

Charm tagging efficiencies +5.6 −4.8

Simulation modeling +4.2 −5.1

Jet energy scale and resolution +2.4 −2.8 Lepton identification efficiencies +0.4 −1.8

Luminosity +1.6 −1.7

Statistics of the simulated samples +0.5 −1.9

Theory +6.5 −4.6

Signal +5.0 −2.5

Backgrounds +4.3 −3.9

Total +20.0 −19.5

Table 4. Summary of the impact of the statistical and systematic uncertainties on the signal strength modifier for combined analysis of the resolved-jet and merged-jet topologies.

SM production of VH(H → cc ) is extracted at 95% CL based on a modified frequentist approach [96,97] under the asymptotic approximation for the profile likelihood test statis-tic [98,99]. Both analyses are validated by measuring the products of the VZ production cross section and the branching fraction of Z to charm quark-antiquark pair, B (Z → cc ). The systematic uncertainties are incorporated in the fit as constrained parameters of the likelihood function. The cross section of the VH(H → bb ) process is set to its SM pre-diction for the H boson mass of 125 GeV. The result has only a weak dependence on the assumed VH(H → bb ) rate. Indeed, on average, the energy carried by neutrinos is higher for the VH(H → bb ) than for the VH(H → cc ) process. This leads to distinguishably different contributions to the final fitted distribution. Varying the VH(H → bb ) rate by 100% results in less than a 5% change in the expected sensitivity.

The results obtained in the resolved-jet and merged-jet topology analyses indepen-dently, i.e., exploiting larger regions of the full phase space prior to defining disjoint data samples for the combination of results, are described in sections8.1 and8.2. As described in sections 5 and 6, in the merged-jet topology analysis the phase-space considered is bounded from below by pT(V) > 200 GeV, while for the resolved-jet topology analysis the

lower bound is set by the p_T(V) thresholds of 50, 100, and 170 GeV in the 2L, 1L and 0L channels, respectively. Neither of the analyses has an upper limit on p_T(V). The two analyses are then combined for the final result, presented in section8.3, after making them statistically independent via a selection on p_T(V) to set an upper bound for the resolved-jet topology analysis that is also the lower bound on the merged-jet topology analysis.

(22)

JHEP03(2020)131

8.1 Resolved-jet topology

In the resolved-jet topology analysis, the VH(H → cc ) signal is extracted via a binned likelihood fit to the BDT output distributions, that is carried out simultaneously with fits to the backgrounds in control regions. In the LF control regions the fits are for the CvsL_mindistributions, while in the TT, HF, and CC control regions they are for the CvsB_min distributions, as detailed in section 5.2.

The analysis is first validated by measuring the product of the VZ production cross section and B (Z → cc ) normalised to the SM prediction. A separate BDT is trained for each channel, using VZ(Z → cc ) as signal and VH(H → cc ) as contribution to background with cross section fixed to the SM prediction. The measured signal strength for the VZ(Z → cc ) process is µ_{VZ(Z→cc )}= 1.35+0.94−0.95 with an observed (expected) significance of 1.5 (1.2)

standard deviations (σ), respectively. The results are consistent within uncertainties with the SM expectation.

A dedicated BDT is trained for each channel to distinguish the VH(H → cc ) signal from the backgrounds. Figure5displays the BDT distributions in all search channels after the fit to the data. In all plots, the value of each nuisance parameter has been fixed to its best fit value. In general, the BDT distributions in data agree well with the background predictions. The largest excess in the data occurs at large BDT values in the high-p_T(V) 2L (Z(ee)) channel with an observed local signal significance of 2.1 σ.

The observed (expected) UL at 95% CL on µ for SM VH(H → cc ) production is 75 (38+16−11), and the measured signal strength is µVH(H→cc ) = 41

+20

−20. The uncertainties in

the expected UL correspond to a variation of ±1 σ in the expected event yields under the background-only hypothesis. The results are consistent with the SM expectations within two standard deviations. This modest deviation is mostly due to the small excess mentioned above. The results for each channel and their combination are shown in table5. The most sensitive channel is 2L, whereas the 0L and 1L channels have similar sensitivity.

8.2 Merged-jet topology

In the merged-jet topology analysis, the VH(H → cc ) signal is extracted via a binned maximum likelihood fit to the soft-drop mass m_SD of H, with the signal regions and the control regions from all three purity categories included in the fit simultaneously. In total, 15 bins are used in the fit for each region, with a bin width of 10 GeV corresponding roughly to the m_SDresolution. The m_SDdistributions of the VH(H → cc ) and background processes in all three channels in the high-purity category are shown in figure 6. The background prediction is in good agreement with the observed data, within uncertainties. Similar to the resolved-jet topology analysis, the full procedure of the merged-jet topol-ogy analysis is validated by measuring the product of the VZ production cross section and B (Z → cc) normalised to the SM prediction. The event selection, including the kinematic BDT, the cc tagging discriminant criteria, and the signal extraction procedure, remain unchanged. In place of VH(H → cc ), the VZ(Z → cc ) process is considered to be the signal and VH(H → cc ) contributes to the background with cross section fixed to the SM prediction. The measured signal strength is µ_{VZ(Z→cc )} = 0.69+0.89−0.75 with an observed

(23)

JHEP03(2020)131

Events 2 − 10 1 2 10 4 10 6 10 8 10 10 10 Data VZ(Z→cc)

VV+other Single top

t t Z+cc /bc b Z+b Z+b/c Z+udsg VH(H→cc), µ=41 ) b b → VH(H S+B uncertainty )x100 c c → VH(H Signal Region CMS T 2L (ee), Low V-p Resolved-jet (13 TeV) -1 35.9 fb BDT output 0 0.2 0.4 0.6 0.8 1 Obs / Exp 0.5₀ 1 1.5 2 Events 2 − 10 1 2 10 4 10 6 10 8 10 10 10 11 10 Data VZ(Z→cc)

VV+other Single top

t t Z+cc /bc b Z+b Z+b/c Z+udsg VH(H→cc), µ=41 ) b b → VH(H S+B uncertainty )x100 c c → VH(H Signal Region CMS T ), Low V-p µ µ 2L ( Resolved-jet (13 TeV) -1 35.9 fb BDT output 0 0.2 0.4 0.6 0.8 1 Obs / Exp 0.5₀ 1 1.5 2 Events 2 − 10 1 2 10 4 10 6 10 8 10 Data VZ(Z→cc)

VV+other Single top

t t Z+cc /bc b Z+b Z+b/c Z+udsg VH(H→cc), µ=41 ) b b → VH(H S+B uncertainty )x100 c c → VH(H Signal Region CMS T 2L (ee), High V-p Resolved-jet (13 TeV) -1 35.9 fb BDT output 0 0.2 0.4 0.6 0.8 1 Obs / Exp 0.5₀ 1 1.5 2 Events 2 − 10 1 2 10 4 10 6 10 8 10 9 10 Data VZ(Z→cc)

VV+other Single top

t t Z+cc /bc b Z+b Z+b/c Z+udsg VH(H→cc), µ=41 ) b b → VH(H S+B uncertainty )x100 c c → VH(H Signal Region CMS T ), High V-p µ µ 2L ( Resolved-jet (13 TeV) -1 35.9 fb BDT output 0 0.2 0.4 0.6 0.8 1 Obs / Exp 0.5₀ 1 1.5 2 Events 2 − 10 1 2 10 4 10 6 10 8 10 10 10 11 10 Data VZ(Z→cc)

VV+other Single top

t t W+cc /bc b W+b W+b/c W+udsg Z+cc /bc b Z+b Z+b/c Z+udsg VH(H→cc), µ=41 ) b b → VH(H S+B uncertainty )x100 c c → VH(H Signal Region CMS 1L (e) Resolved-jet (13 TeV) -1 35.9 fb BDT output 0 0.2 0.4 0.6 0.8 1 Obs / Exp 0.5₀ 1 1.52 Events 2 − 10 1 2 10 4 10 6 10 8 10 10 10 12 10 _Data _c_c₎ → VZ(Z

VV+other Single top

t t W+cc /bc b W+b W+b/c W+udsg Z+cc /bc b Z+b Z+b/c Z+udsg VH(H→cc), µ=41 ) b b → VH(H S+B uncertainty )x100 c c → VH(H Signal Region CMS ) µ 1L ( Resolved-jet (13 TeV) -1 35.9 fb BDT output 0 0.2 0.4 0.6 0.8 1 Obs / Exp 0.5₀ 1 1.52 Events 2 − 10 1 2 10 4 10 6 10 8 10 10 10 11 10 Data VZ(Z→cc)

VV+other Single top

t t W+cc /bc b W+b W+b/c W+udsg Z+cc /bc b Z+b Z+b/c Z+udsg QCD =41 µ ), c c → VH(H VH(H→bb) S+B uncertainty VH(H→cc)x100 Signal Region CMS 0L Resolved-jet (13 TeV) -1 35.9 fb BDT output 0 0.2 0.4 0.6 0.8 1 Obs / Exp 0.5₀ 1 1.52

Figure 5. Post-fit distributions of the BDT score in the signal region of the resolved-jet topology analysis for the 2L low-pT(V), 2L high-pT(V), 1L, and 0L channels. The plain red histograms

represent the signal contribution normalized by the post-fit value of µ_{VH(H→cc )}, while the red line histograms show the expected signal contribution multiplied by a factor 100.

(24)

JHEP03(2020)131

0 2 4 6 8 10 12 14 16 Events (13 TeV) -1 35.9 fb CMS Observed VH(H→bb) cc) → VZ(Z VV(other) Single Top tt Z+jets VH(H→cc), µ=21 100 × cc) → VH(H S+B Uncertainty 60 80 100 120 140 160 180 200

Higgs candidate mass [GeV]

0.5 1 1.5 exp /N obs N Merged-jet 2L (ee) High purity 0 5 10 15 20 25 Events (13 TeV) -1 35.9 fb CMS Observed VH(H→bb) cc) → VZ(Z VV(other) Single Top tt Z+jets VH(H→cc), µ=21 100 × cc) → VH(H S+B Uncertainty 60 80 100 120 140 160 180 200

0.5 1 1.5 exp /N obs N Merged-jet ) µ µ 2L ( High purity 0 20 40 60 80 100 120 140 160 180 200 Events (13 TeV) -1 35.9 fb CMS Observed VH(H→bb) cc) → VZ(Z VV(other) Single Top tt W+jets Z+jets =21 µ cc), → VH(H VH(H→cc) × 100 S+B Uncertainty 60 80 100 120 140 160 180 200

0.6 0.8 1 1.2 1.4 exp /N obs N Merged-jet 1L (e) High purity 0 50 100 150 200 250 Events (13 TeV) -1 35.9 fb CMS Observed VH(H→bb) cc) → VZ(Z VV(other) Single Top tt W+jets Z+jets =21 µ cc), → VH(H VH(H→cc) × 100 S+B Uncertainty 60 80 100 120 140 160 180 200

0.6 0.8 1 1.2 1.4 exp /N obs N Merged-jet ) µ 1L ( High purity 0 20 40 60 80 100 120 140 160 180 200 Events (13 TeV) -1 35.9 fb CMS Observed VH(H→bb) cc) → VZ(Z VV(other) Single Top tt W+jets Z+jets =21 µ cc), → VH(H VH(H→cc) × 100 S+B Uncertainty 60 80 100 120 140 160 180 200

0.6 0.8 1 1.2 1.4 exp /N obs N Merged-jet 0L High purity

Figure 6. The m_SD distribution of H in data and simulation in the merged-jet topology analysis signal regions after the maximum likelihood fit, for events in the high purity category. Upper row: 2L channel, electrons (left) and muons (right); middle row: 1L channel, electron (left) and muon (right); lower row: 0L channel. The plain red histograms represent the signal contribution normalized by the post-fit value of µ_{VH(H→cc )}, while the red line histograms show the expected signal contribution multiplied by a factor 100.