Model-Independent and Quasi-Model-Independent Search For New Physics at CDF
Model-Independent and Quasi-Model-Independent Search For New Physics at CDF
  S. Pashapour,33 J. Patrick,17 G. Pauletta,53 M. Paulini,12 C. Paus,32 D.E. Pellett,7 A. Penzo,53 T.J. Phillips,16
   G. Piacentino,45 J. Piedra,43 L. Pinera,18 K. Pitts,24 C. Plager,8 L. Pondrom,58 O. Poukhov,15 N. Pounder,41
      F. Prakoshyn,15 A. Pronko,17 J. Proudfoot,2 F. Ptohosh,17 G. Punzi,45 J. Pursley,58 J. Rademackerc,41
     A. Rahaman,46 V. Ramakrishnan,58 N. Ranjan,47 I. Redondo,31 B. Reisert,17 V. Rekovic,36 P. Renton,41
   M. Rescigno,50 S. Richter,26 F. Rimondi,5 L. Ristori,45 A. Robson,21 T. Rodrigo,11 E. Rogers,24 R. Roser,17
M. Rossi,53 R. Rossin,10 P. Roy,33 A. Ruiz,11 J. Russ,12 V. Rusu,17 H. Saarikko,23 A. Safonov,52 W.K. Sakumoto,48
G. Salamanna,50 L. Santi,53 S. Sarkar,50 L. Sartori,45 K. Sato,17 A. Savoy-Navarro,43 T. Scheidle,26 P. Schlabach,17
     E.E. Schmidt,17 M.P. Schmidt,59 M. Schmitt,37 T. Schwarz,7 L. Scodellaro,11 A.L. Scott,10 A. Scribano,45
 F. Scuri,45 A. Sedov,47 S. Seidel,36 Y. Seiya,40 A. Semenov,15 L. Sexton-Kennedy,17 A. Sfyrla,20 S.Z. Shalhout,57
T. Shears,29 P.F. Shepard,46 D. Sherman,22 M. Shimojiman ,54 M. Shochet,13 Y. Shon,58 I. Shreyber,20 A. Sidoti,45
       A. Sisakyan,15 A.J. Slaughter,17 J. Slaunwhite,38 K. Sliwa,55 J.R. Smith,7 F.D. Snider,17 R. Snihur,33
M. Soderberg,34 A. Soha,7 S. Somalwar,51 V. Sorin,35 J. Spalding,17 F. Spinella,45 T. Spreitzer,33 P. Squillacioti,45
M. Stanitzki,59 R. St. Denis,21 B. Stelzer,8 O. Stelzer-Chilton,41 D. Stentz,37 J. Strologas,36 D. Stuart,10 J.S. Suh,27
  A. Sukhanov,18 H. Sun,55 I. Suslov,15 T. Suzuki,54 A. Taffarde ,24 R. Takashima,39 Y. Takeuchi,54 R. Tanaka,39
    M. Tecchio,34 P.K. Teng,1 K. Terashi,49 J. Thomg ,17 A.S. Thompson,21 G.A. Thompson,24 E. Thomson,44
    P. Tipton,59 V. Tiwari,12 S. Tkaczyk,17 D. Toback,52 S. Tokar,14 K. Tollefson,35 T. Tomura,54 D. Tonelli,17
    S. Torre,19 D. Torretta,17 S. Tourneur,43 Y. Tu,44 N. Turini,45 F. Ukegawa,54 S. Uozumi,54 S. Vallecorsa,20
    N. van Remortel,23 A. Varganov,34 E. Vataga,36 F. Vázquezl ,18 G. Velev,17 C. Vellidisa ,45 V. Veszpremi,47
   M. Vidal,31 R. Vidal,17 I. Vila,11 R. Vilar,11 T. Vine,30 M. Vogel,36 G. Volpi,45 F. Würthwein,9 P. Wagner,44
R.G. Wagner,2 R.L. Wagner,17 J. Wagner,26 W. Wagner,26 R. Wallny,8 S.M. Wang,1 A. Warburton,33 D. Waters,30
     M. Weinberger,52 W.C. Wester III,17 B. Whitehouse,55 D. Whitesone ,44 A.B. Wicklund,2 E. Wicklund,17
 G. Williams,33 H.H. Williams,44 P. Wilson,17 B.L. Winer,38 P. Wittichg ,17 S. Wolbers,17 C. Wolfe,13 T. Wright,34
     X. Wu,20 S.M. Wynne,29 S. Xie,32 A. Yagil,9 K. Yamamoto,40 J. Yamaoka,51 T. Yamashita,39 C. Yang,59
  U.K. Yangm ,13 Y.C. Yang,27 W.M. Yao,28 G.P. Yeh,17 J. Yoh,17 K. Yorita,13 T. Yoshida,40 G.B. Yu,48 I. Yu,27
       S.S. Yu,17 J.C. Yun,17 L. Zanello,50 A. Zanetti,53 I. Zaw,22 X. Zhang,24 Y. Zhengb ,8 and S. Zucchelli5
                                                (CDF Collaboration∗)
                    1
                          Institute of Physics, Academia Sinica, Taipei, Taiwan 11529, Republic of China
                                           2
                                             Argonne National Laboratory, Argonne, Illinois 60439
     3
       Institut de Fisica d’Altes Energies, Universitat Autonoma de Barcelona, E-08193, Bellaterra (Barcelona), Spain
                                                     4
                                                       Baylor University, Waco, Texas 76798
                      5
                        Istituto Nazionale di Fisica Nucleare, University of Bologna, I-40127 Bologna, Italy
                                            6
                                              Brandeis University, Waltham, Massachusetts 02254
                                         7
                                           University of California, Davis, Davis, California 95616
                                8
                                  University of California, Los Angeles, Los Angeles, California 90024
                                    9
                                      University of California, San Diego, La Jolla, California 92093
                            10
                               University of California, Santa Barbara, Santa Barbara, California 93106
                  11
                     Instituto de Fisica de Cantabria, CSIC-University of Cantabria, 39005 Santander, Spain
                                             12
                                                Carnegie Mellon University, Pittsburgh, PA 15213
                               13
                                  Enrico Fermi Institute, University of Chicago, Chicago, Illinois 60637
      14
         Comenius University, 842 48 Bratislava, Slovakia; Institute of Experimental Physics, 040 01 Kosice, Slovakia
                                   15
                                      Joint Institute for Nuclear Research, RU-141980 Dubna, Russia
                                              16
                                                 Duke University, Durham, North Carolina 27708
                                    17
                                       Fermi National Accelerator Laboratory, Batavia, Illinois 60510
                                              18
                                                 University of Florida, Gainesville, Florida 32611
               19
                  Laboratori Nazionali di Frascati, Istituto Nazionale di Fisica Nucleare, I-00044 Frascati, Italy
                                          20
                                             University of Geneva, CH-1211 Geneva 4, Switzerland
                                        21
                                           Glasgow University, Glasgow G12 8QQ, United Kingdom
                                           22
                                              Harvard University, Cambridge, Massachusetts 02138
                                        23
                                           Division of High Energy Physics, Department of Physics,
                     University of Helsinki and Helsinki Institute of Physics, FIN-00014, Helsinki, Finland
                                                 24
                                                    University of Illinois, Urbana, Illinois 61801
                                      25
                                         The Johns Hopkins University, Baltimore, Maryland 21218
                 26
                    Institut für Experimentelle Kernphysik, Universität Karlsruhe, 76128 Karlsruhe, Germany
                                  27
                                     Center for High Energy Physics: Kyungpook National University,
                                   Taegu 702-701, Korea; Seoul National University, Seoul 151-742,
                                             Korea; SungKyunKwan University, Suwon 440-746,
                              Korea; Korea Institute of Science and Technology Information, Daejeon,
                             305-806, Korea; Chonnam National University, Gwangju, 500-757, Korea
                                                                                                                                    3
                     28
                      Ernest Orlando Lawrence Berkeley National Laboratory, Berkeley, California 94720
                                 29
                                     University of Liverpool, Liverpool L69 7ZE, United Kingdom
                              30
                                 University College London, London WC1E 6BT, United Kingdom
              31
                 Centro de Investigaciones Energeticas Medioambientales y Tecnologicas, E-28040 Madrid, Spain
                          32
                             Massachusetts Institute of Technology, Cambridge, Massachusetts 02139
                                   33
                                      Institute of Particle Physics: McGill University, Montréal,
                          Canada H3A 2T8; and University of Toronto, Toronto, Canada M5S 1A7
                                        34
                                           University of Michigan, Ann Arbor, Michigan 48109
                                    35
                                       Michigan State University, East Lansing, Michigan 48824
                                  36
                                     University of New Mexico, Albuquerque, New Mexico 87131
                                          37
                                             Northwestern University, Evanston, Illinois 60208
                                         38
                                            The Ohio State University, Columbus, Ohio 43210
                                           39
                                              Okayama University, Okayama 700-8530, Japan
                                               40
                                                  Osaka City University, Osaka 588, Japan
                                    41
                                       University of Oxford, Oxford OX1 3RH, United Kingdom
                                  42
                                     University of Padova, Istituto Nazionale di Fisica Nucleare,
                                            Sezione di Padova-Trento, I-35131 Padova, Italy
                43
                   LPNHE, Universite Pierre et Marie Curie/IN2P3-CNRS, UMR7585, Paris, F-75252 France
                                 44
                                    University of Pennsylvania, Philadelphia, Pennsylvania 19104
                               45
                                  Istituto Nazionale di Fisica Nucleare Pisa, Universities of Pisa,
                                      Siena and Scuola Normale Superiore, I-56127 Pisa, Italy
                                     46
                                        University of Pittsburgh, Pittsburgh, Pennsylvania 15260
                                         47
                                            Purdue University, West Lafayette, Indiana 47907
                                        48
                                           University of Rochester, Rochester, New York 14627
                                     49
                                        The Rockefeller University, New York, New York 10021
                                    50
                                       Istituto Nazionale di Fisica Nucleare, Sezione di Roma 1,
                                      University of Rome “La Sapienza,” I-00185 Roma, Italy
                                         51
                                            Rutgers University, Piscataway, New Jersey 08855
                                       52
                                          Texas A&M University, College Station, Texas 77843
                         53
                            Istituto Nazionale di Fisica Nucleare, University of Trieste/ Udine, Italy
                                        54
                                           University of Tsukuba, Tsukuba, Ibaraki 305, Japan
                                           55
                                              Tufts University, Medford, Massachusetts 02155
                                                  56
                                                     Waseda University, Tokyo 169, Japan
                                          57
                                             Wayne State University, Detroit, Michigan 48201
                                        58
                                           University of Wisconsin, Madison, Wisconsin 53706
                                           59
                                              Yale University, New Haven, Connecticut 06520
                Data collected in Run II of the Fermilab Tevatron are searched for indications of new electroweak
              scale physics. Rather than focusing on particular new physics scenarios, CDF data are analyzed
              for discrepancies with respect to the standard model prediction. A model-independent approach
              (Vista) considers the gross features of the data, and is sensitive to new large cross section physics.
              A quasi-model-independent approach (Sleuth) searches for a significant excess of events with large
              summed transverse momentum, and is particularly sensitive to new electroweak scale physics that
              appears predominantly in one final state. This global search
                                                                      √      for new physics in over three hundred
              exclusive final states in 927 pb−1 of pp̄ collisions at s = 1.96 TeV reveals no such significant
              indication of physics beyond the standard model.
Contents I. INTRODUCTION
   2. is not due to a mismodeling of the detector re-               This search for new physics terminates when one of
      sponse, and                                                two conditions are satisfied: either a compelling case for
                                                                 new physics is made, or there remain no statistically sig-
   3. is not due to an inadequate implementation of the          nificant discrepancies on which a new physics case can
      standard model prediction,                                 be made. In the former case, to quantitatively assess
                                                                 the significance of the potential discovery, a full treat-
and therefore must be due to new underlying physics.             ment of systematic uncertainties must be implemented.
Any observed discrepancy is subject to scrutiny, and ex-         In the latter case, it is sufficient to demonstrate that all
planations are sought in terms of the above points.              observed effects are not in significant disagreement with
   The Vista and Sleuth algorithms provide a means for           an appropriate global Standard Model description.
making the above three arguments, with a high threshold
placed on the statistical significance of a discrepancy in
order to minimize the chance of a false discovery claim.                                III.   VISTA
As described later, this threshold is the requirement that
the false discovery rate is less than 0.001, after taking into
                                                                   This section describes Vista: object identification,
account the total number of final states, distributions, or
                                                                 event selection, estimation of standard model back-
regions being examined.
                                                                 grounds, simulation of the CDF detector response, de-
   This analysis employs a correction model implement-           velopment of a correction model, and results.
ing specific hypotheses to account for mismodeling of de-
tector response and imperfect implementation of stan-
dard model prediction. Achieving this on the entire high-
pT dataset requires a framework for quickly implement-                             A.   CDF II detector
ing and testing modifications to the correction model.
The specific details of the correction model are intention-         CDF II is a general purpose detector [7, 8] designed
ally kept as simple as possible in the interest of trans-        to detect particles produced in pp̄ collisions. The detec-
parency in the event of a possible new physics claim.            tor has a cylindrical layout centered on the accelerator
Vista’s toolkit includes a global comparison of data to          beamline.
the standard model prediction, with a check of thousands            CDF uses a cylindrical coordinate system with the z-
of kinematic distributions and an easily adjusted correc-        axis along the axis of the colliding beams. The variable θ
tion model allowing a quick fit for values of associated         is the polar angle relative to the incoming proton beam,
correction factors.                                              and the variable φ is the azimuthal angle about the beam
   The traditional notions of signal and control regions         axis. The pseudorapidity of a particle trajectory is de-
are modified. Without prejudice as to where new physics          fined as η = − ln(tan(θ/2)). It is also useful to define
may appear, all regions of the data are treated as both          detector pseudorapidity ηdet , denoting a particle’s pseu-
signal and control. This analysis is not blind, but rather       dorapidity in a coordinate system in which the origin lies
seeks to identify and understand discrepancies between           at the center of the CDF detector rather than at the event
data and the standard model prediction. With the goal            vertex. The transverse momentum pT is the component
of discovery, emphasis is placed on examining discrep-           of the momentum projected on a plane perpendicular to
ancies, focusing on outliers rather than global goodness         the beam axis.
of fit. Individual discrepancies that are not statistically         Charged particle tracks are reconstructed in a 3.1 m
significant are generally not pursued.                           long open cell drift chamber that performs up to 96 mea-
   Vista and Sleuth are employed simultaneously,                 surements of the track position in the radial region from
rather than sequentially.         An effect highlighted by       0.4 m to 1.4 m. Between the beam pipe and this tracking
Sleuth prompts additional investigation of the discrep-          chamber are multiple layers of silicon microstrip detec-
ancy, usually resulting in a specific hypothesis explaining      tors, enabling high precision determination of the impact
the discrepancy in terms of a detector effect or adjust-         parameter of a track relative to the primary event vertex.
ment to the standard model prediction that is then fed           The tracking detectors are immersed in a uniform 1.4 T
back and tested for global consistency using Vista.              solenoidal magnetic field.
   Forming hypotheses for the cause of specific discrepan-          Outside the solenoid, calorimeter modules are arranged
cies, implementing those hypotheses to assess their wider        in a projective tower geometry to provide energy mea-
consequences, and testing global agreement after the im-         surements for both charged and neutral particles. Pro-
plementation are emphasized as the crucial activities for        portional chambers are embedded in the electromagnetic
the investigator throughout the process of data analysis.        calorimeters to measure the transverse profile of electro-
This process is constrained by the requirement that all          magnetic showers at a depth corresponding to the shower
adjustments be physically motivated. The investigation           maximum for electrons. The outermost part of the de-
and resolution of discrepancies highlighted by the algo-         tector consists of a series of drift chambers used to detect
rithms is the defining characteristic of this global analy-      and identify muons, minimum ionizing particles that typ-
sis [52].                                                        ically pass through the calorimeter.
6
   A set of forward gas Čerenkov detectors is used to        small summed scalar transverse momentum containing
measure the average number of inelastic pp̄ collisions per    only jets.
Tevatron bunch crossing, and hence determine the lumi-           A secondary vertex b-tagging algorithm is used to iden-
nosity acquired. A three level trigger and data acqui-        tify jets likely resulting from the fragmentation of a bot-
sition system selects the most interesting collisions for     tom quark (b) produced in the hard scattering [14].
offline analysis.                                                Momentum visible in the detector but not clustered
   Here and below the word “central” is used to describe      into an electron, muon, tau, photon, jet, or b-tagged jet
objects with |ηdet | < 1.0; “plug” is used to describe ob-    is referred to as unclustered momentum (uncl).
jects with 1.0 < |ηdet | < 2.5.                                  Missing momentum ( /p) is calculated as the negative
                                                              vector sum of the 4-vectors of all identified objects and
                                                              unclustered momentum. An event is said to contain a /p
               B.   Object identification                     object if the transverse momentum of this object exceeds
                                                              17 GeV, and if additional quality criteria discriminating
   Energetic and isolated electrons, muons, taus, photons,    against fake missing momentum due to jet mismeasure-
jets, and b-tagged jets with |ηdet | < 2.5 and pT > 17 GeV    ment are satisfied [53].
are identified according to standard criteria. The same
criteria are used for all events. The isolation criteria
employed vary according to object, but roughly require                          C.   Event selection
less thanp2 GeV of extra energy flow within a cone of
∆R =        ∆η 2 + ∆φ2 = 0.4 in η–φ space around each            Events containing an energetic and isolated electron,
object.                                                       muon, tau, photon, or jet are selected. A set of three
   Standard CDF criteria [9] are used to identify electrons   level online triggers requires:
(e± ) in the central and plug regions of the CDF detector.
Electrons are characterized by a narrow shower in the            • a central electron candidate with pT > 18 GeV
                                                                   passing level 3, with an associated track having
central or plug electromagnetic calorimeter and a match-
                                                                   pT > 8 GeV and an electromagnetic energy clus-
ing isolated track in the central gas tracking chamber or
                                                                   ter with pT > 16 GeV at levels 1 and 2; or
a matching plug track in the silicon detector.
   Standard CDF muons (µ± ) are identified using three           • a central muon candidate with pT > 18 GeV pass-
separate subdetectors in the regions |ηdet | < 0.6, 0.6 <          ing level 3, with an associated track having pT >
|ηdet | < 1.0, and 1.0 < |ηdet | < 1.5 [9]. Muons are              15 GeV and muon chamber track segments at levels
characterized by a track in the central tracking cham-             1 and 2; or
ber matched to a track segment in the central muon de-
tectors, with energy consistent with minimum ionizing            • a central or plug photon candidate with pT >
deposition in the electromagnetic and hadronic calorime-           25 GeV passing level 3, with hadronic to electro-
ters along the muon trajectory.                                    magnetic energy less than 1:8 and with energy sur-
   Narrow central jets with a single charged track are             rounding the photon to the photon’s energy less
identified as tau leptons (τ ± ) that have decayed hadron-         than 1:7 at levels 1 and 2; or
ically [10]. Taus are distinguished from electrons by re-
quiring a substantial fraction of their energy to be de-         • a central or plug jet with pT > 20 GeV passing level
posited in the hadron calorimeter; taus are distinguished          3, with 15 GeV of transverse momentum required
from muons by requiring no track segment in the muon               at levels 1 and 2, with corresponding prescales of
detector coinciding with the extrapolated track of the             50 and 25, respectively; or
tau. Track and calorimeter isolation requirements are
                                                                 • a central or plug jet with pT > 100 GeV passing
imposed.
                                                                   level 3, with energy clusters of 20 GeV and 90 GeV
   Standard CDF criteria requiring the presence of a nar-
                                                                   required at levels 1 and 2; or
row electromagnetic cluster with no associated tracks are
used to identify photons (γ) in the central and plug re-         • a central electron candidate with pT > 4 GeV and
gions of the CDF detector [11].                                    a central muon candidate with pT > 4 GeV pass-
   Jets (j) are reconstructed using the JetClu [12] clus-          ing level 3, with a muon segment, electromagnetic
tering algorithm with a cone of size ∆R = 0.4, unless the          cluster, and two tracks with pT > 4 GeV required
event contains one or more jets with pT > 200 GeV and              at levels 1 and 2; or
no leptons or photons, in which case cones of ∆R = 0.7
are used. Jet energies are appropriately corrected to the        • a central electron or muon candidate with pT >
parton level [13]. Since uncertainties in the standard             4 GeV and a plug electron candidate with pT >
model prediction grow with increasing jet multiplicity,            8 GeV, requiring a central muon segment and track
up to the four largest pT jets are used to characterize the        or central electromagnetic energy cluster and track
event; any reconstructed jets with pT -ordered ranking of          at levels 1 and 2, together with an isolated plug
five or greater are neglected, except in final states with         electromagnetic energy cluster; or
                                                                                                                           7
    • two central or plug electromagnetic clusters with          possibly with other objects present. Explicit online trig-
      pT > 18 GeV passing level 3, with hadronic to elec-        gers feeding this offline selection are required. The pT
      tromagnetic energy less than 1:8 at levels 1 and 2;        thresholds for these criteria are chosen to be sufficiently
      or                                                         above the online trigger turn-on curves that trigger effi-
                                                                 ciencies can be treated as roughly independent of object
    • two central tau candidates with pT > 10 GeV                pT .
      passing level 3, each with an associated track hav-           Good run criteria are imposed, requiring the operation
      ing pT > 10 GeV and a calorimeter cluster with             of all major subdetectors. To reduce contributions from
      pT > 5 GeV at levels 1 and 2.                              cosmic rays and events from beam halo, standard CDF
                                                                 cosmic ray and beam halo filters are applied [15].
  Events satisfying one or more of these online triggers
                                                                    These selections result in a sample of roughly two mil-
are recorded for further study. Offline event selection for
                                                                 lion high-pT data events in an integrated luminosity of
this analysis uses a variety of further filters. Single object
                                                                 927 pb−1 .
requirements keep events containing:
TABLE I: The 44 correction factors introduced in the Vista correction model. The leftmost column (Code) shows correction
factor codes. The second column (Category) shows correction factor categories. The third column (Explanation) provides a
short description. The correction factor best fit value (Value) is given in the fourth column. The correction factor error (Error)
resulting from the fit is shown in the fifth column. The fractional error (Error(%)) is listed in the sixth column. All values
are dimensionless with the exception of code 0001 (luminosity), which has units of pb−1 . The values and uncertainties of these
correction factors are valid within the context of this correction model.
10
can be chosen so that the standard model prediction is            details relegated to Appendix A 3.
in agreement with the CDF high-pT data.                              Events are first partitioned into final states according
   The correction model is obtained by an iterative pro-          to the number and types of objects present. Each final
cedure informed by observed inadequacies in modeling.             state is then subdivided into bins according to each ob-
The process of correction model improvement, motivated            ject’s detector pseudorapidity (ηdet ) and transverse mo-
by observed discrepancies, may allow a real signal to be          mentum (pT ), as described in Appendix A 3 a.
artificially suppressed. If adjusting correction factor val-         Generated Monte Carlo events, adjusted by the cor-
ues within allowed bounds removes a signal, then the              rection model, provide the standard model prediction for
case for the signal disappears, since it can be explained         each bin. The standard model prediction in each bin is
in terms of known physics. This is true in any analysis.          therefore a function of the correction factor values. A
The stronger the constraints on the correction model, the         figure of merit is defined to quantify global agreement
more difficult it is to artificially suppress a real signal. By   between the data and the standard model prediction,
requiring a consistent interpretation of hundreds of final        and correction factor values are chosen to maximize this
states, Vista is less likely to mistakenly explain away           agreement, consistent with external experimental con-
new physics than if it had more limited scope.                    straints.
   The 44 correction factors currently included in the               Letting ~s represent a vector of correction factors, for
Vista correction model are shown in Table I. These                the k th bin
factors can be classified into two categories: theoretical                                 (Data[k] − SM[k])2
and experimental. A more detailed description of each                           χ2k (~s) = p      2           ,             (1)
individual correction factor is provided in Appendix A 4.                                   SM[k] + δSM[k]2
   Theoretical correction factors reflect the practical dif-      where Data[k] is the number of data events observed in
ficulty of calculating accurately within the framework            the k th bin, SM[k] is the number of events predicted by
of the standard model. These factors take the form                the standard model in the k th bin, δSM[k] is the Monte
of k-factors, so-called “knowledge factors,” represent-           Carlo statistical uncertainty on p  the standard model pre-
ing the ratio of the unavailable all order cross section          diction in the k th bin [54], and SM[k] is the statistical
to the calculable leading order cross section. Twenty-
                                                                  uncertainty on the expected data in the k th bin. The
three k-factors are used for standard model processes in-
                                                                  standard model prediction SM[k] in the k th bin is a func-
cluding QCD multijet production, W+jets, Z+jets, and
                                                                  tion of ~s.
(di)photon+jets production.
                                                                     Relevant information external to the Vista high-pT
   Experimental correction factors include the integrated         data sample provides additional constraints in this global
luminosity of the data, efficiencies associated with trig-        fit. The CDF luminosity counters measure the inte-
gering on electrons and muons, efficiencies associated            grated luminosity of the sample described in this ar-
with the correct identification of physics objects, and           ticle to be 902 pb−1 ± 6% by measuring the fraction
fake rates associated with the mistaken identification of         of bunch crossings in which zero inelastic collisions oc-
physics objects. Obtaining an adequate description of             cur [27]. The integrated luminosity of the sample mea-
object misidentification has required an understanding            sured by the luminosity counters enters in the form of
of the underlying physical mechanisms by which objects            a Gaussian constraint on the luminosity correction fac-
are misreconstructed, as described in Appendix A 1.               tor. Higher order theoretical calculations exist for some
   In the interest of simplicity, correction factors repre-       standard model processes, providing constraints on cor-
senting k-factors, efficiencies, and fake rates are generally     responding k-factors, and some CDF experimental cor-
taken to be constants, independent of kinematic quanti-           rection factors are also constrained from external infor-
ties such as object pT , with only five exceptions. The pT        mation. In total, 26 of the 44 correction factors are con-
dependence of three fake rates is too large to be treated         strained. The specific constraints employed are provided
as approximately constant: the jet faking electron rate           in Appendix A 3 b.
p(j → e) in the plug region of the CDF detector; the jet             The overall function to be minimized takes the form
faking b-tagged jet rate p(j → b), which increases steadily                                         !
with increasing pT ; and the jet faking tau rate p(j → τ ),                            X
                                                                            χ2 (~s) =       χ2k (~s) + χ2constraints (~s), (2)
which decreases steadily with increasing pT . Two other
                                                                                      k∈bins
fake rates possess geometrical features in η–φ due to the
construction of the CDF detector: the jet faking electron         where the sum in the first term is over bins in the CDF
rate p(j → e) in the central region, because of the fidu-         high-pT data sample with χ2k (~s) defined in Eq. 1, and the
cial tower geometry of the electromagnetic calorimeter;           second term is the contribution from explicit constraints.
and the jet faking muon rate p(j → µ), due to the non-              Minimization of χ2 (~s) in Eq. 2 as a function of the
trivial fiducial geometry of the muon chambers. After             vector of correction factors ~s results in a set of correction
determining appropriate functional forms, a single over-          factor values ~s0 providing the best global agreement be-
all multiplicative correction factor is used.                     tween the data and the standard model prediction. The
   Correction factor values are obtained from a global fit        best fit correction factor values are shown in Table I, to-
to the data. The procedure is outlined here, with further         gether with absolute and fractional uncertainties. The
                                                                                                                                                                         11
                                                                Vista Distributions
rate p(j → γ), photon efficiency p(γ → γ), and k-factors
                                                                                 2000
for prompt photon production and prompt diphoton pro-
duction are dominated by the γj, γjj, and γγ final states.
Additional knowledge incorporated in the determination                           1500
of fake rates is described in Appendix A 1.
                                                                                                                                                              overflow
                                                                                 1000
  The global fit χ2 per number of bins is 288.1/133+27.9,
where the last term is the contribution to the χ2 from the
imposed constraints. A χ2 per degree of freedom larger                                500
than unity is expected, since the limited set of correction
factors in this correction model is not expected to provide                            0
                                                                                       -10   -8        -6   -4        -2   0   2       4     6         8         10
a complete description of all features of the data. Em-                                                                    σ
phasis is placed on individual outlying discrepancies that
may motivate a new physics claim, rather than overall           FIG. 1: Distribution of observed discrepancy between data
goodness of fit.                                                and the standard model prediction, measured in units of stan-
   Corrections to object identification efficiencies are typ-   dard deviation (σ), shown as the solid (green) histogram, be-
ically less than 10%; fake rates are consistent with an         fore accounting for the trials factor. The upper pane shows
understanding of the underlying physical mechanisms re-         the distribution of discrepancies between the total number
                                                                of events observed and predicted in the 344 populated fi-
sponsible; k-factors range from slightly less than unity to
                                                                nal states considered. Negative values on the horizontal axis
greater than two for some processes with multiple jets.         correspond to a deficit of data compared to standard model
All values obtained are physically reasonable. Further          prediction; positive values indicate an excess of data com-
analysis is provided in Appendix A 4.                           pared to standard model prediction. The lower pane shows
   With the details of the correction model in place, the       the distribution of discrepancies between the observed and
complete standard model prediction can be obtained. For         predicted shapes in 16,486 kinematic distributions. Distribu-
                                                                tions in which the shapes of data and standard model predic-
each Monte Carlo event after detector simulation, the
                                                                tion are in relative disagreement correspond to large positive
event weight is multiplied by the value of the luminosity       σ. The solid (black) curves indicate expected distributions,
correction factor and the k-factor for the relevant stan-       if the data were truly drawn from the standard model back-
dard model process. The single Monte Carlo event can            ground. Interest is focused on the entries in the tails of the
be misreconstructed in a number of ways, producing a            upper distribution and the high tail of the lower distribution.
set of Monte Carlo events derived from the original, with       The final state entering the upper histogram at −4.03σ is the
weights multiplied by the probability of each misrecon-         Vista 3j τ final state, which heads Table II. Most of the dis-
struction. The weight of each resulting event is multiplied     tributions entering the lower histogram with > 4σ derive from
by the probability the event satisfies trigger criteria. The    the 3j ∆R(j2 , j3 ) discrepancy, discussed in the text.
resulting standard model prediction, corrected as just de-
scribed, is referred to as “the standard model prediction”
throughout the rest of this paper, with “corrected” im-
plied in all cases.
12
              Final State             Data     Background       σ     Final State         Data     Background      Final State       Data     Background
              3jτ ±                      71    113.7 ± 3.6     −2.3   2jµ±                9513    9362.3 ± 166.8   e± /
                                                                                                                      pτ ±              20     18.7 ± 1.9
              5j                       1661   1902.9 ± 50.8    −1.7   2e± j                 13       9.8 ± 2.2     e± γp/             141     144.2 ± 6
              2jτ ±                     233    296.5 ± 5.6     −1.6   2e± e∓                12       4.8 ± 1.2     e± µ∓ /
                                                                                                                         p              54     42.6 ± 2.7
              2j2τ ±                      6      27 ± 4.6      −1.4   2e±                   23      36.1 ± 3.8     e± µ± /
                                                                                                                         p              13     10.9 ± 1.3
              be± j                           2015.4 ± 28.7                                        335.8 ± 7       e± µ∓                      127.6 ± 4.2
                                                                                P
                                       2207                    +1.4   2b, low    pT         327                                       153
                                      35436 37294.6 ± 524.3    −1.1                                                e± j
                          P
                              pT                                                                   173.1 ± 7.1                      386880 392614 ± 5031.8
                                                                               P
              3j, high                                                2b, high    pT        187
              e± 3jp/                         1751.6 ± 42                                           33.5 ± 5.5     e± j2γ                      15.9 ± 2.9
                                                                                 P
                                       1954                    +1.1   2b3j, high    pT      28                                          14
              be± 2j                           695.3 ± 13.3                                        326.3 ± 8.4     e± jτ ±                     79.3 ± 2.9
                                                                                P
                                        798                    +1.1   2b2j, low    pT       355                                         79
                /, low                         967.5 ± 38.4    −0.8                                                e± jτ ∓
                          P
                              pT                                                                    80.2 ± 5                                  148.8 ± 7.6
                                                                                 P
              3jp                       811                           2b2j, high    pT       56                                       162
              e± µ±                      26     11.6 ± 1.5     +0.8   2b2jγ                  16     15.4 ± 3.6     e± jp
                                                                                                                       /            58648 57391.7 ± 661.6
              e± γ                      636    551.2 ± 11.2    +0.7   2bγ                    37     31.7 ± 4.8     e± jγp
                                                                                                                        /               52     76.2 ± 9
              e± 3j                   28656 27281.5 ± 405.2    +0.6                                393.8 ± 9.1     e± jµ∓ /                    13.1 ± 1.7
                                                                                 P
                                                                      2bj, low   pT         415                           p             22
              b5j                       131      95 ± 4.7      +0.5                                195.8 ± 8.3     e± jµ∓                      26.8 ± 2.3
                                                                                P
                                                                      2bj, high   pT        161                                         28
              j2τ ±                      50     85.6 ± 8.2     −0.4      /, low                     23.2 ± 2.6     e± e∓ 4j                   113.5 ± 5.9
                                                                                P
                                                                      2bjp        pT         28                                       103
              jτ ± τ ∓                   74     125 ± 13.6     −0.4   2bjγ                   25     24.7 ± 4.3     e± e∓ 3j           456      473 ± 14.6
               /, low                           29.5 ± 4.6     −0.4   2be± 2jp                                     e± e∓ 2jp
                         P
              bp              pT         10                                   /              15     12.3 ± 1.6             /            30      39 ± 4.6
              e± jγ                     286    369.4 ± 21.1    −0.3   2be± 2j                30     30.5 ± 2.5     e± e∓ 2j          2149     2152 ± 40.1
              e± jp
                  /τ ∓                   29     14.2 ± 1.8     +0.2   2be± j                 28     29.1 ± 2.8     e± e∓ τ ±            14     11.1 ± 2
                                      96502 92437.3 ± 1354.5 +0.1     2be±                                         e± e∓ /
                          P
              2j, high        pT                                                             48     45.2 ± 3.7           p            491     487.9 ± 12
              be± 3j                    356    298.6 ± 7.7     +0.1   τ±τ∓                  498    428.5 ± 22.7    e± e∓ γ            127     132.3 ± 4.2
              8j                         11      6.1 ± 2.5            γτ ±                  177    204.4 ± 5.4     e± e∓ j          10726 10669.3 ± 123.5
              7j                         57     35.6 ± 4.9            γp
                                                                       /                  1952    1945.8 ± 77.1    e± e∓ jp
                                                                                                                          /           157      144 ± 11.2
              6j                        335    298.4 ± 14.7           µ± τ ±                 18     19.8 ± 2.3     e± e∓ jγ             26     45.6 ± 4.7
                                      39665 40898.8 ± 649.2           µ± τ ∓                151    179.1 ± 4.7     e± e∓            58344 58575.6 ± 603.9
                         P
              4j, low      pT
                                              8403.7 ± 144.7          µ± /
                                                                         p               321351 320500 ± 3475.5    b6j                  24     15.5 ± 2.3
                          P
              4j, high      pT         8241
                                                                      µ± /
                                                                         pτ ∓                       25.8 ± 2.7                                  9.2 ± 1.8
                                                                                                                              P
              4j2γ                       38     57.5 ± 11                                    22                    b4j, low   pT        13
              4jτ ±                                                   µ± γ                         285.5 ± 5.9                                499.2 ± 12.4
                                                                                                                             P
                                         20     36.9 ± 2.4                                  269                    b4j, high   pT     464
                                                                      µ± γp/                       282.2 ± 6.6                                5285 ± 72.4
                                                                                                                            P
                /, low                         525.2 ± 34.5                                 269                    b3j, low   pT     5354
                          P
              4jp             pT        516
                                                                      µ± µ∓ /                       61.4 ± 3.5
                                                                                                                             P
                 /
              4jγp                       28     53.8 ± 11                   p                49                    b3j, high   pT    1639    1558.9 ± 24.1
                                                                      µ± µ∓ γ                       29.9 ± 2.6
                                                                                                                             P
              4jγ                      3693   3827.2 ± 112.1                                 32                       /, low
                                                                                                                   b3jp        pT     111     116.8 ± 11.2
              4jµ±                      576    568.2 ± 26.1           µ± µ∓              10648 10845.6 ± 96        b3jγ               182     194.1 ± 8.8
              4jµ± /
                   p                    232    224.7 ± 8.5            j2γ                 2196    2200.3 ± 35.2    b3jµ± /
                                                                                                                         p              37     34.1 ± 2
              4jµ± µ∓                    17     20.1 ± 2.5            j2γp/                  38     27.3 ± 3.2     b3jµ±                47     52.2 ± 3
              3γ                         13     24.2 ± 3              jτ ±                  563    585.7 ± 10.2    b2γ                  15     14.6 ± 2.1
                                                                       /, low                     4209.1 ± 56.1                              8576.2 ± 97.9
                                                                                P
                                                                                    pT
                                                                                                                              P
                                                                                                                              pT
                         P
              3j, low        pT       75894 75939.2 ± 1043.9          jp                  4183                     b2j, low          8812
                                                                                                  48743 ± 546.3                              4646.2 ± 57.7
                                                                                                                             P
              3j2γ                      145    178.1 ± 7.4            jγ                 49052                     b2j, high   pT    4691
                                                                      jγτ ±                         104 ± 4.1         /, low                  209.2 ± 8.3
                                                                                                                             P
                                                                                                                               pT
                             P
                /, high
              3jp                pT      20     30.9 ± 14.4                                 106                    b2jp               198
              3jγτ ±                     13      11 ± 2                 /
                                                                      jγp                   913    965.2 ± 41.5    b2jγ               429     425.1 ± 13.1
                 /
              3jγp                       83    102.9 ± 11.1           jµ±                33462 34026.7 ± 510.1     b2jµ± /
                                                                                                                         p              46     40.1 ± 2.7
              3jγ                     11424 11506.4 ± 190.6           jµ± τ ∓                29     37.5 ± 4.5     b2jµ±                56     60.6 ± 3.4
              3jµ± /
                   p                   1114   1118.7 ± 27.1           jµ± /
                                                                          pτ ∓               10      9.6 ± 2.1     bτ ±                 19     19.9 ± 2.2
              3jµ± µ∓                    61     84.5 ± 9.2            jµ± /
                                                                          p              45728 46316.4 ± 568.2     bγ                 976    1034.8 ± 15.6
              3jµ±                     2132   2168.7 ± 64.2           jµ± γp/                78     69.8 ± 9.9       /
                                                                                                                   bγp                  18     16.7 ± 3.1
                                                 9.3 ± 1.9            jµ± γ                  70     98.4 ± 12.1    bµ±                303     263.5 ± 7.9
                          P
              3bj, low        pT         14
              2τ ±                      316    290.8 ± 24.2           jµ± µ∓              1977    2093.3 ± 74.7    bµ± /
                                                                                                                       p              204     218.1 ± 6.4
                                                                      e± 4j                       6661.9 ± 147.2                             9275.7 ± 87.8
                                                                                                                             P
                /
              2γp                       161     176 ± 9.1                                 7144                     bj, low   pT      9060
                                                                      e± 4jp
                                                                           /                        363 ± 9.9                                7030.8 ± 74
                                                                                                                            P
              2γ                       8482   8349.1 ± 84.1                                 403                    bj, high   pT     7236
                                      93408 92789.5 ± 1138.2          e± 3jτ ∓                       7.6 ± 1.6                                 17.6 ± 3.3
                         P
              2j, low        pT                                                              11                    bj2γ                 13
              2j2γ                      645    612.6 ± 18.8           e± 3jγ                 27     21.7 ± 3.4     bjτ ±                13     12.9 ± 1.8
              2jτ ± τ ∓                                               e± 2γ                         74.5 ± 5
                                                                                                                            P
                                         15      25 ± 3.5                                    47                      /, low
                                                                                                                   bjp        pT        53     60.4 ± 19.9
                /, low                          106 ± 7.8             e± 2j              126665 122457 ± 1672.6                               989.4 ± 20.6
                          P
              2jp        pT              74                                                                        bjγ                937
                /, high                         37.7 ± 100.2          e± 2jτ ∓                      37.3 ± 3.9        /                        30.5 ± 4
                        P
              2jp         pT             43                                                  53                    bjγp                 34
              2jγ                     33684 33259.9 ± 397.6           e± 2jτ ±               20     24.7 ± 2.3     bjµ± /
                                                                                                                        p             104     112.6 ± 4.4
              2jγτ ±                     48     41.4 ± 3.4            e± 2jp
                                                                           /             12451 12130.1 ± 159.4     bjµ±               173     141.4 ± 4.8
                 /
              2jγp                      403    425.2 ± 29.7           e± 2jγ                101     88.9 ± 6.1     be± 3jp
                                                                                                                         /              68     52.2 ± 2.2
              2jµ± /
                   p                   7287   7320.5 ± 118.9          e± τ ∓                609    555.9 ± 10.2    be± 2jp
                                                                                                                         /              87      65 ± 3.3
              2jµ± γp/                   13     12.6 ± 2.7            e± τ ±                225    211.2 ± 4.7     be± /
                                                                                                                       p              330     347.2 ± 6.9
              2jµ± γ                     41     35.7 ± 6.1            e± /
                                                                         p               476424 479572 ± 5361.2    be± jp
                                                                                                                        /             211     176.6 ± 5
              2jµ± µ∓                   374    394.2 ± 24.8           e± /
                                                                         pτ ∓                48      35 ± 2.7      be± e∓ j             22     34.6 ± 2.6
TABLE II: A subset of the Vista comparison between Tevatron Run II data and standard model prediction, showing the twenty
most discrepant final states and all final states populated with ten or more data events. Events are partitioned into exclusive
final states based on standard CDF object identification
                                                    P      criteria. FinalP
                                                                          states are labeled in this table according to the number
and types of objects present, and whether (high        pT ) or not (low     pT ) the summed scalar transverse momentum of all
objects in the events exceeds 400 GeV, for final states not containing leptons or photons. Final states are ordered according to
decreasing discrepancy between the total number of events expected, taking into account the error from Monte Carlo statistics
and the total number observed in the data. Final states exhibiting mild discrepancies are shown together with the significance
of the discrepancy in units of standard deviations (σ) after accounting for a trials factor corresponding to the number of final
states considered. Final states that do not exhibit even mild discrepancies are listed below the horizontal line in inverted
alphabetical order. Only Monte Carlo statistical uncertainties on the background prediction are included.
                                                                                                                                13
                                                                                                    ∑p
                                                                                                                             CDF Run II Data
                                                                                               bj    T   > 400 GeV           Other
                                                                                                                             MadEvent W(→eν) jj : 0%
                                                                                                                             Pythia γ j : 0.2%
                                                                                                                             Pythia bj : 15.8%
                                                                                        1000                                 Pythia jj : 83.9%
                                                                     Number of Events
                                                                                        500
                                                                                          0
                                                                                                                50                      100
                                                                                                                M(j) (GeV)
                                                                    FIG.
                                                                    P 4: The jet mass distribution in the bj final state with
FIG. 3: A shape discrepancy highlighted by Vista in the fi-            pT > 400 GeV. The 3j ∆R(j2 , j3 ) discrepancy illustrated
nal state consisting of exactly three reconstructed jets with       in Fig. 3 manifests itself also by producing jets more massive
|η| < 2.5 and pT > 17 GeV, and with one of the jets satis-          in data than predicted by Pythia’s showering algorithm. The
fying |η| < 1 and pT > 40 GeV. This distribution illustrates        mass of a jet is determined by treating energy deposited in
the effect underlying most of the Vista shape discrepancies.        each calorimeter tower as a massless 4-vector, summing the
Filled (black) circles show CDF data, with the shaded (red)         4-vectors of all towers within the jet, and computing the mass
histogram showing the prediction of Pythia. The discrep-            of the resulting (massive) 4-vector.
ancy is clearly statistically significant, with statistical error
bars smaller than the size of the data points. The vertical
axis shows the number of events per bin, withpthe horizon-
tal axis showing the angular separation (∆R = ∆η 2 + δφ2 )
between the second and third jets, where the jets are ordered
according to decreasing transverse momentum. In the region
∆R(j2 , j3 ) & 2, populated primarily by initial state radiation,
the standard model prediction can to some extent be adjusted.
The region ∆R(j2 , j3 ) . 2 is dominated by final state radi-
ation, the description of which is constrained by data from
LEP 1.
3j ∆R(j2 , j3 ) and jet mass discrepancies appear to be two     of this research program involves the systematic search
different views of a single underlying discrepancy, noting      for such physics using an algorithm called Sleuth [32].
that two sufficiently nearby distinct jets correspond to a      Sleuth is quasi model independent, where “quasi” refers
pattern of calorimetric energy deposits similar to a single     to the assumption that the first sign of new physics will
massive jet. The underlying 3j ∆R(j2 , j3 ) discrepancy is      appear as an excess of events in some final
                                                                                                         P state at large
manifest in many other final states. The final state b e j,     summed scalar transverse momentum ( pT ).
arising primarily from QCD production of three jets with           The Sleuth algorithm used by CDF in Tevatron Run
one misreconstructed as an electron, shows a similar dis-       II is essentially that developed by DØ in Tevatron Run
crepancy in ∆R(j, b) in Fig. 5.                                 I [33, 34, 35], and subsequently improved by H1 in HERA
   While these discrepancies are clearly statistically sig-     Run I [36], with small modifications.
nificant, basing a new physics claim around them is diffi-         Sleuth’s definition of interest relies on the following
cult. In the kinematic regime of the discrepancy, different     assumptions.
algorithms to match exact leading order calculations with
a parton shower lead to different predictions [31]. Newer          1. The data can be categorized into exclusive final
predictions have not been systematically compared to                  states in such a way that any signature of new
LEP 1 data, which provide constraints on parton show-                 physics is apt to appear predominantly in one of
ering reflected in Pythia’s tuning. Further investigation             these final states.
into obtaining an adequate QCD-based description of this
discrepancy continues.                                             2. New physics will appear with P  objects at high
                                                                      summed transverse momentum ( pT ) relative to
   An additional 59 discrepant distributions reflect an
                                                                      standard model and instrumental background.
inadequate modeling of the overall transverse boost of
the system. The overall transverse boost of the primary            3. New physics will appear as an excess of data over
physics objects in the event is attributed to two sources:            standard model and instrumental background.
the intrinsic Fermi motion of the colliding partons within
the proton, and soft or collinear radiation of the collid-
ing partons as they approach collision. Together these                               A.    Algorithm
effects are here referred to as “intrinsic kT ,” representing
an overall momentum kick to the hard scattering. Fur-
ther discussion appears in Appendix A 2 c.                        The Sleuth algorithm consists of three steps, follow-
   The remaining 13 discrepant distributions are seen to        ing the above three assumptions.
be due to the coarseness of the Vista correction model.
Most of these discrepancies, which are at the level of
10% or less when expressed as (data − theory)/theory,                                 1.   Final states
arise from modeling most fake rates as independent of
transverse momentum.                                               In the first step of the algorithm, all events are placed
   In summary, this global analysis of the bulk features        into exclusive final states as in Vista, with the following
of the high-pT data has not yielded a discrepancy mo-           modifications.
tivating a new physics claim. There are no statistically
significant population discrepancies in the 344 populated          • Jets are identified as pairs, rather than individu-
final states considered, and although there are several              ally, to reduce the total number of final states and
statistically significant discrepancies among the 16,486             to keep signal events with one additional radiated
kinematic distributions investigated, the nature of these            gluon within the same final state. Final state names
discrepancies makes it difficult to use them to support a            include “n jj” if n jet pairs are identified, with pos-
new physics claim.                                                   sibly one unpaired jet assumed to have originated
   This global analysis of course cannot conclude with               from a radiated gluon.
certainty that there is no new physics hiding in the CDF
data. The Vista population and shape statistics may be             • The present understanding of quark flavor suggests
insensitive  to a small excess of events appearing at large          that b quarks should be produced in pairs. Bottom
P
    pT in a highly populated final state. For such signals           quarks are identified as pairs, rather than individu-
another algorithm is required.                                       ally, to increase the robustness of identification and
                                                                     to reduce the total number of final states. Final
                                                                     state names include “n bb” if n b pairs are identi-
                                                                     fied.
                      IV.   SLEUTH
                                                                   • Final states related through global charge conjuga-
  Taking a broad view of all proposed models that might              tion are considered to be equivalent. Thus e+ e− γ
extend the standard model, a profound commonality is                 is a different final state than e+ e+ γ, but e+ e+ γ
noted: nearly all predict an excess of events at high pT ,           and e− e− γ together make up a single Sleuth final
concentrated in a particular final state. The second stage           state.
16
     • Final states related through global interchange of      • In each final state, the regions
                                                                                               P considered are the
       the first and second generation are considered to be      one dimensional intervals in     pT extending from
       equivalent. Thus e+ /   pγ and µ+ /pγ together make       each data point up to infinity. A region is required
       up a single Sleuth final state. The decision to           to contain at least three data events, as described
       consider third generation objects (b quarks and τ         in Appendix B.
       leptons) differently from first and second generation
       objects reflects theoretical prejudice that the third
       generation may be special, and the experimental         • In a particular finalP state, the data point with the
       ability (in the case of b quarks) and experimental        dth largest
                                                                           Pvalue of      pT defines an interval in the
       challenge (in the case of τ leptons) in the identifi-     variable     pT extending from this data point up
       cation of third generation objects.                       to infinity. This semi-infinite interval contains d
                                                                 data events. The standard model prediction in this
   The symbol ℓ is used to denote electron or muon. The          interval, estimated from the Vista comparison de-
symbol W is used in naming final states containing one           scribed above, integrates to b predicted events. In
electron or muon, significant missing momentum, and              this final state, the interest of the dth region is de-
perhaps other non-leptonic objects. Thus the final states        fined as the Poisson probability pd = ∞
                                                                                                            P     bi −b
                                                                                                              i=d i! e
e+ /pγ, e− /pγ, µ+ /
                   pγ, and µ− /
                              pγ are combined into the           that the standard model background b would fluc-
Sleuth final state W γ. A table showing the relation-            tuate up to or above the observed number of data
ship between Vista and Sleuth final states is provided           events d in this region. The most interesting region
in Appendix B 1.                                                 in this final state is the one with smallest Poisson
                                                                 probability.
                        2.   Variable
                                                               • For this final state, pseudo experiments are gener-
  The second step of the algorithm considers a single            ated, with pseudo data pulled from the standard
variable in each exclusive final state: the summedP
                                                  scalar         model background. For each pseudo experiment,
transverse momentum of all objects in the event ( pT ).          the interest of the most interesting region is calcu-
Assuming momentum conservation in the plane trans-               lated. An ensemble of pseudo experiments deter-
verse to the axis of the colliding beams,                        mines the fraction P of pseudo experiments in this
                            −−→                                  final state in which the most interesting region is
                      p~i + uncl + ~/
                  X
                                    p = ~0,          (3)         more interesting than the most interesting region
                    i                                            in this final state observed in the data. If there is
where the sum over i represents a sum over all identified        no new physics in this final state, P is expected to
objects in the event, the ith object has momentum ~pi ,          be a random number pulled from a uniform distri-
−−→                                                              bution in the unit interval. If there is new physics
uncl denotes the vector sum of all momentum visible in
the detector but not clustered into an identified object,        in this final state, P is expected to be small.
~/
 p denotes the missing momentum, and the equation is a
two-component vector equality for the components of the        • Looping over all final states, P is computed for each
momentum along the two spatial directions transverse to          final state. The minimum of these values is denoted
the axis of the colliding beams. The Sleuth variable
P                                                                Pmin . The most interesting region in the final state
   pT is then defined by                                         with smallest P is denoted R.
                                  −−→  
                           pi | + uncl + ~/
             X        X
                pT ≡      |~                 p ,     (4)
                                        
                         i                                     • The interest of the most interesting region R in
where only the momentum components transverse to the             theQmost interesting final state is defined by P̃ =
axis of the colliding beams are considered when comput-          1− a (1−p̂a ), where the product is over all Sleuth
ing magnitudes.                                                  final states a, and p̂a is the lesser of Pmin and the
                                                                 probability for the total number of events predicted
                                                                 by the standard model in the final state a to fluc-
                        3.   Regions                             tuate up to or above three data events. The quan-
                                                                 tity P̃ represents the fraction of hypothetical sim-
  The algorithm’s third step involves searching for re-          ilar CDF experiments that would produce a final
gions in which more events are seen in the data than             state with P < Pmin . The range of P̃ is the unit
expected from standard model and instrumental back-              interval. If the data are distributed according to
ground. This search is performed in the variable space           standard model prediction, P̃ is expected to be a
defined in the second step of the algorithm, for each of         random number pulled from a uniform distribution
the exclusive final states defined in the first step.            in the unit interval. If new physics is present, P̃ is
  The steps of the search can be sketched as follows.            expected to be small.
                                                                                                                                                  17
                                                               Number of pseudo-experiments
                          4.    Output
                                                                                              100
   The output of the algorithm is the most interesting re-
gion R observed in the data, and a number P̃ quantifying
the interest of R. A reasonable threshold for discovery is
P̃ . 0.001, which corresponds loosely to a local 5σ effect
after the trials factor is accounted for.                                                      50
   Although no integration over systematic errors is per-
formed in computing P̃, systematic uncertainties do af-
fect the final Sleuth result. If Sleuth highlights a dis-
crepancy in a particular final state, explanations in terms
of a correction to the background estimate are consid-                                          0
                                                                                                 0                      0.5                   1
ered. This process necessarily requires physics judge-                                                                                       ~
                                                                                                                                             P
ment. A reasonable explanation of a Sleuth discrep-
ancy in terms of an inadequacy in the modeling of the
detector response or standard model prediction that is
                                                               Number of pseudo-experiments
consistent with external information is fed back into the                                     100
Vista correction model and tested for global consistency.
In this way, plausible explanations for discrepancies ob-
served by Sleuth are incorporated into the Vista cor-
rection model. This iteration continues until either all
reasonable explanations for a significant Sleuth discrep-
                                                                                              50
ancy are exhausted, resulting in a possible new physics
claim, or no significant Sleuth discrepancy remains.
                     B.        Sensitivity
                                                                                               0
                                                                                               -4             -2         0           2        4
                                                                                                                                 ~
  Two important questions must be asked:                                                                                        P in units of σ
   • Will Sleuth find nothing if there is nothing to be       FIG. 6: Distribution of 103 P̃ values from 103 CDF pseudo ex-
     found?                                                   periments, in which pseudo data are pulled from the standard
                                                              model prediction. The distribution of P̃ is shown in the unit
                                                              interval (upper), with one entry for each of the CDF pseudo
   • Will Sleuth find something if there is something
                                                              experiments. The distribution of P̃ translated into units of
     to be found?                                             standard deviations is also shown (lower). The distribution
                                                              of P̃ from pseudo experiments is consistent with flat (upper),
   If there is nothing to be found, Sleuth will find noth-    and consistent with a Gaussian when translated into units of
ing 999 times out of 1000, given a uniform distribution of    standard deviations (lower), as expected.
P̃ and a discovery threshold of P̃ . 0.001. The uniform
distribution of P̃ in the absence of new physics is illus-
trated in Fig. 6, using values of P̃ obtained in pseudo ex-
                                                                                                     1.   Known standard model processes
periments with pseudo data generated from the standard
model prediction. Sleuth will of course return spuri-
ous signals if provided improperly modeled backgrounds.          Consideration of specific standard model processes can
The algorithm directly addresses the issue of whether an      provide intuition for Sleuth’s sensitivity to new physics.
observed hint is due to a statistical fluctuation. Sleuth     This section tests Sleuth’s sensitivity to the production
itself is unable to address systematic mismeasurement or      of top quark pairs, W boson pairs, single top, and the
incorrect modeling, but quite useful in bringing these to     Higgs boson.
attention.                                                       a. Top quark pairs. Top quark pair production re-
   The answer to the second question depends to what de-      sults in two b jets and two W bosons, each of which may
gree the new physics satisfies the three assumptions on       decay leptonically or hadronically. The W branching ra-
which Sleuth is based: new physics will appear predom-        tios are such that this signal predominantly populates
inantly in one final state, at high summed scalar trans-      the Sleuth final state W bb̄jj, where “W ” denotes an
verse momentum, and as an excess of data over standard        electron or muon and significant missing momentum. Al-
model prediction. Sleuth’s sensitivity to any particular      though the final states ℓ+ ℓ− /pbb̄ were important in verify-
new phenomenon depends on the extent to which this            ing the top quark pair production hypothesis in the initial
new phenomenon satisfies these assumptions.                   observation by CDF [5] and DØ [6] in 1995, most of the
18
FIG. 7: (Top left) The Sleuth final state bb̄ℓ+ ℓ′− / p, consisting of events with one electron and one muon of opposite sign,
missing momentum, and two or three jets, one or two of which are b-tagged. Data corresponding to 927 pb−1 are shown as
filled (black) circles; the standard model prediction is shown as the (red) shaded histogram. (Top right) The same final state
with tt̄ subtracted from the standard model prediction. (Bottom row) The Sleuth final state W bb̄jj, with the standard model
tt̄ contribution included (lower left) and removed (lower right). Significant discrepancies far surpassing Sleuth’s discovery
threshold are observed in these final states with tt̄ removed from the standard model background estimate. If the top quark
had not been predicted, Sleuth would have discovered it.
statistical power came from the final state W bb̄jj. The         quark pair production, this process is removed from the
all hadronic decay final state bb̄ 4j has only convincingly      standard model prediction, and the values of the Vista
been seen after integrating substantial Run II luminos-          correction factors are re-obtained from a global fit as-
ity [37]. Sleuth’s first assumption that new physics will        suming ignorance of tt̄ production. Sleuth easily dis-
appear predominantly in one final state is thus reason-          covers tt̄ production in 927 pb−1 in the final states
ably well satisfied. Since the top quark has a mass of           bb̄ℓ+ ℓ′− /p and W bb̄jj, shown in Fig. 7. Sleuth finds
170.9 ± 1.8 GeV [38], the production
                            P           of two such objects      Pbb̄ℓ+ ℓ′− p/ < 1.5 × 10−8 and PW bb̄jj < 8.3 × 10−7 , far
leads to a signal at large    pT relative to the standard        surpassing the discovery threshold of P̃ . 0.001.
model background of W bosons produced in association
with jets, satisfying Sleuth’s second and third assump-            The test is repeated as a function of assumed inte-
tions. Sleuth is expected to perform reasonably well on          grated luminosity, and Sleuth is found to highlight the
this example.                                                    top quark signal at an integrated luminosity of roughly
   To quantitatively test Sleuth’s sensitivity to top            80 ± 60 pb−1 , where the large variation arises from sta-
                                                                                                                                                                                       19
                                                                                           Number of Events
                                                                                                                      CDF Run II data
                           Pythia WW : 45%              8                       SM= 56                                Pythia Z(→ττ) : 51%                                  SM= 30
                   25      Pythia Z(→ττ) : 27%                                  d= 77                                 MadEvent W(→µ ν) γ : 13%
                                                                                                                                                    8
                                                                                                                                                                           d= 77
                           MadEvent W(→µ ν) γ : 6.6%                                                          25
                                                        6                                                             MadEvent W(→µ ν) j : 7.6%
                           MadEvent W(→µ ν) j : 4.1%                                                                  Pythia Z(→µµ ) : 6.2%
                                                                                                                                                    6
                           Other                        4                                                             Other                         4
                   20                                                                                         20
                                                        2                                                                                           2
                                                        0                                                                                           0
                                                             100         200         300                                                                100         200          300
                   15                                                                                         15
10 10
5 5
                   0                                                                                          0
                    0      50        100     150       200         250         300   350                       0      50         100       150     200        250         300      350
                                71                                         ∑    p (GeV)
                                                                                 T                                          71                                            ∑pT
                                                                                                                                                                                (GeV)
                                                                                                                                                P
             tistical fluctuations in the tt̄ signal events. Weaker con-                               it does not appear at particularly large   pT relative
             straints on the Vista correction factors at lower inte-                                   to other standard model processes. Since the standard
             grated luminosity marginally increase the integrated lu-                                  model Higgs boson poorly satisfies Sleuth’s first and
             minosity required to claim a discovery.                                                   second assumptions, a targeted search for this specific
                b. W boson pairs. The sensitivity to standard                                          signal is expected to outperform Sleuth.
             model W W production is tested by removing this pro-
             cess from the standard model background prediction and
             allowing the Vista correction factors to be re-fit. In
                                                                                                                       2.    Specific models of new physics
             927 pb−1 of Tevatron Run II data, Sleuth identifies an
             excess in the final state ℓ+ ℓ′ − /
                                               p, consisting of an electron
             and muon of opposite sign and missing momentum. This                                         To build intuition for Sleuth’s sensitivity to new
             excess corresponds to P̃ < 2 × 10−4 , sufficient for the                                  physics signals, several sensitivity tests are conducted for
             discovery of W W , as shown in Fig. 8.                                                    a variety of new physics possibilities. Some of the new
                c. Single top. Single top quarks are produced                                          physics models chosen have already been considered by
             weakly, and predominantly decay to populate the                                           more specialized analyses within CDF, making possible
             Sleuth final state W bb̄, satisfying Sleuth’s first as-                                   a comparison between Sleuth’s sensitivity and the sen-
             sumption. Single top production will appear as an ex-                                     sitivity of these previous analyses.
             cess of events, satisfying Sleuth’s third assumption.                                        Sleuth’s sensitivity can be compared to that of a ded-
             Sleuth’s second assumption is not well satisfied for                                      icated search by determining the minimum new physics
             this example,   since single top production does not lie at                               cross section σmin required for a discovery by each. The
                                                                                                       discovery for Sleuth occurs when P̃ < 0.001. In most
                    P
             large      pT relative to other standard model processes.
             Sleuth is thus expected to be outperformed by a tar-                                      Sleuth regions satisfying the discovery threshold of
             geted search in this example.                                                             P̃ < 0.001, the probability for the predicted number of
                d. Higgs boson. Assuming a standard model Higgs                                        events to fluctuate up to or above the number of events
             boson of mass mh = 115 GeV, the dominant observable                                       observed corresponds to greater than 5σ. The discov-
             production mechanism is pp̄ → W h and pp̄ → Zh, popu-                                     ery for the dedicated search occurs when the observed
             lating the final states W bb̄, ℓ+ ℓ− bb̄, and /
                                                           p bb̄. The signal                           excess of data corresponds to a 5σ effect. Smaller σmin
             is thus spread over three Sleuth final states. Events in                                  corresponds to greater sensitivity.
             the last of these ( /
                                 p bb̄) do not pass the Vista event selec-                                The sensitivity tests are performed by first generating
             tion, which does not use /   p as a trigger object. Sleuth’s                              pseudo data from the standard model background pre-
             first assumption is thus poorly satisfied for this exam-                                  diction. Signal events for the new physics model are gen-
             ple. The standard model Higgs boson signal will appear                                    erated, passed through the chain of CDF detector simu-
             as an excess, but as in the case of single top production                                 lation and event reconstruction, and consecutively added
20
TABLE III: Summary of Sleuth’s sensitivity to several new physics models, expressed in terms of the minimum production
cross section needed for discovery with 927 pb−1 . Where available, a comparison is made to the sensitivity of a dedicated
search for this model. The solid (red) box represents Sleuth’s sensitivity, and the open (white) box represents the sensitivity
of the dedicated analysis. Systematic uncertainties are not included in the sensitivity calculation. The width of each box shows
typical variation under fluctuation of data statistics. In Models 3 and 4, there is no targeted analysis available for comparison.
Sleuth is seen to perform comparably to the targeted analyses on models satisfying the assumptions on which Sleuth is
based.
to the pseudo data until Sleuth finds P̃ < 0.001. The               tions on which Sleuth P    is based. For models in which
number of signal events needed to trigger discovery is              Sleuth’s simple use of        pT can be improved upon by
used to calculate σmin .                                            optimizing for a specific feature, a targeted search may
   For each dedicated analysis to which Sleuth is com-              be expected to achieve greater sensitivity. One of the im-
pared, the number of standard model events expected                 portant features of Sleuth is that it not only performs
in 927 pb−1 within the region targeted is used to calcu-            reasonably well, but that it does so broadly. In Model
late the number of signal events required in that region            1, a search for a particular model point in a gauge medi-
to produce a discrepancy corresponding to 5σ. Using                 ated supersymmetry breaking (GMSB) scenario, Sleuth
the signal efficiency determined in the dedicated analysis,         gains an advantage by exploiting a final state not consid-
σmin is calculated. The effect of systematic uncertainties          ered in the targeted analysis [39]. In Model 2, a search for
are removed from the dedicated analyses, and are not                a Z ′ decaying to lepton pairs, the targeted analysis [40]
included for Sleuth. The inclusion of systematic uncer-             exploits the narrow resonance in the e+ e− invariant mass.
tainties will reduce the sensitivity of both Sleuth and             In Models 3 and 4, which are searches for a hadronically
the dedicated analysis to the extent that the systematic            decaying Z ′ of different masses, there is no targeted anal-
parameters are allowed to vary. Vista and Sleuth have               ysis against which to compare. In Model 5, a search for
the advantage of using a large data set to constrain them.          a Z ′ → tt̄ resonance, the signal appears at large summed
   The results of five such sensitivity tests are summa-            scalar transverse momentum in a particular final state,
rized in Table III. Sleuth is seen to perform comparably            resulting in comparable sensitivity between Sleuth and
to targeted analyses on models satisfying the assump-               the targeted analysis [41].
                                                                                                                                       21
                                                          Entries   72
                                                                             as discussed below.
                            25                                                  The final state j /p, consisting of events with one recon-
                                                                             structed jet and significant missing transverse momen-
   Number of final states
                            20
                                                                             tum, is the second final state identified by Sleuth. The
                                                                             primary background is due to non-collision processes, in-
                                                                             cluding cosmic rays and beam halo backgrounds, whose
                            15
                                                                             estimation is discussed in Appendix A 2 a. Since the
                                                                             hadronic energy is not required to be deposited in time
                            10
                                                                             with the beam crossing, Sleuth’s analysis of this final
                                                                             state is sensitive to particles with a lifetime between 1 ns
                            5                                                and 1 µs that lodge temporarily in the hadronic calorime-
                                                                             ter, complementing Ref. [42].
                                                                                                       +           +            +
                            0
                             0   0.2        0.4     0.6    0.8           1      The final states ℓ+ ℓ′ /pjj, ℓ+ ℓ′ /p, and ℓ+ ℓ′ all con-
                                                                    P                                             ′
                                                                             tain an electron (ℓ) and muon (ℓ ) with identical recon-
FIG. 9: The distribution of P in the data, with one entry for                structed charge (either both positive or both negative).
each final state considered by Sleuth.                                       The final states with and without missing transverse mo-
                                                                             mentum are qualitatively different in terms of the stan-
                                                                             dard model processes contributing to the background es-
                                                                                                                −
                                                                             timate, with the final state ℓ+ ℓ′ composed mostly of di-
                                       C.     Results
                                                                             jets where one jet is misreconstructed as an electron and
                                                                             a second jet is misreconstructed as a muon; Z → τ + τ − ,
   The distribution of P for the final states considered                     where one tau decays to a muon and the other to a lead-
by Sleuth in the data is shown in Fig. 9. The concav-                        ing π 0 , one of the two photons from which converts while
ity of this distribution reflects the degree to which the                    traveling through the silicon support structure to result
correction model described in Sec. III F has been tuned.                     in an electron reconstructed with the same sign as the
A crude correction model tends to produce a distribu-                        muon, as described in Appendix A 1; and Z → µ+ µ− ,
tion that is concave upwards, as seen in this figure, while                  in which a photon is produced, converts, and is misre-
an overly tuned correction model produces a distribution                     constructed as an electron. The final states containing
that is concave downwards, with more final states than                       missing transverse momentum are dominated by the pro-
expected having P near the midpoint of the unit interval.                    duction of W (→ µν) in association with one or more
   The most interesting final states identified by Sleuth                    jets, with one of the jets misreconstructed as an elec-
are shown in Fig. 10, together with a quantitative mea-                      tron. The muon is significantly more likely than the
sure (P) of the interest of the most interesting region in                   electron to have been produced in the hard interaction,
each final state, determined as described in Sec. IV A 3.                    since the fake rate p(j → µ) is roughly an order of mag-
The legends of Fig. 10 show the primary contributing                         nitude smaller than the fake rate p(j → e), as observed
                                                                                                                −
standard model processes in each of these final states,                      in Table I. The final state ℓ+ ℓ′ /pjj, which contains two
together with the fractional contribution of each. The                       or three reconstructed jets in addition to the electron,
top six final states, which correspond to entries in the                     muon, and missing transverse momentum, also has some
leftmost bin in Fig. 9. span a range of populations, rel-                    contribution from W Z and top quark pair production.
evant physics objects, and important background contri-                         The final state τ /p contains one reconstructed tau, sig-
butions.                                                                     nificant missing transverse momentum, and one recon-
   The final state bb̄, consisting of two or three recon-                    structed jet with pT > 200 GeV. This final state in prin-
structed jets, one or two of which are b-tagged, heads                       ciple also contains events with one reconstructed tau, sig-
the list. These events enter the analysis by satisfying                      nificant missing transverse momentum, and zero recon-
the Vista offline selection requiring one or more jets or                    structed jets, but such events do not satisfy the offline
b-jets with pT > 200 GeV. The definition of Sleuth’s                         selection criteria described in Sec. III C. Roughly half
                                                                             of the background is non-collision, in which two differ-
P
    pT variable is such
                     P that all events in this final state
consequently   have     pT > 400 GeV. Sleuth chooses the                     ent cosmic ray muons (presumably from the same cosmic
           pT > 469 GeV, which includes nearly 104 data                      ray shower) leave two distinct energy deposits in the CDF
        P
region
events. The standard model prediction in this region                         hadronic calorimeter, one with pT > 200 GeV, and one
is sensitive to the b-tagging efficiency p(b → b) and the                    with a single associated track from a pp̄ collision occur-
fake rate p(j → b), which have few strong constraints on                     ring during the same bunch crossing. Less than a single
their values for jets with pT > 200 GeV other than those                     event is predicted from this non-collision source (using
imposed by other Vista kinematic distributions within                        techniques described in Appendix A 2 a) over the past
this and a few other related final states. For this region                   five years of Tevatron running.
Sleuth finds Pbb̄ = 0.0055, which is unfortunately not                          In these CDF data, Sleuth finds P̃ = 0.46. The frac-
statistically significant after accounting for the trials fac-               tion of hypothetical similar CDF experiments (assuming
tor associated with looking in many different final states,                  a fixed standard model prediction, detector simulation,
22
                                 bb                                                                   P = 0.0055                                        j pT                                                                             P = 0.0092
                                                                                                                                                                                                               240
                                                                                                                                          Number of Events
                                                                                                                                                                               CDF Run II data
              Number of Events
                                                  CDF Run II data             2000                                                                               600                                           220
                                                  Pythia jj : 82%             1800                                 SM= 9540                                                    Non-collision : 88%             200                                SM= 1030
                                                  Pythia bj : 18%             1600                                  d= 9900                                                    Pythia jj : 9.3%                180                                 d= 1150
                                 3500                                                                                                                                          MadEvent Z(→νν) j : 0.89%       160
                                                  Pythia γ j : 0.2%           1400
                                                                                                                                                                                                               140
                                                  Herwig tt : 0.11%           1200
                                                                                                                                                                 500           MadEvent W(→µ ν) j : 0.76%      120
                                                  Other                       1000                                                                                             Other                           100
                                 3000                                          800                                                                                                                              80
                                                                               600                                                                                                                              60
                                                                               400                                                                                                                              40
                                 2500                                          200                                                                               400                                            20
                                                                                 0                                                                                                                               0
                                                                                         600      800         1000     1200                                                                                             1000           2000                 3000
2000 300
                                 1500
                                                                                                                                                                 200
                                 1000
                                                                                                                                                                 100
                                  500
                                       0                                                                                                                          0
                                        400   500      600    700     800     900        1000 1100 1200 1300                                                            500         1000         1500       2000                2500           3000
                                              469                                                     ∑
                                                                                                   p (GeV)
                                                                                                              T                                                               679                                                        ∑p     (GeV)
                                                                                                                                                                                                                                                  T
                             +     +                                                                                                                         +    +
                    l l’ pT jj                                                                          P = 0.011                                        l l’ pT                                                                        P = 0.016
                                                                                  3.5
      Number of Events
                                                                                                                                          Number of Events
                                                 CDF Run II data                                                                                                               CDF Run II data                     6
                                  2.5            MadEvent W(→µ ν) jjj : 23%        3                              SM= 0.71                                         9           MadEvent W(→µ ν) γ : 27%            5                              SM= 6.9
                                                 MadEvent W(→µ ν) jj : 21%        2.5                                d= 4                                                      MadEvent W(→µ ν) j : 14%                                             d= 16
                                                 Pythia WZ : 14%                                                                                                               Pythia Z(→µµ ) : 13%                4
                                                 Herwig tt : 9.9%                  2                                                                               8           MadEvent W(→µ ν) jj : 9.5%
                                                                                                                                                                                                                   3
                                                 Other                            1.5                                                                                          Other
                                   2
                                                                                    1                                                                              7                                               2
                                                                                  0.5                                                                                                                               1
                                                                                                                                                                   6
                                                                                   0                                                                                                                               0
                                  1.5                                                     200               250               300                                                                                         150          200            250
                                                                                                                                                                   5
                                                                                                                                                                   4
                                    1
                                                                                                                                                                   3
0.5 2
                                   0                                                                                                                               0
                                    0            100          200           300           400               500      600                                            0          50          100     150         200             250           300       350
                                                             168                                        ∑     p (GeV)
                                                                                                                   T                                                                             117                                    ∑p    T
                                                                                                                                                                                                                                                   (GeV)
                                 τ pT                                                            P = 0.016                                             l l’  +    +
                                                                                                                                                                                                                                     P = 0.021
                                                                                                                                                                                                               10
          Number of Events
Number of Events
                                                                                                                                                                  8
                                       1
                                                                                                                                                                  6
0.5 4
                                    0                                                                                                                             0
                                        400      500         600      700          800          900      1000            1100                                      0      20        40      60    80     100        120        140     160 180                200
                                           416                                                   ∑    p (GeV)
                                                                                                        T                                                                                   57                                       ∑p  T
                                                                                                                                                                                                                                          (GeV)
FIG. 10: The most interesting final states identified by Sleuth. The region chosen by Sleuth, extending up to infinity, is
shown by the (blue) arrow just below the horizontal axis. Data are shown as filled (black) circles, and the standard model
prediction is shown as the shaded (red) histogram. The Sleuth final state is labeled in the upper left corner of each panel, with
ℓ denoting e or µ, and ℓ+ ℓ′+ denoting an electron and muon with the same electric charge. The number at upper right in each
panel shows P, the fraction of hypothetical similar experiments in which something at least as interesting as the region shown
would be seen in this final state. The inset in each panel shows an enlargement of the region selected by Sleuth, together with
the number of events (SM) predicted by the standard model in this region, and the number of data events (d) observed in this
region.
                                                                                                                          23
and correction model) that would exhibit a final state          and compared with data in 344 populated exclusive final
with P smaller than the smallest P observed in the CDF          states and 16,486 relevant kinematic distributions, most
Run II data is approximately 46%. The actual value ob-          of which have not been previously considered. Considera-
tained for P̃ is not of particular interest, except to note     tion of exclusive final state populations yields no statisti-
that this value is significantly greater than the thresh-       cally significant (> 3σ) discrepancy after the trials factor
old of . 0.001 required to claim an effect of statistical       is accounted for. Quantifying the difference in shape of
significance. Sleuth has not revealed a discrepancy of          kinematic distributions using the Kolmogorov-Smirnov
sufficient statistical significance to justify a new physics    statistic, significant discrepancies are observed between
claim.                                                          data and standard model prediction. These discrepan-
   Systematics are incorporated into Sleuth in the form         cies are believed to arise from mismodeling of the parton
of the flexibility in the Vista correction model, as de-        shower and intrinsic kT , and represent observables for
scribed previously. This flexibility is significantly more      which a QCD-based understanding is highly motivated.
important in practice than the uncertainties on particular      None of the shape discrepancies highlighted motivates a
correction factor values obtained from the fit, although        new physics claim.
the latter are easier to discuss. The relative importance          A further systematic
                                                                                     P search (Sleuth) for regions of
of correction factor value uncertainties on Sleuth’s re-        excess on the high- pT tails of exclusive final states has
sult depends on the number ofPpredicted standard model          been performed, representing a quasi-model-independent
events (b) in Sleuth’s high         pT tail. The uncertain-     search for new electroweak scale physics. Most of the ex-
ties on the correction factors of Table I are such that         clusive final states searched with Sleuth have not been
the appropriate addition in quadrature gives a typical          considered by previous Tevatron analyses. A measure of
uncertainty of ≈ 10% on the total background predic-            interest rigorously accounting for the trials factor asso-
tion in each
          √ final state. Using σsys ≈ 10% × b and               ciated with looking in many regions with few events is
σstat ≈ b, the relative importance of systematic un-            defined, and used to quantify the most interesting region
certainty and statistical √ uncertainty is estimated to be      observed P in the CDF Run II data. No region of excess on
σsys /σstat = 10% × b/ b. The importance of system-             the high- pT tail of any of the Sleuth exclusive final
atic and                                                        states surpasses the discovery threshold.
      P statistical uncertainties are thus comparable for
high pT tails containing b ∼ 100 predicted events. The             Although this global analysis of course cannot prove
effect of systematic uncertainties is provided in this ap-      that no new physics is hiding in these data, this broad
proximation rather than through a rigorous integration          search of the Tevatron Run II data represents one of the
over these uncertainties as nuisance parameters due to          single most encompassing tests of the particle physics
the high computational cost of performing the integra-          standard model at the energy frontier.
tion. This estimate of systematic uncertainty is valid only
within the particular correction model resulting in the list
of correction factors shown in Table I; additional changes                          Acknowledgments
to the correction model may result in larger variation.
The inclusion of additional systematic uncertainties does          Tim Stelzer and Fabio Maltoni have provided partic-
not qualitatively change the conclusion that Sleuth has         ularly valuable assistance in the estimation of standard
not revealed a discrepancy of sufficient statistical signifi-   model backgrounds. Sergey Alekhin provided helpful cor-
cance to justify a new physics claim.                           respondence in understanding the theoretical uncertain-
   Due to the large number of final states considered,          ties on W and Z production imposed as constraints on
there are regions (such as those shown in Fig. 10) in which     the Vista correction factor fit. Sascha Caron provided
the probability for the standard model prediction to fluc-      insight gained from the general search at H1. Torbjörn
tuate up to or above the number of events observed in           Sjöstrand assisted in the understanding of Pythia’s
the data corresponds to a significance exceeding 3σ if the      treatment of parton showering applied to the Vista 3j
appropriate trials factor is not accounted for. A doubling      ∆R(j2 , j3 ) discrepancy.
of data may therefore result in discovery. In particular,          We thank the Fermilab staff and the technical staffs
although the excesses in Fig. 10 are currently consistent       of the participating institutions for their vital contribu-
with simple statistical fluctuations, if any of them are        tions. This work was supported by the U.S. Department
genuinely due to new physics, Sleuth will find they pass        of Energy and National Science Foundation; the Italian
the discovery threshold of P̃ < 0.001 with roughly a dou-       Istituto Nazionale di Fisica Nucleare; the Ministry of
bling of data.                                                  Education, Culture, Sports, Science and Technology of
                                                                Japan; the Natural Sciences and Engineering Research
                                                                Council of Canada; the National Science Council of the
                  V.   CONCLUSIONS                              Republic of China; the Swiss National Science Founda-
                                                                tion; the A.P. Sloan Foundation; the Bundesministerium
  A broad search for new physics (Vista) has been per-          für Bildung und Forschung, Germany; the Korean Sci-
formed in 927 pb−1 of CDF Run II data. A complete               ence and Engineering Foundation and the Korean Re-
standard model background estimate has been obtained            search Foundation; the Science and Technology Facilities
     24
     e+
                                                                            p(j → e+ ) =      p(q → γ) p(γ → e+ ) +
                                                                                            p(q → π 0 ) p(π 0 → e+ ) +
     e-
                                                                                            p(q → π + ) p(π + → e+ ) +
     µ+
                                                                                            p(q → K + ) p(K + → e+ ).     (A1)
FIG. 12: A few of the most discrepant distributions in the final states ej and jµ, which are greatly affected by the fake
rates p(j → e) and p(j → µ), respectively. These distributions are among the 13 significantly discrepant distributions identified
as resulting from coarseness of the correction model employed. The vertical axis shows the number of events; the horizontal
axes show the transverse momentum and pseudorapidity of the lepton. Filled (black) circles show CDF data, and the shaded
(red) histogram shows the standard model prediction. Events enter the ej final state either on a central electron trigger with
pT > 25 GeV, or on a plug electron trigger with pT > 40 GeV. The fake rate p(j → e) is significantly larger in the plug region
than in the central region of the CDF detector. Muons are identified with separate detectors covering the regions |η| < 0.6 and
0.6 < |η| < 1.0.
  The probability for a jet to be misreconstructed as a           data, Pythia predicts the probability for a quark jet to
tau lepton can be written                                         fake a one-prong tau to be roughly four times the proba-
                                                                  bility for a gluon jet to fake a one-prong tau. This differ-
           p(j → τ + ) = p(j → τ1+ ) + p(j → τ3+ ),      (A7)     ence in fragmentation is incorporated into Vista’s treat-
                                                                  ment of jets faking electrons, muons, taus, and photons.
where p(j → τ1+ ) denotes the probability for a jet to fake       The Vista correction model includes such correction fac-
a 1-prong tau, and p(j → τ3+ ) denotes the probability for        tors as the probability for a jet with a parent quark to
a jet to fake a 3-prong tau. For 1-prong taus,                    fake an electron (0033 and 0034) and the probability for
         p(j → τ1+ ) = p(q → π + ) p(π + → τ + ) +                a jet with a parent quark to fake a muon (0035); the
                                                                  probability for a jet with a parent gluon to fake an elec-
                       p(q → K + ) p(K + → τ + ).        (A8)     tron or muon is then obtained by dividing the values of
                                                                  these fitted correction factors by four.
Similar equations hold for negatively charged taus.
  Figure 14 shows the probability for a quark (or gluon)             The physical mechanism underlying the process
to fake a one-prong tau, as a function of transverse mo-          whereby an incident photon or neutral pion is misrecon-
mentum. Using fragmentation functions tuned on LEP 1              structed as an electron is a conversion in the material
                                                                                                                             27
FIG. 13: A few of the most discrepant distributions in the final states jτ and jγ, which are greatly affected by the fake rates
p(j → τ ) and p(j → γ), respectively. The vertical axis shows the number of events; the horizontal axes show the transverse
momentum and pseudorapidity of the tau lepton and photon. Filled (black) circles show CDF data, and the shaded (red)
histogram shows the standard model prediction. The distributions in the jγ final state are among the 13 significantly discrepant
distributions identified as resulting from coarseness of the correction model employed.
serving as the support structure of the silicon vertex de-        tive to the K − cross section by roughly a factor of two.
tector. This process produces exactly as many e+ as e− ,             The physical process primarily responsible for π ± →
leading to                                                        e± is inelastic charge exchange
            1
              p(γ → e) = p(γ → e+ ) = p(γ → e− )                                         π− p → π0 n
            2
        1                                                                                π+ n → π0 p                     (A10)
          p(π 0 → e) = p(π 0 → e+ ) = p(π 0 → e− ),     (A9)
        2                                                         occurring within the electromagnetic calorimeter. The
where e is an electron or positron.                               charged pion leaves the “electron’s” track in the CDF
   From Fig. 11, the average pT of electrons reconstructed        tracking chamber, and the π 0 produces the “electron’s”
from 25 GeV incident photons is 23.9 ± 1.4 GeV. The av-           electromagnetic shower. No true electron appears at all
erage pT of electrons reconstructed from incident 25 GeV          in this process, except as secondaries in the electromag-
neutral pions is 23.7 ± 1.3 GeV.                                  netic shower originating from the π 0 .
   The charge asymmetry between p(K + → e+ ) and                     The average pT of reconstructed “electrons” originat-
p(K − → e− ) in Table IV arises because K − can capture           ing from a single charged pion is 18.8 ± 2.2 GeV, indi-
on a nucleon, producing a single hyperon. Conservation            cating that the misreconstructed “electron” in this case
of baryon number and strangeness prevents K + from cap-           is measured to have on average only 75% of the total en-
turing on a nucleon, reducing the K + cross section rela-         ergy of the parent quark or gluon. This is expected, since
28
and is related to previously defined quantities by              Isolated and energetic electrons and muons arising from
                                  1                             parent b quarks in this way are modeled as having pT
       p(γ → τ ) = p(γ → e)            p(e → τ ),     (A18)     equal to the parent b quark pT , multiplied by a random
                              p(e → e)                          number uniformly distributed between 0 and 1.
where p(γ → e) denotes the fraction of produced photons
that are reconstructed as electrons, p(e → e) denotes the
fraction of produced electrons that are reconstructed as                  2.    Additional background sources
electrons, and hence p(γ → e)/p(e → e) is the fraction of
produced photons that pair produce a single leading elec-        This appendix provides additional details on the esti-
tron.                                                           mation of the standard model prediction.
   Note p(e → γ) ≈ p(γ → e) from Table IV, as expected,
with value of ≈ 0.03 determined by the amount of mate-
rial in the inner detectors and the tightness of isolation                 a.   Cosmic ray and beam halo muons
criteria. A hard bremsstrahlung followed by a conver-
sion is responsible for electrons to be reconstructed with          There are four dominant categories of events caused by
opposite sign; hence                                            cosmic ray muons penetrating the detector: µ /p, µ+ µ− ,
        p(e± → e∓ ) = p(e+ → e− ) = p(e− → e+ )                 γ /p, and j /p. There is negligible contribution from cosmic
                                                                ray secondaries of any particle type other than muons.
                    ≈ 12 p(e± → γ)p(γ → e∓ ),         (A19)
                                                                    A cosmic ray muon penetrating the CDF detector
where the factor of 1/2 comes because the material al-          whose trajectory passes within 1 mm of the beam line
ready traversed by the e± will not be traversed again by        and within −60 < z < 60 cm of the origin may be recon-
the γ. In particular, track curvature mismeasurement is         structed as two outgoing muons. In this case the cosmic
not responsible for erroneous sign determination in the         ray event is partitioned into the final state µ+ µ− . If
central region of the CDF detector.                             one of the tracks is missed, the cosmic ray event is parti-
   From knowledge of the underlying physical mecha-             tioned into the final state µ /p. The standard CDF cosmic
nisms by which jets fake electrons, muons, taus, and            ray filter, which makes use of drift time information in
photons, the simple use of a reconstructed jet as a lep-        the central tracking chamber, is used to reduce these two
ton or photon with an appropriate fake rate applied to          categories of cosmic ray events.
the weight of the event needs slight modification to cor-           CDF data events with exactly one track (correspond-
rectly handle the fact that a jet that has faked a lep-         ing to one muon) and events with exactly two tracks (cor-
ton or photon generally is measured more accurately             responding to two muons) are used to estimate the cosmic
than a hadronic jet. Rather than using the momentum             ray muon contribution to the final states µ /p and µ+ µ−
of the reconstructed jet, the momentum of the parent            after the cosmic ray filter. This sample of events is used
quark or gluon is determined by adding up all Monte             as the standard model background process cosmic µ.
Carlo particle level objects within a cone of ∆R = 0.4          The cosmic µ sample does not contribute to the events
about the reconstructed jet. In misreconstructing a jet         passing the analysis offline trigger, whose cleanup cuts
in an event, the momentum of the corresponding par-             require the presence of three or more tracks. Roughly
ent quark or gluon is used rather than the momentum             100 events are expected from cosmic ray muons in the
of the reconstructed jet. A jet that fakes a photon             categories µ+ /p and µ+ µ− . The sample cosmic µ is ne-
then has momentum equal to the momentum of the par-             glected from the background estimate, since there is no
ent quark or gluon plus a fractional correction equal to        discrepancy that demands its inclusion.
0.01×(parent pT −25 GeV)/(25 GeV) to account for leak-              The remaining two categories are γ /p and j /p, result-
age out
      √ of the √cone of ∆R = 0.4, and a further smearing        ing from a cosmic ray muon that penetrates the CDF
of 0.2 GeV× parent pT , reflecting the electromagnetic          electromagnetic or hadronic calorimeter and undergoes
resolution of the CDF detector. The momenta of jets that        a hard bremsstrahlung in one calorimeter cell. Such an
fake photons are multiplied by an overall factor of 1.12,       interaction can mimic a single photon or a single jet, re-
and jets that fake electrons, muons, or taus are multi-         spectively. The reconstruction algorithm infers the pres-
plied by an overall factor of 0.95. These numbers are           ence of significant missing energy balancing the “photon”
determined by the ℓ / p, ℓj, and γj final states. The distri-   or “jet.” If this cosmic ray interaction occurs during a
butions most sensitive to these numbers are the missing         bunch crossing in which there is a pp̄ interaction produc-
energy and the jet pT .                                         ing three or more tracks, the event will be partitioned
   A b quark fragmenting into a leading b hadron that           into the final state γ /p or j /p.
then decays leptonically or semileptonically results in an          CDF data events with fewer than three tracks are
electron or muon that shares the pT of the parent b quark       used to estimate the cosmic ray muon contribution to
with the associated neutrino. If all hadronic decay prod-       the final states γ /p and j /p. These samples of events are
ucts are soft, the distribution of the momentum fraction        used as standard model background processes cosmic γ
carried by the charged lepton can be obtained by con-           and cosmic j for the modeling of this background, cor-
sidering the decay of a scalar to two massless fermions.        responding to offline triggers requiring a photon with
30
                                                                                                                     ∑p
                                                        CDF Run II Data                                                                               CDF Run II Data
                           γ pT                         Other                                                 j pT    T       > 400 GeV               Other
                                                        Pythia W(→eν) : 1.8%                                                                          MadEvent W(→µν) j : 0.7%
                     600                                MadEvent Z(→νν) γ : 2.3%                                                                      MadEvent Z(→νν) j : 0.8%
                                                        Pythia jj : 10.4%                                                                             Pythia jj : 8.9%
                                                        Non-collision : 81.7%                          1000                                           Non-collision : 88.5%
  Number of Events
                                                                                    Number of Events
                     400
                                                                                                       500
                     200
                       0                                                                                 0
                                  100        200                300                                           200               400            600            800
                                         γ pT (GeV)                                                                                    j p T (GeV)
                                                                                                                     ∑p
                                                        CDF Run II Data                                                                               CDF Run II Data
                           γ pT                         Other                                                 j pT        T   > 400 GeV               Other
                                                        Pythia W(→eν) : 1.8%                                                                          MadEvent W(→µν) j : 0.7%
                                                        MadEvent Z(→νν) γ : 2.3%                                                                      MadEvent Z(→νν) j : 0.8%
                                                        Pythia jj : 10.4%                                                                             Pythia jj : 8.9%
                                                        Non-collision : 81.7%
                                                                                                        300                                           Non-collision : 88.5%
  Number of Events
                                                                                    Number of Events
                     100
                                                                                                        200
                      50
                                                                                                        100
                       0                                                                                  0
                                   -2          0                2                                                         -2                 0                2
                                        γ φ (radians)                                                                                 j φ (radians)
FIG. 15: The distribution of transverse momentum and azimuthal angle for photons and jets in the γ /     p and j /
                                                                                                                 p final states,
dominated by cosmic ray and beam halo muons. The vertical axis shows the number of events in each bin. Data are shown
as filled (black) circles; the standard model prediction is shown as the shaded (red) histogram. Here the “standard model”
prediction includes contributions from cosmic ray and beam halo muons, estimated using events containing fewer than three
reconstructed tracks. The contribution from cosmic ray muons is flat in φ, while the contribution from beam halo is localized
to φ = 0. The only degrees of freedom for the background to these final states are the cosmic γ and cosmic j correction factors,
whose values are determined from the global Vista fit and provided in Table I.
basic improvement to our understanding of this physics.           probabilityPassesTrigger[(k1 , k2 )] represents the proba-
                                                                  bility that an event produced with energy clusters in the
                                                                  detector regions labeled by k1 that are identified as ob-
                       3.   Global fit                            jects labeled by k2 would pass the trigger.
                                                                     The quantity SM0 [(k1 , k2 ′ )][l] is obtained by generating
  This section describes the construction of the global           some number nl (say 104 ) of Monte Carlo events corre-
χ2 used in the Vista global fit.                                  sponding to the process l. The event generator provides
                                                                  a cross section σl for this process l. The weight of each of
                                                                  these Monte Carlo events is equal to σl /nl . Passing these
                            a.   χ2k                              events through the CDF simulation and reconstruction,
                                                                  the sum of the weights of these events falling into the bin
                                                                  (k1 , k2 ′ ) is SM0 [(k1 , k2 ′ )][l].
   The bins in the CDF high-pT data sample are labeled
by the index k = (k1 , k2 ), where each value of k1 rep-
resents a phrase such as “this bin contains events with
                                                                                          b.   χ2constraints
three objects: one with 17 < pT < 25 GeV and |η| < 0.6,
one with 40 < pT < 60 GeV and 0.6 < |η| < 1.0, and
one with 25 < pT < 40 GeV and 1.0 < |η|,” and each                   The term χ2constraints (~s) in Eq. 2 reflects constraints on
value of k2 represents a phrase such as “this bin contains        the values of the correction factors determined by data
events with three objects: an electron, muon, and jet,            other than those in the global high-pT sample. These
respectively.” The reason for splitting k into k1 and k2          constraints include k-factors taken from theoretical cal-
is that a jet can fake an electron (mixing the contents of        culations and numbers from the CDF literature when use
k2 ), but an object with |η| < 0.6 cannot fake an object          is made of CDF data external to the Vista high-pT sam-
with 0.6 < |η| < 1.0 (no mixing of k1 ). The term corre-          ple. The constraints imposed are:
sponding to the k th bin takes the form of Eq. 1, where               • The luminosity (0001) is constrained to be within
Data[k] is the number of data events observed in the k th               6% of the value measured by the CDF Čerenkov
bin, SM[k] is the number of events predicted by the stan-               luminosity counters.
dard model in the k th bin, δSM[k] is the Monte Carlo
statistical uncertainty
                     p on the standard model prediction               • The fake rate p(q → γ) (0039) is constrained to be
in the k th bin, and SM[k] is the statistical uncertainty               2.6 × 10−4 ± 1.5 × 10−5, from the single particle gun
on the prediction in the k th bin. To legitimize the use                study of Appendix A 1.
of Gaussian errors, only bins containing eight or more
data events are considered. The standard model predic-                • The fake rate p(e → γ) (0032) plus the efficiency
tion SM[k] for the k th bin can be written in terms of the              p(e → e) (0026) for electrons in the plug is con-
introduced correction factors as                                        strained to be within 1% of unity.
                                                                      • Noting p(q → γ) corresponds to correction factor
SM[k] = SM[(k1 , k2 )] =                                                0039, p(q → π ± ) = 2 p(q → π 0 ), and p(q → π 0 ) =
            P                        P
                  k ′ ∈objectLists
                  2                    l∈processes                      p(q → γ)/p(π 0 → γ), and taking p(π 0 → γ) = 0.6
       ( L dt) · (kFactor[l]) · (SM0 [(k1 , k2 ′ )][l]) ·
        R                                                               and p(π ± → τ ) = 0.415 from the single parti-
                                                                        cle gun study of Appendix A 1, the fake rate
  (probabilityToBeSoMisreconstructed[(k1 , k2 ′ )][k2 ]) ·              p(q → τ ) (0038) is constrained to p(q → τ ) =
          (probabilityPassesTrigger[(k1 , k2 )]),         (A25)         p(q → π ± )p(π ± → τ ) ± 10%.
where SM[k] is the standard model prediction for the                  • The k-factors for dijet production (0018 and 0019)
k th bin; the index k is the Cartesian product of                       are constrained to 1.10 ± 0.05 and 1.33 ± 0.05 in
the two indices k1 and k2 introduced above, label-                      the kinematic regions p̂T < 150 GeV and p̂T >
ing the regions of the detector in which there are                      150 GeV, respectively, where p̂T is the transverse
energy clusters and the identified objects correspond-                  momentum of the scattered partons in the 2 → 2
ing to those clusters, respectively; the index k2 ′ is                  process in the colliding parton center of momentum
a dummy summation index; the index l labels stan-                       frame.
dard model background processes, such as dijet pro-
duction or W +1 jet production; SM0 [(k1 , k2 ′ )][l] is the          • The inclusive k-factor for γ + N jets (0004–0007) is
initial number of standard model events predicted in                    constrained to 1.25 ± 0.15 [43, 44].
bin (k1 , k2 ′ ) from the process labeled by the index                • The inclusive k-factor for γγ + N jets (0008–0010)
l; probabilityToBeMisreconstructedThus[(k1 , k2 ′ )][k2 ] is            is constrained to 2.0 ± 0.15 [45].
the probability that an event produced with en-
ergy clusters in the detector regions labeled by k1                   • The inclusive k-factors for W and Z production
that are identified as objects labeled by k2 ′ would                    (0011–0014 and 0015–0017) are subject to a 2-
be mistaken as having objects labeled by k2 ; and                       dimensional Gaussian constraint, with mean at the
                                                                                                                                  33
       NNLO/LO theoretical values [46], and a covari-                   model prediction to data. The correction factors consid-
       ance matrix that encapsulates the highly corre-                  ered are numbers that can in principle be calculated a
       lated theoretical uncertainties, as discussed in Ap-             priori, but whose calculation is in practice not feasible.
       pendix A 4.                                                      These correction factors divide naturally into two classes,
                                                                        the first of which reflects the difficulty of calculating the
    • Trigger efficiency correction factors are constrained             standard model prediction to all orders, and the second
      to be less than unity.                                            of which reflects the difficulty of understanding from first
                                                                        principles the response of the experimental apparatus.
    • All correction factors are constrained to be positive.
                                                                           The theoretical correction factors considered are of two
                                                                        types. The difficulty of calculating the standard model
                        c.   Covariance matrix
                                                                        prediction for many processes to all orders in perturba-
                                                                        tion theory is handled through the introduction of k-
                                                                        factors, representing the ratio of the true all orders pre-
  This section describes the correction factor covariance               diction to the prediction at lowest order in perturbation
matrix Σ. The inverse of the covariance matrix is ob-                   theory. Uncertainties in the distribution of partons in-
tained from                                                             side the colliding proton and anti-proton as a function of
                          1 ∂ 2 χ2 (~s) 
                                                                       parton momentum are in principle handled through the
                   −1
                  Σij =                    ,       (A26)                introduction of correction factors associated with parton
                          2 ∂si ∂sj ~s0
                                                                        distribution functions, but there are currently no discrep-
                                                                        ancies to motivate this.
 where χ2 (~s) is defined by Eq. 2 as a function of the correc-
                                                                           Experimental correction factors correspond to num-
 tion factor vector ~s, vector elements si and sj are the ith
                                                                        bers describing the response of the CDF detector that are
 and j th correction factors, and ~s0 is the vector of correc-
                                                                        precisely calculable in principle, but that are in practice
 tion factors that minimizes χ2 (~s). Numerical estimation
                                                                        best constrained by the high-pT data themselves. These
 of the right hand side of Eq. A26 is achieved by calcu-
                                                                        correction factors take the form of the integrated lumi-
 lating χ2 at ~s0 and at positions slightly displaced from
                                                                        nosity, object identification efficiencies, object misiden-
~s0 in the direction of the ith and j th correction factors,
                                                                        tification probabilities, trigger efficiencies, and energy
 denoted by the unit vectors î and ĵ. Approximating the
                                                                        scales.
 second partial derivative
TABLE V: Correction factor correlation matrix. The top row and left column show correction factor codes. Each element of the matrix shows the correlation between
the correction factors corresponding to the column and row. Each matrix element is dimensionless; the elements along the diagonal are unity; the matrix is symmetric;
positive elements indicate positive correlation, and negative elements anti-correlation.
                                                                                                                                                                            35
                                                                                                                  jet
                                                                                                    3
                                                                                                                 pT distribution
                                                                                               10
10-1
                                                                                               10-2         FULL
                                                                                               10
                                                                                                 -3
                                                                                                            LO
                                                                                               10-4
                                                                                                        0   20          40   60    80    100   120   140    160     180     200
                                                                                                                                                                  Pjet (GeV/c)
                                                                                                            K-factor as PT of jet
                                                                                                    6
   0041. The central electron trigger inefficiency is dom-         This section expands on a subtle point in the definition
inated by not correctly reconstructing the electron’s track     of the Sleuth algorithm: for purely practical considera-
at the first online trigger level.                              tions, only final states in which three or more events are
                                                                observed in the data are considered.
   0042. The plug electron trigger inefficiency is due to
                                                                   Suppose Pe+ e− bb̄ = 10−6 ; then in computing P̃ all final
inefficiencies in clustering at the second online trigger
                                                                states with b > 10−6 must be considered and accounted
level.
                                                                for. (A final state with b = 10−7 , on the other hand,
   0043, 0044. The muon trigger inefficiencies in the           counts as only ≈ 0.1 final states, since the fraction of
regions |η| < 0.6 and 0.6 < |η| < 1.0 derive partly from        hypothetical similar experiments in which P < 10−6 in
tracking inefficiency, and partly from an inefficiency in       this final state is equal to the fraction of hypothetical
reconstructing muon stubs in the CDF muon chambers.             similar experiments in which one or more events is seen
                                                                in this final state, which is 10−7 .) This is a large practical
The value of these corrections factors are consis-              problem, since it requires that all final states with b >
tent with other trigger efficiency measurements made            10−6 be enumerated and estimated, and it is difficult to
using additional information [50].                              do this believably.
                                                                   To solve this problem, let Sleuth consider only final
                                                                states with at least dmin events observed in the data. The
                                                                goal is to be able to find P̃ < 10−3 . There will be some
                       e.   Energy scales                       number Nfs (bmin ) of final states with expected number
                                                                of events b > bmin , writing Nfs explicitly as a function
   The Vista infrastructure also allows the jet energy          of bmin ; thus bmin must be chosen to be sufficiently large
scale to be treated as a correction factor. At present this     that all of these Nfs (bmin ) final states can be enumerated
correction factor is not used, since there is no discrepancy    and estimated. The time cost of simulating events is such
requiring it.                                                   that the integrated luminosity of Monte Carlo events is at
   To understand the effect of introducing such a correc-       most 100 times the integrated luminosity of the data; this
tion factor, a jet energy scale correction factor is added      practical constraint restricts bmin > 0.01. The number of
and constrained to 1±0.03, reflecting the jet energy scale      Sleuth Tevatron Run II final states with b > 0.01 is
determination at CDF [13]. The fit returns a value with         Nfs (bmin = 0.01) ≈ 103 .
a very small error, since this correction factor is highly         For small Pmin , keeping the first term in a binomial
constrained by the low-pT 2j, 3j, e j, and e 2j final states.   expansion yields P̃ = PminNfs (bmin ), where Pmin is the
Assuming perfectly correct modeling of jets faking elec-        smallest P found in any final state. From the discussion
trons, as described in Appendix A 1, this is a correct          above, the computation of P̃ from Pmin can only be jus-
energy scale error. The inclusion of additional correction      tified if Pmin > (bmin dmin ); if otherwise, final states with
factor degrees of freedom to reflect possible imperfections     b < bmin will need to be accounted for. Thus P̃ can be
in this modeling of jets faking electrons increases the en-     confidently computed only if P̃ > (bmin dmin )Nfs (bmin ).
ergy scale error. The interesting conclusion is that the           Solving this inequality for dmin and inserting values
jet energy scale (considered as a lone free parameter) is       from above,
very well constrained by the large number of dijet events;
adjustment to the jet energy scale must be accompanied                     log10 P̃ − log10 Nfs (bmin )   −3 − 3
                                                                  dmin ≥                                ≈        = 3. (B1)
by simultaneous adjustment of other correction factors                              log10 bmin             −2
(such as the dijet k-factor) in order to retain agreement
with data.                                                      A believable trials factor can be computed if dmin ≥ 3.
38
             ℓ+ ℓ− 2j /p      e+ e− 2j /
                                       p, e+ e− 3j p
                                                   /, µ+ µ− 2j p
                                                               /, µ+ µ− 3j p
                                                                           /         τp
                                                                                      /              τ+j p
                                                                                                         /, τ + b p
                                                                                                                  /
             ℓ+ ℓ− τ + 2j p
                          /   e+ e− τ + 2j                                           bb̄bb̄          3bj
             ℓ+ ℓ− ℓ′ γ /
                        p     e+ µ+ µ− γj                                            W bb̄bb̄        e+ 3bj p
                                                                                                            /
             ℓ+ ℓ− ℓ′ /
                      p       e+ µ+ µ− , e+ e− µ+ , e+ e− µ+ p
                                                             /                       ℓ+ ℓ+           2e+ , 2e+ j, 2µ+
             ℓ+ ℓ− γ          e+ e− γ, µ+ µ− γ, e+ e− γj, µ+ µ− γj                   ℓ+ ℓ− ℓ+ jj p
                                                                                                 /   2e+ e− 2j, 2e+ e− 3j
             ℓ+ ℓ− γ p
                     /        e+ e− γj /
                                       p, e+ e− γ p
                                                  /                                  ℓ+ ℓ− ℓ+ p
                                                                                              /      2e+ e− , 2e+ e− j, 2e+ e− p
                                                                                                                               /
             ℓ+ ℓ− γτ + p
                        /     e+ e− τ + γ                                            ℓ+ ℓ+ jj        2e+ 2j
             ℓ+ ℓ− /
                   p          e+ e− /
                                    p,       e+ e− j p
                                                     /,   µ+ µ− p
                                                                /,    µ+ µ− j p
                                                                              /,     ℓ+ ℓ+ ℓ′− p
                                                                                               /     e+ 2µ− p/
                              e+ e− b /
                                      p                                              ℓ+ ℓ+ γ         2e+ γ
             ℓ+ ℓ− /
                   pτ +       e+ e− τ + , e+ e− τ + j, µ+ µ− τ +                     ℓ+ ℓ+ γ p
                                                                                             /       2e+ γ p
                                                                                                           /
             ℓ+ ℓ− 4j         e+ e− 4j, µ+ µ− 4j                                     ℓ+ ℓ+ p
                                                                                           /         2e+ p
                                                                                                         /, 2e+ j p
                                                                                                                  /
             ℓ+ ℓ− 4j /
                      p       e+ e− 4j /
                                       p                                             ℓ+ ℓ+ 4j        2e+ 4j
             ℓ+ ℓ− τ + 4j /
                          p   e+ e− τ + 4j                                           4j              4j
             ℓ+ ℓ′+ jj        e+ µ+ 3j                                               γ4j             γ4j
             ℓ+ ℓ′+ /
                    pjj       e+ µ+ 2j /
                                       p                                             γ4j p
                                                                                         /           γ4j p
                                                                                                         /
             ℓ+ ℓ′− jj        e+ µ− 2j                                               4j p
                                                                                        /            4j p
                                                                                                        /
             ℓ+ ℓ′− /
                    pjj       e+ µ− 3j /p, e+ µ− 2j p
                                                    /                                τ+ p/4j         τ + 4j p
                                                                                                            /
             W γjj            µ+ γ2j /p, e+ γ2j p
                                                /, µ+ γ3j p
                                                          /, e+ γ3j p
                                                                    /                γγ4j            2γ4j
             W jj             e+ 2j /
                                    p, µ+ 2j /
                                             p, e+ 3j p
                                                      /, µ+ 3j p
                                                               /                     γγ              2γ, 2γj, 2γb
             ℓ+ τ + p
                    /jj       µ+ τ + 3j /
                                        p                                            γγ p
                                                                                        /            2γ p
                                                                                                        /, 2γj p
                                                                                                               /
             ℓ+ τ − p
                    /jj       e+ τ − 2j /
                                        p, e+ τ − 3j p
                                                     /, µ+ τ − 3j p
                                                                  /, µ+ τ − 2j p
                                                                               /     3γ              3γ, 3γj
TABLE VI: Correspondence between Sleuth and Vista final states. The first column shows the Sleuth final state formed
by merging the populated Vista final states in the second column. Charge conjugates of each Vista final state are implied.
   At the other end of the scale, computational strength                             a time limit is exceeded. If the time limit is exceeded
limits the maximum number of events Sleuth is able                                   before P is determined to within the desired fractional
to consider to . 104 . Excesses in which the number of                               precision of 5%, Sleuth returns an upper bound on P,
events exceed 104 are expected to be identified by Vista’s                           and indicates explicitly that only an upper bound has
normalization statistic.                                                             been determined. For the data described in this article,
   For each final state, pseudo experiments are run until                            the desired precision is obtained.
P is determined to within a fractional precision of 5% or
 [1] G. Arnison et al. (UA1 Collaboration), Phys. Lett. B122,                         [2] M. Banner et al. (UA2 Collaboration), Phys. Lett. B122,
     103 (1983).                                                                          476 (1983).
                                                                                                                                     39
     strum. Meth. A290, 469 (1990).                                  hadronic energy clusters in the event, and the hadronic
[27] D. Acosta et al., Nucl. Instrum. Meth. A494, 57 (2002).         energy resolution of the CDF detector has been approxi-
                                                                                        √
[28] F. Abe et al. (CDF Collaboration), Phys. Rev. D54, 4221         mated as 100% pT i , expressed in GeV. An event is said
     (1996).                                                         to contain missing transverse momentum if /       pT > 17 GeV
[29] F. Abe et al. (CDF Collaboration), Phys. Rev. D56, 2532         and / pT ′ > 10 GeV.
     (1997).                                                    [54] Given a set of Monte Carlo events with individual weights
[30] S. Abachi et al. (D0 Collaboration), Phys. Lett. B357,          wj , so that the total standard model       P prediction from
     500 (1995).                                                     these Monte Carlo events is SM =              j wj events, the
[31] J. Alwall et al. (2007), arXiv:0706.2569 [hep-ph].              “effective weight” weff of these events can be taken    P        to
[32] B. Knuteson, Ph.D. thesis, University of California,                                                                      j wj wj
                                                                     be the weighted average of the weights: weff = P                  .
                                                                                                                                 j wj
     Berkeley (2000).
                                                                     The “effective number of Monte Carlo events” is Neff =
[33] B. Abbott et al. (DØ Collaboration), Phys. Rev. D62,
                                                                     SM/weff , and the√ error on the standard model prediction
     092004 (2000).
                                                                     is δSM = SM/ Neff .
[34] V. Abazov et al. (DØ Collaboration), Phys. Rev. D64,
                                                                [55] Final states for which p > 0.5 after accounting for the
     012004 (2001).
                                                                     trials factor are not even mildly interesting, and the
[35] B. Abbott et al. (DØ Collaboration), Phys. Rev. Lett.
                                                                     corresponding σ after accounting for the trials factor is
     86, 3712 (2001).
                                                                     not quoted. For the mildly interesting final states with
[36] A. Aktas et al. (H1 Collaboration), Phys. Lett. B602, 14
40
     p < 0.5 after accounting for the trials factor, σ is quoted   [56] The electron and muon efficiencies shown in this table
     as positive if the number of observed data events ex-              are different from the correction factors 0025 and 0027
     ceeds the standard model prediction, and negative if the           in Table I, which show the ratio of the object efficien-
     number of observed data events is less than the standard           cies in the data to the object identification efficiencies in
     model prediction.                                                  CdfSim.