The Logic of Explanatory Power

(1)

Source: Philosophy of Science , Vol. 78, No. 1 (January 2011), pp. 105-127

Published by: The University of Chicago Press on behalf of the Philosophy of Science

Association

Stable URL: https://www.jstor.org/stable/10.1086/658111

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at https://about.jstor.org/terms

The University of Chicago Press and Philosophy of Science Association are collaborating with JSTOR to digitize, preserve and extend access to Philosophy of Science

(2)

Jonah N. Schupbach and Jan Sprenger

†‡

This article introduces and defends a probabilistic measure of the explanatory power that a particular explanans has over its explanandum. To this end, we propose several intuitive, formal conditions of adequacy for an account of explanatory power. Then, we show that these conditions are uniquely satisfied by one particular probabilistic function. We proceed to strengthen the case for this measure of explanatory power by proving several theorems, all of which show that this measure neatly corresponds to our explanatory intuitions. Finally, we briefly describe some promising future projects inspired by our account.

1. Explanation and Explanatory Power. Since the publication of Hempel

and Oppenheim’s (1948) classic investigation into “the logic of explana-tion,” philosophers of science have earnestly been seeking an analysis of the nature of explanation. Necessity (Glymour 1980), statistical relevance (Salmon 1971), inference and reason (Hempel and Oppenheim 1948; Hempel 1965), familiarity (Friedman 1974), unification (Friedman 1974; Kitcher 1989), causation (Salmon 1984; Woodward 2003), and mechanism (Machamer, Darden, and Craver 2000) are only some of the most popular concepts that such philosophers draw on in the attempt to describe

nec-*Received April 2010; revised June 2010.

†To contact the authors, please write to: Jonah N. Schupbach, Department of History and Philosophy of Science, University of Pittsburgh, 1017 Cathedral of Learning, Pittsburgh, PA 15260; e-mail: jonah@schupbach.org.

‡For helpful correspondence on earlier versions of this article, we would like to thank David Atkinson, Vincenzo Crupi, John Earman, Theo Kuipers, Edouard Machery, John Norton, Jeanne Peijnenburg, Jan-Willem Romeijn, Tomoji Shogenji, audiences at PROGIC 09 (Groningen), the ESF Workshop (Woudschouten), and FEW 2010 (Konstanz), and especially Stephan Hartmann. Jan Sprenger would like to thank the Netherlands Organization for Scientific Research, which supported his work on this article with Veni grant 016.104.079. Jonah N. Schupbach would like to thank Tilburg University’s Center for Logic and Philosophy of Science, which supported him with a research fellowship during the time that he worked on this article.

(3)

essary and sufficient conditions under which a theory explains some prop-osition.1 _{A related project that is, on the other hand, much less often} pursued by philosophers today is the attempt to analyze the strength of an explanation—that is, the degree of explanatory power that a particular explanans has over its explanandum. Such an analysis would clarify the conditions under which hypotheses are judged to provide strong versus weak explanations of some proposition, and it would also clarify the meaning of comparative explanatory judgments such as “hypothesis A provides a better explanation of this fact than does hypothesis B.”

Given the nature of these two projects, the fact that the first receives so much more philosophical attention than the second can hardly be explained by appeal to any substantial difference in their relative philo-sophical imports. Certainly, the first project has great philophilo-sophical sig-nificance; after all, humans on the individual and social levels are con-stantly seeking and formulating explanations. Given the ubiquity of explanation in human cognition and action, it is both surprising that this concept turns out to be so analytically impenetrable and critical that philosophers continue to strive for an understanding of this notion.2_The second project is, however, also immensely philosophically important. Humans regularly make judgments of explanatory power and then use these judgments to develop preferences for hypotheses or even to infer outright to the truth of certain hypotheses. Much of human reasoning— again, on individual and social levels—makes use of judgments of ex-planatory power. Ultimately then, in order to understand and evaluate human reasoning generally, philosophers need to come to a better un-derstanding of explanatory power.

The relative imbalance in the amount of philosophical attention that these two projects receive is more likely due to the prima facie plausible but ultimately unfounded assumption that one must have an analysis of explanation before seeking an analysis of explanatory power. This as-sumption is made compelling by the fact that in order to analyze the strength of something, one must have some clarity about what that thing is. However, it is shown to be much less tenable in light of the fact that humans do generally have some fairly clear intuitions concerning expla-nation. The fact that there is no consensus among philosophers today over the precise, necessary, and sufficient conditions for explanation does not imply that humans do not generally have a firm semantic grasp on the concept of explanation. Just how firm a semantic grasp on this concept

1. See Woodward (2009) for a recent survey of this literature.

2. Lipton (2004, 23) refers to this fact that humans can be so good at doing explanation while simultaneously being so bad at describing what it is they are doing as the “gap between doing and describing.”

(4)

humans actually have is an interesting question. One claim of this article will be that our grasp on the notion of explanation is at least sufficiently strong to ground a precise formal analysis of explanatory power, even if it is not strong enough to determine a general account of the nature of explanation.

This article attempts to bring more attention to the second project above by formulating a Bayesian analysis of explanatory power. Moreover, the account given here does this without committing to any particular theory of the nature of explanation. Instead of assuming the correctness of a theory of explanation and then attempting to build a measure of explan-atory power derivatively from this theory, we begin by laying out several more primitive adequacy conditions that, we argue, an analysis of ex-planatory power should satisfy. We then show that these intuitive ade-quacy conditions are sufficient to define for us a unique probabilistic analysis and measure of explanatory power.

Before proceeding, two more important clarifications are necessary. First, we take no position on whether our analysis captures the notion of explanatory power generally; it is consistent with our account that there be other concepts that go by this name but which do not fit our measure.3 _{What we do claim, however, is that our account captures at} least one familiar and epistemically compelling sense of explanatory power that is common to human reasoning.

Second, because our explicandum is the strength or power of an ex-planation, we restrict ourselves in presenting our conditions of adequacy to speaking of theories that do in fact provide explanations of the ex-planandum in question.4 _{This account thus is not intended to reveal the} conditions under which a theory is explanatory of some proposition (that is, after all, the aim of an account of explanation rather than an account of explanatory power); rather, its goal is to reveal, for any theory already known to provide such an explanation, just how strong that explanation is. Ultimately then, this article offers a probabilistic logic of explanation that tells us the explanatory power of a theory (explanans) relative to

3. As a possible example, Jeffrey (1969) and Salmon (1971) both argue that there is a sense in which a hypothesis may be said to have positive explanatory power over some explanandum so long as that hypothesis and explanandum are statistically rel-evant, regardless of whether they are negatively or positively statistically relevant. As will become clear in this article, insofar as there truly is such a notion of explanatory power, it must be distinct from the one that we have in mind.

4. To be more precise, the theory only needs to provide a potential explanation of the explanandum, where a theory offers a potential explanation of some explanandum just in case, if it were true, then it would be an actual explanation of that explanandum. In other words, this account may be used to measure the strength of any potential explanation, regardless of whether the explanans involved is actually true.

(5)

some proposition (explanandum), given that that theory constitutes an explanation of that proposition. In this way, this article forestalls the objection that two statements may stand in the probabilistic relation de-scribed while not simultaneously constituting an explanation.

2. The Measure of Explanatory Power . The sense of explanatory powerE

that this article seeks to analyze has to do with a hypothesis’s ability to decrease the degree to which we find the explanandum surprising (i.e., its ability to increase the degree to which we expect the explanandum). More specifically, a hypothesis offers a powerful explanation of a proposition, in this sense, to the extent that it makes that proposition less surprising. This sense of explanatory power dominates statistical reasoning in which scientists are “explaining away” surprise in the data by means of assuming a specific statistical model (e.g., in the omnipresent linear regression pro-cedures). But the explaining hypotheses need not be probabilistic; for example, a geologist will accept a prehistoric earthquake as explanatory of certain observed deformations in layers of bedrock to the extent that deformations of that particular character, in that particular layer of bed-rock, and so on, would be less surprising given the occurrence of such an earthquake.

This notion finds precedence in many classic discussions of explanation. Perhaps its clearest historical expression occurs when Peirce (1931–35, 5.189) identifies the explanatoriness of a hypothesis with its ability to render an otherwise “surprising fact” as “a matter of course.”5

This sense of explanatory power may also be seen as underlying many of the most popular accounts of explanation. Most obviously, Deductive-Nomological and Inductive-Statistical accounts (Hempel 1965) and necessity accounts (Glymour 1980) explicitly analyze explanation in such a way that a theory that is judged to be explanatory of some explanandum will necessarily increase the degree to which we expect that explanandum.

Our formal analysis of this concept proceeds in two stages: in the first stage, a parsimonious set of adequacy conditions is used to determine a

5. This quote might suggest that explanation is tied essentially to necessity for Peirce. However, elsewhere, Peirce clarifies and weakens this criterion: “to explain a fact is to show that it is a necessary or, at least, a probable result from another fact, known or supposed” (1935, 6.606; emphasis mine). See also Peirce (1958, 7.220). There are two senses in which our notion of explanatory power is more general than Peirce’s notion of explanatoriness: first, a hypothesis may provide a powerful explanation of a sur-prising proposition, in our sense, and still not render it a matter of course; that is, a hypothesis may make a proposition much less surprising while still not making it unsurprising. Second, our sense of explanatory power does not suggest that a osition must be surprising in order to be explained; a hypothesis may make a prop-osition much less surprising (or more expected), even if the latter is not very surprising to begin with.

(6)

measure of explanatory power up to ordinal equivalence. In other words, we show that, for all pairs of functions f and fthat satisfy these adequacy conditions,_{f(e, h)}₁_(p, _!_{)f(e , h )} if and only if_{f (e, h)} ₁_(p, _!_{)f (e , h )} ; all such measures thus impose the same ordinal relations on judgments of explanatory power. This is already a substantial achievement. In the sec-ond stage, we introduce more adequacy csec-onditions in order to determine a unique measure of explanatory power (from among the class of ordinally equivalent measures).

In the remainder of the article, we make the assumption that the prob-ability distribution is regular (i.e., only tautologies and contradictions are awarded rational degrees of belief of 1 and 0). This is not strictly required to derive the results below, but it makes the calculations and motivations much more elegant.

2.1. Uniqueness Up to Ordinal Equivalence. The first adequacy

condi-tion is, we suggest, rather uncontentious. It is a purely formal condicondi-tion intended to specify the probabilistic nature and limits of our explication (which we denote ):E

CA1 (Formal Structure): For any probability space and regular prob-ability measure (Q,A, Pr (7)) E, is a measurable function from two propositionse, h苸Ato a real numberE(e, h) 苸 [⫺1, 1]. This function is defined on all pairs of contingent propositions (i.e., cases such as , etc., are not in the domain of ).6_{This implies by Bayes’s}

Pr (e) p 0 E

Theorem that we can representEas a function ofPr (e) Pr (hFe), , and , and we demand that any such function be analytic.7 Pr (hF¬e)

The next adequacy condition specifies, in probabilistic terms, the general notion of explanatory power that we are interested in analyzing. As men-tioned, a hypothesis offers a powerful explanation of a proposition, in the sense that we have in mind, to the extent that it makes that proposition less surprising. In order to state this probabilistically, the key interpretive move is to formalize a decrease in surprise (or increase in expectedness) as an increase in probability. This move may seem dubious, depending on one’s interpretation of probability. Given a physical interpretation (e.g., a relative frequency or propensity interpretation), it would be difficult

6. The background knowledge term k always belongs to the right of the solidus “ ”F in Bayesian formalizations. Nonetheless, here and in the remainder of this article, we choose for the sake of transparency and simplicity in exposition to leave k implicit in all formalizations.

7. A real-valued function f is analytic if we can represent it as the Taylor expansion around a point in its domain. This requirement ensures that our measure will not be composed in an arbitrary or ad hoc way.

(7)

indeed to saddle such a psychological concept as surprise with a proba-bilistic account. However, when probabilities are themselves given a more psychological interpretation (whether in terms of simple degrees of belief or the more normative degrees of rational belief ), this move makes sense. In this case, probabilities map neatly onto degrees of expectedness.8 Ac-cordingly, insofar as surprise is inversely related to expectedness (the more surprising a proposition, the less one expects it to be true), it is straight-forwardly related to probabilities. Thus, if h decreases the degree to which

e is surprising, we represent this with the inequalityPr (e)!_{Pr (eFh)}. The

strength of this inequality corresponds to the degree of statistical relevance between e and h, giving us:

CA2 (Positive Relevance): Ceteris paribus, the greater the degree of statistical relevance between e and h, the greater the value of

.

E(e, h)

The third condition of adequacy defines a point at which explanatory power is unaffected. Ifh2 does nothing to increase or decrease the degree to which e,h1, or any logical combination of e andh1are surprising, then will not make e any more or less surprising than by itself already

h1∧ h2 h1

does. In this case, tackingh2 on to our hypothesis has no effect on the degree to which that hypothesis alleviates our surprise over e. Given that explanatory power has to do with a hypothesis’s ability to render its explanandum less surprising, we can state this in other words: ifh2 has no explanatory power whatever relative to e,h1, or any logical combi-nation of e and h1, then explanandum e will be explained no better or worse by conjoining h2 to our explanans h1. Making use of the above probabilistic interpretation of a decrease in surprise, this can be stated more formally as follows:

CA3 (Irrelevant Conjunction): If Pr (e∧ h ) p Pr (e) # Pr (h )2 2 and and

Pr (h1∧ h ) p Pr (h ) # Pr (h )2 1 2 Pr (e∧ h ∧ h ) p Pr (e ∧ h ) #1 2 1

, then .

Pr (h )2 E(e, h ∧ h ) p1 2 E(e, h )1

The fourth adequacy condition postulates that the measure of atory power should, if the negation of the hypothesis entails the explan-andum, not depend on the prior plausibility of the explanans. This is because the extent to which an explanatory hypothesis alleviates the sur-prising nature of some explanandum does not depend on considerations of how likely that hypothesis is in and of itself, independent of its relation

8. This is true by definition for the first, personalist interpretation; in terms of the more normative interpretation, probabilities still map neatly onto degrees of edness, although these are more specifically interpreted as rational degrees of expect-edness.

(8)

to the evidence. More precisely, it would strike us as odd if we could, given that¬h entails e, rewriteE(e, h) as a function ofPr (h) plus either or . In that case, the plausibility of the hypothesis in itself Pr (hFe) Pr (e)

would affect the degree of explanatory power that h lends to e. Since we find such a feature unattractive, we require what follows:

CA4 (Irrelevance of Priors): If ¬h entails e, then the values of do not depend on the values of .9

E(e, h) Pr (h)

We acknowledge that intuitions might not be strong enough to make a conclusive case for CA4. However, affirming that condition is certainly more plausible than denying it. Moreover, it only applies to a very specific case (¬h X e), while making no restrictions on the behavior of the ex-planatory power measure in more general circumstances.

These four conditions allow us to derive the following theorem (proof in app. A):

Theorem 1. All measures of explanatory power satisfying CA1–CA4

are monotonically increasing functions of the posterior ratio .

Pr (hFe)/ Pr (hF¬e)

From this theorem, two important corollaries follow. First, we can derive a result specifying the conditions under whichE takes its maximal and minimal values (proof in app. A):

Corollary 1. Measure E(e, h) takes maximal value if and only if h entails e and minimal value if and only if h implies¬e.

Note that this result fits well with the concept of explanatory power that we are analyzing, according to which a hypothesis explains some prop-osition to the extent that it renders that propprop-osition less surprising (more expected). Given this notion, any h ought to be maximally explanatorily powerful regarding some e when it renders e maximally unsurprising (expected), and this occurs whenever h guarantees the truth of e (Pr (eFh) p 1). Similarly, h should be minimally explanatory of e if e is maximally surprising in the light of h, and this occurs whenever h implies the falsity of e (Pr (eFh) p 0).

The second corollary constitutes our desired ordinal equivalence result:

Corollary 2. All measures of explanatory power satisfying CA1–CA4

are ordinally equivalent.

9. More precisely, we demand that there exists a functionf : [0, 1] r⺢ so that, if¬h implies e, either _{E(e, h) p f(Pr (hFe))} or _{E(e, h) p f(Pr (e))}. Note that in this case,

, so . If a function f with the above

Pr (hF¬e) p 1 Pr (h) p Pr (hFe) Pr (e)⫹ 1 ⫺ Pr (e)

properties did not exist, we could not speak of a way in whichE(e, h)would be in-dependent of considerations of prior plausibility of h.

(9)

To see why the corollary follows from the theorem, let r be the posterior ratio of the pair (e, h), and let r be the posterior ratio of the pair . Without loss of generality, assume . Then, for any functions

(e , h ) r1_r

f and fthat satisfy CA1–CA4, we obtain the following inequalities:

f(e, h) p g(r)1_{g(r ) p f(e , h )}

f (e, h) p g (r)1_{g (r ) p f (e , h ),}

where the inequalities are immediate consequences of theorem 1. So any

f and fsatisfying CA1–CA4 always impose the same ordinal judgments,

completing the first stage of our analysis.

2.2. Uniqueness of . This section pursues the second task of choosingE

a specific and suitably normalized measure of explanatory power out of the class of ordinally equivalent measures determined by CA1–CA4. To begin, we introduce an additional, purely formal requirement of our mea-sure:

CA5 (Normality and Form): MeasureEis the ratio of two functions ofPr (e∧ h) Pr (¬e ∧ h) Pr (e ∧ ¬h), , , andPr (¬e ∧ ¬h), each of which are homogeneous in their arguments to the least possible degree

.10

k ≥ 1

RepresentingEas the ratio of two functions serves the purpose of nor-malization. The termsPr (e∧ h) Pr (¬e ∧ h) Pr (e ∧ ¬h), , andPr (¬e ∧ ¬h) fully determine the probability distribution over the truth-functional com-pounds of e and h, so it is appropriate to represent Eas a function of them. Additionally, the requirement that our two functions be “homog-enous in their arguments to the least possible degree k ≥ 1” reflects a minimal and well-defined simplicity assumption akin to those advocated by Carnap (1950) and Kemeny and Oppenheim (1952, 315). This as-sumption effectively limits our search for a unique measure of explanatory power to those that are the most cognitively accessible and useful.

Of course, larger values ofEindicate greater explanatory power of h with respect to e. Measure ’s maximal value,E E(e, h) p 1, indicates the point at which explanans h fully explains its explanandum e, andE(e, h) p ⫺1 E( ’s minimal value) indicates the minimal explanatory power for h relative to

e (where h provides a full explanation for e being false). The neutral point

at which h lacks any explanatory power whatever relative to e is repre-sented byE(e, h) p 0.

While we have provided an informal description of the point at which should take on its neutral value 0 (when h lacks any explanatory power

E

10. A function is homogeneous in its arguments to degree k if its arguments all have the same total degree k.

(10)

whatever relative to e), it is still left to us to define this point formally. Given our notion of explanatory power, a complete lack of explanatory power is straightforwardly identified with the scenario in which h does nothing to increase or decrease the degree to which e is surprising. Prob-abilistically, in such cases, h and e are statistically irrelevant to (indepen-dent of ) one another:

CA6 (Neutrality): For explanatory hypothesis h,E(e, h) p 0if and only ifPr (h∧ e) p Pr (h) # Pr (e).

The final adequacy condition requires that the more h explains e, the less it explains its negation. This requirement is appropriate given that the less surprising (more expected) the truth of e is in light of a hypothesis, the more surprising (less expected) is e’s falsity. Corollary 1 and neutrality provide a further rationale for this condition. Corollary 1 tells us that should be maximal only if . Importantly, in such a case,

E(e, h) Pr (eFh) p 1

, and this value corresponds to the point at which this same Pr (¬eFh) p 0

corollary demandsE(¬e, h)to be minimal. In other words, given corollary 1, we see thatE(e, h)takes its maximal value precisely whenE(¬e, h)takes its minimal value and vice versa. Also, we know thatE(e, h) andE(¬e, h) should always equal zero at the same point given that Pr (h∧ e) p

if and only if . The formal

Pr (h) # Pr (e) Pr (h∧ ¬e) p Pr (h) # Pr (¬e)

condition of adequacy that most naturally sums up all of these points is as follows.

CA7 (Symmetry):E(e, h) p ⫺E(¬e, h).

These three conditions of adequacy, when added to CA1–CA4, con-jointly determine a unique measure of explanatory power as stated in the following theorem (proof in app. B).11

Theorem 2. The only measure that satisfies CA1–CA7 is

Pr (hFe)⫺ Pr (hF¬e)

E(e, h) p .

Pr (hFe)⫹ Pr (hF¬e)

Remark. Since, forPr (hF¬e) ( 0,

Pr (hFe)⫺ Pr (hF¬e) Pr (hFe)/ Pr (hF¬e) ⫺ 1

p ,

Pr (hFe)⫹ Pr (hF¬e) Pr (hFe)/ Pr (hF¬e) ⫹ 1

it is easy to see thatEis indeed an increasing function of the posterior ratio.

11. There is another attractive uniqueness theorem for . It can be proven thatE Eis also the only measure that satisfies CA3, CA5, CA6, CA7, and corollary 1, although we do not include this separate proof in this article.

(11)

Thus, these conditions provide us with an intuitively grounded, unique measure of explanatory power.12

3. Theorems of . We have proposed the above seven conditions of ad-_E

equacy as intuitively plausible constraints on a measure of explanatory power. Accordingly, the fact that these conditions are sufficient to deter-mine E already constitutes a strong argument in this measure’s favor. Nonetheless, we proceed in this section to strengthen the case forE by highlighting some important theorems that follow from adopting this measure. Ultimately, the point of this section is to defend further our assertion that_Eis well behaved in the sense that it gives results that match our clear intuitions about the concept of explanatory power, even in one case where other proposed measures fail to do so.13

3.1. Addition of Irrelevant Evidence. Good (1960) and, more recently,

McGrew (2003) both explicate h’s degree of explanatory power relative to e in terms of the amount of information concerning h provided by e. This results in the following intuitive and simple measure of explanatory power:14

Pr (eFh)

I(e, h) p ln

[

]

.

Pr (e)

According to this measure, the explanatory power of explanans h must remain constant whenever we add an irrelevant proposition eto explan-andum e (where proposition eis irrelevant in the sense that it is statistically independent of h in the light of e):

12. MeasureEis closely related to Kemeny and Oppenheim’s (1952) measure of “fac-tual support” F. In fact, these two measures are structurally equivalent; however, regarding the interpretation of the measure,E(e, h)isF(h, e)with h and e reversed (h is replaced by e, and e is replaced by h).

13. Each of the theorems presented in this section can and should be thought of as further conditions of adequacy on any measure of explanatory power. Nonetheless, we choose to present these theorems as separate from the conditions of adequacy presented in section 2 in order to make explicit which conditions do the work in giving us a unique measure.

14. Good’s measure is meant to improve on the following measure of explanatory power defined by Popper (1959):[Pr (eFh)_{⫺ Pr (e)]/[Pr (eFh) ⫹ Pr (e)]}. It should be noted that Popper’s measure is ordinally equivalent to Good’s in the same sense thatEis ordinally equivalent to the posterior ratioPr (hFe)/ Pr (hF¬e). Thus, the problem we present here for Good’s measure is also a problem for Popper’s.

(12)

Pr (e∧ e Fh) Pr (e Fe∧ h) Pr (eFh) I(e∧ e , h) p ln

[

]

pln

[

]

Pr (e∧ e ) Pr (e Fe) Pr (e)

Pr (e Fe) Pr (eFh) Pr (eFh)

p ln

[

]

pln

[

]

pI(e, h).

Pr (e Fe) Pr (e) Pr (e)

This is, however, a very counterintuitive result. To see this, consider the following example: let e be a general description of the Brownian motion observed in some particles suspended in a particular liquid, and let h be Einstein’s atomic explanation of this motion. Of course, h con-stitutes a lovely explanation of e, and this fact is reflected nicely by measure

I:

Pr (eFh)

I(e, h) p ln

[

]

k_0.

Pr (e)

However, take any irrelevant new statement e and conjoin it to e; for example, let ebe the proposition that the mating season for an American green tree frog takes place from mid-April to mid-August. In this case, measure I judges that Einstein’s hypothesis explains Brownian motion to the same extent that it explains Brownian motion and this fact about tree frogs. Needless to say, this result is deeply unsettling.

Instead, it seems that, as the evidence becomes less statistically relevant to some explanatory hypothesis h (with the addition of irrelevant prop-ositions), it ought to be the case that the explanatory power of h relative to that evidence approaches the value at which it is judged to be explan-atorily irrelevant to the evidence (E p 0). Thus, ifE(e, h)1₀, then this value should decrease with the addition of e to our evidence: 0!E(e ∧ . Similarly, if , then this value should increase with

e , h)!E_{(e, h)} E(e, h)!₀

the addition of e:01E(e ∧ e , h) 1E_{(e, h)}. And finally, if E(e, h) p 0, then this value should remain constant atE(e ∧ e , h) p 0 . MeasureEgives these general results as shown in the following theorem (proof in app. C):

Theorem 3. If Pr (e Fe ∧ h) p Pr (e Fe) —or equivalently, Pr (hFe∧

—and , then:

e ) p Pr (hFe) Pr (e Fe) ( 1

• if_{Pr (eFh)}₁_{Pr (e)}, then _{E(e, h)}₁_E_(e_{∧ e , h)} ₁₀, • ifPr (eFh)!_{Pr (e)}, then E(e, h)!E_(e∧ e , h) !₀, and • ifPr (eFh) p Pr (e), thenE(e, h) pE(e∧ e , h) p 0 .

3.2. Addition of Relevant Evidence. Next, we explore whetherEis well

behaved in those circumstances in which we strengthen our explanandum by adding to it relevant evidence. Consider the case in which h has some explanatory power relative to e so thatE(e, h)1⫺1(i.e., h has any degree of explanatory power relative to e greater than the minimal degree). What

(13)

should happen to this degree of explanatory power if we gather some new information e that, in the light of e, we know is explained by h to the worst possible degree?

To take a simple example, imagine that police investigators hypothesize that Jones murdered Smith (h) in light of the facts that Jones’s fingerprints were found near the dead body and Jones recently had discovered that his wife and Smith were having an affair (e). Now suppose that the in-vestigators discover video footage that proves that Jones was not at the scene of the murder on the day and time that it took place (e). Clearly,

h is no longer such a good explanation of our evidence once eis added;

in fact, h now seems to be a maximally poor explanation ofe∧ eprecisely because of the addition of e(h cannot possibly explaine∧ ebecause e rules h out entirely). Thus, in such cases, the explanatory power of h relative to the new collection of evidencee∧ eshould be less than that relative to the original evidence e; in fact, it should be minimal with the addition of e. This holds true in terms of , as shown in the followingE theorem (proof in app. D):

Theorem 4. IfE(e, h)1⫺1and_{Pr (e Fe} ∧ h) p 0(in which case, it also must be true thatPr (e Fe) ( 1 ), then E(e, h)1E_(e∧ e , h) p ⫺1 . Alternatively, we may ask what intuitively should happen in the same circumstance (adding the condition that h does not have the maximal degree of explanatory power relative to e—i.e.,E(e, h)!₁) but where the new information we gain e is fully explained by h in the light of our evidence e. Let h and e be the same as in the above example, and now imagine that investigators discover video footage that proves that Jones was at the scene of the murder on the day and time that it took place (e). In this case, h becomes an even better explanation of the evidence precisely because of the addition of eto the evidence. Thus, in such cases, we would expect the explanatory power of h relative to the new evidence to be greater than that relative to e alone. Again, agrees with our

e∧ e E

intuition here (proof in app. D):

Theorem 5. If0!_{Pr (e Fe)} !₁and h does not already fully explain e or its negation (0!_{Pr (eFh)}!₁) and_{Pr (e Fe} ∧ h) p 1, then E(e, h)!

.

E(e∧ e , h)

While these last two theorems are highly intuitive, they are also quite limited in their applicability. Both theorems require in their antecedent conditions that one’s evidence be strengthened with the addition of some

ethat is itself either maximally or minimally explained by h in the light of e. However, our intuitions reach to another class of related examples in which the additional evidence need not be maximally or minimally explained in this way. In situations in which h explains e to some positive

(14)

degree, it is intuitive to think that the addition of any new piece of evidence that is negatively explained by (made more surprising by) h in the light of e will decrease h’s overall degree of explanatory power. Similarly, when-ever h has some negative degree of explanatory power relative to e, it is plausible to think that the addition of any new piece of evidence that is positively explained by (made less surprising by) h in the light of e will increase h’s overall degree of explanatory power. These intuitions are captured in the following theorem ofE(proof in app. D):

Theorem 6. If E(e, h)1₀, then if _{Pr (e Fe} ∧ h)!_{Pr (e Fe)} , then E(e ∧ . Alternatively, if , then if

e , h)!E_{(e, h)} E(e, h)!₀ _{Pr (e Fe}∧ h)1

, then .

Pr (e Fe) E(e ∧ e , h)1E_{(e, h)}

4. Conclusions. Above, we have shown the following: first,Eis a member of the specific family of ordinally equivalent measures that satisfy our first four adequacy conditions. Moreover, among the measures included in this class,Eitself uniquely satisfies the additional conditions CA5–CA7. The theorems presented in the last section strengthen the case forE by showing that this measure does indeed seem to correspond well and quite generally to many of our clear explanatory intuitions. In light of all of this, we argue thatEis manifestly an intuitively appealing formal account of explanatory power.

The acceptance of E opens the door to a wide variety of potentially fruitful, intriguing further questions for research. Here, we limit ourselves to describing very briefly two of these projects that seem to us to be particularly fascinating and manageable withEin hand.

First, measureEmakes questions pertaining to the normativity of ex-planatory considerations much more tractable, at least from a Bayesian perspective. Given this probabilistic rendering of the concept of explan-atory power, one has a new ability to ask and attempt to answer questions such as, “Does the ability of a hypothesis to explain some known fact itself constitute reason in that hypothesis’s favor in any sense?” or, re-latedly, “Is there any necessary sense in which explanatory power is tied to the overall probability of an hypothesis?” Such questions call out for more formal work in terms ofE attempting to show whether, and how closely,E(e, h)might be related toPr (hFe). This further work would have important bearing on debates over the general normativity of explanatory power; it would also potentially lend much insight into discussions of Inference to the Best Explanation and its vices or virtues.

Second, we have presented and defended E here as an accurate nor-mative account of explanatory power in the following sense: in the wide space of cases in which our conditions of adequacy are rationally

(15)

com-pelling and intuitively applicable, one ought to think of explanatory power in accord with the results of . However, one may wonder whether peopleE actually have explanatory intuitions that accord with this normative mea-sure. WithEin hand, this question becomes quite susceptible to further study. In effect, the question is whether E is, in addition to being an accurate normative account of explanatory power, a good predictor of people’s actual judgments of the same. This question is, of course, an empirical one and thus requires an empirical study into the degree of fit between human judgments and theoretical results provided by . Such aE study could provide important insights both for the psychology of human reasoning and for the philosophy of explanation.15

Appendix A: Proof of Theorem 1 and Corollary 1

Theorem 1. All measures of explanatory power satisfying CA1–

CA4 are monotonically increasing functions of the posterior ratio .

Pr (hFe)/ Pr (hF¬e)

Proof.Pr (hFe) Pr (hF, ¬e), andPr (e) jointly determine the probability distribution of the pair(e, h), so we can representEas a function of these values: there is a _{g : [0, 1] r}3 _⺢ such that _{E(e, h) p g(Pr (e),}

. Pr (hFe), Pr (hF¬e))

First, note that whenever the assumptions of CA3 are satisfied (i.e., wheneverh2is independent of all e,h1, ande∧ h1), the following equalities hold:

Pr (h1∧ h Fe) p Pr (h Fh ∧ e) Pr (h Fe) p Pr (h ) Pr (h Fe)2 2 1 1 2 1

Pr (h1∧ h ∧ ¬e)2 Pr (h1∧ h ) ⫺ Pr (h ∧ h ∧ e)2 1 2 Pr (h1∧ h F¬e) p2 p (A1) Pr (¬e) Pr (¬e) Pr (h )1 ⫺ Pr (h ∧ e)1 p Pr (h )₂ pPr (h ) Pr (h F₂ ₁¬e). Pr (¬e)

Now, for all values ofc, x, y, z苸 (0, 1), we can choose propositions

e,h1, and h2 and probability distributions over these such that the independence assumptions of CA3 are satisfied, and c p Pr (h )2 ,

, , and . Due to CA1, we can

x p Pr (e) y p Pr (h Fe)1 z p Pr (h F1¬e)

always find such propositions and distributions so long asE is

ap-15. This second research project is, in fact, now underway. For a description and report of the first empirical study investigating the descriptive merits ofE(and other candidate measures of explanatory power), see Schupbach (forthcoming).

(16)

plicable. The above equations then imply thatPr (h1∧ h Fe) p cy2 and

. Applying CA3 ( ) yields

Pr (h1∧ h F¬e) p cz2 E(e, h ) p1 E(e, h1∧ h )2 the general fact that

g(x, y, z) p g(x, cy, cz). (A2)

Consider now the case that ¬h entails e; that is, Pr (eF¬h) p . Assume that could be written as a function Pr (hF¬e) p 1 g(7, 7, 1)

ofPr (e)alone. Accordingly, there would be a functionh : [0, 1] r⺢

such that

g(x, y, 1) p h(x). (A3)

If we choosey p Pr (hFe)!_{Pr (hF}¬e) p z, it follows from equations (A2) and (A3) that

y

g(x, y, z) p g(x, _z, 1) p h(x). (A4)

In other words, g (and ) would then be constant on the triangleE for any fixed . Now, since

{y!_{z} p {Pr (hFe)}!_{Pr (hF}¬e)} _{x p Pr (e)}

g is an analytic function (due to CA1), its restrictiong(x, 7, 7) (for

fixed x) must be analytic as well. This entails in particular that if is constant on some nonempty open set 2, then it is

g(x, 7, 7) S O R

constant everywhere:

1. All derivatives of a locally constant function vanish in that environment (Theorem of Calculus).

2. We write, by CA1,g(x, 7, 7)as a Taylor series expanded around a fixed point (y*, z*)苸 S p {y!_z}:

⬁ ₁ _⭸ _⭸ j

g(x, y, z) p

冘

[

(

(y⫺ y*) ⫹ (z ⫺ z*)

)

g(x, y*, z*)

]

.

j! ⭸y ⭸z

jp0 _ypy*,zpz*

Since all derivatives ofg(x, 7, 7)in the setS p {y!_z}are zero, all terms of the Taylor series, except the first one (g(x, y*, z*)) vanish.

Thus,g(x, 7,7)must be constant everywhere. But this would violate the statistical relevance condition CA2 since g (and ) would thenE depend onPr (e)alone and not be sensitive to any form of statistical relevance between e and h.

Thus, whenever¬hentails e,g(7, 7, 1)either depends on its second argument alone or on both arguments. The latter case implies that there must be pairs _{(e, h)} and _{(e , h )} with _{Pr (hFe) p Pr (h Fe )} such that

(17)

Note that ifPr (eF¬h) p 1, we obtain

Pr (e) p Pr (eFh) Pr (h)⫹ Pr (eF¬h) Pr (¬h)

pPr (hFe) Pr (e)⫹ (1 ⫺ Pr (h)) (A6) 1⫺ Pr (h)

p ,

1⫺ Pr (hFe)

and so we can writePr (e)as a function ofPr (h) andPr (hFe). Combining (A5) and (A6), and keeping in mind that g cannot depend on Pr (e) alone, we obtain that there are pairs (e, h) and

such that (e , h ) 1⫺ Pr (h) 1⫺ Pr (h ) g

(

, Pr (hFe), 1 ( g

) (

, Pr (hFe), 1 .

)

1⫺ Pr (hFe) 1⫺ Pr (hFe)

This can only be the case if the prior probability (_{Pr (h)}and_{Pr (h )} , respectively) has an impact on the value of g (and thus onE), in contradiction with CA4. Thus, equality in (A5) holds whenever . Hence, cannot depend on both argu-

Pr (hFe) p Pr (h Fe ) g(7, 7, 1)

ments, and it can be written as a function of its second argument alone.

Thus, for any _{Pr (hFe)}_!_{Pr (hF}_¬e), there must be a _{g : [0, 1] r} 2 _⺢ such that

Pr (hFe)

E(e, h) p g(Pr (e), Pr (hFe), Pr (hF¬e)) p g Pr (e),

(

, 1

)

Pr (hF¬e) Pr (hFe)

pg

(

, 1 .

)

Pr (hF¬e)

This establishes thatEis a function of the posterior ratio if h and e are negatively relevant to each other. By applying analyticity of E once more, we see that E is a function of the posterior ratio in its entire domain (i.e., also if e and h are positively Pr (hFe)/ Pr (hF¬e)

relevant to each other or independent).

Finally, CA2 implies that this function must be monotonically increasing since, otherwise, explanatory power would not increase with statistical relevance (of which the posterior probability is a mea-sure). Evidently, any such function satisfies CA1–CA4. QED

Corollary 1. Measure E(e, h) takes maximal value if and only if h entails e and minimal value if and only if h implies¬e.

Proof. Since E is an increasing function of the posterior ratio

, is maximal if and only if . Due

(18)

to the regularity ofPr (7), this is the case if and only if ¬eentails¬ , in other words, if and only if h entails e. The case of minimality

h

is proven analogously. QED

Appendix B: Proof of Theorem 2 (Uniqueness of )E Theorem 2. The only measure that satisfies CA1–CA7 is

Pr (hFe)⫺ Pr (hF¬e)

E(e, h) p .

Let x p Pr (e∧ h) y p Pr (e ∧ ¬h) z p Pr (¬e ∧ h), , , and t p Pr (¬

with . Write (by CA5).

e∧ ¬h) x⫹ y ⫹ z ⫹ t p 1 E(e, h) p f(x, y, z, t)

Lemma 1. There is no normalized function f(x, y, z, t) of degree 1 that satisfies our desiderata CA1–CA7.

Proof. If there were such a function, the numerator would have the

formax⫹ by ⫹ cz ⫹ dt. If e and h are independent, the numerator must vanish, by means of CA6. In other words, for those values of

, we demand . Below, we list four

dif-(x, y, z, t) ax⫹ by ⫹ cz ⫹ dt p 0

ferent realizations of (x, y, z, t) that make e and h independent, namely, (1/2, 1/4, 1/6, 1/12), (1/2, 1/3, 1/10, 1/15), (1/2, 3/8, 1/14, 3/ 56), and (1/4, 1/4, 1/4, 1/4). Since these vectors are linearly indepen-dent (i.e., their span has dimension 4), it must be the case that . Hence, there is no such function of degree 1.

a p b p c p d p 0

QED

Lemma 2. CA3 entails that for any value ofb苸 (0, 1),

f(bx, y⫹ (1 ⫺ b)x, bz, t ⫹ (1 ⫺ b)z) p f(x, y, z, t). (B1)

Proof. For anyx, y, z, t, we choose e,h1 such that

x p Pr (e∧ h ) y p Pr (e ∧ ¬h )1 1

z p Pr (¬e ∧ h ) t p Pr (¬e ∧ ¬h ).1 1

Moreover, we choose ah2such that the antecedent conditions of CA3 are satisfied, and we let b pPr (h )2 . Applying the independencies betweenh2, e, and h1 and recalling (A1), we obtain

bx p Pr (h ) Pr (e2 ∧ h ) p Pr (h ) Pr (e) Pr (h Fe)1 2 1 p Pr (e) Pr (h₁∧ h Fe) p Pr (e ∧ h ∧ h ),₂ ₁ ₂ and similarly

(19)

bz p Pr (¬e ∧ (h ∧ h )) y ⫹ (1 ⫺ b)z p Pr (¬e ∧ ¬(h ∧ h ))1 2 1 2

y⫹ (1 ⫺ b)x p Pr (e ∧ ¬(h ∧ h )).1 2

Making use of these equations, we see directly that CA3—that is, —implies equation (B1). QED

E(e, h ) p1 E(e, h1∧ h )2

Proof of Theorem 2 (Uniqueness of ). Lemma 1 shows that there isE

no normalized functionf(x, y, z, t)of degree 1 that satisfies our de-siderata. Our proof is constructive: we show that there is exactly one such function of degree 2, and then we are done, due to the formal requirements set out in CA5. By CA5, we look for a function of the form

2 2 2 2

ax ⫹ bxy ⫹ cy ⫹ dxz ⫹ eyz ⫹ gz ⫹ ixt ⫹ jyt ⫹ rzt ⫹ st

f(x, y, z, t) p_ax_¯ 2_{⫹ bxy ⫹ cy ⫹ dxz ⫹ eyz ⫹ gz ⫹ ixt ⫹ jyt ⫹ rzt ⫹ st}¯ _¯ 2 ¯ _¯ _¯ 2 _¯ _¯ _¯ _¯ 2. (B2) We begin by investigating the numerator.16_{CA6 tells us that it has} to be zero ifPr (e∧ h) p Pr (e) Pr (h), in other words, if

x p (x⫹ y)(x ⫹ z). (B3)

Making use ofx⫹ y ⫹ z ⫹ t p 1, we conclude that this is the case if and only ifxt⫺ yz p 0:

xt⫺ yz p x(1 ⫺ x ⫺ y ⫺ z) ⫺ yz

2

p x⫺ x ⫺ xy ⫺ xz ⫺ yz p x⫺ (x ⫹ y)(x ⫹ z).

The only way to satisfy the constraint (B3) is to sete p⫺iand to set all other coefficients in the numerator to zero. All other choices of coefficients do not work since the dependencies are nonlinear. Hence, f becomes

i(xt⫺ yz)

f(x, y, z, t) p_ax_¯ 2_{⫹ bxy ⫹ cy ⫹ dxz ⫹ eyz ⫹ gz ⫹ ixt ⫹ jyt ⫹ rzt ⫹ st}¯ _¯ 2 ¯ _¯ _¯ 2 _¯ _¯ _¯ _¯ 2. Now, we make use of corollary 1 and CA7 in order to tackle the coefficients in the denominator. Corollary 1 (maximality) entails that

if , and CA7 (symmetry) is equivalent to

f p 1 z p 0

f(x, y, z, t) p⫺f(z, t, x, y). (B4)

First, applying corollary 1 yields _¯ 2 _¯

1 p f(x, 0, 0, t) p ixt/(ax ⫹ ixt ⫹

16. The general method of our proof bears resemblance to Kemeny and Oppenheim’s (1952) theorem 27. However, we would like to point out two crucial differences. First, we use more parsimonious assumptions, and we work in a different—non-Carnapian— framework. Second, their proof contains invalid steps, e.g., they derived p 0by means of symmetry (CA7) alone. (Take the counterexample_{f p (xy}_{⫺ yz ⫹ xz ⫺ z )/(xy ⫹}2

, which even satisfies corollary 1.) Hence, our proof is truly original. 2

(20)

, and by a comparison of coefficients, we get and 2

¯ ¯ ¯

st ) a p s p 0

. Similarly, we obtain and from

¯_{i p i} _{c p g p 0}_¯ _¯ _¯_{e p i} _{1 p}

, combining

cor-2 2

¯ ¯ ¯

f(x, 0, 0, t) p⫺f(0, t, x, 0) p ixt/(ct ⫹ ext ⫹ gx )

ollary 1 with CA7 (i.e., eq. [B4]). Now, f has the form

i(xt⫺ yz)

f(x, y, z, t) p _¯ _¯ _¯ .

¯ bxy⫹ dxz ⫹ i(xt ⫹ yz) ⫹ jyt ⫹ rzt

Assume now that¯j(0. Letx, z r 0. We know by corollary 1 that in this case,f r 1. Since the numerator vanishes, the denominator must vanish too, but by¯j(0 it stays bounded away from zero, leading to a contradiction (f r 0). Hence,¯j p 0. In a similar vein, we can argue forb p 0¯ by lettingz, t r 0 and for ¯r p 0by letting

(making use of [B4] again: ).

x, y r 0 ⫺1 p f(0, 0, z, t)

Thus, f can be written as

i(xt⫺ yz) (xt⫺ yz)

f(x, y, z, t) p p , (B5)

¯ ¯

dxz⫹ i(xt ⫹ yz) (xt⫹ yz) ⫹ axz by lettinga p d/i¯ .

It remains to make use of CA3 in order to fix the value of . Seta in (B1) and make use of

b p1/2 f(x, y, z, t) p f(bx, (1⫺ b)x ⫹

(lemma 2) and the restrictions on f captured in

y, bz, (1⫺ b)z ⫹ t)

(B5). By making use of (B1), we obtain the general constraint

xt⫺ yz x(z/2⫹ t) ⫺ z(x/2 ⫹ y) p xt⫹ yz ⫹ axz x(z/2⫹ t) ⫹ z(x/2 ⫹ y) ⫹ axz/2 xt⫺ yz p . (B6) xt⫹ yz ⫹ xz(2 ⫹ a)/2

For (B6) to be true in general, we have to demand that a p 1⫹ , which implies that . Hence,

a/2 a p2

xt⫺ yz x(t⫹ z) ⫺ z(x ⫹ y)

f(x, y, z, t) p p ,

xt⫹ yz ⫹ 2xz x(t⫹ z) ⫹ z(x ⫹ y)

implying

Pr (e∧ h) Pr (¬e) ⫺ Pr (¬e ∧ h) Pr (e)

E(e, h) p

Pr (e∧ h) Pr (¬e) ⫹ Pr (¬e ∧ h) Pr (e) Pr (hFe)⫺ Pr (hF¬e)

p , (B7)

(21)

Appendix C: Proof of Theorem 3

Theorem 3. If Pr (e Fe ∧ h) p Pr (e Fe) —or equivalently, Pr (hFe∧

—and , then:

e ) p Pr (hFe) Pr (e Fe) ( 1

• ifPr (eFh)1_{Pr (e)}, then E(e, h)1E_(e∧ e , h) 1₀, • ifPr (eFh)!_{Pr (e)}, then E(e, h)!E_(e∧ e , h) !₀, and • ifPr (eFh) p Pr (e), thenE(e, h) pE(e∧ e , h) p 0 .

Proof. SinceE(e, h)and the posterior ratio r(e, h) p Pr (hFe)/ Pr (hF¬

are ordinally equivalent, we can focus our analysis on that quantity:

e)

r(e, h) Pr (hFe) Pr (hF¬(e ∧ e ))

p #

r(e∧ e , h) Pr (hF¬e) Pr (hFe∧ e )

1⫺ Pr (e) Pr (eFh) 1⫺ Pr (e Fe ∧ h) Pr (eFh)

p # #

Pr (e) 1⫺ Pr (eFh) Pr (e Fe∧ h) Pr (eFh)

Pr (e∧ e ) # _(C1) 1⫺ Pr (e ∧ e )

1⫺ Pr (e) 1⫺ Pr (e Fe) Pr (eFh)

p #

1⫺ Pr (eFh) 1⫺ Pr (e) Pr (e Fe)

1⫹ Pr (e) Pr (e Fe) Pr (eFh) ⫺ (Pr (e) ⫹ Pr (eFh) Pr (e Fe))

p .

1⫹ Pr (e) Pr (e Fe) Pr (eFh) ⫺ (Pr (eFh) ⫹ Pr (e) Pr (e Fe)) This quantity is greater than one if and only if the numerator exceeds the denominator, that is, if and only if

0!_{(Pr (eFh)}⫹ Pr (e) Pr (e Fe)) ⫺ (Pr (e) ⫹ Pr (eFh) Pr (e Fe))

p Pr (eFh)(1⫺ Pr (e Fe)) ⫺ Pr (e)(1 ⫺ Pr (e Fe)) (C2)

p (Pr (eFh)⫺ Pr (e))(1 ⫺ Pr (e Fe)),

which is satisfied if and only ifPr (eFh)1_{Pr (e)}and not satisfied oth-erwise. Thus,r(e, h)1_r(e∧ e , h) (andE(e, h)1E_(e∧ e , h) ) if and only ifPr (eFh)1_{Pr (e)}. The other two cases follow directly from (C2).

It remains to show that E(e, h) and E(e ∧ e , h) always have the same sign. This follows from the fact that

Pr (e∧ e Fh) Pr (e Fe∧ h) Pr (eFh) Pr (eFh)

p p .

Pr (e∧ e ) Pr (eFe ) Pr (e) Pr (e)

Thus, h is positively relevant to e if and only if it is positively relevant to(e∧ e ) . By CA2 and CA6, this implies thatE(e ∧ e , h) 1₀ if and only ifE(e, h)1₀and vice versa for negative relevance. QED

(22)

Appendix D: Proofs of Theorems 4–6

Theorem 4. If_{E(e, h)}₁_⫺1and_{Pr (e Fe} _{∧ h) p 0}(in which case, it also must be true thatPr (e Fe) ( 1 ), then E(e, h)1E_(e∧ e , h) p ⫺1 .

Proof. Under the assumptions of theorem 1, by application of Bayes’s

Theorem, Pr (h) Pr (e∧ e Fh) Pr (h) Pr (e Fe∧ h) Pr (eFh) Pr (hFe∧ e ) p p p0. Pr (e∧ e ) Pr (e∧ e ) Thus, E(e ∧ e Fh) p ⫺1!E(e, h). QED

Theorem 5. If0!_{Pr (e Fe)} !₁and h does not already fully explain e or its negation (0!_{Pr (eFh)}!₁) and_{Pr (e Fe} ∧ h) p 1, then E(e, h)!

.

E(e∧ e , h)

Proof. Note first that

Pr (e∧ e Fh) p Pr (e Fe ∧ h) Pr (eFh) p Pr (eFh). (D1) Analogous to theorem 3, we prove this theorem by comparing the posterior ratiosr(e, h)andr(e∧ e , h) and applying equation (D1):

r(e, h) Pr (hFe) Pr (hF¬(e ∧ e ))

p #

r(e∧ e , h) Pr (hF¬e) Pr (hFe∧ e )

1⫺ Pr (e) Pr (eFh) 1⫺ Pr (e ∧ e Fh) Pr (e∧ e ) p # # # Pr (e) 1⫺ Pr (eFh) Pr (e∧ e Fh) 1⫺ Pr (e ∧ e ) 1⫺ Pr (e) Pr (e∧ e ) p # Pr (e) 1⫺ Pr (e ∧ e ) Pr (e∧ e ) ⫺ Pr (e) Pr (e ∧ e ) p Pr (e)⫺ Pr (e) Pr (e ∧ e ) ! ₁

since, by assumption,Pr (e∧ e ) p Pr (e) Pr (e Fe) !_{Pr (e)}. This implies

thatE(e, h)!E_(e∧ e , h) . QED

Theorem 6. If E(e, h)1₀, then if _{Pr (e Fe} ∧ h)!_{Pr (e Fe)} , then E(e ∧ . Alternatively, if , then if

e , h)!E_{(e, h)} E(e, h)!₀ _{Pr (e Fe}∧ h)1

, then .

(23)

Proof. First, we note that ifPr (e Fe ∧ h)!_{Pr (e Fe)} , then also_{Pr (e}∧ . Then we apply the same

e Fh) p Pr (e Fe∧ h) Pr (eFh)!Pr (e Fe) Pr (eFh)

approach as in the previous proofs:

r(e, h) 1⫺ Pr (e) Pr (eFh) 1⫺ Pr (e ∧ e Fh) Pr (e∧ e )

p # # #

r(e∧ e , h) Pr (e) 1⫺ Pr (eFh) Pr (e∧ e Fh) 1⫺ Pr (e ∧ e )

1⫺ Pr (e) Pr (eFh) 1⫺ Pr (e Fe) Pr (eFh) Pr (e Fe) Pr (e)

1 # # #

Pr (e) 1⫺ Pr (eFh) Pr (e Fe) Pr (eFh) 1⫺ Pr (e Fe) Pr (e)

1⫺ Pr (e) 1⫺ Pr (eFh) Pr (e Fe)

p #

1⫺ Pr (eFh) 1⫺ Pr (e Fe) Pr (e)

1⫹ Pr (e) Pr (e Fe) Pr (eFh) ⫺ (Pr (e) ⫹ Pr (eFh) Pr (e Fe))

p .

1⫹ Pr (e) Pr (e Fe) Pr (eFh) ⫺ (Pr (eFh) ⫹ Pr (e) Pr (e Fe))

This is exactly the term in the last line of (C1). We have already shown in the proof of theorem 3 that this quantity is greater than 1 if and only ifPr (eFh)1_{Pr (e)}(i.e., ifE(e, h)1₀). This suffices to prove the first half of theorem 6. The reverse case is proved in exactly the same way. QED

REFERENCES

Carnap, Rudolf. 1950. Logical Foundations of Probability. Chicago: University of Chicago Press.

Friedman, Michael. 1974. “Explanation and Scientific Understanding.” Journal of

Philos-ophy 71 (1): 5–19.

Glymour, Clark. 1980. “Explanations, Tests, Unity and Necessity.” Nous 14:31–49. Good, I. J. 1960. “Weight of Evidence, Corroboration, Explanatory Power, Information

and the Utility of Experiments.” Journal of the Royal Statistical Society B 22 (2): 319– 31.

Hempel, Carl G. 1965. “Aspects of Scientific Explanation.” In Aspects of Scientific

Expla-nation and Other Essays in the Philosophy of Science, 331–496. New York: Free Press.

Hempel, Carl G., and Paul Oppenheim. 1948. “Studies in the Logic of Explanation.”

Phi-losophy of Science 15 (2): 135–75.

Jeffrey, Richard. 1969. “Statistical Explanation versus Statistical Inference.” In Essays in

Honor of Carl G. Hempel, ed. Nicholas Rescher, 104–13. Dordrecht: Reidel.

Kemeny, John G., and Paul Oppenheim. 1952. “Degree of Factual Support.” Philosophy

of Science 19:307–24.

Kitcher, Philip. 1989. “Explanatory Unification and the Causal Structure of the World.” In

Scientific Explanation, ed. Philip Kitcher and Wesley Salmon, 410–505. Minneapolis:

University of Minnesota Press.

Lipton, Peter. 2004. Inference to the Best Explanation. 2nd ed. New York: Routledge. Machamer, Peter, Lindley Darden, and Carl F. Craver. 2000. “Thinking about Mechanisms.”

Philosophy of Science 67 (1): 1–25.

McGrew, Timothy. 2003. “Confirmation, Heuristics, and Explanatory Reasoning.” British

Journal for the Philosophy of Science 54:553–67.

Peirce, Charles Sanders. 1931–35. The Collected Papers of Charles Sanders Peirce. Vols. 1– 6, ed. Charles Hartshorne and Paul Weiss. Cambridge, MA: Harvard University Press. ———. 1958. The Collected Papers of Charles Sanders Peirce. Vols. 7–8, ed. Arthur W.

Burks. Cambridge, MA: Harvard University Press.

(24)

Salmon, Wesley C. 1971. “Statistical Explanation.” In Statistical Explanation and Statistical

Relevance, ed. Wesley C. Salmon, 29–87. Pittsburgh: University of Pittsburgh Press.

———. 1984. Scientific Explanation and the Causal Structure of the World. Princeton, NJ: Princeton University Press.

Schupbach, Jonah N. Forthcoming. “Comparing Probabilistic Measures of Explanatory Power.” Philosophy of Science.

Woodward, James. 2003. Making Things Happen: A Theory of Causal Explanation. Oxford: Oxford University Press.

———. 2009. “Scientific Explanation.” In Stanford Encyclopedia of Philosophy, ed. Edward N. Zalta. Stanford, CA: Stanford University.