This page best viewed at 640x480 resolution. Some tables and matrix diagrams may not appear correctly at narrower screen widths.

A Theoretical Calculation of Expected Runs/Inning

by James Jones

Introduction

In this study I do a theoretical determination of how many runs we would expect a team to score per inning given the assumption that the frequency distribution of events is dependent only on the base and out situation. This is a fairly reasonable assumption, although not entirely accurate in every case, so we feel no need to relax it.

Theoretical Considerations - Single Table Case

In this case we consider the situation where we have a single state transition table. This would be suitable for calculating the Offensive Winning Percentage of a single player (which is the theoretical winning percentage that a lineup made up of nine identical copies of this player would produce facing average pitching). It is also a fairly simple case and thus serves as a useful introduction to the more complete case in which we introduce multiple tables so as to examine the effects of lineup.

Consider any base/out situation (e.g., runners on the corners with one out) to be a "state." Then various events cause the states to transition to other states in a probabilistic way. In general, the transitions are Markov processes, but we will see later that it is useful to modify the stochastic matrix slightly for easier computation.

There are three bases which may be occupied, resulting in 8 possible 'base' situations. There are also 3 possible 'out' situations. So there are 24 possible base/out combinations, corresponding to the first 24 states. We add a 25th state to represent the end of an inning. Let A be the matrix of state transition probabilities. Let B be a matrix of runs-scored expectation for each transition. Finally, let x be a column vector representing an initial state distribution. (Note that for the specific case of determining runs scored in an inning, the entries of x will be set to 1 for the 'none on, none out' state and 0 everywhere else.)

Let w be a vector containing all 1's. Then for any vector u, wTu is the sum of the entries of u.

Now the expected number of runs scored in the plate appearance following a state distribution of u is wTBu. The first state distribution is initialized to be x, and since transitions are governed by A, the k'th state distribution will be Ak-1x. Thus we obtain the following formula for expected runs in an inning:

(1)

<r> = wTBx + wTBAx + wTBA2x + ... + wTBAkx + ...

This series should converge as long as there is a nonzero probability of making an out from any state. However, the series 1 + A + A2 + ... + Ak + ... will not converge, since Ak does not tend to zero but rather to a matrix with zeroes everywhere but a 1 in the entry for the transition between the 'end of an inning' state and itself. However, since no runs can be scored from the 'end of an inning' state, we can see that B + BA + BA2 + ... + BAk + ... will always converge. (Note that the full argument is a bit more involved than this, but since we assume a nonzero probability of making an out from any state, we may determine that an upper bound on this series is a geometric series with factor < 1, demonstrating convergence.)

To compute (1), then, we cannot simply use the identity 1 + A + A2 + ... + Ak + ... = (1 - A)-1. Instead, we examine block diagrams for A and B. Suppose WLOG that the order of states is such that the last row and column correspond to the 'end of an inning' state. Then we have the following block diagrams:
A = [ A0 0 ] B = [ B0 0 ]
pT 1 qT 0

where the vector p contains the probabilities of making an out and the vector q contains the expected number of runs scored while ending an inning. Note that it is possible for q > 0, but we will usually set q = 0 for simplicity.

Given our assumption that from every state there is a nonzero probability of making an out, we know that A0k tends to 0 and is bounded above by a decreasing geometric sequence, thus the series 1 + A0 + A02 + ... + A0k + ... converges to (1 - A0)-1. A quick block-matrix computation shows that:

BAk = [ B0A0k 0 ]
qTA0k 0

Taking x0 to be the first n-1 entries of x and w0 to be the first n-1 entries of w, we see that (1) reduces to:

(2)

<r> = (w0TB0 + qT) (1 - A0)-1 x0.

Theoretical Considerations - Multiple Matrix Case

Now suppose we have a sequence of n transition matrices A0, A1,..., An-1. This sequence corresponds to the lineup under consideration. We also have a sequence B0, B1,..., Bn-1 of scoring expectation matrices. So we may write:

(3)

<r> = wT (B0 + B1A0 + B2A1A0 + ... + B0An-1An-2...A0 + ... ) x

At this point it is useful to introduce a difference sequence of matrices, {Pi}, defined as Pi = Ai Ai-1...A0. Then we may rewrite the infinite series in (3) as

B0 + B1P0 + B2P1 + ... + B0Pn-1 + B1P0Pn-1 + ...

Grouping, we obtain

B0 (1 + Pn-1 + Pn-12 + ... ) + B1P0 (1 + Pn-1 + Pn-12 + ... ) + ... + Bn-1Pn-2 (1 + Pn-1 + Pn-12 + ... )

Rewriting the geometric series and grouping again, we have

(B0 + B1P0 + ... Bn-1Pn-2) (1 - Pn-1)-1.

Thus we may write the result for multiple transition matrices:

(4)

<r> = (w0T (B0 + B1P0 + ... Bn-1Pn-2) + (q0T + q1T + ... + qn-1T)) (1 - Pn-1)-1 x0.

Practical Application: Only homers and outs

As a very simple example, we examine the case in which the result of every play is either a home run or an out. There will never be any runners on base, so the only states that arise are 'none on, none out', 'none on, one out', 'none on, two outs' and 'end of an inning.' I arrange the states in this order to produce the transition matrices below. Letting the probability of scoring a home run be p, we obtain the following matrices:

A = [ p 0 0 0 ], B = [ p 0 0 0 ]
1 - p p 0 0 0 p 0 0
0 1 - p p 0 0 0 p 0
0 0 1 - p 1 0 0 0 0

Given the definitions of A0, B0, and q we can immediately see that B0 = p1, q = 0 and

1 - A0 = (1 - p) [ 1 0 0 ]
-1 1 0
0 -1 1

Whence we obtain

(1 - A0)-1 = (1 - p)-1 [ 1 0 0 ]
1 1 0
1 1 1

And finally, from (2) we have

<r> = (w0TB0 + qT) (1 - A0)-1 x0

<r> = pw0T (1 - A0)-1 x0

<r> = p/(1 - p) [3 2 1] x0

Taking x0 to be e1, our standard 'none on, none out' starting state, we see that <r> = 3p/(1 - p), as we can easily check by other means.

Conclusion

The method presented herein is fairly general and can be used, for example, to determine the theoretical value of various baseball events under various event frequency assumptions. By changing the value of x0 we may examine the effect changing the base/out situation has on run expectations, which can help us to determine the value of a stolen base, error, wild pitch, etc., under difference conditions. In addition, it can illuminate the change in these values that comes about when event frequencies change. It would make sense if a greater frequency of home runs led to a decrease in the value of a stolen base and an increase in the negative value of a caught stealing. This method could determine precise values for this effect, and help to determine under what conditions the stolen base might still have value.

This method can help to determine success frequencies for managerial choices such as the hit-and-run or sacrifice bunt. If it is necessary to get just one run, B0 can be suitably redefined to determine the probability of scoring at least one run. This could then decide under what conditions a sacrifice bunt should be called for. Of course, there is still plenty of room for managerial decision-making as the relative probability of succeeding is an input to the method, not an output.


Copyright ©2000 by James Jones. Redistribution is permitted, so long as there is no charge and this document is included without alteration in its entirety.