ewi003 57..75


Ex-Post Egalitarianism and Legal Justice

Alon Harel

Hebrew University of Jerusalem

Zvi Safra

Tel Aviv University

Uzi Segal

Boston College

In any legal system, one finds numerous rules, practices, and constitutional pro-

visions that are incompatible with utilitarian considerations. It is not merely util-

itarianism that fails to explain a diverse range of rules and practices. Other

theories that, like utilitarianism, involve ex ante considerations cannot explain

them as well. There are two possible primary explanations for the prevalence

of these nonutilitarian rules and practices: Kantian (deontological) explanations

and a view we label ex post egalitarianism, which requires that the state decides

on its action in an egalitarian manner ex post. Our approach allows for compar-

isons among different societies by giving meaning to statements like ‘‘Society A

is more egalitarian than society B.’’ Furthermore, we show that the more egal-

itarian societies should also employ less extreme criminal law rules and should

be more sensitive to various kinds of injustice, whether it is caused by individual

wrongful behavior or by criminal law rules.

1. Introduction

In any legal system, one finds numerous rules and practices as well as consti-

tutional provisions that are incompatible with utilitarian considerations (i.e.,

with maximizing the sum of individual utilities). These rules and practices

often grant benefits to an individual whose well-being is at risk; yet the costs

of these benefits, in terms of utilities, to other individuals may outweigh these

benefits. Thus, for instance, despite the persistent belief of economists that

efficiency requires to impose the harshest possible sanctions, legal systems

often impose light sanctions and consequently have to bear the high costs

of increasing the probability of detection.

It is not merely utilitarianism that fails to explain a diverse range of

rules and practices. Other theories that, like utilitarianism, involve ex ante

We thank Ariel Rubinstein for helpful discussions. Zvi Safra thanks the Israel Institute of Busi-

ness research for its financial support. Uzi Segal thanks the National Science Foundation (grant

0111541) for financial aid.

The Journal of Law, Economics, & Organization, Vol. 21, No. 1,

doi:10.1093/jleo/ewi003

� The Author 2005. Published by Oxford University Press. All rights reserved. For Permissions,
please email: journals.permissions@oupjournals.org

JLEO, V21 N1 57


considerations cannot explain them as well. An ex ante analysis ranks distribu-

tionsofexanteutilitiesbeforepeopleknowtheactualoutcometheyreceive.
1
Ex

post analysis, on the other hand, relates to the final distribution of utilities. This

analysis, too, is done before information concerning the distribution of final out-

comesisrevealed,butrelatestotheevaluationofdistributionsoffinaloutcomes.

There are two possible primary explanations for the prevalence of these non-

utilitarian rules and practices: Kantian (deontological) explanations and a view

we label ex post egalitarianism. Kantian explanations are based on the con-

viction that there are nonconsequentialist obligations. The state is sometimes

obliged to act or not to act in certain ways even if acting differently is con-

ducive to utility, or to the ex ante interests of individuals. Typically those

explanations rely on autonomy-based considerations and are based on the

belief that respecting the dignity of individuals constrains the promotion of

their well-being. For contemporary sophisticated deontological explanations

of various rules and practices, see Kamm (1996:143–204).

Ex post egalitarianism, on the other hand, requires that the State decide on its

actioninanegalitarianmannerexpost.ItdiffersfromtheKantianapproachsince

it is founded on a comparison between the well-being of the people who are

affected by a certain rule and practice rather than presupposes the existence of

moral constraints, which are independent of consequentialist considerations. It

differs from utilitarianism since it is concerned with equality, and not only with

the maximization of the sum of utilities. Finally, it differs from ex ante analysis,

as itconsidersthe well-being of individuals after they have the knowledge of the

prevailing social outcomes.

The purpose of this article is to demonstrate that some legal rules and

practices that are traditionally justified in terms of Kantian explanations

can alternatively be explained in terms of ex post egalitarianism. Two impor-

tant qualifications should be stated. First, we do not reject Kantian consider-

ations. It is not claimed that the rules and practices examined here cannot be

explained on the basis of Kantian considerations. Instead, we simply claim that

they can also be explained in terms of ex post egalitarianism. Second, it is not

claimed here that ex post egalitarianism is an absolute value, which overrides

any conflicting values. The rules and practices examined here demonstrate that

ex post egalitarianism competes with other considerations such as ex ante anal-

ysis and utilitarian considerations.

To demonstrate our approach, consider the issue of sentencing practices.

As was argued by Becker (1968), sentencing practices pose a great challenge

for the utilitarian approach: If criminals react to the expected sanction, then

greater deterrence can be achieved by either increasing the probability of de-

tection or by increasing the size of the sanction. However, increasing the prob-

ability of detection is much more costly than increasing the size of the sanction.

Hence the most efficient way to deter criminals is to impose the harshest

1. Note that when we use the term ‘‘ex ante’’ we do not discuss the extreme situation of ex ante

decision process—the situation in which the decision is made behind the ‘‘veil of ignorance,’’

before the individual is even aware of his personal characteristics.

58 The Journal of Law, Economics, & Organization, V21 N1


sanction possible and to reduce accordingly the probability of detection. In

reality, however, sanctions do not conform to Becker’s recommendations

and they are usually proportional to the degree of culpability and wrongfulness

of the crimes committed.

Numerous explanations were provided to the puzzle raised by Becker. First,

if criminals are risk averse, an increase in the sanction is not a costless transfer

payment. Increasing the sanction and reducing the probability increases the

risk faced by individuals. Second, the system proposed by Becker eliminates

marginal deterrence—the incentive to substitute less for more serious crimes.
We suggest below the following explanation. Sanctions generate inequality

because those who are subjected to the sanctions are worse off relative to those

who are not. The harsher the sanction is, the larger is the inequality between

two classes of individuals, namely those offenders who are subjected to the

sanction and nonoffenders who are subjected to the sanction by mistake,

and offenders and nonoffenders who are not subjected to the sanction. Harsh

sanctions are indeed, as Becker observed, required by consideration of aggre-

gate utilities. Yet social interests in harsh sanctions must be balanced against

social concerns for equality. Therefore it may well be the case that what dic-

tates limits on the size of sanctions is ex post egalitarianism and not retributive

justice considerations.

Our approach allows for comparisons among different societies by giving

meaning to statements like ‘‘Society A is more egalitarian than society B.’’

Furthermore, we show that the more egalitarian societies should also employ

less extreme criminal law rules and should be more sensitive to various kinds

of injustice, whether it is caused by individual wrongful behavior or by the

society’s criminal law rules. As we elaborate in Section 8, such societies would

use less harsh sentencing practices; furthermore, they would try to avoid wrong-

ful convictions by raising the required burden of proof and, at the same time,

they would try not to sacrifice minorities for the sake of the whole society.

In Section 2 we present social policies and discuss the ex post and ex ante

approaches. We define the notion of being an ex post egalitarian society and

provide a way for comparing levels of egalitarianism among societies. In Sec-

tion 3 we present our results, and we present our results and offer some appli-

cations in Section 4. Formal analysis is deferred to the appendix.

2. Social Policies and Their Evaluation

Consider a situation where individuals are facing some uncertainty regarding

outcomes that are controlled by society, for example, road accidents or some

health-related issues. There are many individuals and we assume that the risks

they face are independent of each other. If society is sufficiently large, we can

assume, for practical measures, that the proportion of those who receive a cer-

tain outcome equals the probability of receiving this outcome. For example, if

each member of society is facing an independent chance of 2% to fall victim to

a car theft, then in a large society the probability that the true proportion of

victims will differ from 2% by more than e is negligible. Social policies are

Ex-Post Egalitarianism 59


often aimed at controlling these proportions, but they rarely try (or are even

able) to determine the recipients of each outcome. For example, lower speed

limits or wider road shoulders reduce the probability of fatal car accidents, but

they do not tell who will be involved in accidents.

There are two extreme ways in which society can view such issues: ex ante

and ex post. Suppose that society is facing a choice between two economic

policies. One implies an annual increase of 5% in everybody’s utility, the other

provides each person with an independent risk, where there is a 50% chance of

a 10% increase and a 50% chance of no change in utility. From an ex ante per-

spective, both policies are equally attractive, as all expected utility maximizers

are indifferent between the two. But the two policies may differ from an ex post

perspective. The first implies a 5% increase in the utility of all. As individual

risks are independent, the second policy implies (for a sufficiently large so-

ciety) a 10% increase in the utility of half of the population and no change

in the utility of the rest. If initially everybody’s utility is 100, now half of

the population will have 110 and half will get only 100. Ex post, the two pol-

icies are not the same and a nonutilitarian society may not be indifferent

between them.

Which is the correct analysis, the ex ante or the ex post? Consider another

situation, where one unit of an indivisible good needs to be given to one of two

individuals. Both receive utility one from receiving it and zero otherwise. All

allocation procedures lead to an uneven allocation of ex post utility, one person

receives one, the other zero. Diamond (1967:765) suggested randomization

over the two individuals as a tool for improving social well-being, thus cre-

ating ex ante egalitarianism. There is an extensive discussion in the literature

about whether social welfare is really improved by such randomizations.

Broome (1984a,b) argues in favor of ex post analysis, claiming that equality

of expected utilities is not a real equality. Harsanyi (1977) also argues against

attributing ex ante egalitarianism any significant value. Epstein and Segal

(1992) focus attention on ex ante analysis, while Karni and Safra (2000)

and, more explicitly, Ben Porath, Gilboa, and Schmeidler (1997) try to com-

bine both ex post and ex ante consideration into one evaluation function. This

article offers an ex post analysis, while assuming that society is not necessarily

utilitarian (i.e., evaluations of social policies do not depend solely on the sum

of individual utilities). We restrict attention to situations where each social

policy leads to a known ex post distribution of utilities. As argued above, even

though we cannot predict individual outcomes, we nevertheless know, for each

policy, what proportion of the population will receive each possible utility

level. We evaluate such distributions using a social welfare function W, which,

not being utilitarian, is not necessarily linear.

A utilitarian society is interested in maximizing the sum of individual util-

ities and does not pay attention to the diversity of the distribution of utilities.

Preferences for equality imply that society will be willing to reduce the av-

erage utility in order to make the distribution more concentrated around its

mean. Such concerns can be represented by quasi-concave functions, where

the social evaluation of the average of two equally attractive distributions

60 The Journal of Law, Economics, & Organization, V21 N1


is better than both.
2
Such functions can be monotonic in individual utilities, but

are also sensitive to utility differences between individuals, and admit some

trade-off between the sum of utilities and their disparity. We now present an

outline of our model; a more formal analysis appears in the appendix.

Given two distributions of utility x and y, the mixture ax þ (1 � a)y gives
each individual a times his allocation under x plus 1 � a times his allocation
under y. For example, if x ¼ (3, 1) and y ¼ (1, 5), 1

2
x þ 1

2
y ¼ ð2; 3Þ. Our first

assumption is that mixtures of equally attractive ex post utility distributions

improve social welfare.

Quasi-Concavity. Let x and y be two distributions of utility. If W(x) ¼ W(y),
then for all a 2 (0, 1), W(ax þ (1 � a)y) c W(x).

Strict quasi-concavity requires that for all x 6¼ y, W(x) ¼ W(y) implies W(ax þ
(1 � a)y) > W(x).3

As mentioned above, we are interested here in utility distributions and do

not care for the identity of the individual recipients of these utilities. Distri-

bution x of utilities can therefore be represented as a cumulative distribution,

where Fx(u) is the proportion of individuals whose utility given x does not

exceed u. Two distributions that lead to the same cumulative distribution

should therefore be equally attractive.

Symmetry. If the two utility distributions x and y satisfy Fx ¼ Fy, then W(x) ¼
W(y).

The major concept of the article is that of egalitarian social welfare

functions:

Definition 1. We say that a social welfare function is ex-post egalitarian if it

is strictly quasi concave and symmetric.

For a meaningful comparative statics analysis, we need to be able to com-

pare different societies. For this, we will need the following concept:

Definition 2. A social welfare function W# is more egalitarian than another
social welfare function W if, whenever the society with the social welfare func-

tion W is willing to sacrifice the difference between the averages of the two

distributions in order to improve equality, so does the society with the social

welfare function W#.

2. Recall that from an ex ante perspective, such functions imply preferences for randomizations.

3. Note the relation of this definition to Diamond’s (1967:765) argument in favor of ex ante

evaluation—choosing the utility distribution x with probability p and y with probability 1 � p
yields each individual expected utility that is equal to p times his utility under x plus 1 � p times
his utility under y. Preferences for randomization imply quasi-concave preferences over social

lotteries. Assuming that all individuals are expected utility maximizers, such preferences imply

that the social welfare function is quasi-concave in ex ante utilities (Epstein and Segal, 1992).

Ex-Post Egalitarianism 61


(see the Appendix for a more precise definition). Figure 1 depicts this concept

for the case of two individuals. In this picture, W# is more egalitarian than W
and distributions between x and E[x] (the average of the distribution x) display

more equality. In the region between x and the main diagonal, the indifference

curve through x of W# is lower than the corresponding indifference curve of W,
hence W# is showing more egalitarianism than W.
In the literature, considerations of equality that are based on cumulative dis-

tributions lead to the concept of aversion to mean-preserving spreads. How-

ever, by Dekel (1986), if W is symmetric and quasi-concave, then W must also

represent such aversion, hence this concept of equality is related to the concept

of quasi-concavity discussed above.

3. More Egalitarian Societies

Suppose that society has control over a decision variable a 2 [0, 1] that leads to
a utility distribution c(a). We assume that a > b implies that there is s such that
the distribution of c(b) þ s is a mean-preserving spread of the distribution of
c(a). In other words, as a moves from zero to one, c(a) is becoming more and
more concentrated (even with respect to a changing average). The curve c(a) is
called a track. The optimal value of a depends on social preferences, and our

aim here is to compare these optimal values for societies with different degrees

of ex post egalitarianism.

The main result of the article is that the more egalitarian the society, the

further it is willing to sacrifice average utility in order to reduce the variation

of the optimal utility distribution. This result is stated in the following theorem.

Figure 1. W# is more egalitarian than W.

62 The Journal of Law, Economics, & Organization, V21 N1


Theorem 1. If W# is more egalitarian than W, then along a track c where
there is a substitution between average utility and spread, the optimal point for

the more egalitarian function has a weakly lower sum of utilities and more

concentration than the less egalitarian function.

We present a formal statement of this theorem and provide a proof in the

appendix. Moreover, we provide three conditions for a strict result (Theorem

2). Another result, implied by both theorems, is that an egalitarian society

chooses a utility distribution that is more concentrated than the one chosen

by a utilitarian society. This result is stated in the following corollary (and

is proved in the appendix).

Corollary 1. Let W be an ex-post egalitarian social welfare function and let

c be a differentiable track. The optimal utility distribution of W is more con-
centrated (and has a lower sum of utilities) than the optimal distribution of

a utilitarian society.

4. Applications

In this section we discuss some applications of our main result, namely, that

a more egalitarian society will go further in the direction of reducing utility

spread, even at the cost of total utility.

4.1 Victims of Crime versus Victims of the Criminal Law System

Under the most fundamental principles of evidence law, facts constitutive of

the defendant’s guilt have to be proven beyond a reasonable doubt.
4
This rule is

perceived by practitioners as well as scholars to be grounded in justice-based

considerations; it is often described as a right of defendants against the state.

The rhetoric of rights and justice used in the justification of the strict standard

of proof is often hostile to utilitarian calculations. This is no accident; utili-

tarianism cannot in general justify the heightened burdens of proof associated

with this principle.

To see it, assume that a legal system can adopt either a strict or a lenient rule

of proof. Under the strict rule, guilt has to be proven beyond a reasonable

doubt. Under the lenient rule, guilt has to be proven in a ‘‘satisfactory man-

ner.’’ Each one of these alternative rules involves costs and benefits, and there

4. This is an old principle of common law. An important articulation of it can be found in the

English landmark case of Woolmington v. DPP [1935] A.C. 462. The court stated there unambig-

uously that ‘‘No matter what the charge or where the trial, the principle that the prosecution must

prove the guilt of the prisoner is part of the Common Law of England and no attempt to whittle it

down can be entertained’’ (see Woolmington v. DPP [1935] A.C. 481). In the United States the

same principle is regarded as required by the due process clause. The Supreme Court stated that this

clause ‘‘protects the accused against conviction except upon proof beyond reasonable doubt of

every fact necessary to constitute the crime with which he is charged.’’ See In Winship, 397

U.S. 358, 364 (1970). The rule however had much longer and deeper roots which can be traced

to Roman law (Williams, 1963:186–190).

Ex-Post Egalitarianism 63


is some trade-off between convicting the innocent and acquitting the guilty

(Williams, 1963:188, Wertheimer, 1977:51–52).

It is reasonable to believe that a legal system that adopts the lenient rule

fulfills better its task of protecting its citizens from crime than a system that

adopts the strict rule and the costs of the strict principle in terms of deterrence

may outweigh the benefits the strict principle provides to the innocent people

who are better protected under it from wrongful conviction. In that case, util-

itarianism dictates a rejection of the strict rule.

In their efforts to justify the strict rule, utilitarians often develop justifica-

tions that depend on some restrictive assumptions. Bentham (1825:197), for

example, supported the claim that there ought to be a presumption in favor of

the accused, writing: ‘‘Generally speaking, a too easy acquittal excites regret

and uneasiness only among men of reflection; while the condemnation of an

accused, who turns out to have been innocent, spreads general dismay; all se-

curity appears to be destroyed; no defence can any longer be found, when even

innocence is insufficient.’’ Yet moral theorists usually reject utilitarian justi-

fications for the principle that guilt should be proven beyond a reasonable

doubt (Dworkin, 1986:72,81–84). Some of their arguments rely on the intu-

ition that wrongful conviction seems to be a direct interference in the lives of

an innocent person. In contrast, the failure to prevent a crime is merely an

omission on the part of the state.

Expostegalitarianismisanin-betweenapproach.LikeBenthamandlikeutil-

itarianism, it does not ignore the individual costs and benefits of different bur-

dens of proof. However, it also considers social norms and notions of justice,

whicharerepresentedbythequasi-concavityofthesocialwelfarefunction.Such

functions pay more attention to the well-being of those members of society who

areworseoffcomparedtotherest.Toillustrate,considerthefollowingstructure.

Society is composed of innocents and criminals. Innocent people face the

risk of wrong conviction and the risk of criminal victimhood. Criminals may or

may not be caught and convicted (for simplicity, we ignore minor groups such

as criminals who are not convicted but are victims of crime). For a given

level of burden of proof a, we get an ex post utility distribution of the form

cðaÞ ¼ ðui; piÞ5i¼1; where u1 is the utility of convicted innocents, u2 is the utility
of convicted criminals, u3 is the utility of innocent victims of crime, u4 is the

utility of innocents, and u5 is the utility of the unconvicted criminals. We as-

sume that crime benefits the criminal, that punishment is more severe than the

consequences of the criminal act, and that wrongful conviction is worse than

just conviction to the bearer of the punishment. Therefore we assume u1 < . . .

< u5 (see distribution s1 —the continuous line—in Figure 2). As explained in
Section 2, pi is the proportion in the population of those who receive utility ui
(and is also the probability of getting this utility level).

Society now raises the burden of proof while simultaneously raising the

probability of detection so that the level of crime in society remains untouched.

The reason for this simultaneous act is the desire that the cost of the increase in

crime level be shared by the whole society and not only by the additional vic-

tims. Criminals are interested in the probability of being punished, which is the

64 The Journal of Law, Economics, & Organization, V21 N1


product of the probability of detection and the conditional probability of con-

viction given detection. As crime does not change, this combined probability

too does not change. Moreover, assuming one crime per criminal, the number

of criminals is unchanged. Therefore the number of those who receive utility

levels u2 and u5 is unchanged. Also, the size of the u3 part of society is not

changed—these being the people who suffer from crime.
Obviously the combination of higher burden of proof and higher rate of de-

tection that does not change the crime rate will reduce the number of convicted

innocents, hence the size of the u1 level will go down, while the size of the

u4 group will increase.

Better detection is of course not cost free. Suppose that society finances the

cost by taxing those who are not convicted and did not suffer from crime, that

is, groups 4 and 5.
5
The new distribution is depicted by curve s2 (the dotted

line). Obviously the second distribution intersects the first one only once, and

from below. If the first distribution is optimal, then the new distribution must

have a lower expected value, otherwise, as it is less spread than before, a higher

expectation would have prevented distribution s1 from being optimal, even

for a utilitarian society, and certainly for an egalitarian one. We strengthen

this statement by assuming that the expected value is actually monotonically

decreasing as we move from the first to the second distribution.

The conditions of Theorem 1 are thus satisfied (observe that the set of dis-

tributions society can choose from is a track) (section 3). Even when the second

distribution is shifted to the right by the whole increased cost of detection,

there will be only one crossing point, and the expected value of the shifted

distribution will be more than the expected value of distribution s1 ). We there-

fore find out that the more egalitarian society will seek a higher level of burden

of proof.

Our analysis does not imply that the optimal social policy is to set the burden

of proof at its highest possible level (‘‘beyond any reasonable doubt’’). We

Figure 2. Burden of proof.

5. Arguably, one should tax group 3 as well (recall that groups 1 and 2 cannot be taxed, as they

are serving jail terms). However, we will ignore this tax in order to satisfy the conditions of

Theorem 1. For infinitesimal changes this tax is indeed negligible, as the number of those who

suffer from (serious) crime is small compared to the number of those who are not affected by

it, and the tax is small compared to the sanction against criminals.

Ex-Post Egalitarianism 65


recognize the validity of two claims on the social ruling. The obligation to

increase social well-being, but also the obligation to promote equality. Unlike

utilitarianism, and unlike some of the above justifications for the strict rule, we

refuse to give one claim lexicographic dominance over the other. Ex post egal-

itarian social welfare functions do indeed take both factors into consideration.

4.2 Torture and Other Forms of Cruel Punishment

One of the most puzzling concerns of the legal economist is the ceiling on the

size of the criminal sanction. Since Becker (1968), legal economists tried to

explain the reasons for the widely held intuition that the criminal law system

should not impose the maximally possible sanctions.

Becker’s argument is simple and yet compelling. An expected punishment

of $1000 can be imposed by different combinations of fines and probabilities

of apprehension. If the costs of collecting fines are assumed to be zero, regard-

less of the size of the fine, but the costs of apprehending and convicting crim-

inals rise with the probability of apprehension,
6
then the most efficient

combination is a probability close to zero and a fine arbitrarily close to infinity

(see Becker, 1968; see also Posner, 2003:219–227). Under this argument, the

only optimal sanction for every crime is the most extreme—possibly torture
and death. Such a sanction enables one to obtain that even very low probability

of apprehension and conviction would be sufficient to guarantee a sufficiently

large expected sanction. Yet our legal system does not conform to the Becker

model. Instead, our system imposes a ceiling on criminal sanctions. Torture is

prohibited altogether in modern legal systems and most modern legal systems

do not impose capital punishment.

There are several primary explanations for the ceilings on criminal sanc-

tions. Some legal economists dispute some of the presuppositions of Becker’s

analysis. They point out that criminals may be risk averse or risk preferring,

and if criminals are not risk neutral, Becker’s analysis does not yield the same

outcome (Polinsky and Shavell, 1979). Moreover, economists point out the

importance of marginal deterrence, that is, the incentive to substitute less

for more serious crimes as a reason for differentiating between the sanctions

imposed for different crimes (Shavell, 1983:1245–1246; Posner, 2003:222).

Traditional criminal law provides, Kantian justification to the ceiling on sanc-

tions. The wrongfulness and the culpability of criminals dictate what they de-

serve and criminal law should not impose any sanctions that exceed what is

being deserved. For a use of this Kantian insight, see Nozick, (1981/2000:363–

397) and Fletcher (2000:454–491).

Ex post egalitarian considerations could explain the very same phenome-

non. Criminal sanctions yield benefits that are provided to the public at large

at the cost of the particular individual upon whom they are imposed. The im-

position of criminal sanctions contributes to the well-being of others by

6. These costs are increasing with the probability because higher probabilities imply more po-

lice, prosecutors, judges, defense attorneys, etc.

66 The Journal of Law, Economics, & Organization, V21 N1


producing deterrence. Yet, if too large sanctions are imposed on a small group

of individuals, the disparity between the well-being of those who bear the costs

of deterrence and those who benefit from it is too large. The ceiling on the

criminal sanction is aimed at constraining this disparity.

As before, society is composed of innocents and criminals. The first group

consists of victims of crime (with the utility u2) and of nonvictims (with the

utility u3), while the second group consists of those who are punished (u1) and

of those who are not (u4).
7
Here too we assume that crime benefits the criminals

and that punishment is more severe than the consequences of the criminal act.

Therefore we assume u1 < . . . < u4, see distribution s1 in Figure 3.
Society now reduces the severity of punishment while simultaneously rais-

ing the probability of detection so that the number of convicted criminals is not

higher than before. It follows that the number of criminals is thus decreasing

(as before, we assume one crime and one victim per criminal). This, of course,

is not cost free (otherwise, society would not have been at an optimal point),

and we assume, as before, that the financial burden falls on the shoulders of

those who are not convicted and did not suffer from crime (see footnote 5). The

new distribution is depicted by curve s2. In this distribution, the utility of

convicted criminals is higher than before (as the punishment is less severe),

and there are not more of them than before. The utility of unconvicted crim-

inals is less than before (they have to pay for better detection), and since the

overall crime rate is down and the probability of detection is higher, the size of

this group diminishes. As there are fewer crimes, there are fewer victims of

crime, but their utility is the same as before. Finally, less crime means more

people who are neither criminals nor victims (the old u3 group), but since they

Figure 3. Cruel punishment.

7. We ignore here the possible existence of innocent people who are wrongly convicted. For the

sake of simplicity, we assume that all innocent people share the same utility. Relaxing this assump-

tion would not affect the outcome of our analysis. For example, introducing an additional group of

typical and nonactive potential criminals whose social state is inferior to innocent non-victims and

superior to innocent victims would create a new utility level u# between u2 and u3. Since it is
reasonable to assume that this group is small and less wealthy, and hence only little taxes are

imposed on it, our analysis holds for this case as well.

Ex-Post Egalitarianism 67


have to pay for better detection, their utility goes down. Obviously the new

distribution intersects the first one only once, and from below.

If the first distribution is optimal, then the new distribution must have a lower

expected value, otherwise, as it is less spread than before, a higher expectation

would have prevented distribution s1 from being optimal even for a utilitarian

society. We thus assume that the expected value is monotonically decreasing

as we move from the first to the second distribution.

It is reasonable to assume that crime benefits criminals less than the harm it

imposes on its victims (otherwise it would have been socially optimal to have

more crime, at least for a utilitarian society). In other words, u4 � u3 < u3 � u2.
Moreover, since the number of convicted criminals is reduced, the change in

the number of victims must be higher than the change in the number of un-

punished criminals. In other words, when the new distribution s1 is shifted to

the right by the full amount of the increased cost of detection, its expected

value will be higher than the original distribution, hence the social choice

set is a track. The conditions of Theorem 1 are thus satisfied, and we therefore

find out that the more egalitarian society will seek less severe punishment than

the less egalitarian one.

4.3 Sacrificing Some for the Rest

The problem of whether or not society should sacrifice some (hopefully few)

individuals to save the rest is at least 2500 years old. In Iphigeneia at Aulis,

Euripides describes the attempted sacrifice of Iphigeneia to appease Artemis’

wrath, thus obtaining fair wind to carry the Greek navy out of the bay of Aulis

to sail for Troy (Hornblower and Spaworth, 1996:765). Similarly the rabbin-

ical law deals with the situation of a group of people who are surrounded by

enemies and are asked to surrender one of them to be killed. The law is that

they should all be killed, but should not do it, unless the enemy specified the

one they wanted (Tosefta, 1977:20). In both sources there is an unambiguous

reluctance to accept the idea that social goals may be achieved by sharply re-

ducing the well-being of some members of society.

Figure 4. Sacrificing some for the rest.

68 The Journal of Law, Economics, & Organization, V21 N1


In this subsection we discuss a similar problem, where society can reduce

the utility of a minority to improve the position of the majority. We assume

a homogeneous society, where initially everyone has utility level u1. By re-

ducing the utility of some members of society it is possible to increase the

utility of the rest—two such possibilities are depicted in Figure 4 (distributions
s2 and s3 distribution s1 represents the fully egalitarian situation). Such redis-

tributions of utility can be obtained, for example, by a military draft, jury duty,

and other such obligations.

If the expected value of the third distribution is less than that of the second

one, then there is no point in switching from s2 to s3 even in a utilitarian society.

We therefore assume that the expected value in monotonically decreasing as

society is moving from distribution s3 to s2. Shifting distribution s3 to the left to

get the same expected value as that of distribution s2 will maintain the single,

fewer crossing property of the two distributions, hence Theorem 1 may be

applied. We obtain that the more egalitarian society will opt for fewer benefits

for the privileged at the cost of the underprivileged.

We would like to emphasize that our analysis does not imply that society

should not recruit only part of the citizenry, as it may well be that the fully

egalitarian distribution is suboptimal, for example, when a relatively modest

reduction in utility for a small number of individuals can significantly increase

the utility of the rest. But the ex post egalitarian society is certainly more sen-

sitive than the utilitarian society to issues of utility distribution, and will there-

fore ask for greater benefits for the privileged than the utilitarian society before

it agrees to further sacrifices by the underprivileged.

5. Conclusion

Our most fundamental rules and practices constrain the pursuit of maximiza-

tion of utility. It is often natural to explain these constraints as grounded in

deontological considerations founded on principles of dignity and inviolability

of persons. This indeed has been the explanation traditionally given to many of

the rules and principles constraining the pursuit of utility.

This article provides an additional explanation—one that relies on principles
of ex post egalitarianism. More specifically, it is argued that ex post egalitar-

ianism can explain a broad array of rules and practices, including the require-

ment that guilt be proven beyond reasonable doubt, the prohibition on torture

and cruel and unjust punishment, and the prohibition of sacrificing some for the

sake of saving others. These rules and practices constrain the pursuit of utility.

At the same time, they reallocate utility in a fashion that promotes ex post egal-

itarianism. More specifically, they reduce the costs which otherwise would be

imposed on people whose well-being turns out ex post to be lower at the expense

of those whose well-being turns out ex post to be higher.

Unlike ex ante egalitarianism, which is often achieved through randomiza-

tions over social members, ex post egalitarianism involves a real transfer of

goods from some individuals to others. It is therefore important that these

goods be divisible, as indeed are the goods in all of our examples.

Ex-Post Egalitarianism 69


6. Appendix

Let X be the set of real bounded random variables on the probability space
(S, R, P), and let Fx denote the cumulative distribution function of x 2 X, where
X is the set of all utility distributions. Generic elements of X are denoted x, y, z,
while degenerate random variables are denoted r, s, t. Scalars are denoted a, b,

c. A preference relation c is a binary relation on X that is complete, transitive,
continuous,

8
and monotone with respect to the relation of first-order stochastic

dominance (as usual, ; denotes indifference and _ denotes strict preference).
The preference relation c is symmetric if for all x; y 2 X; Fx ¼ Fy 0 x ; y: It
is quasi-concave if for all x; y 2 X; x ; y 0 "a 2 ð0; 1Þ; ax þ ð1 � aÞy c x
(strict quasi-concavity is defined with strict preferences). Note that

quasi concavity is defined with respect to outcomes, and not with respect

to probabilities. By construction, quasi-concavity implies preferences for av-

eraging, hence it is related to equity-seeking behavior. Let U denote the set of
all symmetric and quasi-concave preference relations on X. The social welfare
functions W and W# used in this article represent preferences in U.
Example 1. Consider the set L � U of preference relations that satisfy

positive linearity: for all a > 0 and for all r 2 v

x c y5ax þ r c ay þ r

Let c 2 L and pick a non-degenerate utility distribution x. The preference
relation c can be fully reconstructed from the set fy 2 v : y ; x and
E½y� ¼ E½x�g and from the unique degenerate distribution a that satisfy
a ; x. The set L was characterized in Safra and Segal (1998).

Definition 3. Let c, c# 2 U. The preference relation c# is more egali-
tarian than c if for all x 2 v

fy 2 v : y c x and E½y�E½x�g � fy 2 v : y#x and E½y�E½x�g

The preference relation c# is strictly more egalitarian than c if these inclu-
sions are strict for all non-degenerate x.

Note that the relation ‘‘more egalitarian than’’ is a partial order on the set of

preference relations U.

Fact 1. Assume c# is more egalitarian than c and that they are both
strictly quasi-concave. Then, for all x 2 X;
fy 2 X : y ; #x and E½y� ¼ E½x�g ¼ fy 2 X : y ; x and E½y� ¼ E½x�g:
That is, both induce the same preferences over subspaces in which the expected

values are fixed.

8. We use the topology of weak convergence with the Lévy metric on it. This metric is defined

by d(F, G) ¼ inffe > 0: For all x, G(x – e) – e � F(x) � G(x þ e) þ eg; see Huber (1977).

70 The Journal of Law, Economics, & Organization, V21 N1


Proof. Let x 2 X and denote Ix ¼ fy 2 X : y ; x and E½y� ¼ E½x�g and
I#x ¼ fy 2 X : y ; #x and E½y� ¼ E½x�g: First consider y 2 Ix. By definition,
y c# x. If y c# x, then

x 2 fz 2 X : z c y and E½z� � E½y�gnfz 2 X : z c #y and E½z� � E½y�g;
a contradiction to the assumption that c# is more egalitarian than c.

Next consider y 2 I#x\Ix and let a denote the degenerate utility distribution in
which all utilities are equal to E[x]. By strict quasi-concavity, a _ x and a _ y.
If x _ y, then, by continuity, there exist k 2 (0, 1), such that z ¼ ky þ (1 � k)a
satisfies z ; x. Since z 2 Ix, it follows by the above argument that z 2 I#x, hence
z ; #y; a contradiction to the strict quasi-concavity of c#. Since y;Ix, it must
be the case that y _ x. By the strict quasi-concavity of c# and the continuity
of c, there exists z satisfying E[z] ¼ E[x], x _#z, and z _ x. Hence,
z 2 fy 2 X : y c x and E½y� � E½x�gnfy 2 X : y c #x and E½y� � E½x�g;
a contradiction to the assumption that c# is more egalitarian than c. n

Let x8y denote that Fy is a mean-preserving spread of Fx while Fx 6¼ Fy. The
following fact shows that if c# is (strictly) more egalitarian than c, then,
whenever the addition of r > 0 suffices to compensate the preference relation
c# for the lack of equity manifested by y, then r (more than) suffices for the
preference relation c. This result demonstrates the close connection between
our definition and the strong measure of risk aversion suggested by Ross

(1981). Note that the analysis of Ross is restricted to a subset of preferences

in U—preferences that correspond to expected utility preferences in the liter-
ature on risk.

Fact 2. If c# is more egalitarian than c, then, for all x; y; r 2 X such
that x 8 y,

y þ r c #x 0 y þ r c x:
Similarly, if c# is strictly more equitable than c, then

y þ r c #x 0 y þ r_x:
Proof. Let x, y, and r satisfy x8y, y þ r c#x, and x _ y þ r. By mono-

tonicity, there exists z ¼ y þ r# (r# > r) satisfying z ; x and z ;#x þ r$ _# x
(r$ > 0). Hence x belongs to fw 2 X : w c z and E½w� � E½z�g; while it does
not belong to fw 2 X : wc#z and E½w� � E½z�g; a contradiction to the
assumption that c# is more egalitarian than c. n

Definition 4. A track is a function c : [0,1] / v such that

a > b 0 cðaÞ 8 cðbÞ þ s
for some s 2 v.9

9. s is a function of a and b.

Ex-Post Egalitarianism 71


Note that s can be either a positive or a nonpositive distribution. Also note

that if c(a) is a degenerate distribution, than a¼1. Otherwise, if c(a) is a mean-
preserving spread of c(1) þ s, then c(a)¼ c(1) þ s, while the definition of
8requires the two to be different from each other.

Theorem 1. Let c, c# 2 U and assume that c# is more egalitarian than c.
Let c be a track and let c (a) and c (a#) be the unique optimal distributions
of c and c# along c, respectively. Then a# � a.

Proof. Suppose that a > a# and let s satisfy c(a)8c(a#) þ s. Consider first
the case s c 0. The quasi-concavity of c# implies that c(a)c#c(a#) þ s. By
monotonicity, c(a#) þ s#c(a#), hence c(a)c#c(a#), a violation of the assump-
tion that a# is the unique optimal point of c# along c.

Suppose now that s < 0. By assumption, c(a#) c#c(a). Since c# is more
egalitarian than c, it follows Fact 2 and from the fact that c(a)8c(a#) þ s that

cða#Þ ¼ ½cða#Þ þ s� þ ð�sÞ_#cðaÞ 0

cða#Þ_cðaÞ:
A contradiction to the assumption that a is the unique optimal point of c
alongc. It thus follows that a# c a. n

Figure 5. Proof of Theorem 2.

72 The Journal of Law, Economics, & Organization, V21 N1


Next we want to establish conditions under which a# is strictly greater than
a. Define the positive tangent c#þ(a) to satisfy

c#þðaÞ ¼ lim
anYa

¼ cðanÞ � cðaÞkcðanÞ � cðaÞk
;

where k�k is with respect to the Lévy metric (see footnote 8).
Definition 5. The track c : [0,1] / v is equitably-differentiable if for all

a the tangent c#þðaÞ is well defined and there exists �c > 0 such that for all
0 < c < �c;

cðaÞ þ cc#þðaÞ8cðaÞ þ s

for some s ¼ s (a, c) 2 v.

The ray cðaÞ þ cc#þðaÞ is the linear continuation of the curve c from the point
c(a). The condition of the last display equation says that small movements in
this direction increase the level of equity. (Recall that all preferences in U are
quasi-concave.)

Definition 6. The preference relation c 2 U is smooth is for all non-
degenerate x 2 v there exists a unique linear function gx : v / < such that
y ; x implies gx(y) � gx(x), and moreover, gx is continuous in x.

Since a hyperplane in X is an indifference set of a linear function, it follows
that if a preference relation c 2 U is smooth, then for all x 2 X there is
a unique hyperplane Hx supporting the indifference set at that point. Note that

Fréchet differentiability may not imply this property (Machina, 2001; Safra

and Segal, 2001).

Let c, c# 2 U be strictly quasi-concave and smooth and assume that c# is
strictly more egalitarian than c. Let Hx (similarly H#x) be the supporting
hyperplane to the indifference set of c(c#) through the nondegenerate dis-
tribution x. Let D denote the line generated by all degenerate distributions,

D ¼ fr : r 2 Xg; and consider the two-dimensional plane T ¼ SpanfD, xg.
By monotonicity, D is not contained in Hx, hence Hx \ T is a line. By
monotonicity and quasi-concavity there exists t satisfying ftg ¼ (Hx \ T) \
D and, by symmetry and strict quasi-concavity, t < E[x] (see Figure 5). By
construction, Hx \ T ¼ fxg þ Spanfx � tg. Let Jx (J#x) be the indifference curves
of c (c#) in T through x. Let Gx ¼ fy 2 X : E½y� ¼ E½x�g: Clearly, Hx ¼
SpanfHx \ Gx, x � tg and, by Fact 1, H#x \ Gx ¼ Hx \ Gx.

By definition, the curves Jx and J#x intersect each other at x such that the
following holds: Jx lies above J#x between x and D and Jx lies below J#x other-
wise. This, however, is not sufficient to imply that x þ (x – t) does not belong to
H#x. To have this, we assume the following generic assumption:

ð*ÞFor all nondegenerate x; HðxÞ 6¼ H#ðxÞ:

Ex-Post Egalitarianism 73


Note that the set of points fx : Hx \ T ¼ H#x \ Tg is the complement of an open
and dense set (by comparability, Hx \ T ¼ H#x \ T iff Hx ¼ H#x; by smoothness,
the set of points where Hx 6¼ H#x is open; and the fact that c# is more strictly
equitable than c implies that there is no open set on which Hx ¼ H#x).
Theorem 2. Let c, c# 2 U be strictly quasi concave and smooth. Assume

that c# is strictly more egalitarian than c, and that condition (*) is satisfied.
Let c be an equitably-differentiable track and let c(a) and c(a#) be the unique
optimal points of c and c# along c, respectively. If a < 1, the a# > a.

Proof. Since a < 1, c(a) must be a non-degenerate distribution (see the
discussion after Definition 2 above). By Theorem 1, it is sufficient to show

that a# 6¼ a.
Consider the tangent c#þ(a). By the optimality of x, x þ c#þ(a) 2 Hx. There-

fore x þ c#þ(a) ¼ z þ d(x � t) for some d, where z 2 Hx \ Gx ( ¼ H#x \ Gx). We
show that x þ c#þ(a) does not belong to Hx \ Gx. Assume the contrary. Since
c is equitably differentiable, it follows that for sufficiently small c > 0, c(a) þ
cc#þ(a)8c(a) (note that the s in the definition of an equitably differentiable
track satisfies s ¼ 0). The smoothness of c implies that for sufficiently small
c > 0, c(a þ c) c c(a); a contradiction. Therefore d 6¼ 0.
It follows by condition (*) that x þ (x � t) does not belong to H#x. Now

d 6¼ 0 implies that x þ c#þ(a) does not belong to H#x. Therefore the optimality
of c(a#) for c# shows that c(a#) 6¼ x and a# 6¼ a. n
Corollary 2. Let c, c# 2 U such that c# is strictly quasi concave and

smooth and c is a utilitarian preference relation. Let c be an equitably-
differentiable track and let c(a) and c(a#) be the unique optimal points
of c and c#, respectively. If a < 1, then a# > a.
Proof. We will use the notations of the previous proof. By Theorem 1, a# c

a. The condition that c is a utilitarian preference relation implies that the
supporting hyper plane to the indifference set of c through the nondegenerate
distribution x satisfies Hx ¼ Gx (note that t ¼ E[x]) and the tangent vector x þ
c#þ(a) belongs to Hx \ Gx. As in the former proof, the facts that c is equitable-
differentiable and that c# is smooth, symmetric, and strictly quasi-concave
imply that for sufficiently small c > 0, c(a þ c) c#c(a), hence a# > a.

References
Becker, G.S. 1968. ‘‘Crime and Punishment: An Economic Approach,’’ 76 Journal of Political

Economy 169–217.

Ben Porath, E., I. Gilboa, and D. Schmeidler. 1997. ‘‘On the Measurement of Inequality Under

Uncertainty,’’ 75 Journal of Economic Theory 194–204.

Bentham, J. 1825. A Treatise on Judicial Evidence. London: Paget.

Broome, J. 1984a. ‘‘Uncertainty and Fairness,’’ 94 Economic Journal 624–32.

———. 1984b. ‘‘Selecting People Randomly,’’ 95 Ethics 38–55.

Dekel, E. 1989. ‘‘Asset Demand Without the Independence Axiom,’’ 57 Econometrica 163–69.

Diamond, P.A. 1967. ‘‘Cardinal Welfare, Individualistic Ethics, and Interpersonal Comparison of

Utility: Comment,’’ 75 Journal of Political Economy 765–66.

74 The Journal of Law, Economics, & Organization, V21 N1


Dworkin, R. 1986. ‘‘Principle, Policy, Procedure,’’ in R. Dworkin: A Matter of Principle. Oxford:

Oxford University Press.

Epstein, L.G., and U. Segal. 1992. ‘‘Quadratic Social Welfare Functions,’’ 100 Journal of Political

Economy 691–712.

Fletcher, G. 2000. Rethinking Criminal Law. Oxford: Oxford University Press.

Harsanyi, J.C. 1977. Rational Behavior and Bargaining Equilibrium in Games and Social Situa-

tions. Cambridge: Cambridge University Press.

Hornblower, S., and A. Spaworth. 1996. The Oxford Classical Dictionary. Oxford: Oxford Uni-

versity Press.

Huber, P.J. 1977. Robust Statistics Procedures. Society for Industrial and Applied Mathematics.

Kamm, F.M. 1996: Morality, Mortality, vol. II. New York: Oxford University Press.

Karni, E., and Z. Safra. 2002. ‘‘Individual Sense of Justice: A Utility Representation,’’ 70 Econ-

ometrica 263–84.

Machina, M. 2001. ‘‘Payoff Kinks in Preferences Over Lotteries,’’ 23 Journal of Risk and Un-

certainty 207–60.

Myerson, R. 1981. ‘‘Utilitarianism, Egalitarianism and the Timing Effect in Social Choice Prob-

lems,’’ 49 Econometrica 883–97.

Nozick, R. 1981. Philosophical Explanations. Cambridge, MA: Harvard University Press.

Polinsky, M., and S. Shavell. 1979. ‘‘The Optimal Tradeoff Between the Probability and Magni-

tude of Fines,’’ 69 American Economic Review 880–91.

Posner, R. 2003. Economic Analysis of Law, 6th ed. New York. Aspen Law & Business.

Roemer, J.E. 1996. Theories of Distributive Justice. Cambridge MA: Harvard University Press.

Ross, S.A. 1981. ‘‘Some Stronger Measures of Risk Aversion in the Small and in the Large with

Applications,’’ 49 Econometrica 621–38.

Safra, Z., and U. Segal. 1998. ‘‘Constant Risk Aversion,’’ 83 Jornal of Economic Theory 19–42.

———. 2002. ‘‘On the Ecomomic Meaning of Machina’s Fréchet Differentiability Assumption,’’

104 Journal of Economic Theory 450–61.

Shavell, S. 1985. ‘‘Criminal Law and the Optimal Use of Nonmonetary Sanctions as a Deterrent,’’

85 Columbia Law Review 1232–62.

Tosefta, 1977. Translated by J. Neusner. New York: Ktav.

Wertheimer, A. 1977. ‘‘Punishing the Innocent Unintentionally,’’ 20 Inquiry 45.

Williams, G. 1963. The Proof of Guilt: A Study of the English Criminal Trial, 3rd ed. London:

Stevens.

Ex-Post Egalitarianism 75