In OEI‐1973 an aggregation property for subgroups of populations is briefly mentioned in discussing Theil's entropy measure of inequality (pp. 35–6). The general property of additive separability is also discussed later on, in the context of welfare measurement (OEI‐1973, pp. 39–41), paying particular attention to an independence axiom due to Hamada (1973). These two types of conditions, now known as ‘decomposability’ and ‘subgroup consistency’, have come to play a central role in inequality analysis, in terms of theory as well as practical application.52 These conditions have also been used to classify inequality indices in terms of their acceptability. Several key characterizations of well‐known measures of inequality are based on these requirements (seen as axioms). Other measures, including most notably the Gini coefficient (still the most commonly used measure of inequality in empirical work), have been criticized for their failure to satisfy them. We now turn to these developments.53
The main idea behind decomposability of inequality measures can be traced to the analysis of variance (or ANOVA), a (p.150) traditional method of evaluating ‘how much’ of the variance in a variable (such as income) can be ‘explained’ by relevant characteristics (such as age, sex, race, schooling, or work experience). The key formula of ANOVA links overall income variance to ‘between‐group’ and ‘within‐group’ variances. The ‘between‐group’ term B is the variance that would exist if each observation were replaced by the mean income of the group sharing the same characteristics, so that we concentrate only on variations between these groups. The ‘within‐group’ term W, on the other hand, is the weighted average of the variance within each group, where the weight is the ‘population share’ or the share of total observations in the respective group. In the two‐group case this may be written as:
Note, though, that the variance is an absolute measure of dispersion, not a relative inequality indicator (see OEI‐1973, p. 27). Indeed, if each income is doubled, the overall measured dispersion is quadrupled. There are two common ways of converting the variance into a mean‐independent measure, either through taking the variance of logarithms, or through (p.151) going for the coefficient of variation.55 Both these ways can be interpreted as using the variance over a transformation of incomes that makes them mean independent.
The variance of logarithms is obtained by applying the variance to the distribution of log‐incomes. In fact, following the important and influential work of Mincer (1958, 1970), there has been considerable use of the income variable in logarithmic form in wage‐determination models. The resulting ‘semi‐log’ regression equation yields an ANOVA decomposition invoking a population‐share‐weighted within‐group term (like the variance), but a rather different between‐group term.56 However, the variance of logs, like the variance itself, is not Lorenz consistent. The variance of logs satisfies mean independence, but the basic Pigou–Dalton condition is violated. Such violations arise only when relatively high incomes are involved (OEI‐1973, pp. 28–9).57 Even so, this does not mean that the problem is a minor one. As demonstrated by Foster and Ok (1996), the likelihood of violations is significant, and the extent of the disagreement between the variance of logs and the Lorenz criterion can be surprisingly large.58 These difficulties do not necessarily remove the variance of logs from consideration, but they do provide an incentive to explore other possibilities.
The second procedure applies the variance to the normalized (p.152) (or unit mean) distribution of incomes to obtain the squared coefficient of variation, C 2. This is indeed Lorenz consistent as well as mean independent, and its decomposition has a standard between‐group term. However, the population‐share weights w x on the within‐group term have to be altered from (n x/n) to (n x/n) (μx/μ)2 to account for the difference in the subgroup normalization factor μx and the overall normalization factor μ. This adjusts the population share upward or downward depending on whether the subgroup mean is higher or lower than the overall mean.59
Theil's ‘entropy’ measure T also has an additive decomposition, but its formula has weights of the form w x = (n x/n) (μx/μ), or the share of the group in the total income. The population share is still adjusted in favour of richer subgroups, but to a lesser extent than the previous measure.60 Theil's second measure D returns to the pure population‐share weighting w x = n x/n of the variance. All three of the measures C 2, T and D have decompositions of the form:
The ‘if’ portion of the proof follows immediately, since each generalized entropy measure I α can be additively decomposed with weights w x = (n x/n) (μx/μ)α. The ‘only if’ part of the proof is quite challenging, since it requires the derivation of a specific functional form (viz., I α) from the assumed general properties. This is accomplished using methods from the study of functional equations, which, like differential equations, offers up an entire function as a solution, but from equations that do not involve derivatives.62
This characterization theorem shows how drastically the requirement of additive decomposability limits the permissible inequality measures. However, it should be noted that other types of breakdown are also available, and the issue can be characterized differently and less exactingly. An important example is the Gini coefficient which can be given an additive but somewhat artificial form of ‘decomposition’: G(x,y) = [W] + [B] + [R], where W is a weighted average of within‐group Ginis (with weights w x = (μx n x 2)/(μ n 2)), B is the Gini applied to the standard ‘smoothed’ distribution, and R is a non‐negative residual term devised to balance the equation. For instance, if x = (0,8) and y = (4,20), then G(x,y) = 1/2 is overall inequality, and W = 3/16 and B = 1/4 are the respective component values, so that R = 1/16 is left unaccounted for by the breakdown. The Gini measure cannot be decomposed neatly into the ‘within’ and ‘between’ group terms required (p.154) by additive decomposability, which may lessen its appeal in certain applications.
While the presence of R makes the Gini coefficient less suitable for decomposition analysis, the R term does have value from another perspective in giving useful information that decomposable indices must, by definition, ignore. Recall that the weights in the Gini formula depend on all incomes in the distribution. Consequently, when a subgroup's incomes are evaluated without reference to the entire distribution (as in the construction of the within‐group term), or when they are replaced with subgroup means (as in the between‐group term), some information on the rankings of individuals is lost. The residual term conveys the lost information in a natural way: R indicates the extent to which the various subgroup distributions overlap.63 In the special case where subgroup distributions are non‐overlapping, R vanishes and the two standard terms account for all of the inequality. As an example of this, note that each income in x′ = (0,4) is below each income in y′ = (8,20), and that G(x′,y′) = 1/2 is indeed the sum of W′ = 1/8 and B′ = 3/8. In general, though, subgroup distributions tend to overlap, and hence all three terms—the overlap term as well as the standard within‐group and between‐group terms—are required to reconstruct the Gini inequality value.
Blackorby, Donaldson, and Auersperg (1981) have presented another way of altering the decomposition formula, for the Atkinson family of measures. They use a different form of between‐group term based on ‘equivalent incomes’ of group distributions rather than subgroup means. In contrast to the Gini decomposition, the residual term here is negative (or non‐positive), indicating that the formula's within‐group and (p.155) between‐group terms account for more inequality than is present in the original distribution. This is, thus, not an exact—residual‐free—decomposition, but Blackorby, Donaldson, and Auersperg's investigation moves the analysis of between‐group inequality more in line with the general Atkinsonian approach of using ‘equally distributed equivalent incomes’.64
When additive decomposability is imposed as a strict requirement, there is, as mentioned earlier, the class of generalized entropy measures I α to choose from. While, on the one hand, this restriction would eliminate many potential inequality measures, there is still, on the other hand, quite a range of measures from which a choice can be made. This selection can be approached from several directions. The property of transfer sensitivity, for example, may be invoked, which immediately limits consideration to the range α < 2. The form of decomposition—or more specifically, the weighting structure—also helps distinguish between measures. For instance, we have noted that the within‐group weights for the (squared) coefficient of variation (α = 2) and Theil's entropy measure (α = 1) emphasize the inequality within richer subgroups. Ruling this out would select a measure in the range α ≤ 0. Alternatively, note that Theil's two measures are the only ones with weights that sum up exactly to 1. The sum of weights for the other measures exceeds or falls short of unity by an amount proportional to the between‐group term, clouding the interpretation of the within‐group term (on this, see Shorrocks 1980). Consequently, only α = 0 or 1 would be fully endorsed by this criterion.
The ‘standardization’ analysis of Love and Wolfson (1976) suggests one more way of deciding. The traditional approach defines B through using an ‘as if’ distribution () where within‐group inequality has been removed. An alternative is to construct the within‐group term W′ first by rescaling group (p.156) distributions to remove between‐group inequality, and then define the between‐group term B′ = I − W′. Is there a generalized entropy measure which gives the same answer both ways? As noted by Shorrocks (1980, p.629) and Anand (1983, p. 200), only the second Theil measure D among the generalized entropy measures satisfies this independence property.65 The precise form of decomposition may, therefore, exercise a powerful influence on the selection of an inequality measure from a generally plausible class.
(52) See, for example, the analytical explorations in Bourguignon (1979), Cowell (1980, 1988a, 1988b), Shorrocks (1980, 1984, 1988), Cowell and Kuga (1981a, 1981b), Foster (1983), Kanbur (1984), Russell (1985), as well as such empirical studies as Mookherjee and Shorrocks (1982), Anand (1983), Cowell (1984). A strong case can also be made for decomposing inequality according to income source (e.g., earned and unearned income). Shorrocks (1982) has provided a definitive study of the alternative methodologies.
(53) In his classic study of economic inequality and poverty in Malaysia, Anand (1983) presents an excellent example of the power and cogency of decomposition analysis for descriptive and prescriptive investigations. The monograph also contains a set of extremely useful Appendices (pp. 302–54) on ‘the measurement of income inequality’, from which we have freely drawn.
(54) For example, in a simple regression model if each group is taken to share the same value of the independent variable, R 2 measures the between‐group contribution, while 1 − R 2 is the proportion left unexplained, which corresponds to the within‐group contribution (i.e., owing to variations in other variables with the same value of the chosen independent variable). The analogy with regression analysis is discussed further in Anand (1983, pp. 222–3).
(55) See OEI‐1973 (pp. 27–9), for discussions of the coefficient of variation, and of the standard deviation of logarithms (the square root of the variance of logarithms).
(56) Instead of using the arithmetic mean in the smoothed distribution, the geometric mean (the m‐th root of the product of m incomes) must be used to preserve the exact decomposition. Alternative notions of ‘representative income’ in the between‐group term are explored in Blackorby, Donaldson and Auersperg (1981) and Foster and Shneyerov (1996b). The latter paper characterizes a two‐parameter family of additively decomposable measures which includes the variance of logs and all generalized entropy measures.
(58) Foster and Ok (1996) show the existence of distributions x and y for which x L y, with L x being arbitrarily close to the line of complete equality and L y being arbitrarily close to the edge of complete inequality, and yet the variance of logs judges x to have greater inequality than y. They also show that the likelihood of violations of the transfer principle is much higher than previously suggested (see, for example, Creedy 1977).
(59) Interestingly, the within‐group and between‐group contribution terms are the same for C 2 as they are for V; while V is not mean independent, its constitutive terms are.
(60) The Theil index is derived from the Shannon entropy measure, which also has a useful decomposition as a measure of information. Khinchin's (1957) axiomatic characterization of the Shannon measure can be converted directly into a characterization of the Theil measure (on this, see Foster 1983, 1985), yielding a characterization that is a bit more transparent than Theil's (1967) own—somewhat cryptic—story.
(61) Here normalization is taken to include the requirement that the inequality measure I be zero at equality. Continuity is the usual ‘no‐jump’ assumption. Actually, Shorrocks (1984) proves a more powerful result. Let us call a measure I aggregative if it can be expressed as a function of subgroup means, population sizes, and inequality levels alone. Then I is a Lorenz‐consistent, normalized, continuous, and aggregative inequality measure if and only if it is a continuous, increasing transformation of a generalized entropy measure.
(62) The interested reader may consult the classic work of Aczel (1966) and the survey of applications in economics by Eichhorn (1978). There have been, by now, quite a few works employing this approach; see Chakravarty (1990) and the references cited there.
(63) Where y is the subgroup with the higher mean, R can be expressed as the sum of the differences |y i − x i| − (y i − x j), over all i and j, divided by μ n 2. A non‐zero entry in this sum corresponds to the case where an income from y (the higher mean distribution) falls below an income from x, and hence where the subgroup distributions overlap. For other interpretations of this term, see Bhattacharya and Mahalanobis (1969), Pyatt (1976), Love and Wolfson (1976), Silber (1989), Lambert and Aronson (1993), and especially Anand (1983, pp. 311–26).
(64) See also Foster and Shneyerov (1996a, 1996b) who present exact additive decompositions which base the within‐group weights and the between‐group ‘smoothed’ term on a ‘representative income function’ potentially different from the arithmetic mean.
(65) This leaves open the question raised by Anand (1983) whether there are any measures outside the generalized entropy class that yield a decomposition that is independent of the route taken (the ‘smoothed’ or ‘standardized’ approach). Foster and Shneyerov (1996a) have shown that Theil's second measure is indeed the unique ‘path‐independent’ decomposable measure in the usual mean‐based world. But when the scope of decomposition is broadened to allow arbitrary ‘representative income’ functions (in defining standardized and smoothed distributions), the possibilites expand to a single‐parameter family of measures containing the variance of logs, among others.