LORENZ procedure

Plots the Lorenz curve and calculates the Gini and asymmetry coefficients (R.W. Payne).


Options

PRINT = strings
Controls printed output (gini, lorenz, asymmetry); default gini, lore, asym

PLOT = string
Controls graphical output (curve); default curv

TITLE = string
Title for the graph; default uses the identifier of the DATA variate

NBOOT = scalar
Number of samples to make to construct the bootstrap confidence intervals; default 100

SEED = scalar
Seed for the random number generator used to construct the bootstrap samples; default 0 i.e. continue an existing sequence of random numbers or, if none, initialize the generator automatically

CIPROBABILITY = scalar
Probability for the bootstrap confidence interval; default 0.95


Parameters

DATA = variates
Specifies sets of data values

GINI = scalars
Saves the Gini coefficient for each DATA variate

ASYMMETRY = scalars
Saves the asymmetry coefficient for each DATA variate


Description

The Lorenz curve provides a graphical representation of the inequality of a sample of numbers. In economics the numbers could be the annual incomes of a group of people, or in ecology they could be population sizes of a set of species of animal or plant. The y-coefficients for the curve are formed by sorting the numbers, calculating their cumulative totals, and then dividing these by the grand total. The x-coefficients are simply the numbers 0, 1, ... n, where n is the size of the sample. If the numbers are all equal, the curve will form a straight line, known as the line of equality, running from the origin to the point (1, 1). Inequalities amongst the numbers cause the curve to lie below the line of equality.

   The Gini coefficient is the area between the line of equality and the Lorenz curve area, divided by area under the line of equality. So, a value close to zero indicates near equality, while a value near to one shows a high amount of inequality. The asymmetry coefficient assesses the amount of asymmetry of the Lorenz curve. The axis of symmetry for the curve is the line from (1, 0) to (0, 1). The coefficient is less than one if the point where the Lorenz curve is parallel to the line of equality lies below the axis of symmetry, and greater than one if it lies above the axis.

   The numbers whose equality is to be studied are specified, in a variate, by the DATA parameter. Their Gini and asymmetry coefficients can be saved, in scalars, using the GINI and ASYMMETRY parameters respectively.

   Printed output is controlled by the PRINT option, with settings:

    asymmetry
prints the coefficient of asymmetry,

    gini
prints the Gini,

    lorenz
prints the coordinates of the Lorenz curve.

By default, these are all printed.

   The procedure can also print bootstrap confidence intervals for the Gini and asymmetry coefficients. The probability level for the interval is specified by the CIPROBABILITY option; the default of 0.95 gives 95% intervals. The NBOOT option specifies how many bootstrap samples to take (default 100). If you do not want the confidence intervals, you should set NBOOT=0. The SEED option specifies the seed to use in the random number generator used to construct the bootstrap samples. The default value of zero continues an existing sequence of random numbers or, if the generator has not yet been used in this run of GenStat, it initializes the generator automatically.

   By default curve is plotted, but you can set PLOT=* to suppress the plot. The TITLE option can supply a title for the graph.

 

Options: PRINT, PLOT, TITLE, NBOOT, SEED, CIPROBABILITY.

Parameters: DATA, GINI, ASYMMETRY.


Method

The Gini coefficient is calculated using the equation

Gini = ∑{ ((2 × i - n - 1) * Dsort) / (mean(DATA) × n2) }

where n is the sample size, Dsort are the sorted numbers.

   The asymmetry coefficient is given by

Asymmetry = Fmu + Lmu

with Fmu and Lmu defined by

Fmu = (m + d) / n

Lmu = (CDsortm + d × Dsortm+1) / CDsortn

where m is index of the largest number less than mean(DATA),

CDsort = CUMULATE(Dsort),

and

d = ( mean(DATA) - Dsortm ) / ( Dsortm+1 - Dsortm )

   The bootstrap confidence intervals are generated using the BOOTSTRAP procedure.


Action with RESTRICT

LORENZ takes account of any restrictions on the DATA variate.