DESCRIBE procedure

Saves and/or prints summary statistics for variates (R.C. Butler & D.A. Murray).


Options

PRINT = string
Controls whether or not the summaries are printed (summaries); default summ

SELECTION = strings
Selects the statistics to be produced (nval, nobs, nmv, mean, median, min, max, range, q1, q3, sd, sem, var, sevar, %cv, sum, ss, uss, skew, seskew, kurtosis, sekurtosis, all); default mean, min, max, nobs, nmv, medi, q1, q3

GROUPS = factor
Allows groups to be defined, so that summaries are produced for each group in turn


Parameters

DATA = variates
Data to summarize

SUMMARIES = variates or pointers
To save summaries for each DATA variate, in a variate if GROUPS is unset, or in a pointer to a set of variates (one for each group) if groups have been specified; will be redefined if necessary


Description

DESCRIBE calculates up to 22 different summary statistics for values stored in a variate. The statistics may be saved, or printed, or both. The statistics to be calculated are indicated by the SELECTION option; the available settings are:

    nval
number of values

    nobs
number of non-missing values

    nmv
number of missing values

    mean
arithmetice mean

    median
median

    min
minimum

    max
maximum

    range
range (max-min)

    q1
lower quartile

    q3
upper quartile

    sd
standard deviation

    sem
standard error of mean

    var
variance

    sevar
standard error of variance

    %cv
coefficient of variation

    sum
total of values

    ss
corrected sum of squares

    uss
uncorrected sum of squares

    skew
skewness (see Method)

    seskew
standard error of skewness

    kurtosis
kurtosis (see Method)

    sekurtosis
s.e. of kurtosis

    all
all 22 summaries

by default the mean, min, max, nobs, nmv, median and both quartiles are calculated.

   Printing is controlled by the PRINT option. The statistics are printed by default, so to suppress printing you need to put PRINT=*.

   The GROUPS option allows groups of observations to be defined. Summaries are then given for each group.

   The SUMMARIES parameter allows the statistics to be saved in a variate, or a pointer to a set of variates if there are groups. These need not be declared in advance. The units of the variate(s) are labelled by the corresponding strings from the settings (in capital letters) of the SELECTION option, to simplify the subsequent access of any individual statistic. For example, the minimum value can be copied from a SUMMARIES variate v into a scalar m by

CALCULATE m = v$['MIN']


Options: PRINT, SELECTION, GROUPS. Parameters: DATA, SUMMARIES.


Method

The statistics are calculated in a variate which is then restricted to print only those that were required, and to obtain the unit numbers of those to be copied into the SUMMARIES variate.

SE Variance is calculated as

√((N (M4 - 4 M1 M3 + 6 M1 M1 M2 - 3 M14)/(N-1) - (N (M2 - M1 M1)/(N-1))2)/N)

Skewness is calculated as (M3 - 3 M1 M2 + 2 M13 ) / (M2 - M1 M1)3/2

SE Skewness is calculated as √({6N×(N-1)}/{(N-2)×(N+1)×(N+3)})

Kurtosis is calculated as (M4 - 4 M1 M3 + 6 M12 M2 - 3 M14)/(M2 - M1 M1)2 - 3

SE Kurtosis is calculated as √({24N(N-1)2}/{(N-2)(N-3)(N+5)(N+3)})

where Mi = ∑ xi / N

and N = NOBSERVATIONS(DATA)


Action with RESTRICT

The statistics are calculated for the restricted set of units from each DATA variate. Any existing restrictions are not affected by the procedure.