Example 9: Order Statistics
by Zavan Karian
> restart: with(plots, display): libname:="C:/mylib/statistics",libname: with(stat):
A study of the p.d.f.s of order statistics
First, take example 10.1-3
> f := x -> 1;
> F := x -> int(f(t), t=0..x);
> g := (n, r, y) -> (n!/((r-1)!*(n-r)!)) * F(y)^(r-1)*(1-F(y))^(n-r)*f(y);
> G:= (n,r,y) -> sum(n!/(k!*(n-k)!)*F(y)^k*(1-F(y))^(n-k), k=r..n);
> pdf := g(4,3,y);
> int(pdf, y=1/3..2/3);
> G(4,3,2/3)-G(4,3,1/3);
Note that this "hides the expressions for g and G.
We get around this by the following.
Can this be done if we assume that the underlying distribution
is more complicated? Assume we have an exponential.
> assume(theta > 0);
> f := x-> ExponentialPDF(theta, x);
> F := x-> int(f(t), t=0..x);
> g := (n, r, y) -> (n!/((r-1)!*(n-r)!)) * F(y)^(r-1)*(1-F(y))^(n-r)*f(y);
> G:= (n,r,y) -> sum(n!/(k!*(n-k)!)*F(y)^k*(1-F(y))^(n-k), k=r..n);
We can now compute probabilities. For example, if n=4,
what is the probability that the third order statistic is
between 1/2 and 1.
> P := simplify(G(4,3,1)-G(4,3,1/2));
If we assume a specific theta such as theta=1, we get
a numeric answer.
> evalf(subs(theta=1,P));
We can now study the effect of theta on an order
statistic for specific values of n and r. Let's
take n=4, r=3.
> pdf := simplify(g(4,3,y));
> N := 5;
> for i from 1 to N do
> gg := simplify(subs(theta=i, g(4,3,y)));
> P||i := plot(gg, y=0..10, color=blue):
> od:
> display([seq(P||i, i=1..5)], insequence=true);
For a fixed theta, say theta=1, we can also consider
the shapes of the order statistics for r=1, 2, 3, ...
Taking theta =1 and n=4, we look at r=1,2,3, and 4.
> for i from 1 to 4 do
> gg := simplify(subs(theta=1, g(4,i,y)));
> P||i := plot(gg, y=0..4, color=blue):
> od:
> display({P1,P2,P3,P4});
Unfortunately, the expressions for g and G in both examples
"hide" their actual forms. The following will produce more
explicit representations of g and G, if that is desired.
> assume( theta > 0);
> f := ExponentialPDF(theta, y);
> int(f, y=0..t);
> F := subs(t=y,%);
>
>
>
> g := (n!/((r-1)!*(n-r)!)) * F^(r-1)*(1-F)^(n-r)*f;
> G := sum(n!/(k!*(n-k)!)*F^k*(1-F)^(n-k), k=r..n);
For contrasrt, we now consider order statistics from a
symmetric, two-parameter distribution: N(mu,var).
> assume(sigma > 0);
> f := NormalPDF(mu, sigma^2, y);
> int(f, y=-infinity..t);
> F := subs(t=y,%);
>
>
> g := (n!/((r-1)!*(n-r)!)) * F^(r-1)*(1-F)^(n-r)*f;
> G := simplify(sum(n!/(k!*(n-k)!)*F^k*(1-F)^(n-k), k=r..n));
As before, we can calculate probabilities in
specific cases.
> GG := simplify(subs({n=4, r=3, mu=0, sigma=1}, G));
> evalf(subs(y=1/2, GG)- subs(y=-1/2, GG));
And we can obtain expressions and graphs for
specific p.d.f.s of order statistics or through
animation observe the effect of the symmetry
of the underlying distribution by comparing
g_i with g_(n-i+1).
> gg := simplify(subs({n=4, r=3, mu=0, sigma=1}, g));
> F;
> plot(gg, y=-3.5..3.5);
> N := 10;
> for i from 1 to N do
> gg := simplify(subs({n=N, r=i, mu=0, sigma=1},g));
> P||i := plot(gg, y=-3.5..3.5, color=blue):
> od:
> display([seq(P||i, i=1..N)], insequence=true);
> N:=20;
> for i from 1 to N/2 do
> gg := simplify(subs({n=N, r=i, mu=0, sigma=1},g));
>
ggg := simplify(subs({n=N, r=N-i+1, mu=0, sigma=1},g));
P||i := plot({gg,ggg}, y=-3.5..3.5, color=blue):
> od:
> display([seq(P||i, i=1..N/2)], insequence=true);