eliminate negative value from normal random number distribution

  • Thread starter Thread starter Ahmad
  • Start date Start date
A

Ahmad

hi
I try to generate normal distributed random number, my problem how i
can eliminate negative value, so i used this
function to generate random numbers
=(NORMSINV(RAND())*4)+6
where 4 is stdev, and 6 is average

thanks
 
Ahmad said:
I try to generate normal distributed random number,
my problem how i can eliminate negative value, so i
used this function to generate random numbers
=(NORMSINV(RAND())*4)+6
where 4 is stdev, and 6 is average

If the mean is 6 and the std dev is 4, the normal distribution curve will
become negative to the left of -1.5sd, i.e. when RAND() is less than
NORMSDIST(-1.5) -- about 6.68%.

If your intent is to clip the normal distribution curve at zero on the left,
you can use:

=MAX(0,NORMINV(RAND(),6,4))

Note that NORMINV(RAND(),6,4) is the same as NORMSINV(RAND())*4+6.

If your intent is to shift the normal distribution curve to the right so
that the left tail is non-negative, the mean will no longer be 6.

Moreover, theoretically it cannot be done because the tails are infinitely
asymptotic.

However, in practice, it can be done either by determining the negative-most
return value from NORMSINV (-30 in XL2003), or by arbitrarily assigning zero
to a "large" negative z-score, e.g. -8sd, and clipping anything to the left
of that.

It is risky to rely on the negative-most return value from NORMSINV. I
presume it is not documented; ergo, it might change from release-to-release.
 
PS....
If the mean is 6 and the std dev is 4, the normal distribution curve will
become negative to the left of -1.5sd, i.e. when RAND() is less than
NORMSDIST(-1.5) -- about 6.68%.

If your intent is to clip the normal distribution curve at zero on the
left, you can use:

=MAX(0,NORMINV(RAND(),6,4))

If your intent is to clip the normal distribution curve, but you want all
random values to land in the unclipped portion of the curve, then ostensibly
you want:

=MAX(0,NORMINV(NORMSDIST(-1.5)+(100%-NORMSDIST(-1.5))*RAND(),6,4))

The MAX(0,...) should be superfluous. I added it to accommodate any
floating-point anomalies that might result in less than zero. In XL2003,
NORMINV(NORMSDIST(-1.5),6,4) results in about -1.77636E-15, even though it
should be exactly zero.

Theoretically, NORMINV might return a #NUM error if the first parameter
(probability) exceeds some internal limit. In XL2003, that is only for a
probability of exactly 100% [*]. In the expression above, that would mean
that RAND() is exactly 1, which should not happen [**].

-----
[*] I tried NORMINV(1-2^-53,6,4) in XL2003, and that works. You might try
that in your version of Excel. Note that 1-2^-53 is the largest decimal
fraction less than 1 that can be represented in Excel.

[**] Theoretically, floating-point anomalies might cause the arithmetic to
result in exactly 1 even if RAND() is not. I don't know if that might be a
problem in actual practice. The largest possible result of RAND() in XL2003
is "very far" from 1. YMMV in later Excel versions, especially XL2010,
which redesigned RAND(). Nonetheless, I don't think it is worth it to
bullet-proof the formula on the high end.
 
Clarification....
If your intent is to clip the normal distribution curve,
but you want all random values to land in the unclipped
portion of the curve, then ostensibly you want:
=MAX(0,NORMINV(NORMSDIST(-1.5)+(100%-NORMSDIST(-1.5))*RAND(),6,4)) [....]
[**] Theoretically, floating-point anomalies might cause
the arithmetic to result in exactly 1 even if RAND() is not. [....]
[So,] I don't think it is worth it to bullet-proof the
formula on the high end.

To be consistent, I guess it is also not worth it to bullet-proof on the low
end, as I did. The following is probably sufficient:

=NORMINV(NORMSDIST(-1.5)+(1-NORMSDIST(-1.5))*RAND(),6,4)

Explanation....

The fact that NORMINV(NORMSDIST(-1.5),6,4) results in about -1.77636E-15 is
a floating-point anomaly. In fact, it might be different in XL2010, perhaps
even XL2007. Note that NORMINV(NORMSDIST(-1.5)+2^-56,6,4) returns a
positive result. 2^-56 is the smallest possible increment for the result of
NORMSDIST(-1.5).

Theoretically, the RAND expression would result in NORMSDIST(-1.5) only when
RAND() is exactly zero [*]. In XL2003 and XL2007, the RAND algorithm never
returns exactly zero. I don't know about XL2010. In any case, the
probability is so very small that it probably is not worth the trouble.
 
PS....
If the mean is 6 and the std dev is 4, the normal distribution
curve will become negative to the left of -1.5sd, i.e. when
RAND() is less than NORMSDIST(-1.5) -- about 6.68%.

I should have noted that -1.5sd depends on the mean and std dev. More
generally, the normal distribution becomes negative to the left of -mean/sd;
e.g. -6/4.
 
Back
Top