Is there something wrong with the random generator in Excel?

  • Thread starter Thread starter JasonK
  • Start date Start date
J

JasonK

TIA --

I'm running 2003.

I've created a spreadsheet that plays craps. I've written a lengthy
macro in old basic language that runs fine. It rolls 2 dice, sums the
total, places wagers in columns and totals winners and losers and
keeps a running total of a bankroll. It's designed to play a specific
way that I saw some one play who happened to be winning a lot of
money.

When I call the random generator, "Randomize" on a certain line, the
player wins consistently. When I move the Randomize line down a few
lines of code, the house wins consistently.

Is there something inherently wrong with the Random generator in
Excel? Is there a better way to call a Random number that
"Randomize"? I used to use "Randomize Timer" with the old basic
language. That doesn't seem to make a difference here.

Thanks for your help,

JasonK
 
Do not place "randomize" inside a loop.
Use it just before the loop starts.
--
Jim Cone
Portland, Oregon USA



"JasonK" <[email protected]>
wrote in message
TIA --

I'm running 2003.

I've created a spreadsheet that plays craps. I've written a lengthy
macro in old basic language that runs fine. It rolls 2 dice, sums the
total, places wagers in columns and totals winners and losers and
keeps a running total of a bankroll. It's designed to play a specific
way that I saw some one play who happened to be winning a lot of
money.

When I call the random generator, "Randomize" on a certain line, the
player wins consistently. When I move the Randomize line down a few
lines of code, the house wins consistently.

Is there something inherently wrong with the Random generator in
Excel? Is there a better way to call a Random number that
"Randomize"? I used to use "Randomize Timer" with the old basic
language. That doesn't seem to make a difference here.

Thanks for your help,

JasonK
 
Jim,

Thanks for the response.

I've placed randomize where the computer rolls the two virtual dice.
the rest of the program is based on what number rolls.

It's not actually inside a loop. Either before the first die is
rolled, or before the second. That's the only difference.

Any other ideas?

JasonK
 
Something to chew on (similar issue) ...
http://groups.google.com/group/micr...528540/fcafbf32d068901f?lnk=gst&q=rnd()+2007#
--
Jim Cone
Portland, Oregon USA



"JasonK" <[email protected]>
wrote in message Jim,

Thanks for the response.

I've placed randomize where the computer rolls the two virtual dice.
the rest of the program is based on what number rolls.

It's not actually inside a loop. Either before the first die is
rolled, or before the second. That's the only difference.

Any other ideas?

JasonK
 
JasonK said:
When I call the random generator, "Randomize" on
a certain line, the player wins consistently. When
I move the Randomize line down a few lines of code,
the house wins consistently.

That sounds like a coincidence or a defect in your code. To address, it
would be helpful to see the entire macro (and called procedures).

But note that Randomize should be called only one time. Arguably, it should
not hurt to call it multiple times. But its purpose is to "seed"
(initialize) the pseudorandom number generator. "Seeding" is typically done
only one time.

There are many ways to do this. If your macro is called only one time, the
simplest way is to put the Randomize statement outside any loops, typically
at the beginning of the macro. If your macro is called successively, you
could use a static variable to ensure that Randomize is called only the
first time, e.g.:

Static notFirst As Integer
If notFirst = 0 Then Randomize: notFirst = 1

Note: "first" will be initially zero each time VBA is reset. If that
bothers you, you can put the Randomize in another macro, which is executed
only one time.


----- original message -----
 
Note that if you call Randomize with the same number each time, you will get
the same stream of semi-random numbers. The default call to Randomize
(without any number) uses Timer, so that helps make things truly random. If
you are using something like "Randomize 1234568", then you will get the same
"random" numbers each time you run your code.

You should call Randomize only once and then use the Rnd function to get
successive random numbers for your game.

HTH,

Eric
 
But note that Randomize should be called only one time. Arguably, it
should not hurt to call it multiple times.

The above statement from your post is not really true... it will hurt to
call Randomize multiple times. Here is a Excel UserForm translation done by
Peter T for a compiled VB program I've posted in the past that shows the
problem graphically. To use it, simply add a UserForm to the VB project and
copy/paste all of the code following my signature into the UserForm's code
window, then run the project and then click the UserForm. The two loops are
identical except that the first loop (corresponding to the left-hand
display) runs the Randomize statement one time before executing the loop
whereas the second loop (corresponding to the right-hand display) executes
the Randomize statement with each iteration of the loop. All the loops are
doing is picking a random color value and displaying a pixel of that color
(arranged to fill out a rectangular area). Notice how random the color
distribution is in the left-hand display and how regular looking "patterns"
seem to be displayed in the right-hand display. Randomize should only be
executed once per running of any code that uses the Rnd function call.

--
Rick (MVP - Excel)


Private Declare Function FindWindow Lib "user32" _
Alias "FindWindowA" _
(ByVal lpClassName As String, _
ByVal lpWindowName As String) As Long

Private Declare Function SetPixel Lib "gdi32" _
(ByVal hDc As Long, _
ByVal x As Long, _
ByVal y As Long, _
ByVal crColor As Long) As Long

Private Declare Function GetPixel Lib "gdi32" _
(ByVal hDc As Long, _
ByVal x As Long, _
ByVal y As Long) As Long

Private Declare Function GetDC Lib "user32" ( _
ByVal hWnd As Long) As Long

Private Declare Function ReleaseDC Lib "user32" ( _
ByVal hWnd As Long, ByVal hDc As Long) As Long

Private Sub PaintPixels()
Dim hWnd As Long, hDc As Long
Dim tp As Long, lt As Long
Dim x As Long, y As Long
Dim colr As Long

Const Z As Long = 128 * 2 - 1
Const C As Long = 16711680

Me.Left = 10: Me.Top = 10
Me.Width = (Z * 2) * 0.75 + 45: Me.Height = Z * 0.75 + 60
' if form is too small, change 0.75 to 1 or 1.25

hWnd = FindWindow("ThunderDFrame", Me.Caption)
hDc = GetDC(hWnd)

tp = Me.Top + 15
lt = Me.Left + 10

Randomize
For y = tp To tp + Z
For x = lt To lt + Z
colr = Rnd * C
SetPixel hDc, x, y, colr
Next
Next

lt = lt + Z + 15
For y = tp To tp + Z
For x = lt To lt + Z
Randomize
colr = Rnd * C
SetPixel hDc, x, y, colr
Next
Next

ReleaseDC hWnd, hDc
End Sub

Private Sub UserForm_Activate()
Me.Caption = "Click me to (re-) PaintPixels"
PaintPixels
End Sub

Private Sub UserForm_Click()
' Me.Repaint
PaintPixels
End Sub
 
Rick Rothstein said:
The above statement from your post is not really true...
it will hurt to call Randomize multiple times.

First, I want to reiterate and emphasize that I agree wholeheartedly that
"Randomize should be called only one time". That's why I said that.

But let me explain what I meant by "[a]rguably, it should not hurt to call
it multiple times".

First, note that the operative word is "arguably". By that, I mean it is
indeed dubious to do so.

Second, I would never recommend calling Randomize in a tight loop; I would,
indeed, expect that to be "hurtful". Since it is common to seed the PRNG
using the time of day or system uptime, and we do not know (and should not
consider) the internal algorithm for deriving the seed from time, it would
not be unreasonable to assume that consecutive Randomize seeds in a tight
loop might be very close.

However, I think it is less risky (albeit still not recommended) to call
Randomize each time a function or macro is executed. For example, see
BerndP's http://sulprobil.com/html/randint.html, which directs the user to
uncomment the Randomize statement "if you like".

I cannot address the observations with your algorithm.

But the following macro demonstrates that even when Randomize is called in a
tight loop (surprise!), it does not seem to unduly hurt the "randomness" of
the results, measured in simple terms.

Of course, the true measure of "randomness" requires rigorous and complex
tests.

The measurements I used are more simplistic: distribution among the buckets
defined by the tenths digit (i.e., 0.0x, 0.1x, 0.2, etc); the maximum run
length, counting instances in the same bucket +/- 1 to be part of the same
run; and average and standard deviation of the difference between
consecutive random numbers. The latter two statistics attempt to measure
locality.

The macro should be run with the #Const doOnce first set True, then set
False. Then compare the results in A1:B10 and A11:B21. A1 and A12 are the
std dev of the bucket distribution in B1:B10 and B11:B21 respectively. A2
and A13 are the max run legnth. A3:A4 and A14:A15 are the average and std
dev of the difference between consecutive random numbers.

Of course, the results will vary somewhat from run to run. YMMV.

But in the experiments that I tried, A3:A4 and A14:A15 consistently very
similar; for example 3.32E-1 and 2.36E-1 when doOnce is True v. 3.29E-1 and
2.35E-1 when doOnce is False.

Likewise, the A2 and A13 were often close and consistently very small; for
example 8 when doOnce is True v. 5 when doOnce is False.

Sometimes, A1 and A12 were similar; for example, about 80 and 98.

But frequently, A12 was significantly higher than A1, 2 to 3 times higher.
I don't know what to make of that statistic. I did not find any consistent
difference in the bucket distributions, e.g. some buckets consistently
significantly more or less when doOnce is False then when doOnce is True.

Perhaps it suggests some localized ping-ponging between one set of buckets
and another set. That could be "hurtful". But I reiterate: this macro
calls Randomize in a tight loop (when doOnce is False), which I was
surprised performed as well as it did.

Based on the surprisingly good results from a tight loop, I feel more
confident in saying that calling Randomize once every time a function or
macro is executed (assuming the call itself is not in a tight loop), which
is more like what I had in mind, would probably not be hurtful.

Nonethess, I still would not recommend it. I would call Randomize in a
Workbook_Open macro, or I would use a "first-time" static or global variable
so that Randomize is called only once ostensibly.

(As I wrote previously, the first-time variable approach will call Randomize
once each time VBA is reset.)


-----
Macro....


Option Explicit

#Const doOnce = True

Sub doit()
Dim f(0 To 9, 0) As Double
Dim n As Double, r As Double, rprev As Double
Dim nrun As Double, nrunMax as Double
Dim i As Integer, iprev As Integer, d As Integer
Dim x As Double, sumX As Double, sumX2 As Double, a As Double, s As Double

#If doOnce Then
Randomize
#End If

nrun = 1: nrunMax = 1
sumX = 0: sumX2 = 0

For n = 1 To 100000
#If Not doOnce Then
Randomize
#End If
r = Rnd
i = Int(r * 10) 'bucket number (tenths digit)
f(i, 0) = f(i, 0) + 1
If n > 1 Then
x = Abs(r - rprev) 'diff btwn consecutive Rnd's
sumX = sumX + x
sumX2 = sumX2 + x * x
If (i = iprev) Or (i = iprev - 1) Or (i = iprev + 1) Then nrun = nrun
+ 1 _
Else nrunMax = IIf(nrun > nrunMax, nrun, nrunMax): nrun = 1
End If
iprev = i
rprev = r
Next

If nrun > nrunMax Then nrunMax = nrun
a = sumX / (n - 2) 'avg
s = Sqr(sumX2 / (n - 2) - a * a) 'std dev
#If doOnce Then
d = 0 'a1:a4,b1:b10
#Else
d = 11 'a12:a15,b12:b21
#End If
Range("b1:b10").Offset(d, 0).NumberFormat = "0"
Range("b1:b10").Offset(d, 0) = f
Range("a1").Offset(d, 0).NumberFormat = "0.00"
Range("a1").Offset(d, 0).Formula = "=stdevp(offset(b1:b10," & d & ",0))"
Range("a2").Offset(d, 0).NumberFormat = "0"
Range("a2").Offset(d, 0) = nrunMax
Range("a3").Offset(d, 0).NumberFormat = "0.00E+00"
Range("a3").Offset(d, 0) = a
Range("a4").Offset(d, 0).NumberFormat = "0.00E+00"
Range("a4").Offset(d, 0) = s
End Sub



----- original message -----
 
EricG said:
The default call to Randomize (without any number) uses Timer

For Excel 2003 with VBA 6.5.1024, the VB Help page for Randomize says that
the "system timer" is used if "number" is omitted.

I am pretty sure that does not mean Timer per se, although that might vary
from system to system. That is, "Randomize" is not the same as "Randomize
Timer".

Try the following macro. A2 is Timer at the start, A3 is the first Rnd
after Randomize without "number", A4 is the first Rnd after Randomize using
the starting Timer for "number", and A5 is Timer at the end. B4 is TRUE if
A3 and A4 are identical. B5 is TRUE if A2 and A5 are identical. A1 is the
time of day.

On my system (Win XP SP3), A3 and A4 are different and B4 is FALSE, whereas
A2 and A5 are the same and B5 is TRUE.


-----
Macro.....

Option Explicit

Sub testRandomize()
Dim x1 As Double, x2 As Double, r1 As Double, r2 As Double
x1 = Timer
Randomize
r1 = Rnd
Randomize x1
r2 = Rnd
x2 = Timer
Range("a1").NumberFormat = "dd mmm yyyy hh:mm:ss.000"
Range("a1") = Evaluate("now()") 'more accurate than VB Now
Range("a2:a5").NumberFormat = "0.000000000000000E+00"
Range("a2") = x1
Range("a3") = r1
Range("a4") = r2
Range("a5") = x2
Range("b4:b5").NumberFormat = "general"
Range("b4") = (r1 = r2)
Range("b5") = (x1 = x2)
Range("a1:b5").Columns.AutoFit
End Sub


----- original message -----
 
Errata....
"Randomize" is not the same as "Randomize Timer".

I guess I cannot say that impunity.

It appears that successive calls to Randomize with the same "number"
argument result in different first Rnd values(!).

Let me be more clear. When "Randomize 123" is put into the ThisWorkbook
Workbook_Open macro, the first Rnd value is indeed the same each time the
workbook is opened.

But subsequent calls to "Randomize 123" result in different first Rnd
values, whether those calls are in Workbook_Open or in a public macro like
testRandomize2 below.

Am I doing something wrong?(!)

-----
ThisWorkbook macro....

Option Explicit

Private Sub Workbook_Open()
Dim r1 As Double, r2 As Double
Randomize 123
r1 = Rnd
Randomize 123
r2 = Rnd
With Worksheets("sheet3")
.Range("a1").NumberFormat = "dd mmm yyyy hh:mm:ss.000"
.Range("a1") = Evaluate("now()") 'more accurate than VB Now
.Range("a2:a3").NumberFormat = "0.000000000000000E+00"
.Range("a2") = r1
.Range("a3") = r2
.Range("b3").NumberFormat = "general"
.Range("b3") = (r1 = r2)
.Range("a1:b3").Columns.AutoFit
End With
End Sub

-----
Module1 macro....

Option Explicit

Sub testRandomize2()
Dim r1 As Double, r2 As Double
Randomize 123
r1 = Rnd
Randomize 123
r2 = Rnd
With Selection
.NumberFormat = "dd mmm yyyy hh:mm:ss.000"
Selection = Evaluate("now()") 'more accurate than VB Now
.Offset(1, 0).NumberFormat = "0.000000000000000E+00"
.Offset(1, 0) = r1
.Offset(2, 0).NumberFormat = "0.000000000000000E+00"
.Offset(2, 0) = r2
.Offset(2, 1).NumberFormat = "general"
.Offset(2, 1) = (r1 = r2)
End With
Selection.Resize(3, 1).Columns.AutoFit
End Sub


----- original message -----
 
Errata....
"Randomize" is not the same as "Randomize Timer".

I guess I cannot say that impunity.

It appears that successive calls to Randomize with the same "number"
argument result in different first Rnd values(!).

Let me be more clear. When "Randomize 123" is put into the ThisWorkbook
Workbook_Open macro, the first Rnd value is indeed the same each time the
workbook is opened.

But subsequent calls to "Randomize 123" result in different first Rnd
values, whether those calls are in Workbook_Open or in a public macro like
testRandomize2 below.

Am I doing something wrong?(!)

-----
ThisWorkbook macro....

Option Explicit

Private Sub Workbook_Open()
Dim r1 As Double, r2 As Double
Randomize 123
r1 = Rnd
Randomize 123
r2 = Rnd
With Worksheets("sheet3")
.Range("a1").NumberFormat = "dd mmm yyyy hh:mm:ss.000"
.Range("a1") = Evaluate("now()") 'more accurate than VB Now
.Range("a2:a3").NumberFormat = "0.000000000000000E+00"
.Range("a2") = r1
.Range("a3") = r2
.Range("b3").NumberFormat = "general"
.Range("b3") = (r1 = r2)
.Range("a1:b3").Columns.AutoFit
End With
End Sub

-----
Module1 macro....

Option Explicit

Sub testRandomize2()
Dim r1 As Double, r2 As Double
Randomize 123
r1 = Rnd
Randomize 123
r2 = Rnd
With Selection
.NumberFormat = "dd mmm yyyy hh:mm:ss.000"
Selection = Evaluate("now()") 'more accurate than VB Now
.Offset(1, 0).NumberFormat = "0.000000000000000E+00"
.Offset(1, 0) = r1
.Offset(2, 0).NumberFormat = "0.000000000000000E+00"
.Offset(2, 0) = r2
.Offset(2, 1).NumberFormat = "general"
.Offset(2, 1) = (r1 = r2)
End With
Selection.Resize(3, 1).Columns.AutoFit
End Sub


----- original message -----
 
Joe said:
First, I want to reiterate and emphasize that I agree wholeheartedly
that "Randomize should be called only one time". That's why I said that.

It is much worse than that. Calling Randomize once to get a unique
initial start point is fine, but using it repeatedly will get you into
trouble. You won't really notice unless you are doing serious
statistical tests on a decent sized dataset. eg

http://www.amstat.org/sections/srms/proceedings/y2004/files/Jsm2004-000710.pdf

Historically Excels random number generator was very ropey. See
http://support.microsoft.com/kb/828795
And more amusing still is
http://support.microsoft.com/kb/834520/

These days it is better, but I would not want to rely on it for any
Monte-Carlo simulations where I cared about the outcome.
But let me explain what I meant by "[a]rguably, it should not hurt to
call it multiple times".

First, note that the operative word is "arguably". By that, I mean it
is indeed dubious to do so.

Very dubious. The best you can hope for from a pseudorandom number
generator is that it generates plausible random numbers with a suitably
long period and uniform distribution.

A cautionary tale is that random algorithms seldom produce good random
number sequences. The OP might find the chapter on pseudo random number
generation chapter 7 in Numerical Recipes helpful.

http://www.nrbook.com/a/bookcpdf.php

Regards,
Martin Brown
 
Martin Brown said:
Calling Randomize once to get a unique initial start point
is fine, but using it repeatedly will get you into trouble.

A fact that I stumbled upon in my response to EricG. Surely what I step in
is a defect. Right? (See my "errata" response to EricG and respond there.)

The best you can hope for from a pseudorandom number generator is that it
generates plausible random numbers
with a suitably long period and uniform distribution.

I am well aware of the eccentricities of PRNGs, having studied and written
them myself over the years. But Randomize should work as documented. And
there should be no harm in reseeding a PRNG repeatedly, if done correctly,
other than the fact that it defeats the purpose of a PRNG.

By "done correctly", I mean: taking into account the algorithm for default
seeding. If the default seed uses system time measured to the second,
obviously reseeding many times within the same second will result in the
same sequence of random numbers (read: "harmful"). But if the default seed
uses the CPU instruction counter, for example, then reseeding repeatedly
should result in new sequences of random numbers. The only problem is: we
can no longer make assumptions about the period of each sequence, ergo the
uniqueness and locality of each random number. To that extent, repeated
reseeding is risky. But in simple uses like a craps game, I doubt that the
player would notice the difference. Of course, repeated reseeding is
foolish for demanding uses of random numbers like Monte Carlo simulation.

Let's not take my comments out of context. I was referring to the JasonK's
claims, not general usage. I have repeatedly reaffirmed that repeated
reseeding is not a good idea. I would consider it a poor programming
practice.


----- original message -----

Martin Brown said:
Joe said:
First, I want to reiterate and emphasize that I agree wholeheartedly that
"Randomize should be called only one time". That's why I said that.

It is much worse than that. Calling Randomize once to get a unique initial
start point is fine, but using it repeatedly will get you into trouble.
You won't really notice unless you are doing serious statistical tests on
a decent sized dataset. eg

http://www.amstat.org/sections/srms/proceedings/y2004/files/Jsm2004-000710.pdf

Historically Excels random number generator was very ropey. See
http://support.microsoft.com/kb/828795
And more amusing still is
http://support.microsoft.com/kb/834520/

These days it is better, but I would not want to rely on it for any
Monte-Carlo simulations where I cared about the outcome.
But let me explain what I meant by "[a]rguably, it should not hurt to
call it multiple times".

First, note that the operative word is "arguably". By that, I mean it is
indeed dubious to do so.

Very dubious. The best you can hope for from a pseudorandom number
generator is that it generates plausible random numbers with a suitably
long period and uniform distribution.

A cautionary tale is that random algorithms seldom produce good random
number sequences. The OP might find the chapter on pseudo random number
generation chapter 7 in Numerical Recipes helpful.

http://www.nrbook.com/a/bookcpdf.php

Regards,
Martin Brown
 
Let me be more clear. When "Randomize 123" is put into the ThisWorkbook
Workbook_Open macro, the first Rnd value is indeed the same each time the
workbook is opened.

But subsequent calls to "Randomize 123" result in different first Rnd
values, whether those calls are in Workbook_Open or in a public macro like
testRandomize2 below.

Am I doing something wrong?(!)

See the Note in the Remarks section of the Help Files for either the
Randomize or Rnd functions.
 
Rick Rothstein said:
See the Note in the Remarks section of the Help Files for either the
Randomize or Rnd functions.

D'oh! I read EricG's explanation literally [1]. I spotted-checked the Help
page, but I did not look at all of the details.

Empirically, I discovered that Rnd -1 (any negative number) is also needed
in order for multiple calls to Randomize (no argument) to repeat the same
sequence of pseudorandom numbers when called within the same 15.625-msec
window. (System time is updated every 15.625 msec, at least in MSWin XP.)

Not very useful, of course. But that explains why calling Randomize alone
in a tight loop seemed to work suprisingly well, albeit not a good practice.

It appears that Randomize uses some state information (probably the PRNG
factors after the last Rnd call) in conjunction with its argument, or with
"system time" when there is no argument. Calling Rnd -1 first ensures a
consistent state, ergo a repeatable seed. Without calling Rnd -1
before Randomize, the PRNG state is inconsistent, resulting in an
unpredictable seed despite the same or similar "system time".


-----
[1] EricG wrote: ``If you are using something like "Randomize 1234568",
then you will get the same "random" numbers each time you run your code.``


----- original message -----
 
Back
Top