M
David Biddulph said:I'm confused by your reply to Maria.
Can you explain to us why you think that
=RANDBETWEEN(1,50)
=RANDBETWEEN(1,100)
would not provide duplicates?
The probability of duplicates will obviously reduce progressively as the
probability for each independent number is reduced, but the probability of
duplicates doesn't become zero until the range of numbers from which you are
choosing is infinite.
[It is left as an excercise for the interested reader to calculate the
probability of duplicates for a selection of 10 samples from
RANDBETWEEN(1,100), but I can assure you that the probability is non-zero.]
Hence you need a VBA solution if duplcates are to be avoided.
--
David Biddulph
Ms-Xl-Learner said:Increase the Maximum Number value in your Randbetween formula.
For Example if you want to use the Randbetween for 10 cells, then if
you use the formula like this =RANDBETWEEN(1,10) will result duplicate
values.
If you use the formula like this
=RANDBETWEEN(1,50)
=RANDBETWEEN(1,100)
Then you will not get any duplicates.
Like this depends upon your data size increase the Maximum Value In
your Randbetween formula to avoid duplicates values.
.
David Biddulph said:I'm confused by your reply to Maria.
Can you explain to us why you think that
=RANDBETWEEN(1,50)
=RANDBETWEEN(1,100)
would not provide duplicates?
The probability of duplicates will obviously reduce progressively as the
probability for each independent number is reduced, but the probability of
duplicates doesn't become zero until the range of numbers from which you are
choosing is infinite.
[It is left as an excercise for the interested reader to calculate the
probability of duplicates for a selection of 10 samples from
RANDBETWEEN(1,100), but I can assure you that the probability is non-zero.]
Hence you need a VBA solution if duplcates are to be avoided.
--
David Biddulph
Ms-Xl-Learner said:Increase the Maximum Number value in your Randbetween formula.
For Example if you want to use the Randbetween for 10 cells, then if
you use the formula like this =RANDBETWEEN(1,10) will result duplicate
values.
If you use the formula like this
=RANDBETWEEN(1,50)
=RANDBETWEEN(1,100)
Then you will not get any duplicates.
Like this depends upon your data size increase the Maximum Value In
your Randbetween formula to avoid duplicates values.
.
Ms-Exl-Learner said:David Sir I am not having any in depth knowledge about excel. I am
learning
it and know some little bit about excel by practice. I have also come
across
this issue and in that time I used to do like this to avoid duplicates.
Open a New work book and in A1 Cell paste this formula
=RANDBETWEEN(1,10)
Apply the formula upto A10, after that select the column or Range A1:A10
and
do copy and paste it as values. In B1 cell apply this formula
=COUNTIF(A:A,A1) and drag it upto B10. Now you can see the values "2" in
B
Column. When we apply the Randbetween for 10 cells and if we mention the
Maximum Randbetween value as 10 then it's creating duplicates.
Now open another new workbook and paste the below formula in A1 Cell
=RANDBETWEEN(1,100)
OR
=RANDBETWEEN(1,50)
As mentioned above repeat the same steps like pasting it upto A10 cell and
do copy and paste it as values and apply countif formula to check for
duplicates. Now all the countif results will be "1".
So if we increase the Randbetween Maximum value depends upon the data then
it's not creating any duplicate values. That is the reason I have
suggested
it.
If this method is wrong then please guide me I will also stop trying this.
--------------------
(Ms-Exl-Learner)
--------------------
David Biddulph said:I'm confused by your reply to Maria.
Can you explain to us why you think that
=RANDBETWEEN(1,50)
=RANDBETWEEN(1,100)
would not provide duplicates?
The probability of duplicates will obviously reduce progressively as the
probability for each independent number is reduced, but the probability
of
duplicates doesn't become zero until the range of numbers from which you
are
choosing is infinite.
[It is left as an excercise for the interested reader to calculate the
probability of duplicates for a selection of 10 samples from
RANDBETWEEN(1,100), but I can assure you that the probability is
non-zero.]
Hence you need a VBA solution if duplcates are to be avoided.
--
David Biddulph
Ms-Xl-Learner said:Increase the Maximum Number value in your Randbetween formula.
For Example if you want to use the Randbetween for 10 cells, then if
you use the formula like this =RANDBETWEEN(1,10) will result duplicate
values.
If you use the formula like this
=RANDBETWEEN(1,50)
=RANDBETWEEN(1,100)
Then you will not get any duplicates.
Like this depends upon your data size increase the Maximum Value In
your Randbetween formula to avoid duplicates values.
.
David Biddulph said:In this case it's not a knowledge of Excel that counts, but a very basic
knowledge of elementary statistics.
To go back to Excel:
With your RANDBETWEEN(1,100), either recalculate a number of times (just hit
F9) or copy across a number of columns, and then see whether your COUNTIF is
always 1.
I copied this across all 256 column of Excel 2003 & counted in how many
columns out of the 256 there were duplicates. Again the value will change
when you recalculate but the values are typically between 70 and 120 columns
out of the 256 which contain duplicates with 10 samples from
RANDBETWEEN(1,10)
You can use Excel to calculate the probability, as follows.
When you've got a random number in row 1, the probability of the random
number in row 2 not being a duplicate is 99/100
If you've got 2 different random numbers in rows 1 & 2, the probability of
the random number in row 3 not being a duplicate of either of those is
98/100
If you've got 3 different random numbers in rows 1 to 3, the probability of
the random number in row 4 not being a duplicate of any of those is 97/100
and so on down to
If you've got 9 different random numbers in rows 1 to 9, the probability of
the random number in row 10 not being a duplicate of any of those is 91/100
Therefore the probability of there not being a duplicate in your 10 random
samples from 100 is the product of those 9 probabilities above, which works
out at 62.82%.
For interest I then worked out the expected (mean) number of columns with
duplicates out of my 256 columns, and this is =256*(1-62.82%), or about 95
(and 95 is the middle of the typical range of 70 to 120 which I quoted above
when I recalculated a number of times).
If you change from RANDBETWEEN(1,100) to your other suggestion of
RANDBETWEEN(1,50), the probability of no duplicates drops from 62.82% to
38.17%, and the expected number of columns with duplicates increases to
about 158 of the 256 (and typical values seen are between 143 and 173).
Q.E.D.
--
David Biddulph
Ms-Exl-Learner said:David Sir I am not having any in depth knowledge about excel. I am
learning
it and know some little bit about excel by practice. I have also come
across
this issue and in that time I used to do like this to avoid duplicates.
Open a New work book and in A1 Cell paste this formula
=RANDBETWEEN(1,10)
Apply the formula upto A10, after that select the column or Range A1:A10
and
do copy and paste it as values. In B1 cell apply this formula
=COUNTIF(A:A,A1) and drag it upto B10. Now you can see the values "2" in
B
Column. When we apply the Randbetween for 10 cells and if we mention the
Maximum Randbetween value as 10 then it's creating duplicates.
Now open another new workbook and paste the below formula in A1 Cell
=RANDBETWEEN(1,100)
OR
=RANDBETWEEN(1,50)
As mentioned above repeat the same steps like pasting it upto A10 cell and
do copy and paste it as values and apply countif formula to check for
duplicates. Now all the countif results will be "1".
So if we increase the Randbetween Maximum value depends upon the data then
it's not creating any duplicate values. That is the reason I have
suggested
it.
If this method is wrong then please guide me I will also stop trying this.
--------------------
(Ms-Exl-Learner)
--------------------
David Biddulph said:I'm confused by your reply to Maria.
Can you explain to us why you think that
=RANDBETWEEN(1,50)
=RANDBETWEEN(1,100)
would not provide duplicates?
The probability of duplicates will obviously reduce progressively as the
probability for each independent number is reduced, but the probability
of
duplicates doesn't become zero until the range of numbers from which you
are
choosing is infinite.
[It is left as an excercise for the interested reader to calculate the
probability of duplicates for a selection of 10 samples from
RANDBETWEEN(1,100), but I can assure you that the probability is
non-zero.]
Hence you need a VBA solution if duplcates are to be avoided.
--
David Biddulph
Ms-Xl-Learner wrote:
Increase the Maximum Number value in your Randbetween formula.
For Example if you want to use the Randbetween for 10 cells, then if
you use the formula like this =RANDBETWEEN(1,10) will result duplicate
values.
If you use the formula like this
=RANDBETWEEN(1,50)
=RANDBETWEEN(1,100)
Then you will not get any duplicates.
Like this depends upon your data size increase the Maximum Value In
your Randbetween formula to avoid duplicates values.
.
See JE McGimpsey's site for a solution to this problem.
http://www.mcgimpsey.com/excel/udfs/randint.html
Bernd P said:A faster and more flexible RandInt you can find here:
http://sulprobil.com/html/randint.html
Bernd P said:A faster and more flexible RandInt you can find here:
http://sulprobil.com/html/randint.html
...
I don't think the "late initialization" algorithm works correctly when the
lMin-to-lMax range includes zero.
...
...
If you agree, I think the simplest solution is to disable the "late
initialization" algorithm when the lMin-to-Lmax range includes zero.
...
...
PS: I do not agree that "lRange < CLateInit" is sufficient to test the
condition (lMax-lMin)*lRept >> lCount. But that's a matter of opinion.