how can I use randbetween without repeating numbers in a set

  • Thread starter Thread starter Maria
  • Start date Start date
Increase the Maximum Number value in your Randbetween formula.

For Example if you want to use the Randbetween for 10 cells, then if
you use the formula like this =RANDBETWEEN(1,10) will result duplicate
values.

If you use the formula like this

=RANDBETWEEN(1,50)
=RANDBETWEEN(1,100)

Then you will not get any duplicates.

Like this depends upon your data size increase the Maximum Value In
your Randbetween formula to avoid duplicates values.
 
David Sir I am not having any in depth knowledge about excel. I am learning
it and know some little bit about excel by practice. I have also come across
this issue and in that time I used to do like this to avoid duplicates.

Open a New work book and in A1 Cell paste this formula

=RANDBETWEEN(1,10)

Apply the formula upto A10, after that select the column or Range A1:A10 and
do copy and paste it as values. In B1 cell apply this formula
=COUNTIF(A:A,A1) and drag it upto B10. Now you can see the values “2†in B
Column. When we apply the Randbetween for 10 cells and if we mention the
Maximum Randbetween value as 10 then it’s creating duplicates.

Now open another new workbook and paste the below formula in A1 Cell

=RANDBETWEEN(1,100)
OR
=RANDBETWEEN(1,50)

As mentioned above repeat the same steps like pasting it upto A10 cell and
do copy and paste it as values and apply countif formula to check for
duplicates. Now all the countif results will be “1â€.

So if we increase the Randbetween Maximum value depends upon the data then
it’s not creating any duplicate values. That is the reason I have suggested
it.

If this method is wrong then please guide me I will also stop trying this.

--------------------
(Ms-Exl-Learner)
--------------------



David Biddulph said:
I'm confused by your reply to Maria.

Can you explain to us why you think that
=RANDBETWEEN(1,50)
=RANDBETWEEN(1,100)
would not provide duplicates?
The probability of duplicates will obviously reduce progressively as the
probability for each independent number is reduced, but the probability of
duplicates doesn't become zero until the range of numbers from which you are
choosing is infinite.
[It is left as an excercise for the interested reader to calculate the
probability of duplicates for a selection of 10 samples from
RANDBETWEEN(1,100), but I can assure you that the probability is non-zero.]

Hence you need a VBA solution if duplcates are to be avoided.
--
David Biddulph

Ms-Xl-Learner said:
Increase the Maximum Number value in your Randbetween formula.

For Example if you want to use the Randbetween for 10 cells, then if
you use the formula like this =RANDBETWEEN(1,10) will result duplicate
values.

If you use the formula like this

=RANDBETWEEN(1,50)
=RANDBETWEEN(1,100)

Then you will not get any duplicates.

Like this depends upon your data size increase the Maximum Value In
your Randbetween formula to avoid duplicates values.


.
 
Maria,

I can't reply directly because you didn't put anything in the body of your
message so I'll use David's response to reply.

Try this macro, Set your maximum number to whatever you want but it must be
larger than the number of cells in the fill range.

Sub Marine()
Dim MyMax As Long
MyMax = 1000 'Change to suit
Dim FillRange As Range
Set FillRange = Range("A1:a100")
For Each c In FillRange
Do
c.Value = Int((MyMax * Rnd) + 1)
Loop Until WorksheetFunction.CountIf(FillRange, c.Value) < 2
Next
End Sub


Mike

David Biddulph said:
I'm confused by your reply to Maria.

Can you explain to us why you think that
=RANDBETWEEN(1,50)
=RANDBETWEEN(1,100)
would not provide duplicates?
The probability of duplicates will obviously reduce progressively as the
probability for each independent number is reduced, but the probability of
duplicates doesn't become zero until the range of numbers from which you are
choosing is infinite.
[It is left as an excercise for the interested reader to calculate the
probability of duplicates for a selection of 10 samples from
RANDBETWEEN(1,100), but I can assure you that the probability is non-zero.]

Hence you need a VBA solution if duplcates are to be avoided.
--
David Biddulph

Ms-Xl-Learner said:
Increase the Maximum Number value in your Randbetween formula.

For Example if you want to use the Randbetween for 10 cells, then if
you use the formula like this =RANDBETWEEN(1,10) will result duplicate
values.

If you use the formula like this

=RANDBETWEEN(1,50)
=RANDBETWEEN(1,100)

Then you will not get any duplicates.

Like this depends upon your data size increase the Maximum Value In
your Randbetween formula to avoid duplicates values.



.
 
In this case it's not a knowledge of Excel that counts, but a very basic
knowledge of elementary statistics.

To go back to Excel:
With your RANDBETWEEN(1,100), either recalculate a number of times (just hit
F9) or copy across a number of columns, and then see whether your COUNTIF is
always 1.
I copied this across all 256 column of Excel 2003 & counted in how many
columns out of the 256 there were duplicates. Again the value will change
when you recalculate but the values are typically between 70 and 120 columns
out of the 256 which contain duplicates with 10 samples from
RANDBETWEEN(1,10)

You can use Excel to calculate the probability, as follows.
When you've got a random number in row 1, the probability of the random
number in row 2 not being a duplicate is 99/100
If you've got 2 different random numbers in rows 1 & 2, the probability of
the random number in row 3 not being a duplicate of either of those is
98/100
If you've got 3 different random numbers in rows 1 to 3, the probability of
the random number in row 4 not being a duplicate of any of those is 97/100
and so on down to
If you've got 9 different random numbers in rows 1 to 9, the probability of
the random number in row 10 not being a duplicate of any of those is 91/100
Therefore the probability of there not being a duplicate in your 10 random
samples from 100 is the product of those 9 probabilities above, which works
out at 62.82%.
For interest I then worked out the expected (mean) number of columns with
duplicates out of my 256 columns, and this is =256*(1-62.82%), or about 95
(and 95 is the middle of the typical range of 70 to 120 which I quoted above
when I recalculated a number of times).

If you change from RANDBETWEEN(1,100) to your other suggestion of
RANDBETWEEN(1,50), the probability of no duplicates drops from 62.82% to
38.17%, and the expected number of columns with duplicates increases to
about 158 of the 256 (and typical values seen are between 143 and 173).

Q.E.D.
--
David Biddulph


Ms-Exl-Learner said:
David Sir I am not having any in depth knowledge about excel. I am
learning
it and know some little bit about excel by practice. I have also come
across
this issue and in that time I used to do like this to avoid duplicates.

Open a New work book and in A1 Cell paste this formula

=RANDBETWEEN(1,10)

Apply the formula upto A10, after that select the column or Range A1:A10
and
do copy and paste it as values. In B1 cell apply this formula
=COUNTIF(A:A,A1) and drag it upto B10. Now you can see the values "2" in
B
Column. When we apply the Randbetween for 10 cells and if we mention the
Maximum Randbetween value as 10 then it's creating duplicates.

Now open another new workbook and paste the below formula in A1 Cell

=RANDBETWEEN(1,100)
OR
=RANDBETWEEN(1,50)

As mentioned above repeat the same steps like pasting it upto A10 cell and
do copy and paste it as values and apply countif formula to check for
duplicates. Now all the countif results will be "1".

So if we increase the Randbetween Maximum value depends upon the data then
it's not creating any duplicate values. That is the reason I have
suggested
it.

If this method is wrong then please guide me I will also stop trying this.

--------------------
(Ms-Exl-Learner)
--------------------



David Biddulph said:
I'm confused by your reply to Maria.

Can you explain to us why you think that
=RANDBETWEEN(1,50)
=RANDBETWEEN(1,100)
would not provide duplicates?
The probability of duplicates will obviously reduce progressively as the
probability for each independent number is reduced, but the probability
of
duplicates doesn't become zero until the range of numbers from which you
are
choosing is infinite.
[It is left as an excercise for the interested reader to calculate the
probability of duplicates for a selection of 10 samples from
RANDBETWEEN(1,100), but I can assure you that the probability is
non-zero.]

Hence you need a VBA solution if duplcates are to be avoided.
--
David Biddulph

Ms-Xl-Learner said:
Increase the Maximum Number value in your Randbetween formula.

For Example if you want to use the Randbetween for 10 cells, then if
you use the formula like this =RANDBETWEEN(1,10) will result duplicate
values.

If you use the formula like this

=RANDBETWEEN(1,50)
=RANDBETWEEN(1,100)

Then you will not get any duplicates.

Like this depends upon your data size increase the Maximum Value In
your Randbetween formula to avoid duplicates values.


.
 
David Sir Thanks a lot for spending your valuable time in explaining /
guiding me by way of giving a brief valuable note about randbetween function.
Now it’s clear to me that the method which was followed by me will not stop
creating the duplicate values.

Before that I used to increase the randbetween maximum value depends upon
the number of cells for which I am going to apply. For example if I am
applying the randbetween formula for 10 cells then I used to square it like
(10*10) so previously I had the thought if I use the randbetween Maximum
value as 100 for 10 cells then it will not create any duplicates. In the same
case for 100 cells I was used 10000 as the maximum value (100*100) and
assumed that the squaring the number of cells and using that as the maximum
value in randbetween will rectify the duplication issue. But now only I
understood that this is not the right way to stop creating the duplicates.

Today I have learned another informative message about the method of using
the Randbetween Function from your post.

Once again Thank you very much!!!

--------------------
(Ms-Exl-Learner)
--------------------



David Biddulph said:
In this case it's not a knowledge of Excel that counts, but a very basic
knowledge of elementary statistics.

To go back to Excel:
With your RANDBETWEEN(1,100), either recalculate a number of times (just hit
F9) or copy across a number of columns, and then see whether your COUNTIF is
always 1.
I copied this across all 256 column of Excel 2003 & counted in how many
columns out of the 256 there were duplicates. Again the value will change
when you recalculate but the values are typically between 70 and 120 columns
out of the 256 which contain duplicates with 10 samples from
RANDBETWEEN(1,10)

You can use Excel to calculate the probability, as follows.
When you've got a random number in row 1, the probability of the random
number in row 2 not being a duplicate is 99/100
If you've got 2 different random numbers in rows 1 & 2, the probability of
the random number in row 3 not being a duplicate of either of those is
98/100
If you've got 3 different random numbers in rows 1 to 3, the probability of
the random number in row 4 not being a duplicate of any of those is 97/100
and so on down to
If you've got 9 different random numbers in rows 1 to 9, the probability of
the random number in row 10 not being a duplicate of any of those is 91/100
Therefore the probability of there not being a duplicate in your 10 random
samples from 100 is the product of those 9 probabilities above, which works
out at 62.82%.
For interest I then worked out the expected (mean) number of columns with
duplicates out of my 256 columns, and this is =256*(1-62.82%), or about 95
(and 95 is the middle of the typical range of 70 to 120 which I quoted above
when I recalculated a number of times).

If you change from RANDBETWEEN(1,100) to your other suggestion of
RANDBETWEEN(1,50), the probability of no duplicates drops from 62.82% to
38.17%, and the expected number of columns with duplicates increases to
about 158 of the 256 (and typical values seen are between 143 and 173).

Q.E.D.
--
David Biddulph


Ms-Exl-Learner said:
David Sir I am not having any in depth knowledge about excel. I am
learning
it and know some little bit about excel by practice. I have also come
across
this issue and in that time I used to do like this to avoid duplicates.

Open a New work book and in A1 Cell paste this formula

=RANDBETWEEN(1,10)

Apply the formula upto A10, after that select the column or Range A1:A10
and
do copy and paste it as values. In B1 cell apply this formula
=COUNTIF(A:A,A1) and drag it upto B10. Now you can see the values "2" in
B
Column. When we apply the Randbetween for 10 cells and if we mention the
Maximum Randbetween value as 10 then it's creating duplicates.

Now open another new workbook and paste the below formula in A1 Cell

=RANDBETWEEN(1,100)
OR
=RANDBETWEEN(1,50)

As mentioned above repeat the same steps like pasting it upto A10 cell and
do copy and paste it as values and apply countif formula to check for
duplicates. Now all the countif results will be "1".

So if we increase the Randbetween Maximum value depends upon the data then
it's not creating any duplicate values. That is the reason I have
suggested
it.

If this method is wrong then please guide me I will also stop trying this.

--------------------
(Ms-Exl-Learner)
--------------------



David Biddulph said:
I'm confused by your reply to Maria.

Can you explain to us why you think that
=RANDBETWEEN(1,50)
=RANDBETWEEN(1,100)
would not provide duplicates?
The probability of duplicates will obviously reduce progressively as the
probability for each independent number is reduced, but the probability
of
duplicates doesn't become zero until the range of numbers from which you
are
choosing is infinite.
[It is left as an excercise for the interested reader to calculate the
probability of duplicates for a selection of 10 samples from
RANDBETWEEN(1,100), but I can assure you that the probability is
non-zero.]

Hence you need a VBA solution if duplcates are to be avoided.
--
David Biddulph

Ms-Xl-Learner wrote:


Increase the Maximum Number value in your Randbetween formula.

For Example if you want to use the Randbetween for 10 cells, then if
you use the formula like this =RANDBETWEEN(1,10) will result duplicate
values.

If you use the formula like this

=RANDBETWEEN(1,50)
=RANDBETWEEN(1,100)

Then you will not get any duplicates.

Like this depends upon your data size increase the Maximum Value In
your Randbetween formula to avoid duplicates values.


.
 
Test the error paths.  They all return #VALUE instead of the intended error.

Hello,

Thank you. Corrected.

Regards,
Bernd
 
Bernd P said:
A faster and more flexible RandInt you can find here:
http://sulprobil.com/html/randint.html

I don't think the "late initialization" algorithm works correctly when the
lMin-to-lMax range includes zero.

I am not taking the time to prove it by testing. But consider the following
mental experiment with lMin < 0, lMax = 0, and lRept = 1.

Assume the first random position is lT(x) such that 1 <= x < lRange; thus,
the first random integer is x-1+lMin. Then lT(x) is replaced with
lT(lRange). Since lT(lRange) is zero (uninitialized), lT(x) is replaced
with lRange-i+lMin, which is zero. So far, so good.

Now assume the second random position is the same lT(x) coincidentally. In
that case, the second random integer should be zero; that is, it should be
lT(lRange) that was copied down. But since lT(x) is zero, it will be
interpreted as uninitialized. So the second random integer will be x-1+lMin
again.

Not only is that not the intended random integer, but also it violates the
lRept requirement of one.

If you agree, I think the simplest solution is to disable the "late
initialization" algorithm when the lMin-to-Lmax range includes zero. To
wit:

If lRange < CLateInit Or (lMin <= 0 And lMax >= 0) Then '[sic]
For i = 1 To lRange
lT(i) = Int((i - 1) / lRept) + lMin
Next i
i = 1
For lRow = 1 To UBound(lR, 1)
For lCol = 1 To UBound(lR, 2)
lRnd = Int(((lRange - i + 1) * Rnd) + 1)
lR(lRow, lCol) = lT(lRnd)
lT(lRnd) = lT(lRange - i + 1)
i = i + 1
Next lCol
Next lRow
Else
'If we have a huge range of possible random integers and a
'comparably small number of draws, i.e. if
'(lMax - lMin) * lRept >> lCount
'then we can save some runtime with late initialisation.
i = 1
[...etc...]


PS: I do not agree that "lRange < CLateInit" is sufficient to test the
condition (lMax-lMin)*lRept >> lCount. But that's a matter of opinion.
 
Hello again,

...
I don't think the "late initialization" algorithm works correctly when the
lMin-to-lMax range includes zero.
...

Thanks for spotting. I corrected that.
...
If you agree, I think the simplest solution is to disable the "late
initialization" algorithm when the lMin-to-Lmax range includes zero.
...

I do not agree. A runtime check revealed that there is enough time to
shift the results.
...
PS:  I do not agree that "lRange < CLateInit" is sufficient to test the
condition (lMax-lMin)*lRept >> lCount.  But that's a matter of opinion.

I agree and I changed it - now it's a border for a ratio (lRange /
lCount). But it's still a matter of opinion...

Thanks again for your help.

Regards,
Bernd
 
Back
Top