Extracting coordinates of matching numbers within a matrix

  • Thread starter Thread starter Max
  • Start date Start date
M

Max

Hi guys,

I have in Sheet1, in B3:U502, a 500 row x 20 col matrix containing 10,000
randomized unique numbers from 0000 - 9999. The range is named RData.

B2:U2 are numbered 1 - 20, A3:A502 are numbered 1 - 500

In Sheet2, there's a list of 4-digit numbers in col A, A2 downwards.

I would like to match the numbers in col A against RData & extract the
"coordinates" of the matching cells, ie the row and col numbers (as per
numbering in B2:U1 and in A3:A502 in Sheet1) into cols B & C.

Since RData contains all 10,000 numbers, there will always be a match for
the numbers listed in col A.

For example, supposing the number 0000 is located in row no: 2 and col no: 3
within RData (i.e. in cell D4 in Sheet1)

In sheet2

A2: 0000
B2: 2 < extracted row no.
C2: 3 < extracted col no.

Appreciate any insights on the formulae which can be used for cols B and C.

TIA
Max
 
Max,

Try these two functions.

Put:
=RowN(RData,A2)

in Sheet2, B2

Put:
=ColN(RData,A2)
in Sheet2, C2


Then copy B2 and C2 down as far as there are 4 digit numbers.




This is set up for numbers not text. I mention it because you describe the
randon numbers as 4 digit with leading zero.




Function RowN(myrange As Range, Myvalue)
For Each c In myrange.Cells
If c.Value = Myvalue Then
RowN = c.Row - 2
Exit Function
Else
End If
Next c
RowN = "Not Found"
End Function


Function ColN(myrange As Range, Myvalue)
For Each c In myrange.Cells
If c.Value = Myvalue Then
ColN = c.Column - 1
Exit Function
Else
End If
Next c
ColN = "Not Found"
End Function


Don Pistulka
 
Max said:
I have in Sheet1, in B3:U502, a 500 row x 20 col matrix containing 10,000
randomized unique numbers from 0000 - 9999. The range is named RData.

B2:U2 are numbered 1 - 20, A3:A502 are numbered 1 - 500

In Sheet2, there's a list of 4-digit numbers in col A, A2 downwards.

I would like to match the numbers in col A against RData & extract the
"coordinates" of the matching cells, ie the row and col numbers (as per
numbering in B2:U1 and in A3:A502 in Sheet1) into cols B & C.

Since RData contains all 10,000 numbers, there will always be a match for
the numbers listed in col A.

For example, supposing the number 0000 is located in row no: 2 and col no: 3
within RData (i.e. in cell D4 in Sheet1)

In sheet2

A2: 0000
B2: 2 < extracted row no.
C2: 3 < extracted col no.
I put together these formulas:
=SUMPRODUCT((RData=A2)*ROW(RData)) - ROW(RData)+1
=SUMPRODUCT((RData=A2)*COLUMN(RData)) - COLUMN(RData)+1

Calculation time is slow - really slow. Slow like you might think you
computer has crashed slow.

A faster solution would be using a single column that is a translation
of the 500x20 range. Insert a new worksheet (or in a clean area of any
worksheet). Enter the numbers 1-10,000 down a column and in the column
next to them, put in the formula:
=OFFSET(RData,MOD(A1-1,500),INT((A1-1)/500),1,1)
modifing A1 appropriately. Name the single column of numbers
RDataSingleColumn

Then your formulas in B2 and C2 would be:
=MOD(MATCH(A2,RDataSingleColumn) -1,500)+1
=INT((MATCH(A2,RDataSingleColumn) -1)/500)+1
Another solution would be a VBA user-defined function like:

If you don't want to support a single-column view, you might use a VBA
user-defined function like:

(I believe this is somewhat faster than the SUMPRODUCT method, but
much slower than the single column)

Public Function FindInBlock(valueToFind As Integer, _
block As Range, whichOutput As Integer) As Variant
'whichOutput: 1- Row, 2- Column, 3- Address
Dim rFound As Range

Set rFound = block.Find(valueToFind, _
LookIn:=xlValues, lookat:=xlWhole)
If rFound Is Nothing Then
FindInBlock = CVErr(xlErrNA)
Else
Select Case whichOutput
Case 1
FindInBlock = rFound.Row - block.Row + 1
Case 2
FindInBlock = rFound.Column - block.Column + 1
Case 3
FindInBlock = rFound.Address
Case Else
FindInBlock = CVErr(xlErrValue)
End Select
End If
End Function


Hope this helps,

Matthew
 
Max said:
I have in Sheet1, in B3:U502, a 500 row x 20 col matrix containing 10,000
randomized unique numbers from 0000 - 9999. The range is named RData.

B2:U2 are numbered 1 - 20, A3:A502 are numbered 1 - 500

In Sheet2, there's a list of 4-digit numbers in col A, A2 downwards.

I would like to match the numbers in col A against RData & extract the
"coordinates" of the matching cells, ie the row and col numbers (as per
numbering in B2:U1 and in A3:A502 in Sheet1) into cols B & C. ....
For example, supposing the number 0000 is located in row no: 2 and col no: 3
within RData (i.e. in cell D4 in Sheet1)

In sheet2

A2: 0000
B2: 2 < extracted row no.
C2: 3 < extracted col no.
....

This is a situation in which a macro would be much faster than formulas,
even formulas using UDFs. It's far more efficient to 'push' coordinates from
RData into Sheet2. If Sheet2!A2:A10001 were sorted in ascending order,


Sub foo()
Dim src As Range, dest As Range, r As Range
Dim n As Long

Set src = Names("RData").RefersToRange
Set dest = Worksheets("Sheet2").Range("A2")

Application.Calculation = xlCalculationManual

For Each r In src
dest.Offset(r.Value, 2).Value = r.Offset(2 - r.Row, 0).Value
dest.Offset(r.Value, 3).Value = r.Offset(0, 1 - r.Column).Value
Next r

Application.Calculation = xlCalculationAutomatic
End Sub
 
...
....
I put together these formulas:
=SUMPRODUCT((RData=A2)*ROW(RData)) - ROW(RData)+1
=SUMPRODUCT((RData=A2)*COLUMN(RData)) - COLUMN(RData)+1

Calculation time is slow - really slow. Slow like you might think you
computer has crashed slow.

20K formulas each performing 10K comparisons and several arithmetic
operations. Brute force. No surprise it's slow.

If you start with the value to locate in RData, then find it in RData, you
can't avoid O(N^2) run times. If you start with RData and iterate through it
to populate the result/coordinate range, you get O(N) run times.
 
Thanks for your insights, Matthew!
I'll try out the various options you have offered.

cheers
Max
 
By the way, if in addition you entered

=ArrayMatch(A2,RData,"A") in D2 and filled down, you would also have the
addresses of the matches.

Alan Beban
 
Many thanks Harlan! Took me a while though to understand your sub's output
and how to relate it to my set-up.

cheers
Max
 
Here's my feedback on your options, Matthew.. in-line

Matthew Connor said:
I put together these formulas:
=SUMPRODUCT((RData=A2)*ROW(RData)) - ROW(RData)+1
=SUMPRODUCT((RData=A2)*COLUMN(RData)) - COLUMN(RData)+1
Calculation time is slow - really slow. Slow like you might think you
computer has crashed slow.

Yes, this option returns correct results. Great!
Recalculation time appears not too bad on my PC.
A faster solution would be using a single column that is a translation
of the 500x20 range. Insert a new worksheet (or in a clean area of any
worksheet). Enter the numbers 1-10,000 down a column and in the column
next to them, put in the formula:
=OFFSET(RData,MOD(A1-1,500),INT((A1-1)/500),1,1)
modifing A1 appropriately. Name the single column of numbers
RDataSingleColumn

Then your formulas in B2 and C2 would be:
=MOD(MATCH(A2,RDataSingleColumn) -1,500)+1
=INT((MATCH(A2,RDataSingleColumn) -1)/500)+1

I did all of the above as described, but I'm afraid I failed to get the
correct results.
For example, when I input the top left number in RData (i.e. the random
number in cell B3 in sheet1) in sheet2's cell A2, the results returned in B2
& C2 were 444 , 18, instead of 1, 1.
Putting in the bottom right number in RData (i.e. the number in U502 of
sheet1) returned 472, 11, instead of 500, 20. I'm not sure what went a
little off here. Any ideas how/what to rectify?
Another solution would be a VBA user-defined function like:

If you don't want to support a single-column view, you might use a VBA
user-defined function like:

(I believe this is somewhat faster than the SUMPRODUCT method, but
much slower than the single column)

Public Function FindInBlock(valueToFind As Integer, _
block As Range, whichOutput As Integer) As Variant
'whichOutput: 1- Row, 2- Column, 3- Address
Dim rFound As Range

Set rFound = block.Find(valueToFind, _
LookIn:=xlValues, lookat:=xlWhole)
If rFound Is Nothing Then
FindInBlock = CVErr(xlErrNA)
Else
Select Case whichOutput
Case 1
FindInBlock = rFound.Row - block.Row + 1
Case 2
FindInBlock = rFound.Column - block.Column + 1
Case 3
FindInBlock = rFound.Address
Case Else
FindInBlock = CVErr(xlErrValue)
End Select
End If
End Function

Tried out the above UDF as in:

=FindInBlock($A2,RData,1)
=FindInBlock($A2,RData,2)
=FindInBlock($A2,RData,3)

All 3 returned #NA. I'm using xl97. Is the version the problem?

TIA
Max
 
Sorry Alan. Followed your suggestions diligently and downloaded/copied your
module over to the workbook, but I'm not able to get it to work.

=ArrayMatch(A2,RData), array-entered into B2:C2
=ArrayMatch(A2,RData,"A"), array-entered into D2

The results in B2, C2 & D2 were zeros.

I'm using xl97. Could the version be the problem?

TIA
Max
 
Hi Max,

Another option (untested)

=SUMPRODUCT((RData=A2)*ROW(RData)) -ROW(RData)+1

in B2 and:

=SUMPRODUCT((OFFSET(RData,B2-1,,1)=A2)*COLUMN(RData))) -COLUMN(RData)+1

in C2.

They should be nearly twice the speed of Matthew's original pair of
formulae, because the second formula doesn't need to search the complete
range again, just the row that the value was found in.

HTH
Steve D.
 
Max,

If you don't mind having both coordinates in one cell this should cut the
time in half:

Function BothCN(myrange As Range, Myvalue)
For Each c In myrange.Cells
If c.Value = Myvalue Then
BothCN = c.Row - 2 & "," & c.Column - 1
Exit Function
Else
End If
Next c
BothCN = "Not Found"
End Function

Don Pistulka
 
...
...
Another option (untested)

=SUMPRODUCT((RData=A2)*ROW(RData)) -ROW(RData)+1

in B2 and:

=SUMPRODUCT((OFFSET(RData,B2-1,,1)=A2)*COLUMN(RData))) -COLUMN(RData)+1

in C2.

They should be nearly twice the speed of Matthew's original pair of
formulae, because the second formula doesn't need to search the complete
range again, just the row that the value was found in.
...

Interesting.

Only one lookup is needed. Just use an array formula to combine and split the
row and column numbers.

=INT(MOD(SUMPRODUCT((RData=A2)*(1000*OFFSET(RData,0,-1,,1)
+OFFSET(RData,-1,0,1,))),{1000000,1000})/{1000,1})

Still, you're comparing each of the N values in Sheet2!A2:A10001 to each of the
N values in RData, so it remains an O(N^2) process.
 
...
...
I did all of the above as described, but I'm afraid I failed to get the
correct results.
...

This doesn't work because RDataSingleColumn, which is presumably the column
containing the OFFSET formulas, isn't in strictly ascending order, so the MATCH
calls in B2 and C2 need zero third arguments to force exact matches in the
unsorted RDataSingleColumn. That means linear search, so nothing gained compared
to looking in RData directly.
 
Excellent, Steve!
Tested - it works great, and it's fast too.

Thanks
Max

| Hi Max,
|
| Another option (untested)
|
| =SUMPRODUCT((RData=A2)*ROW(RData)) -ROW(RData)+1
|
| in B2 and:
|
| =SUMPRODUCT((OFFSET(RData,B2-1,,1)=A2)*COLUMN(RData))) -COLUMN(RData)+1
|
| in C2.
|
| They should be nearly twice the speed of Matthew's original pair of
| formulae, because the second formula doesn't need to search the complete
| range again, just the row that the value was found in.
|
| HTH
| Steve D.
 
Many thanks, Alan!
I've got it working now.

cheers
Max

| There should be no problem with the version; I have sent you an email.
|
| Alan Beban
|
| Max wrote:
| > Sorry Alan. Followed your suggestions diligently and downloaded/copied
your
| > module over to the workbook, but I'm not able to get it to work.
| >
| > =ArrayMatch(A2,RData), array-entered into B2:C2
| > =ArrayMatch(A2,RData,"A"), array-entered into D2
| >
| > The results in B2, C2 & D2 were zeros.
| >
| > I'm using xl97. Could the version be the problem?
| >
| > TIA
| > Max
 
Great variant, Don!
Works smooth & fast.

Guess I'm really spoilt for choice
given the many options availed by you,
Alan, Matthew, Harlan & Stephen <g>

Thanks
Max

| Max,
|
| If you don't mind having both coordinates in one cell this should cut the
| time in half:
|
| Function BothCN(myrange As Range, Myvalue)
| For Each c In myrange.Cells
| If c.Value = Myvalue Then
| BothCN = c.Row - 2 & "," & c.Column - 1
| Exit Function
| Else
| End If
| Next c
| BothCN = "Not Found"
| End Function
|
| Don Pistulka
|
 
Back
Top