INDEX function array causes problem when attempting to use LARGEfunction to sort positive and negati

  • Thread starter Thread starter exceluser
  • Start date Start date
E

exceluser

Can someone show me how to accomplish the following by using the LARGE
function with positive and negative numbers "WITHOUT" using an array
entered formula ?

I'm trying to rank each product (most, second most, etc.) by weight.


Data worksheet

A B C
1 Product Type Pounds
2 Orange Fruit -200
3 Tomato Vegetable 500
4 Apple Fruit 700
5 Potato Vegetable 1,000


Using the LARGE function, the goal is to get the following result on
another worksheet:


Order worksheet


A B C
1 Product 1 2
2 Fruit Apple Orange
3 Vegetable Potato Tomato



Currently, I'm using the following forumula:

=IFERROR(INDEX(Data!$A$1:$C$5,MATCH(LARGE(INDEX((Data!$B$2:$B$5=
$A2)*Data!$C$2:$C$5,),B$2),Data!$C$1:$C$5,0)),"")

NOTE:

Unfortunately, this formula only works if the sorted numbers are
positive due to the INDEX function's array being populated with 0's
from the FALSE values.

{TRUE,FALSE,TRUE,FALSE} * {-200,500,700,1000} = {-200,0,700,0}

The problem is that the LARGE function doesn't ignore the 0's created
from the FALSE values in the INDEX array.

As a result, TRUE values in the INDEX array that are negative or zero
may not get properly included in the sorted output.

Basically, I want the INDEX function array output to be {-200,,700,}
which will sort properly.

I do not want to use an array entered formula as the number of cells
(15,000+) that would contain the formula would cause approximately 90
minutes of calculation every time related data is changed.

Any ideas as to how to sort the data with positive and negative
numbers and then use the LARGE function to rank the results ?
 
This simple play uses only non-array, entire col formulas, and embedds
a tiebreaker functionality to better handle the desired sorted extract
(in the event of ties or multiple ties)

In Data,
you have the source data as posted within A1:D1

List the 2 Types into F1:G1, viz: Fruit, Vegetable
Put in F2: =IF($C2=F$1,D2-ROW()/10^10,"")
Copy to G2, fill down to cover the max expected extent of source data,
say down to G15000? This is the simple tiebreaker formula. It should
not drag down recalc despite the large range covered.

In Order,
Put in say, B5: =INDEX(Data!$F$1:$G$1,ROWS($1:1))
Copy down to B6

Put in C5:
=IF(COLUMNS($A:A)>COUNT(OFFSET(Data!$F:
$F,,ROWS($1:1)-1)),"",INDEX(Data!$B:$B,MATCH(LARGE(OFFSET(Data!$F:
$F,,ROWS($1:1)-1),COLUMNS($A:A)),OFFSET(Data!$F:$F,,ROWS($1:1)-1),0)))
Copy across to cover the max expected returns for each "Type", say
across to Z5, fill down to Z6

For the sample source data posted,
the above will auto-extract the desired result below into B5:Z6

Fruit Apple   Orange
Vegetable Tomato Potato
(all neatly packed to the left)

In the event of ties in the source scores, items (ie Apple, Orange)
will be extracted in the same relative order that they appear within
the source.
 
Just replace your formula with this
=IFERROR(INDEX(Data!$A$1:$A$5,MATCH(LARGE(IF(Data!$B$2:$B$5=$A3,Data!$C
$2:$C$5,""),B$2),Data!$C$1:$C$5,0)),"")
array formula, press CTRL+ALT+ENTER instead of ENTER.
This will solve your problem with zero values created from logical
function.
 
In your second Sheet in B2 enter
=IFERROR(INDEX(Data!$A$2:$A$5,MATCH(SUMPRODUCT(LARGE(((Data!$B$2:$B$5=$A2)*(Data!$C$2:$C$5))+((Data!$B$2:$B$5<>$A2)*(-1000)),COLUMN()-1)),Data!$C$2:$C$5,0)),"")

Copy/fill this down and right as needed.
It works in 2007 with the data you provided. The -1000 in the formula is to
trick it into thinking this is the lowest value so if your data goes lower
than that it will need changing to suit.

Good luck

Regards
Steve
 
Errata in this line


should be:
Put in F2: =IF($C2=F$1,$D2-ROW()/10^10,"")

Max,

Wow, that was pretty fast.

Maraming salamat po !

With a little modification, your formula works like a charm.

Since the Data!$F:$G columns would only be used for the benefit of
the Order worksheet formulas, is it possible to create that same
series of data in an INDEX array to be used within the formulas on the
Order worksheet - instead of creating helper columns which is
basically what Data!$F:$G is ?

If I could construct an INDEX array without 0's created from FALSE
values like {-200,,700,} without using an array entered formula that
would be the finishing bit.




Exceluser
 
Just replace your formula with this
=IFERROR(INDEX(Data!$A$1:$A$5,MATCH(LARGE(IF(Data!$B$2:$B$5=$A3,Data!$C
$2:$C$5,""),B$2),Data!$C$1:$C$5,0)),"")
array formula, press CTRL+ALT+ENTER instead of ENTER.
This will solve your problem with zero values created from logical
function.
















- Show quoted text -

Minyeh,

Thank you for responding with your idea.

Actually, when I first started attempting this solution, an array
entered formula was used.

And it worked as needed - no zero's for the FALSE values.

But unfortunately, using an array entered formula like this in over
15,000+ cells on the Order worksheet causes the CPU to hit near %100
utilization for 90 minutes every time new data is imported each day.

This makes working with the spreadsheet an exasperating experience.

It would be useful to know what an array entered formula is doing
behind the scenes to exclude the 0's in an INDEX array.

Alternatively, I could buy an 8 core processor machine (sarcasm).




Exceluser
 
Glad to hear. I'm out on your further queries. Meantime, enjoy your
zip along at Ferrari speed with the construct. Do give the stars in
google a resounding ring
 
I forget to say that this formula was set up to work on the small data set
you gave.
Just change the number $5 to suit your needs. and if you have additional
headings in row B of your data sheet, then just add that to your Order sheet
column A
and copy the formula as needed. For 15000 rows the formula should read

=IFERROR(INDEX(Data!$A$2:$A$15000,MATCH(SUMPRODUCT(LARGE(((Data!$B$2:$B$15000=$A2)*(Data!$C$2:$C$15000))+((Data!$B$2:$B$5<>$A2)*(-1000)),COLUMN()-1)),Data!$C$2:$C$15000,0)),"")

Regards
Steve
 
I forget to say that this formula was set up to work on the small data set
you gave.
Just change the number $5 to suit your needs. and if you have additional
headings in row B of your data sheet, then just add that to your Order sheet
column A
and copy the formula as needed. For 15000 rows the formula should read

=IFERROR(INDEX(Data!$A$2:$A$15000,MATCH(SUMPRODUCT(LARGE(((Data!$B$2:$B$150­00=$A2)*(Data!$C$2:$C$15000))+((Data!$B$2:$B$5<>$A2)*(-1000)),COLUMN()-1)),­Data!$C$2:$C$15000,0)),"")

Regards
  Steve








- Show quoted text -

To Max, Minyeh, Steve, Steve Dunn and Bernd P,

TA DAH !

Here it is in all its working, fattened glory.

=IFERROR(INDEX(Data!$A$2:$A$5,MATCH(INDEX(LARGE(((Data!$B$2:$B$5=
$A2)*(Data!$C$2:$C$5)+(ROW(Data!$C$2:$C$5)/10^10))+((Data!$B$2:$B$5<>
$A2)*(SMALL(Data!$C$2:$C$5,1)-1)),B$1),),INDEX(Data!$C$2:$C$5+
(ROW(Data!$C$2:$C$5)/10^10),),0)),"")

The new formula:

1) Supports sorting positive and negative numbers

2) Creates a tie breaker for duplicate values

3) Is orders faster (and fatter) than a similarly constructed array
entered formula

4) Dispenses with 0's created from FALSE values in an INDEX array

5) Does not require any additional helper columns, rows or worksheets
to store workaround data

With your very kind permission, I'd like to thank the innocent and
unsuspecting parties above who inadvertently subjected themselves to a
ridiculously, tortuous exercise.

This formula has replaced an array entered formula and crushed the
calculation time from 90 minutes to 95 seconds.

It's like replacing a $1,000,000 Rolls Royce with a dump truck that
can do 200 mph - it's not pretty, but it gets the job done.

Thanks to all of you, I've learned more about writing moderately
complex Excel formulas in the last week than I have in the last few
years.

And I shouldn't leave out the fact that I've spent an inordinate
amount of time this past week with a dumber than usual look on my face
as I tried to figure out how each of your solutions worked.

You can now resume your regularly scheduled lives ... until next
time.

MOO HOO HOO HA HA (evil laugh).




Exceluser
 
To Max, Minyeh, Steve, Steve Dunn and Bernd P,

   TA DAH !

   Here it is in all its working, fattened glory.

=IFERROR(INDEX(Data!$A$2:$A$5,MATCH(INDEX(LARGE(((Data!$B$2:$B$5=
$A2)*(Data!$C$2:$C$5)+(ROW(Data!$C$2:$C$5)/10^10))+((Data!$B$2:$B$5<>
$A2)*(SMALL(Data!$C$2:$C$5,1)-1)),B$1),),INDEX(Data!$C$2:$C$5+
(ROW(Data!$C$2:$C$5)/10^10),),0)),"")

   The new formula:

        1) Supports sorting positive and negative numbers

        2) Creates a tie breaker for duplicate values

        3) Is orders faster (and fatter) than a similarly constructed array
entered formula

        4) Dispenses with 0's created from FALSE values in an INDEX array

        5) Does not require any additional helper columns, rows or worksheets
to store workaround data

   With your very kind permission, I'd like to thank the innocent and
unsuspecting parties above who inadvertently subjected themselves to a
ridiculously, tortuous exercise.

   This formula has replaced an array entered formula and crushed the
calculation time from 90 minutes to 95 seconds.

   It's like replacing a $1,000,000 Rolls Royce with a dump truck that
can do 200 mph - it's not pretty, but it gets the job done.

   Thanks to all of you, I've learned more about writing moderately
complex Excel formulas in the last week than I have in the last few
years.

   And I shouldn't leave out the fact that I've spent an inordinate
amount of time this past week with a dumber than usual look on my face
as I tried to figure out how each of your solutions worked.

   You can now resume your regularly scheduled lives ... until next
time.

   MOO HOO HOO HA HA (evil laugh).

Exceluser- Hide quoted text -

- Show quoted text -

The only thing this formula doesn't do is indicate which values are
involved in a tie breaker.

For example, you're a judge that's responsible for selecting 10
athletes out of 30 entrants to go to the Olympics by scoring them on
various criteria.

The top 9 athletes score unique values.

However, athletes #10 and #11 score equally.

Depending on how the data is sorted, #11 could be eliminated from
consideration just because of where their name falls in the list to
sort on.

In this situation, the formula would have to be modified to account
for this.

Perhaps by eliminating the tie breaker and concatenating the names
with equal values.

Any ideas ?



Exceluser
 
Back
Top