How long for the first group

  • Thread starter Thread starter PAL
  • Start date Start date
P

PAL

I have a spreadsheet with many rows (1000s) and columns. I am trying to
determine how long it took for a percentage of franchises to open. Lets say
I have:

Column A: Company Name
Column B: Date when product was ready (ie Company A would have the same date
on each row )
Column C: Franchise Name
Column D: Date franchise opened.
Column E: Col D - Col B

If there were a 1000 rows, 500 of which were for company A. I would like to
find the amount of time it took for 50% (the first 50 sites for this example)
of the franchises to open. I am thinking it has to order the dates in column
D, then find the franchise that was 50th and then take column E as the
answer.

I get that it is an array, but get stuck after that.

Ideas.
 
Pal,

=INDEX(E:E,SUMPRODUCT((D1:D20000=(LARGE(IF(A1:A20000="Company
A",D1:D20000),COUNTIF(A1:A20000,"Company A")/2)))*(A1:A20000="Company
A")*ROW(A1:A20000)))

Array entered using Ctrl-Shift-Enter. "Company A" can also be a cell
reference, in case you want to make a table - use advanced filtering to
extract the unique list from your column of company names.

HTH,
Bernie
MS Excel MVP
 
Also, I should have noted that I have assumed that the dates are unique in
column D for any one company - i.e., they did not open multiple franchises
on the same day.

Bernie
 
This is great. Thank you. A few points to fine tune....

I changed "large" to "small" in order for it to start from the smallest
(10th, 25th percentile....).

Also,

1) multiple franchises can be opened on the same day.
2) it is possible that product ready, col B, may be blank. Anyway to force
a blank instead of #"Val"
3) if the number of franchise is "1" we get the "#NUM!". Anway to force a
blank.
4) I also noticed if the number of franchises is "2" or "3". That while it
calucates a 50th or 80th percentile it gives the "#NUM!" for the 25th
percentile.

Any thoughts to clean this up is appreciated, but regardless thanks much.
 
See my comments in-line....

This is great. Thank you.


You're welcome.
A few points to fine tune....

I changed "large" to "small" in order for it to start from the smallest
(10th, 25th percentile....).

Also,

1) multiple franchises can be opened on the same day.

On second thought, this really shouldn't matter, since the value in column E
should be the same.

2) it is possible that product ready, col B, may be blank. Anyway to
force
a blank instead of #"Val"

We weren't using column B andywhere... perhaps wrap the formula in

=IF(B2=""","",LongFormula)
3) if the number of franchise is "1" we get the "#NUM!". Anway to force a
blank.

=IF(COUNTIF(A1:A20000,"Company A")=1,"",IF(B2=""","",LongFormula))
or
=IF(COUNTIF(A1:A20000,"Company A")=1,E2,IF(B2=""","",LongFormula))

4) I also noticed if the number of franchises is "2" or "3". That while it
calucates a 50th or 80th percentile it gives the "#NUM!" for the 25th
percentile.

Use the same technique, along the lines of

=IF(COUNTIF(A1:A20000,"Company A")<3,"Something other than 25th
percentile",Rest of the formula here)

HTH,
Bernie
MS Excel MVP
 
Lots going on in this one. If I focus in on #2 only, I get the blank I am
looking for, but when it is not blank it returns a "false".

=IF(COUNTIF(A1:A20000,P2)=1,"",IF(B2=""",""",INDEX(H:H,SUMPRODUCT(($F$2:$F$1045=(SMALL(IF($A$2:$A$1045=P2,$F$2:$F$1045),COUNTIF($A$2:$A$1045,P2)/4)))*($A$2:$A$1045=P2)*ROW($A$2:$A$1045)))))
 
PAL,

Try changing

IF(B2=""","""

to

IF(B2="","",

You have three double quotes in a row instead of just two.

HTH,
Bernie
MS Excel MVP
 
Hi Bernie,

Wasn't sure if I should post this as a new thread....anyway....

The formula you provided works great. I have pasted in two of them: one
for 50%, the other for 90%. Both formulas seem to work 95% of the time.
Occasionally, I get a "0" for the 50% (or another percentage), while the 90%
works fine. I am not sure why. I thought this to be a stat problem, but the
number of cells it is using is greater than 10. The other fields look fine.
Any ideas.

50%

=IF(Work!G32="","",IF(COUNTIF(Work!$A$2:$A$2000,A33)<=5,"",INDEX(Work!H:H,SUMPRODUCT((Work!$F$2:$F$2000=(SMALL(IF(Work!$A$2:$A$2000=A33,Work!$F$2:$F$2000),COUNTIF(Work!$A$2:$A$2000,A33)/2)))*(Work!$A$2:$A$2000=A33)*ROW(Work!$A$2:$A$2000)))))

90%

=IF(Work!G32="","",IF(COUNTIF(Work!$A$2:$A$2000,A33)<=5,"",INDEX(Work!H:H,SUMPRODUCT((Work!$F$2:$F$2000=(SMALL(IF(Work!$A$2:$A$2000=A33,Work!$F$2:$F$2000),COUNTIF(Work!$A$2:$A$2000,A33)/1.11)))*(Work!$A$2:$A$2000=A33)*ROW(Work!$A$2:$A$2000)))))
 
It is very hard to diagnose the problems without the data - what I like to do is cut down the data
set to, say, 20 rows, and change the formula to reflect that restricted range. Get the problem to
manifest at that level, then select parts of the formula in Edit mode and press F9 to get them to
evaluate - then you can process the formula part by part to figure out what is happening.

HTH,
Bernie
MS Excel MVP
 
Near as I can tell its a stats problem. I have done it two ways, removed all
the data with the exception of the data generating these numbers or I kept it
all in. The 90th was working originally.

The numbers in the data set are
152,239,239,244,244,244,263,273,364,455,482,482

I can get the 50th and 90th percentile if I delete 3 of the numbers - 482,
244, 244. Obviously the dups are messing it up. If I remove 2 x 244 it
works. Obviously, I can't control the data so it seems like I have a
problem.

Not sure where to go from here.
 
What results would you expect? With those numbers I get 244 as the 50%ile and 455 as the 90%ile.
But if I change

COUNTIF(Work!$A$2:$A$2000,A33)/1.11

to

ROUND(COUNTIF(Work!$A$2:$A$2000,A33)/1.11,0)

then I get 482 as the 90%ile.

HTH,
Bernie
MS Excel MVP
 
Sorry for the delay. Got pulled away, but was probably good to get away.
Not sure what I did last time as it was the wrong data. I need back track
to your previous suggestion and recreate.

Within the big spreadsheet, for 50th, I get 0. For 90th, I get 552.
 
Same thing with the shortened data set....

249,336, 370,341,336,341,461,341,360,552,579,579

If I pull the numbers out and do straight percentile I get 350, 578.
ie =PERCENTILE(range,0.50) ETC...which is what I would expect.
 
PAL,

I'm not sure why you are getting a zero - perhaps you have duplicates, and the SUMPRODUCT is
actually summing for two or more rather than just finding one (that is one drawback to SUMPRODUCT).

Can you email me a book or sheet that shows the problem?

HTH,
Bernie
MS Excel MVP
 
Yes, I can send. Trying to remove the noise and the company name from
properties. Where to post? Thanks. My guess is your assumption is correct
 
PAL,

deitbe at consumer dot org

Bernie

PAL said:
Yes, I can send. Trying to remove the noise and the company name from
properties. Where to post? Thanks. My guess is your assumption is
correct
 
PAL,

It was that there are multiple rows returned by the sumproduct. You can
array enter:

=IF(Work!$B$2:$B$23="","",IF(COUNTIF(Work!$A$2:$A$23,A4)<=5,"",INDEX(Work!D:D,SUMPRODUCT(MAX((Work!$C$2:$C$23=(SMALL(IF(Work!$A$2:$A$23=A4,Work!$C$2:$C$23),COUNTIF(Work!$A$2:$A$23,A4)/2)))*(Work!$A$2:$A$23=A4)*ROW(Work!$A$2:$A$23))))))

The sumproduct in this will return the last entry that meets the criteria
(the MAX that I added), but since you are looking at identical numbers, it
doesn't really matter.

I will send you the fixed workbook example.

HTH,
Bernie
MS Exel MVP
 
Back
Top