vlookup table array

  • Thread starter Thread starter Bob
  • Start date Start date
B

Bob

I have a valid vlookup formula to look up expense data
for individuals in an external workbook. The formula
looks something like this:

vlookup("Joe",[week 1.xls]expenses!A2:B100,2)

I have multiple external databases - [week 1], [week 2],
[week 3], ... I'd like to have a cell on my active
worksheet that contains the name of one of the external
workbooks - for example "week 3". I'd like to have my
vlookup formula reference that cell to reflect the
external data from that workbook. If cell A1
contains "week 3", the the formula would look something
like this:

vlookup("Joe",CONCATENATE("[",CELL("contents",A1),".xls]
expenses!A2:B100),2)

But this returns a #VALUE! error.
 
Hi Bob
Fist an importan prerequisite to achieve the following: Your workbooks
have to be opened! This solution won't work with closed files.
but now to your formula. Try
=vlookup("Joe",INDIRECT("'[" & A1 & ".xls]expenses'!A2:B100"),2)
where A1 stores your workbook name. note the multiple apostrophes at
the beginning:
" ' [ " (with spaces between the single characters)

HTH
Frank
 
I have a valid vlookup formula to look up expense data
for individuals in an external workbook. The formula
looks something like this:

vlookup("Joe",[week 1.xls]expenses!A2:B100,2)

I have multiple external databases - [week 1], [week 2],
[week 3], ... I'd like to have a cell on my active
worksheet that contains the name of one of the external
workbooks - for example "week 3". I'd like to have my
vlookup formula reference that cell to reflect the
external data from that workbook. If cell A1
contains "week 3", the the formula would look something
like this:

vlookup("Joe",CONCATENATE("[",CELL("contents",A1),".xls]
expenses!A2:B100),2)

But this returns a #VALUE! error.

First explanation. All that CONCATENATE produces is a text string. A text string
isn't a range reference. If it were, there'd be no way to use text strings like
"A1" as just text.

If the other workbook were open, you could put your CONCATENATE expression
inside INDIRECT. INDIRECT converts text looking like range references into range
references, but 'ranges' - at least as Excel considers them - only exist in open
workbooks. If you need to access multiple cell ranges in closed workbooks, see
the third alternative in the following.

http://www.google.com/[email protected]
 
Fist an importan prerequisite to achieve the following: Your workbooks
have to be opened! . . .

No they don't. At least not generally. Only when relying on INDIRECT.

There are times when udfs are necessary for unrestricted functionality.
 
Harlan said:
...

No they don't. At least not generally. Only when relying on INDIRECT.

There are times when udfs are necessary for unrestricted
functionality.

Hi Harlan
that was just the thing I tried to express, though due to my non native
usage of the english language I failed including the INDIRECT part :-)
One question though regarding this topic (looking up values from closed
workbooks): Have you tried using SQL.REQUEST to fetch values based on
conditions from closed workbooks as 4th alternative. Just curious as
this functions seems to achieve the desired result (though not tested
on my side)

Regards
Frank
 
Harlan Grove said:
...

No they don't. At least not generally. Only when relying on INDIRECT.

As Frank also wrote '... This solution won't work with closed files ...' ,
note 'This solution', and he was using INDIRECT, your comment seems hasty
and irrelevant.
 
As Frank also wrote '... This solution won't work with closed files ...' ,
note 'This solution', and he was using INDIRECT, your comment seems hasty
and irrelevant.

Since it's purely a semantic point, you're probably correct that it's
irrelevant. Not that I've ever let irrelevance stop me. Frank was explicit about
the need for the workbook to be open in order to use INDIRECT. However, in
*GENERAL* it *ISN'T* necessary for workbooks to be open in order to pull data
from them using dynamically constructed references.
 
...
...
One question though regarding this topic (looking up values from closed
workbooks): Have you tried using SQL.REQUEST to fetch values based on
conditions from closed workbooks as 4th alternative. Just curious as
this functions seems to achieve the desired result (though not tested
on my side)

First, SQL.REQUEST uses SQL and ODBC to pull data. It's my understanding (I
could be wrong) that when using it to pull data from Excel workbooks the range
accessed must appear like a database table and it must be a named range
(including field names in the top row). If your data in the other workbooks has
those characteristics, then SQL.REQUEST can pull that data even when that
workbook is closed. If your data doesn't look like that, then SQL.REQUEST can't.

Second, SQL.REQUEST is implemented as an .XLA file, specifically,

<office dir>\Library\MSQuery\XLODBC.XLA

containing wrapper functions that call routines in a DLL, specifically,

<office dir>\Library\MSQuery\XLODBC32.DLL

Since SQL.REQUEST is either written in VBA or XLM (I don't recall which, and it
may have changed in some recent version), you're stuck with the same sort of
performance hit using it as you would be using any other udf. So no basis for
assuming the call interface required by SQL.REQUEST would be any faster than the
call interface required by any other udf. Once you're in the the udf, what it
actually does could affect performance. SQL.REQUEST calls functions in a DLL.
It's possible XLODBC32.DLL does most of the work of extracting data from Excel
workbooks, but its file size (at least in Excel 97 SR-2) is only 67,584 bytes,
which makes it appear that it too is little more than a collection of wrapper
functions calling functions in other DLLs. While one or two DLL calls may be
fast, possibly dozens of DLL calls may not be.

The main alternative udf appoach involves creating a separate instance of the
Excel application, and using it to call its ExecuteExcel4Macro method with an
external reference as its argument. The drag here, over and above the udf call
interface, is creating the second Excel application instance and executing the
ExecuteExcel4Macro method. Creating the second Excel application instance isn't
quick, though that could mitigated by using code to leave it open until the
calling workbook is closed. That done, only the initial call to such a udf and
closing the workbook containing the last formula calling such a udf would incur
a performance penalty. That leaves the ExecuteExcel4Macro method call. While I
haven't tested it against SQL.REQUEST, it seems to me that ExecuteExcel4Macro
would make direct use of Excel itself to pull data, whereas SQL.REQUEST would
need to use a circuitous chain of DLL function calls to achieve the same result.
While only performance testing would prove this, I have to believe the
ExecuteExcel4Macro method call would run circles around the various DLL calls
made by SQL.REQUEST.

FWIW, MOREFUNC.XLL's INDIRECT.EXT also uses a separate Excel application
instance to do its magic, so whatever drag incurred by udfs using such is shared
by INDIRECT.EXT.

If you want to test the performance of SQL.REQUEST to, say, my pull() udf, feel
free. As long as you test them on ranges both with *AND* without field names in
the top row. One method that works all the time, even if it's slow, will tend to
beat out multiple quicker but more specialized alternatives.
 
Hi
thanks for your detailed answer!
Harlan said:
First, SQL.REQUEST uses SQL and ODBC to pull data. It's my
understanding (I could be wrong) that when using it to pull data from
Excel workbooks the range accessed must appear like a database table
and it must be a named range (including field names in the top row).
If your data in the other workbooks has those characteristics, then
SQL.REQUEST can pull that data even when that workbook is closed. If
your data doesn't look like that, then SQL.REQUEST can't.

Due to my testing I fully agree with your assumption. The source range
has to look like a database

[snip]
If you want to test the performance of SQL.REQUEST to, say, my pull()
udf, feel free. As long as you test them on ranges both with *AND*
without field names in the top row. One method that works all the
time, even if it's slow, will tend to beat out multiple quicker but
more specialized alternatives.

You're right in respect to robustness/flexibility. The UDF approach
beats SQL.REQUEST in this respect. I did some small performance testing
using both alternatives (without optimizing your pull UDF / that is
leaving the workbook opened). Both are REALLY slow on a large lookup
range (lets say 2000 columns) with small advantage for the UDF (seems
your assumption regarding multiple DLL calls could be correct).
So IMHO this kind of application/issue should be solved by a database
application (and returning the results to Excel if required) if
performance is an issue

So thanks again for taking your time to give me / the NG some insights

Regards
Frank
 
...
...
So IMHO this kind of application/issue should be solved by a database
application (and returning the results to Excel if required) if
performance is an issue
...

Depends. The OP's stated task is a simple VLOOKUP. If there are several such
VLOOKUP calls accessing the closed workbook(s), and if they're relatively static
(constants as 3rd arguments, only infrequently changing first arguments), then
pulling a table from a database may be optimal. However, the assumption of
relative unchageability means that udfs would recalc rarely, so the performance
drag would be minimal after initial calculation.

Returning to the DBMS alternaive, in this case the 'database file format' is an
Excel workbook, and no DBMS I'm aware of uses .XLS files as a native file
format. So any database access would, I believe, need to pass through several
layers of DLL calls ultimately leading back to Excel. If so, then presumed speed
improvements due to using a DBMS would prove difficult to realize.

If the results from these VLOOKUP calls could change frequently, then, like it
or not, accept it or not, there's no practical alternative to . . .
 
Harlan said:
Returning to the DBMS alternaive, in this case the 'database file
format' is an Excel workbook, and no DBMS I'm aware of uses .XLS
files as a native file format. So any database access would, I
believe, need to pass through several layers of DLL calls ultimately
leading back to Excel. If so, then presumed speed improvements due to
using a DBMS would prove difficult to realize.

With a DBMS alternative I'd suggest to completly transfer the data to a
native database format (of course linking Excel worksheets is not a
good choice in respect to performance). That is using Excel only as a
final 'reporting engine'.
If this is a feasible/good alternative depends on the data structure,
the kind of lookups required, etc.
So you're right, if Excel is the application of choice, you have to
accept the performance issue.

Regards
Frank
 
...
...
If this is a feasible/good alternative depends on the data structure,
the kind of lookups required, etc.
...

And, perhaps most critically, the anticipated frequency of changing values from
the lookups. If they'd change frequently, e.g., in response to frequently
changing first arguments to VLOOKUP, using a DBMS for any part of this would
prove a slower approach than using Excel alone.
 
Harlan said:
...
..
..

And, perhaps most critically, the anticipated frequency of changing
values from the lookups. If they'd change frequently, e.g., in
response to frequently changing first arguments to VLOOKUP, using a
DBMS for any part of this would prove a slower approach than using
Excel alone.

Agreed! But I think now we have lost the original OP :-)
So as we don't know his detailed functional specification this is
(unfortunately) only an academic though enjoyable discussion.

Best regards
Frank
 
...
...
Agreed! But I think now we have lost the original OP :-)
Possibly.

So as we don't know his detailed functional specification this is
(unfortunately) only an academic though enjoyable discussion.

Oh, it's not so mysterious, convoluted or even underspecified. And 'academic'?

From the OP:

"vlookup("Joe",[week 1.xls]expenses!A2:B100,2)

I have multiple external databases - [week 1], [week 2],
[week 3], ... I'd like to have a cell on my active
worksheet that contains the name of one of the external
workbooks"

Seems pretty clear what's intended. The only possibly unstated details are
whether the first or third arguments might be not-necessarily-constant
expressions or cell references.

As for rendering this into a form a DBMS might be able to handle, looks like
it'd require creating several tables (one for each of the [week #.xls] files),
adding a field to each of them to store the week number, then creating a single
table as the union of these other tables. Now, as an added little complication,
if there were multiple matches for "Joe" in the original leftmost field of any
of these tables, then added query logic would be needed to pick off only the
first (topmost) matching field. This would not be a trivial bit of work to throw
at a DBMS.

As for not knowing with certainty the OP's needs/intentions precisely or in
detail, it's often either unnecessary or easy enough to infer. This is an
example of the latter. This really isn't a terribly difficult problem to solve
as long as you eschew any prejudices or biases with regard to approach.
 
I'm running into a similar issue when trying to use a dynamic file name. I have applied the indirect, but it gives me a #ref! error when I try to use it. Any suggestions?
 
Back
Top