Fastest method to combine 3 large lists into 1 multidimentional ar

  • Thread starter Thread starter Dalt
  • Start date Start date
D

Dalt

Lets say I have 2 text files with 2 rows of data each. The first file
contains the "Occupation", the 2nd file contains the "Salary" and the 3rd
file contains the "Hire Date". Each file contains a "Name column" which
would be the "key" of sorts for each entry.

File 1 contains 15,000 entries
File 2 contains 15,000 entries
File 3 contains 15,000 entries

What is the fastest way, through VBA code only (no database table joins,
etc) to join the 3 lists together into either an excel file or
multidimentional array that could be thrown into an excel file.
 
I guess I'm particularly confused as to why you're saying you don't want to
do any database table joins here?

Do you really want avoid using the great built in database engine to
accomplish your task? Why?

You also have to further explain what you mean by a multidimensional table
for your results.

For example:
File 1 contains 15,000 entries

Is the above 15,000 unique and separate names, or do the names overlap? In
other words will you have a name repeated several times if the person has
more than one occupation?. This is a significant detail needed and it will
decide very much how we approach this problem.

I really can't see much of a practical solution here in which we avoid using
at least a table and some SQL joins.

Keep in mind that what we could perhaps come up with a hundred percent
coding "in memory" solution here, but the problem is then you'll have to
write code to take the data out of that arrary and write the data out to a
table suitable for its port to excel.

So, the first detail we need is in each of the three files is the name value
repeated several times? And, if so then when a person has several
occupations what is the resulting file structure to look like?

As a general rule, it would seem to me that if the person has three possible
occupations, then we would need two tables to store the resulting imported
data (Or else we'll have to come up with some kind of naming convention for
the additional columns).

If the name is NOT repeated more then once in any of the files, then you
could simply import all three text files and then build a sql join which
would then allow you to export the resulting table as an excell sheet (And
this would result in very little if any coding on your part).

so, if name data is to be repeated, then I suggest building one table with
unique index on the name, and import to that the one text file that has this
list of names . (for example, what happens if the second file does not have
a name that was found in the first file, can this occur also?).

For all realistic purposes, I would suggest that you use some SQL and joins
here, otherwise you're gonna have to write a lot of messy code, and I see
absolutely no reason to do that.
 
Dalt said:
Lets say I have 2 text files with 2 rows of data each. The first file
contains the "Occupation", the 2nd file contains the "Salary" and the 3rd
file contains the "Hire Date". Each file contains a "Name column" which
would be the "key" of sorts for each entry.

File 1 contains 15,000 entries
File 2 contains 15,000 entries
File 3 contains 15,000 entries

What is the fastest way, through VBA code only (no database table joins,
etc) to join the 3 lists together into either an excel file or
multidimentional array that could be thrown into an excel file.


Assuming the names are the same in each list, use the
strengths of a database to do it with a quick SQL statement:

First, use the File - Get External Data menu item to link to
the three files as if they were tables. Then create a query
like:

SELECT f1.name, f1.Occupation, f2.Salary, f3.[Hire Date]
FROM (f1 INNER JOIN f2 ON f1.name = f2.name)
INNER JOIN f3 ON f1.name = f3.name

Then use the File - Export menu item to send the query to
Excel
 
It is extremely difficult to believe that in a list of 15000 names you would
not have duplicates that actually belong to different people. In this case,
unless the records are in the exact same sequence in each of the 3 files
(don't know why you start out stating you have 2 text files), I'm not sure
how you would accomplish your goal. That is why most database developers
come up with a numeric value (surrogate key) to identify individual people,
and perpetuate that value throughout their applications, rather than using
peoples names.

--
HTH
Dale

email address is invalid
Please reply to newsgroup only.
 
It's not that I don't want to do table joins or don't know how, i'm just
trying to learn a good method of combining lists through sheer code. In
other words, what coding principals are at work when several tables are
joined together?

For one, I guess it's much faster to combine lists when they're sorted
alphabetically so you can do a binary search for items within an array
instead of standard for loop. So I'm guessing a good start to manually
joining 3 "tables" would be to sort them first.

Again, I realize a table join solution is the best when it's available
(which is usually) but i'm thinking beyond access at this point. Forgive me,
maybe this wasn't the right place to post this.
 
Dalt said:
It's not that I don't want to do table joins or don't know how, i'm just
trying to learn a good method of combining lists through sheer code. In
other words, what coding principals are at work when several tables are
joined together?

Ok, great. Now you've clarified your goals and what you're trying to
accomplish here, and put it in the context that you just trying to learn and
thinking of some different techniques to use here, then we can help you.

Remember when people come here to ask for help, and say they have to do
something a particular way (and likely a wrong way), if you don't explain
why you want to do something that different way, then we feel obliged to
give you good advice as to the correct approach. So you ask for medical
advice on how to gain weight when weight gain is not needed, you not
likely get anyone willing to give you that bad advice. So, people here are
naturally against giving
advice to do things the wrong way.

So lets take a crack at this problem now that we all realize it would be
easier, less work and the code likey would run faster if we used
sql quires. (and, further since the data would be in a table, then we
could easily export the data back out to a text file.
For one, I guess it's much faster to combine lists when they're sorted
alphabetically so you can do a binary search for items within an array
instead of standard for loop. So I'm guessing a good start to manually
joining 3 "tables" would be to sort them first.

Actually I would use a collection. The reason why it's better to use a
collection is that you can pull a value out of this dynamic structure based
on a key value without having to use a for/next loop to scan the data.

Were are also going to assume that for each of the three files is no
duplicate names. (each name appears once in each of the 3 files).

The following code would join the data
together. Were also going to assume that each file has two columns and is
comma delimited.

Name,Value
Albert,Steel Worker
Joe,Plumber

The code would look like:

Public Sub MyTest()


Dim cList As New Collection
Dim intFile As Integer
Dim strOneLine As String
Dim strName As String
Dim strValue As String
Dim itemData(1 To 3) As String

' read in 1st file (name + occupation

intFile = FreeFile
Open "c:\datatest\test1.txt" For Input As intFile
Do While EOF(intFile) = False
Line Input #intFile, strOneLine
strName = Split(strOneLine, ",")(0)
strValue = Split(strOneLine, ",")(1)
itemData(1) = strName
itemData(2) = strValue
cList.Add itemData, strName
Loop
Close intFile

' 2nd file (name + Salary)
intFile = FreeFile
Open "c:\datatest\test2.txt" For Input As intFile
Do While EOF(intFile) = False
Line Input #intFile, strOneLine
strName = Split(strOneLine, ",")(0)
strValue = Split(strOneLine, ",")(1)
cList(strName)(2) = strValue
Loop
Close intFile

' 3nd file (name + HireDate)
intFile = FreeFile
Open "c:\datatest\test3.txt" For Input As intFile
Do While EOF(intFile) = False
Line Input #intFile, strOneLine
strName = Split(strOneLine, ",")(0)
strValue = Split(strOneLine, ",")(1)
cList(strName)(3) = strValue
Loop
Close intFile

' at this point, you can display the results:
For Each itemData In cList
Debug.Print itemData(1), itemData(2), itemData(3)
Next

End Sub

The above is not tested code, but it should quite well do the trick...


--
Albert D. Kallal (Access MVP)
Edmonton, Alberta Canada
(e-mail address removed)












a primary key to plus the new one colu







so why would likely be fastidious SQL and table joins, the following code to
do your Perot would look something like the following



forehand of pull lightens out of the list without having to do a search
 
Back
Top