Key Merge Functionality

Tom Ellison · Oct 29, 2003

I've decided to air this issue publicly.

First, my definition: Key Merge is what you need when you have two
rows in a Foreign Table, one of which is logically a duplicate of the
other. Both Foreign Table rows are presumed to have dependent rows.
The user has selected one of these two rows to be deleted, and all the
dependent rows of that row must be made dependent on the row which is
to remain.

In order to initiate such an action, I suggest the user will be
required to select a the Foreign Table row which is to be retained,
what I'll call the Target Row. Let's just do this simply by having
the user navigate to the Target Row and click a Command Button that
will store the Primary Key value of that row for future reference.

Then the user should navigate to the Foreign Table row that is to be
deleted, what I'll call the Delete Row. The user would then click
another Command Button to delete this row and perform the Key Merge by
updating all the values in dependent tables.

Now, it has been increasingly clear to me lately that this
functionality is needed frequently. However, I've never heard it
discussed, nor do I recall seeing it published anywhere.

The query work is pretty straight forward, changing the references in
the dependent rows which are set to the Delete Row's key to be the
value of the Target Row's key. Then the Delete Row may be deleted
without disturbing referential integrity.

I invite comment on this topic, especially the following questions:

- Do you see a wide need for this functionality?

- Have you implemented it before? Seen it implemented?

- Do you know of any web sites, articles, or books that address this?

- Did I explain the topic so you can understand it?

Any comments are welcome.

Thanks,
Tom Ellison
Microsoft Access MVP
Ellison Enterprises - Your One Stop IT Experts

Michel Walsh · Oct 31, 2003

Hi,

- Do you see a wide need for this functionality? I do not think.

- Have you implemented it before? Seen it implemented? No.

- Do you know of any web sites, articles, or books that address this? No.

- Did I explain the topic so you can understand it? If you have explained
it ... or if it is me who didn't understand...

I am not sure rows must be dependent of other rows... that is against the
definition of a table. Ok, but assuming that is not necessary a fully
normalized table, and I would assume you have data about, say, Byzance,
Constantinople and again about Istanbul. Is it you want eliminate, say,
Byzance and also Constantinople to keep, say, the number of appearance of
the city name in a document:

Byzance 560 times
Constantinople 123 times
Istanbul 54 times

to get

Istanbul (560+123+54) ?

If that is the case, then I would use a translation table:

Translations ' table name
This IntoThat ' fields
Québec Quebec City
Byzance Istanbul
Constantinople Istanbul ' data sample

and I would use an outer join (or inner join if you repeat un-translated
info, like Istanbul -> Istanbul ) :

SELECT SUM(original.numberOrAppearance)
FROM original LEFT JOIN translations ON original.City = translations.this
GROUP BY COALESCE( translations.intoThat, original.City )

( or Nz( ) with Jet, instead of COALESCE ).

That is related to what you pictured?

Vanderghast, Access MVP

Tom Ellison · Nov 1, 2003

Dear Michel:

I see I have not explained myself well.

I have a new application in which this has happened, just since I
posted the original here.

Multiple users are entering data. There is performance data for many
suppliers of seed. What they are entering is harvest yield results.

For one seed supplier, say ABC Seeds, they have received reports for a
seed designated as 1234. Some other reports were received for seed
designated as ABC1234. After a bit of examination, it is determined
that these are the same seed, and we'd prefer to have all the reports
for this same seed to be called just 1234. So we desire to merge all
the reports that use the designation ABC1234 into the 1234
designation, and then remove the ABC1234 from the table of
designations.

The procedure is simply to UPDATE the dependent tables containing
ABC1234 by changing them to 1234. This step is very similar whether
you use a surrogate key or a natural key in the relationship.

Once there are no records dependent on ABC1234, that row is deleted.

A problem with using a translation is that it can become recursive:

A -> B
B -> C
C -> D

With the above, a reference to A then needs to end up as D, but only
by going through an indefinite number of steps. This isn't so easily
implemented in SQL.

read more below:

Hi,

- Do you see a wide need for this functionality? I do not think.

- Have you implemented it before? Seen it implemented? No.

- Do you know of any web sites, articles, or books that address this? No.

- Did I explain the topic so you can understand it? If you have explained
it ... or if it is me who didn't understand...

I am not sure rows must be dependent of other rows... that is against the
definition of a table. Ok, but assuming that is not necessary a fully
normalized table, and I would assume you have data about, say, Byzance,
Constantinople and again about Istanbul. Is it you want eliminate, say,
Byzance and also Constantinople to keep, say, the number of appearance of
the city name in a document:

I was specifically meaning rows in another table being dependent.
Just a typical one-to-many relationship is all I meant.

Byzance 560 times
Constantinople 123 times
Istanbul 54 times

to get

Istanbul (560+123+54) ?

If that is the case, then I would use a translation table:

Translations ' table name
This IntoThat ' fields
Québec Quebec City
Byzance Istanbul
Constantinople Istanbul ' data sample

and I would use an outer join (or inner join if you repeat un-translated
info, like Istanbul -> Istanbul ) :

SELECT SUM(original.numberOrAppearance)
FROM original LEFT JOIN translations ON original.City = translations.this
GROUP BY COALESCE( translations.intoThat, original.City )

( or Nz( ) with Jet, instead of COALESCE ).

That is related to what you pictured?

No, just as it relates to a typical one-to-many relationship between
tables.

Vanderghast, Access MVP

(for example, some user has accidentally misspelled the name of a
city, and there are dependent rows in another table for that
misspelled name. Another row in the FK table has the correct spelling
of the city, and it also has dependent rows in those same dependent
tables. Update all the dependent tables to change the reference to
the wrongly spelled city name (changing the reference to the correctly
spelled city name), then delete the wrongly spelled city name)
Tom Ellison
Microsoft Access MVP
Ellison Enterprises - Your One Stop IT Experts

Michel Walsh · Nov 3, 2003

Hi Tom,

Indeed, it seems there is no much to be done given the recursion. I
would do the recursion on the small table supplying the conversions, and
once I would have got resolve the final resolutions:

A -> D
B -> D
C -> D

I would then engage the modifications in the large table, in one "step".

Vanderghast, Access MVP

Tom Ellison · Nov 3, 2003

Dear Michel:

I did, indeed, anticipate your response regarding the recursion.

My point is, this is all rather more complex than necessary. Again,
consider what is needed when the table is a list of city names. The
name of one city is spelled two different ways:

Bufalo, NY
Buffalo, NY

Both have dependent rows in other tables. Of course, when you report,
you don't want to have two separate results listed, especially if
there is aggregation.

All that is really needed is to update the dependent tables to use the
correct spelling, then delete the incorrect spelling. The user
interface to do this could look just like the interface needed to
create the conversions you suggest. However, subsequent query work
and its performance would be better if the table of city names were
just cleaned up.

On the other hand, there is a possible advantage to using the
conversion. If some user consistently misspells the city name, then
having the wrongly spelled city name in the table with the conversion
link to the correct spelling could be useful. I'm not sure I'd like
to promote that usage, but it's worth thinking on.

Whichever way this is implemented, it's still a surprise to me that
I've not seen any discussion on this before. Is it news to you, too?

Tom Ellison
Microsoft Access MVP
Ellison Enterprises - Your One Stop IT Experts

Michel Walsh · Nov 3, 2003

Hi,

It is something quite standard in Accounting, where companies own parts
of other companies and given regulations what kind of "transfer" can be
done, so what belong (tax credits, or part of "common" projects) initially
to CieA is now listed (exclusively, or not, or with different percentages)
under CieB (and furthermore CieA still exists! )... and the need of lawyers
is required, in addition to any SQL statements :-)

Since it is a domain
that can easily become tinted by... irrationality, politic and even
emotional business practices (premium linked to business revenue increase,
etc), it is not frequently discussed in SQL forums, but if you can come with
something being supported by the logic of SQL and incorporating the
versatility required by the potential customers (governments included), all
that with credibility and authority, you are a very rich man.

Vanderghast, Access MVP

Key Merge Functionality

Tom Ellison

Michel Walsh

Tom Ellison

Michel Walsh

Tom Ellison

Michel Walsh