an interesting read

  • Thread starter Thread starter Harlan Grove
  • Start date Start date
Harlan,

Thanks for posting that. It was interesting.
Also, we now an addition to the jargon: "Symbol Mutation error"

Regards,
Jim Cone
San Francisco, CA
 
Hi Harlan!

Interesting and an object lesson.

Perhaps we need an ability to turn off Intellisense but I think that
the main problem was inadequate checks of data on entry / import into
Excel. Both of the error types should have been picked up.
 
Norman Harker said:
Perhaps we need an ability to turn off Intellisense but I think that
the main problem was inadequate checks of data on entry / import into
Excel. Both of the error types should have been picked up.
....

Possibly, but how would one catch, say, 2000 or so missing SEPT2 entries out
of, say, 1,00,000 imported cells? One way would be to be to immediately
resave all imported data as plain text and compare the resaved text file to
the original imported one (using Unix-originated text processing tools like
diff, which has an option to ignore differences in whitespace). How many
naive people (biologists included) would actually do that?

While I agree that manual entry conversions should be and are easy to catch,
it's another thing entirely to catch 0.1% or 0.05% conversion error rates
importing data sets of hundreds of thousands or millions of observations.

THERE SHOULD BE A WAY TO TURN OFF EXCEL'S OH, SO HELPFUL AUTOMATIC
FORMATTING!
 
Hi Harlan!

I sympathise but it's ye olde problems of:

Computers allow us to stuff up more quickly.
Rubbish In Rubbish Out (A stolen expression from HiFi I believe)

Once you know of the possible existence of the problem the data can be
easily mass checked using formulas that check the data. What's wrong
with COUNT? A COUNT of the entire data range which is supposed to
contain text will tell me how many values I have. I can similarly
conditional format to highlight the stuffups.

But..
Re: "THERE SHOULD BE A WAY TO TURN OFF EXCEL'S OH, SO HELPFUL
AUTOMATIC FORMATTING!"

I AGREE!
 
Norman Harker said:
Computers allow us to stuff up more quickly.
Rubbish In Rubbish Out (A stolen expression from HiFi I believe)

A fine description of Excel's text file import functionality: speedy stuff
ups. The functionality comes *WITHOUT* warnings that Excel works the way it
does. It's not the functionality so much as the abysmal documentation that's
the real problem.
Once you know of the possible existence of the problem the data can be
easily mass checked using formulas that check the data. What's wrong
with COUNT? A COUNT of the entire data range which is supposed to
contain text will tell me how many values I have. I can similarly
conditional format to highlight the stuffups.

Yes, once the glitches are discovered, so it's only a question of how much
data gets fried before the problem is discovered.

With respect to using COUNT, if the data in were an arbitrary combination of
numeric strings and nonnumeric strings, using COUNT might not be much help.
It only works when all entries or a subset of known count contain nonnumeric
text. However, this is unlikely to be the case for much data fed into PCs on
RS-232 ports from external measuring devices.

Moral: if one is doing scientific research, one should use appropriate tools
for the task, not an overblown loan calculator (no matter how much fun it is
writing array formulas in it).
 
Hi Harlan!

All agreed. Especially the documentation aspects. Although I would
place greater emphasis on the researcher having to check on the
integrity of their data before they start analysing and interpreting
it. That's basic research skills and will reveal errors in the
original inputs as well as the errors that Excel has created for them.
 
Back
Top