Motivation of software professionals

  • Thread starter Thread starter Stefan Kiryazov
  • Start date Start date
Lew wrote :
The point of my example wasn't that Y2K should have been handled earlier, but
that the presence of the bug was not due to developer fault but management
decision, a point you ignored.

At the time (70's etc) hard drive space was VERY expensive. All sorts
of tricks were being used to save that one bit of storage. Remember
COBOL's packed decimal?

So the decision to drop the century from the date was not only based on
management but on hard economics.

Which, I will grant, is not a technical decision, though the solution
was...

And at the time Y2K was created it was not a bug. It was a money saving
feature. Probably worth many millions.
 
I am not an expert at law, so I cannot reason about justification or
necessity. However, I do recall quite a few "mishaps" and software
bugs that cost both money and lives.
Let's see: a) Mariner I, b) 1982, an F-117 crashed, can't recall if
the pilot made it, c) the NIST has estimated that software bugs cost
the US economy $59 billion annually, d) 1997, radar software
malfunction led to a Korean jet crash and 225 deaths, e) 1995, a
flight-management system presents conflicting information to the
pilots of an American Airlines jet, who got lost, crashed into a
mountain, leading to the deaths of 159 people, f) the crash of Mars
Polar Lander, etc. Common sense tells me that certain people bear
responsibility over those accidents.
http://catless.ncl.ac.uk/risks


How can anybody ignore this? Do more people have to die for us to
start educating software engineers about responsibility, liability,
consequences? Right now, CS students learn that an error in their
program is easily solved by adding carefully placed printf()'s or
running inside a debugger, and that the worst consequence if the TA
discovers a bug in their project solution is maybe 1/10 lesson
credits.

I was exposed to the same mentality, but it's totally ****ed up.



So what? We already know how to write more reliable software, it's
just that we don't care.
 
Leif said:
Not really. Remember, you can pack 256 years into a single 8 bit byte if
you want to, but in most cases of the Y2K problem people had stored a
resolution of 100 years into two bytes -- quite wasteful of space.

In some cases it came from a too tight adherence to the manual business
process that was modeled -- remember the paper forms with "19" pre-printed
and then two digits worth of space to fill out? Those got computerised
and the two-digit year tagged along.

In other cases it boiled down to "this is how we've always done it."

As to the first case, one large app I am familiar with is a J2EE
modernization of a legacy mainframe program. The business users, for
decades, have produced lots of paper, mostly for auditing reasons. All
the printed output has gone into folders. When the mainframe program was
written it included the folder concept, although the application
entirely did not need it. Even better, when the application was
modernized a few years ago, into J2EE, the developers at the time made
strong representations to business to leave the folder concept, and all
of its associated baggage, completely out. No such luck - it's still
there. The folder concept has to be maintained in the application,
maintained through transactions, all the book-keeping related to folders
is also duplicated in the application, and it's completely irrelevant to
proper functioning of the application. In fact it's completely
irrelevant period, but try selling that.

This kind of thing happens all the time, and it results from a failure
to identify and model the real business processes, as opposed to the
things that simply look like business processes. Back when they had
typewriters the folders made sense...like in the '60's.

AHS
 
Leif said:
Imagine the cook at a soup kitchen storing raw and fried
chicken in the same container.

Or imagine a company giving out away a free game as a marketing
stunt and the game turns out to have been infected with a virus
that formats the users' hard-drives.

Or imagine the author of an open-source product not paying
sufficent attention and accepting a patch from a third party
which turns out to have included a backdoor, providing full
access to any system where the program is running.

This is what I am getting at, although we need to have Brian's example
as a baseline. In this day and age, however, I'm not convinced that a
person could even give away a free car (it wouldn't be free in any case,
it would still get taxed, and you'd have to transfer title) and be
completely off the hook, although 99 times out of 100 I'd agree with
Brian that it's not a likely scenario for lawsuits.

With software the law is immature. To my way of thinking there are some
implied obligations that come into effect as soon as a software program
is published, regardless of price. Despite all the "legal" disclaimers
to the effect that all the risk is assumed by the user of the free
software, the fact is that the author would not make the program
available unless he believed that it worked, and unless he believed that
it would not cause harm. This is common sense.

I don't know if there is a legal principle attached to this concept, but
if not I figure one will get identified. Simply put, the act of
publishing _is_ a statement of fitness for use by the author, and to
attach completely contradictory legal disclaimers to the product is
somewhat absurd.

It's early days, and clearly software publishers are able to get away
with this for now. But things may change.

AHS
 
You say that like the developers were at fault.  I cannot tell you how many
times I've seen management overrule developers who wanted to make things
right.  It's been the overwhelming majority, though.  I recall a manager in
1982 refusing to let a team fix the Y2K bug in the project.  Many good
developers have grown resigned to the policies and have given up pushing for
quality.  Many more use stealth quality - they simply don't tell management
they're doing things in an unauthorized way that's better than the official
process.  Only rarely in the last thirty years have I encountered
management alignment with known best practices.

If management overrules developers there should also be a clear,
concise, legal way of assuming responsibility. I'm not blaming
developers; I'm saying they shouldn't be exempt from law (which they
aren't) but they should also be aware of that (which they aren't).
Nearly all projects I've worked on involved many programmers, dozens even..
Parts are written independently of each other, often over a period of years.
Often each part test perfectly in isolation and only reveal bugs emergently
under production conditions.

Same old, same old. Bugs of that type emerge when a module is used in
a way not compliant to its interface specification. There's still
someone to blame - the moron that didn't RTFM.
Many of those projects had large test teams.  Products have passed all the
tests, yet still failed to meet spec in production.

There's an easy explanation for that. Most of the time, software is
written to satisfy tests, particularly so in TDD. "Our software passes
the tests, because it was made to pass the tests. Ergo, it works." and
then they gasp in amazement at the first bugs.
Sometimes the provided test environment differed significantly from the
production environment.

And dozens of other factors that must be taken into account.
Carelessness leads to errors. Sometimes fatal ones.
Before you make the developer liable, you'd better darn well be certain the
developer is actually the one at fault.

I've already disclaimed that is neither my job, nor my interest.
That's why law officials, judges, law systems, etc. exist. But it's a
reality we have to educate ourselves about.
 
No, it was a bug that wasted a byte and threw away data. And it's still
a bug - some of the "solutions" adopted by the industry just shifted the
problem on a little, by using a "century window" technique. That will
catch up with us eventually.
Lets not forget that up to some time in the '90s COBOL could not read the
century, which created a blind spot about four digit years in many IT
people, COBOL being the language of choice for many mainframe systems
(and a lot of minicomputers too, thanks to the quality of the Microfocus
implementation).

Until CODASYL changed the language spec, some time in the mid '90s, the
only way you could get the date from the OS was with the "ACCEPT CURRENT-
DATE FROM DATE." where CURRENT-DATE could only be defined as a six digit
field:

01 CURRENT-DATE.
05 CD-YY pic 99.
05 CD-MM pic 99.
05 CD-DD pic 99.
 
Packed decimal (the COBOL COMP-3 datatype) wasn't a "COBOL" thing; it
was an IBM S370 "mainframe" thing. IBM's 370 instructionset included a
large number of operations on "packed decimal" values, including data
conversions to and from fixedpoint binary, and math operations.
You're right that its an IBM thing, but it goes further back that S/370.
I'm unsure about the 1400s, but I know for sure that the smaller S/360s,
model 30 for instance, and several of the other IBM small business
machines, e.g. System/3 and System/36, could *ONLY* do packed decimal
arithmetic.
IBM's COBOL took advantage of these facilities with the (non-ANSI)
COMP-3 datatype.
It had to: you couldn't have run COBOL on the smaller machines if it
hadn't done so.
 
With software the law is immature. To my way of thinking there are some
implied obligations that come into effect as soon as a software program
is published, regardless of price. Despite all the "legal" disclaimers
to the effect that all the risk is assumed by the user of the free
software, the fact is that the author would not make the program
available unless he believed that it worked, and unless he believed that
it would not cause harm. This is common sense.

Common sense has the interesting attribute that it is frequently totally
wrong.

I have published a fair amount of code which I was quite sure had at
least some bugs, but which I believed worked well enough for recreational
use or to entertain. Or which I thought might be interesting to someone
with the time or resources to make it work. Or which I believed worked in
the specific cases I'd had time to test.

I do believe that software will not cause harm *unless people do something
stupid with it*. Such as relying on it without validating it.
I don't know if there is a legal principle attached to this concept, but
if not I figure one will get identified. Simply put, the act of
publishing _is_ a statement of fitness for use by the author, and to
attach completely contradictory legal disclaimers to the product is
somewhat absurd.

I don't agree. I think it is a reasonable *assumption*, in the lack of
evidence to the contrary, that the publication is a statement of *suspected*
fitness for use. But if someone disclaims that, well, you should assume that
they have a reason to do so.

Such as, say, knowing damn well that it is at least somewhat buggy.

Wind River Linux 3.0 shipped with a hunk of code I wrote, which is hidden
and basically invisible in the infrastructure. We are quite aware that it
had, as shipped, at least a handful of bugs. We are pretty sure that these
bugs have some combination of the following attributes:

1. Failure will be "loud" -- you can't fail to notice that a particular
failure occurred, and the failure will call attention to itself in some
way.
2. Failure will be "harmless" -- operation of the final system image
built in the run which triggered the failure will be successful because
the failure won't matter to it.
3. Failure will be caught internally and corrected.

So far, out of however many users over the last year or so, plus huge amounts
of internal use, we've not encountered a single counterexample. We've
encountered bugs which had only one of these traits, or only two of them,
but we have yet to find an example of an installed system failing to operate
as expected as a result of a bug in this software. (And believe me, we
are looking!)

That's not to say it's not worth fixing these bugs; I've spent much of my
time for the last couple of weeks doing just that. I've found a fair number
of them, some quite "serious" -- capable of resulting in hundreds or thousands
of errors... All of which were caught internally and corrected.

The key here is that I wrote the entire program with the assumption that I
could never count on any other part of the program working. There's a
client/server model involved. The server is intended to be robust against
a broad variety of misbehaviors from the clients, and indeed, it has been
so. The client is intended to be robust against a broad variety of
misbehavior from the server, and indeed, it has been so. At one point in
early testing, a fairly naive and obvious bug resulted in the server
coredumping under fairly common circumstances. I didn't notice this for two
or three weeks because the code to restart the server worked consistently.
In fact, I only actually noticed it when I noticed the segfault log messages
on the console...

A lot of planning goes into figuring out how to handle bad inputs, how
to fail gracefully if you can't figure out how to handle bad inputs, and so
on. Do enough of that carefully enough and you have software that is at
least moderately durable.

-s
p.s.: For the curious: It's something similar-in-concept to the "fakeroot"
tool used on Debian to allow non-root users to create tarballs or disk images
which contain filesystems with device nodes, root-owned files, and other
stuff that allows a non-root developer to do system development for targeting
of other systems. It's under GPLv2 right now, and I'm doing a cleanup pass
after which we plan to make it available more generally under LGPL. When
it comes out, I will probably announce it here, because even though it is
probably the least portable code I have EVER written, there is of course a
great deal of fairly portable code gluing together the various non-portable
bits, and some of it's fairly interesting.
 
Same old, same old. Bugs of that type emerge when a module is used in
a way not compliant to its interface specification. There's still
someone to blame - the moron that didn't RTFM.

cf. Ariane 5

you're assuming there *is* an interface specification. And that it is
unambiguous. I submit that unless these things are written *very*
carefully there are going to be odd interactions between sub-systems.


whilst test teams are good they are only half (or less) of the
solution.


the testing was inadequate then. System test is supposed to test
compliance with the requirement.
There's an easy explanation for that.
maybe


Most of the time, software is
written to satisfy tests, particularly so in TDD. "Our software passes
the tests, because it was made to pass the tests. Ergo, it works." and
then they gasp in amazement at the first bugs.

there is confusion between two types of testing. TDD is about
producing "an executable specification". You come close to "proving"
that the software does what you expect of it.
Of course "what you expect" ain't necessarily what the customer asked
for (and probably a million miles away from what he /wanted/!). The
System Test people do black box testing (no access to internals) and
demonstrate that it meets the requirement. The customer then witnesses
a System Acceptance Test (often a cut-down version of System test plus
some goodies of his own (sometimes just ad hoc "what does this do
then?")).

Skipping either of these leads to problems. TDD type tests don't test
against the requirement (Agile people often purport to despise formal
requirements http://en.wikipedia.org/wiki/Big_Design_Up_Front). And
often run in a non-production environments. Maybe even on the wrong
hardware. Relying only on System Test leads to subtle internal
faults. "it goes wrong when we back up exactly 32 characters on a
message of exactly this size". Systems put together out of untested
components are a ***GIT*** to debug.


oh yes. Sometimes we don't see some of the hardware until we are on a
customer site.
 
I'm terribly sorry, but I didn't get your point, if there was one.
Seriously, no irony at all. Care to elaborate?

oh, sorry. You were listing "software bugs that cost both money and
lives", I thought your list was a bit light (Ariane and Therac spring
to mind immediatly). I thought you might not have come across the
RISKs forum that discusses many computer related (and often software
related) bugs.
 
the testing was inadequate then. System test is supposed to test
compliance with the requirement.
Quite. System tests should at least be written by the designers, and
preferably by the commissioning users.

Module tests should NOT be written by the coders.
The System Test people do black box
testing (no access to internals) and demonstrate that it meets the
requirement. The customer then witnesses a System Acceptance Test (often
a cut-down version of System test plus some goodies of his own
(sometimes just ad hoc "what does this do then?")).
These are the only tests that really count apart from performance testing.

Its really important that the project manager keep an eye on all levels
of testing and especially on how the coders design unit tests or it can
all turn to worms.
 
Is it? What about the software that controls the locks, cars, and
airplanes?

As I said, only a few places. Try to find an insurance against word
corrupting your PhD Thesis, and you'll understand.

Markus
 
Imagine the cook at a soup kitchen storing raw and fried
chicken in the same container.

Or imagine a company giving out away a free game as a marketing
stunt and the game turns out to have been infected with a virus
that formats the users' hard-drives.

Or imagine the author of an open-source product not paying
sufficent attention and accepting a patch from a third party
which turns out to have included a backdoor, providing full
access to any system where the program is running.

You clipped what I wrote about revealing known problems.
If I give someone a twenty dollar bill and later when they
use that bill it is found to be a counterfeit, I don't think
they will sue me unless they have some reason to believe I
knew it was a counterfeit.



Brian Wood
http://webEbenezer.net
(651) 251-9384

"And David longed, and said, Oh that one would give me
drink of the water of the well of Bethlehem, that is at
the gate!" 1 Chronicles 11:17
 
Yes, because that doesn't really matter when it comes to
legal liability.

Well, that's not entirely true -- if someone can prove that
you _did_ know about a flaw in the product, you're going to
be in hot waters. But merely showing that you were ignorant
of a flaw isn't sufficent to absolve you of liability: the
cook in the soup kitchen isn't going to _know_ that the food
he serves is contaminated with salmonella and Toyota didn't
_know_ that their accelerator pedals were faulty.

It's not whether you're aware of flaws in your work that
matters, but whether you did your job properly and with
due diligence.


That is true in a traditional model of exchanging
money for a product or service. If you don't pay
for the good or service, you have no "rights."
If someone asks me for money and I unknowingly give
them a counterfeit bill, for them to become angry
with me for that would be wrong on their part.
At that point you just stop having contact with them.


Brian Wood
http://webEbenezer.net
(651) 251-9384
 
That's quite simply not correct.

It had better become correct, if we don't want to trash all our economies
again.

If you have liabilities to people who grabbed free stuff off the internet
labeled as providing no warranty, no one can afford to give anything away,
and it turns out that there's a huge economic efficiency boost to allowing
people to give away software. Solution: Let people give away software
with no liability or warranty.

-s
 
Lew wrote:
Pretty well everything I saw back in 1982 was out of use by
1999. How much software do you know that made the transition?
Let's see.. Operating systems. The PC world was... umm.. CP/M
80? Maybe MS-Dos 1.0? And by 1999 I was working on drivers
for Windows 2000. That's at least two, maybe three depending
how you count it, ground-up re-writes of the OS.
With that almost all the PC apps had gone from 8 bit versions
in 64kb of RAM to 16-bit DOS to Windows 3.1 16-bit with
non-preemptive multitasking and finally to a 32-bit app with
multi-threading and pre-emptive multitasking running in
hundreds of megs.
OK, so how about embedded stuff? That dot-matrix printer
became a laserjet. The terminal concentrator lost its RS232
ports, gained a proprietary LAN, then lost that and got
ethernet. And finally evaporated in a cloud of client-server
computing smoke.

The "standard" life of a railway locomotive is thirty or fourty
years. Some of the Paris suburbain trainsets go back to the
early 1970's, or earlier, and they're still running.
I'm not so up on the mainframe world - but I'll be surprised
if the change from dumb terminals to PC clients didn't have a
pretty major effect on the software down the back.

Have you been to a bank lately, and seen what the clerk uses to
ask about your account? In more than a few, what you'll see on
his PC is a 3270 emulator. Again, a technology which goes back
to the late 1960's/early 1970's.
Where do you get your conclusions that there was much software
out there that was worth re-writing eighteen years ahead of
time?

It depends on what you're writing, but planned obsolescence
isn't the rule everywhere.
 
At the time (70's etc) hard drive space was VERY expensive.
All sorts of tricks were being used to save that one bit of
storage. Remember COBOL's packed decimal?
[/QUOTE]
Packed decimal (the COBOL COMP-3 datatype) wasn't a "COBOL"
thing; it was an IBM S370 "mainframe" thing. IBM's 370
instructionset included a large number of operations on
"packed decimal" values, including data conversions to and
from fixedpoint binary, and math operations. IBM's COBOL took
advantage of these facilities with the (non-ANSI) COMP-3
datatype.

Packed decimal and COBOL are a lot older than the S370.
Although the 1401 used unpacked decimal, I believe it's been
available on a lot of machines since then.
 
Where Brian's example falls down is that the previous owner of
the car is, in effect, just a reseller: he isn't likely to
have manufactured the car or modified it to any degree.
However, let us assume that he _has_ done modifications to the
car such as, say, replacing the fuel tank. If he messed up the
repair and, without realising it, turned the fuel car into a
potential firebomb, he would be liable for this defect even if
he gave the car away free of charge.

He doesn't even have to have done that much. If he knows that
the brakes doen't work, and he lets you drive it, he's legally
responsible.
I don't think the law is immature when it comes to software.
Ultimately, software is covered by the same laws as Ford
Pintos. That said, the legal practice might be lagging behind,
as might the market and users' awareness of legal rights and
duties.

It is, because there's relatively little jurisprudence. That's
one of the things that makes liability insurance for a
contractor so expensive---the insurance company doesn't really
know how much they're risking (so they assume the worst).
 
Back
Top