On Sat, 4 Dec 2004 15:28:34 -0500, "Roger Wilco"
This is such a flamable topic, yet the underlying issue - the
program/data distinction - is crucial to safety. So here goes...
I know many programmers that don't know how stuff works.
Exhibit A: All those using C's string copy function as a generic bytes
mover, and thus all those buffer overruns.
The point being that computer complexity means there are few who have
the range of understanding that spans from "big picture" to wires,
volts and milliseconds. Most work in a narrow band, e.g. bind
together these black boxes to create a solution, or code this module
(designed by someone else) using that language's library functions.
The trend is inescapable, and I found it robbed programming of its
allure, so I dropped out back in the early '90s. I can see the
attraction of virus writing; solo work, raw code, not much re-use of
other ppl's functions other than to abuse them, and simple code-level
objectives that are self-defined.
A W32 executable may need some translation to be made into an
executable image, but it is a program file even though it is sort of
like a container. Sending it as inline content in an HTML w\scripting
email does not make the email a program any more than does
archiving it in a zip file make the zip file a program.
For our purposes (malware theory), what matters is:
a) Is program material within file "run" when file is "opened"?
b) If so, is what it can do limited to the scope of that file alone?
If Yes and No, the file should be considered "program".
If No and No, the file can be safely considered "data", in terms of
itself - but knowing that ant file can contain anything, and that
program files can use the contents of any file for any purpose
(including loading it into its own memory and running it as code),
such files may still have malware relevance.
If Yes and Yes, you may have a safe macro or "text markup" situation,
but experience shows that sooner or later, sandboxes leak, or devs are
tempted into facilitating wider effects e.g. by adding a generic
"shell arbitrary system call" feature etc.
For this reason, I would prefer *any* sort of macro/scripting to be
held within separate files that are identifiable as such, and/or to be
never automatically interpreted when a "data" file is "opened".
Another detail to avoid getting hung up on, is whether code is
"really" code, or is script, macro, etc. Usually this is determined
by whether it is "executed" or "interpreted".
All code is interpreted, whether it be in hardware such as CPU or
peripheral device, by software emulation of that hardware, by a
programming runtime engine, as interpreted as API calls, or by an
application that parses "document" files in poorly-limited ways.
Not only that, but interpretation of a file may unwittingly be as
powerful as a programing engine, i.e. when a genuine data format such
as JPEG gets to exploit a defective interpreter and thus runs as code.
If users are to retain control over their computers::
1) Users must know what they are doing
2) Computer must limit itself to doing only what it is asked to do
On (1), it goes about concerptualizing software in such a way the user
understands it, and the implications of what it does. If software
does this well, all the user may need to know something they already
know, e.g. "I want to read a message", "I want to see a list of file
names", "I want to read a data file", "I want to run a program".
On (2), it goes about safety WYSIWYG. If the user expects the limited
risk of listing files, the system should not "run" material within
those files. If a user expects the low risk of "reading" a file, the
system should not take the greater risk of running these as programs.
When (1) fails, e.g. meaningless generic concepts such as "open"
require additional and quite in-depth knowledge, or when a consumer PC
is a fully-fledged "network client" requiring sysadmin knowledge, then
you can expect users to "do dumb things" on a regular basis.
When (2) fails, it becomes impossible for users to practice "safe
hex", and effectively, control of their PC is hijacked by anyone with
the failrly simple tech smarts needed to press a few buttons.
We currently live with the consequences of (1) and (2) failure.