Questionable av-test.org heuristics test

  • Thread starter Thread starter Tweakie
  • Start date Start date
T

Tweakie

Hi all,

Frederic Bonroy posted yesterday a brief summary of an anti-virus test
driven by Andreas Marx for a German magazine. Andreas tested heuristic
scanning through the use of two (and a half) different methods (he also
tested other parameters).

- The first method involved scanning recent viruses with 3 and 6 month
old signature databases* ;
- The second method used recent databases and viruses especically
created for the tests (a modified version of CIH, VBS and macro virus
generators and Optix Pro and Assassin 2.0 "generators").
- When applicable, the malwares were also packed with UPX, and both
packed and non packed versions were tested.

The first method is OK, it is probably one of the best techniques for
testing heuristics. In my opinion, the second one is very questionable.
Well, I don't know how CIH has been modified. Let's assume that it
has been done in such a way that every single byte of CIH have changed
(at execution time) so that none of the tested AV would recognize it
using signatures. Let's assume that the number of different VBS and
macro virus that can be generated by the generators is too large for
letting AV editors include each potential sample individually into
their databases. Let's also assume that these generators produce
code that does not contain any significant non variable part that
could be used as a signature.

Still, all these assumptions are *not* valid for the backdoor
"generators". These kind of generators, also called "build server"
utilities produce samples that do not vary much. They allow to
enable/disable some functionnalities of the backdoor, which does not
necessary mean that corresponding code is not present in the resulting
executable files. Moreover, there are large pieces of non-variable
code in it. I would say that this protocol rather tests the choice of
a signature than the heuristics. Coming from from av-test.org, this is
surprisingly amateurish.

There is therefore a legitimate doubt regarding my initial assumptions.

Frederic, do you have more information concerning the protocol ?

I know that some contributors of his group are quite familiar with
anti-virus testing. I would appreciate to know your opinion on this
particular topic.
 
Tweakie a écrit :
Frederic Bonroy posted yesterday a brief summary of an anti-virus test
driven by Andreas Marx for a German magazine. Andreas tested heuristic
scanning through the use of two (and a half) different methods (he also
tested other parameters).

- The first method involved scanning recent viruses with 3 and 6 month
old signature databases* ;
- The second method used recent databases and viruses especically
created for the tests (a modified version of CIH, VBS and macro virus
generators and Optix Pro and Assassin 2.0 "generators").
- When applicable, the malwares were also packed with UPX, and both
packed and non packed versions were tested.

The first method is OK, it is probably one of the best techniques for
testing heuristics.

Yes, although one could argue that this is not indicative of the
heuristic performance of recent versions of a scanner since the engine
may have been improved in the meantime. To answer your question:
the definitions files AND the engines were 3/6 months old.
Well, I don't know how CIH has been modified.

And why it was modified. Why pick CIH, and why pick only CIH? CIH alone
certainly isn't representative.

We indeed don't know how it was modified - and what was modified. Its
source code? Or did they patch the binary?
Frederic, do you have more information concerning the protocol ?

Well, no. In fact the authors don't provide their protocol. Keep
in mind that it was published in a magazine and not in a scientific
publication; its users are interested in the results and probably don't
care about the methodology.
 
Tweakie a écrit :

Yes, although one could argue that this is not indicative of the
heuristic performance of recent versions of a scanner since the engine
may have been improved in the meantime. To answer your question:
the definitions files AND the engines were 3/6 months old.

Did they give the exact version of the scanners they used ? Or the
exact date of the test ?
And why it was modified. Why pick CIH,

Because of its fame. Yet another side effect of the "star-system".
More seriously, it is one of the rare viruses (!= network worm)
that succeded to propagate efficently. It has been written in
asm, which should help a lot of heuristic engines, it uses several
routines that can be flagged as "suspect" (jumping to ring 0,
looking for cavities, etc.) It also have a destructive
payload that implies uncommon operations (very few legitimate
programs will access the BIOS).
and why pick only CIH? CIH alone
certainly isn't representative.

I'd like to know how many CIH-like viruses have been detected ITW
in 2003 ;-)
Well, no. In fact the authors don't provide their protocol. Keep
in mind that it was published in a magazine and not in a scientific
publication; its users are interested in the results and probably don't
care about the methodology.

...but of course, the reliability of the test widely depends on the
methodology. And I suppose that the test results and its methodology
will not be published on av-test.org before...2005 (?)

Does Andreas give some details on detection rates for PE exes /
scripts / macros / trojans, or does he only give a global rate ?

Are the figures similar to these mentionned in this paper :

http://www.av-test.org/down/papers/2002-09_vb_2002.zip

[Summary]

Tested new viruses, macro, scripts, worms, trojans against 3 month and
6 month old databases.

[Figures]

Average detection rates 3 month [min-max] ; 6 month [min-max] (%):

Macro : 86 [74-94] ; 75 [47-89]
Scripts : 58 [35-82] ; 43 [17-74]
Win32 file vir: 57 [24-79] ; 37 [08-68]
Win32 others : 19 [08-13] ; 12 [03-26]
 
Back
Top