!generating ILAsm / compiler question

  • Thread starter Thread starter ajk
  • Start date Start date
A

ajk

Hi

When doing a .NET compiler what are the benefits of generating ILAsm
directly compared to generating another higher level language like C#
first and then using the C# compiler?

Round-tripping will not work and debugging may be easier, any other
reasons why it would be prefered to generate ILAsm?

tia/ajk
 
First, a few definitions. They are not necessarily important to the
question, but they are important to understand the process. The "language"
is IL, for Intermediate Language or MSIL for Microsoft Intermediate
Language. It is considered compiled, as it can be picked up by the CLR and
run without any further "outside" intervention. If it still required more
intervention from a user, it would only be translated.

The main reason someone builds a compiler rather than a translator is to
build running code. Technically, you could build a translator and then fire
off a compiler, but you then have to trust the other compiler. If you write
in C#, C++ or VB, you will find the compilers are each better at something
different. They are functionally equivalent, but not exactly equal in their
IL output.

As a software developer, you would prefer something where you have left your
destiny under your own control, rather than partially under your control and
partially under the control of another piece of software. This is if your
goal is compilation. You also prefer the least number of moving parts, as
more moving parts means more potential points of failure.

There is certainly value in translating one high level language to another.
It is completely different from compiling.

--
Gregory A. Beamer
MVP, MCP: +I, SE, SD, DBA

Subscribe to my blog
http://feeds.feedburner.com/GregoryBeamer#

or just read it:
http://feeds.feedburner.com/GregoryBeamer

********************************************
| Think outside the box! |
********************************************
 
Hi,

Thanks for your input, yes that was what I meant with MSIL. Thanks for
clarifying that.

How many real .NET compilers exist part from the standard ones from
Microsoft? I got the impression from reading an ILAsm book that most
existing .NET compilers generate ILAsm instead of MSIL (judging from
the book's topic it may not be a complete surprise). Is it because it
is so difficult to generate MSIL? or is because they want to be
independent of MSIL which also could change like a new version
of .NET?

How long in your opinion would it take to create a compiler for say
JavaScript that generates MSIL code - roughly? Just to get a feel of
it.

It seems to me that it would be best to generate ILAsm in order to get
the control and functionality without also having to lock oneself into
a particular .NET version?

TIA
ajk
 
ajk said:
How many real .NET compilers exist part from the standard ones from
Microsoft?

There are quite a few. MS is close to releasing F#. Fujitsu has a Cobol
compiler for .NET. There are also Eiffel compilers, Delphi compilers, etc.
Not sure the full extent. Not sure if their last step is the assembler or
not.
I got the impression from reading an ILAsm book that most
existing .NET compilers generate ILAsm instead of MSIL (judging from
the book's topic it may not be a complete surprise).

ILASM is the IL assembler and is a piece that creates the Portable
Executable from IL. The compilers create IL. This is a very high level
explanation. I would have to look at the book to see context of what the
author is saying about it, but now that you have clarified this makes sense.

The author may be differentiating the IL as a flat file from IL as a PE. Not
sure.
Is it because it
is so difficult to generate MSIL? or is because they want to be
independent of MSIL which also could change like a new version
of .NET?

IL is lower level than languages like C#, for sure. To actually package it
as an assembly is even more involved. I have not gotten into internals for
so long, I am not sure 100% how the MS compilers work and they may end up
using ILASM for the final step.
How long in your opinion would it take to create a compiler for say
JavaScript that generates MSIL code - roughly? Just to get a feel of
it.

Someone created Assembler.NET in a month or two. However, if someone knows
assembly that well, they are pretty advanced. If you want to tackle
JavaScript.NET, I would look for a book on compilers for .NET. I seem to
remember seeing one. Not sure if it was mainstream. There are also documents
in the .NET Framework install folders under the tools section that might
help. Dan Appleman's book on Obfuscation also has some great info, as does
Jeffrey Richter's book.

The amount of time it would take would depend on experience level and
determination (and time, of course). What would be your goal for doing this?
It seems to me that it would be best to generate ILAsm in order to get
the control and functionality without also having to lock oneself into
a particular .NET version?

If you mean to produce IL and then use the assembler for the PE, that may be
the only way. It would certainly make sense.

Lutz Roeder's reflector might be something to look at, if you can reverse
engineer it. Understanding how someone breaks apart a PE is a wonderful
learning experience. There may be some open source implementations out
there. Also the Mono project, which contains its own compilers would be a
good thing to look at. Mono is completely open source.

--
Gregory A. Beamer
MVP, MCP: +I, SE, SD, DBA

Subscribe to my blog
http://feeds.feedburner.com/GregoryBeamer#

or just read it:
http://feeds.feedburner.com/GregoryBeamer

********************************************
| Think outside the box! |
********************************************
 
How many real .NET compilers exist part from the standard ones from
There are quite a few. MS is close to releasing F#. Fujitsu has a Cobol
compiler for .NET. There are also Eiffel compilers, Delphi compilers, etc.
Not sure the full extent. Not sure if their last step is the assembler or
not.

You haven't mentioned any specific compiler - only the one from MS but
I asked explicitly for one that was not done by MS. Delphi AFAIK
generates ILAsm - well at least according to the book "Expert .NET 2.0
IL Assembler" by Serge Lidin
ILASM is the IL assembler and is a piece that creates the Portable
Executable from IL. The compilers create IL. This is a very high level
explanation. I would have to look at the book to see context of what the
author is saying about it, but now that you have clarified this makes sense.

Yes that is already clear thanks.
The author may be differentiating the IL as a flat file from IL as a PE. Not
sure.

I think the author, who actually wrote the ILAsm assembler, knows.
IL is lower level than languages like C#, for sure. To actually package it
as an assembly is even more involved. I have not gotten into internals for
so long, I am not sure 100% how the MS compilers work and they may end up
using ILASM for the final step.

I was wondering why you are replying then since you obviously have no
clue?
Someone created Assembler.NET in a month or two. However, if someone knows
assembly that well, they are pretty advanced. If you want to tackle
JavaScript.NET, I would look for a book on compilers for .NET. I seem to
remember seeing one. Not sure if it was mainstream. There are also documents
in the .NET Framework install folders under the tools section that might
help. Dan Appleman's book on Obfuscation also has some great info, as does
Jeffrey Richter's book.

Thanks I wasn't looking for rumors, instead I wanted to know if
somebody had actually done this. BTW creating a .NET assembler should
be a pretty straightforward thing since it is basically a one-to-one
relationship between assembly instructions and actual machine code -
above all there is no need to optimize. It is obviously more difficult
to create a high level language compiler. I mentioned JavaScript as an
example.
The amount of time it would take would depend on experience level and
determination (and time, of course). What would be your goal for doing this?

Yes that obviously is a stupid question. The goal is of course being
able to (1) say that we have a proper true .NET compiler (2) have full
control of the output (3) be able to optimize code based on the
original language to mention a few.
If you mean to produce IL and then use the assembler for the PE, that may be
the only way. It would certainly make sense.

Lutz Roeder's reflector might be something to look at, if you can reverse
engineer it. Understanding how someone breaks apart a PE is a wonderful
learning experience. There may be some open source implementations out
there. Also the Mono project, which contains its own compilers would be a
good thing to look at. Mono is completely open source.

yes i have already looked at the reflector (BTW nowadays it is called
Red Gate's .NET reflector). Mono is a good tip, maybe I'll check out
the code there.

thx/ajk
 
Back
Top