Memory Streams and Struct Pointers Revisited

  • Thread starter Thread starter _TR
  • Start date Start date
T

_TR

I love C# and I've been porting a lot of my older C++ code, but there
is one app that I haven't quite figured out yet. I suppose I could
re-engineer the app from the ground up, but I've already accumulated a
lot of data files, so the format is set for now.

Basically, the data format consists of inlined Structs of different
types and lengths. I can decode the first byte to tell what type of
struct it is, then point a struct pointer to it to decode the
following bytes. The format is fairly easy to work with, and again I
already have megabytes of files in that format, so I don't want to
change it yet.

For speed, I read the entire file into RAM ahead of time rather than
paging it from disk.

Here's the question: How well would that approach translate to C#?
Someone claimed that there was no need for 'unsafe' code, given that
memory streams could be used to read the bytes into structs, but this
seems to require additional work (loading the structs as opposed to
simply pointing a *struct directly into the stream). Is there a
graceful way around this?

The fallback plan is to use VS2005's CLI/C++. I find C++'s syntax
tedious compared to C#, but the new CLI pointer mechanisms might be
preferable; I don't want the GC moving streams at awkward moments
(I'm trying to maintain good realtime response).

Any thoughts?
 
Tough one :)

My guess is that you would have to redefine the types of those structs
in C# regardless of what implementation strategy you'll follow. Of
course you'll have to specify packing and other memory layout
attributes to make sure that they remain compatible with the memory
layout of the C++ structs.

I hope I have guessed correctly so far.

If that is indeed the case, then I would use P/Invoke to run the C++
code which unpacks the bytes into the C# structs. To me, It looks like
the easiest--abliet not the coolest--way to go.

Sorry I couldn't be of much help. Any other ideas anyone?
 
_TR said:
For speed, I read the entire file into RAM ahead of time rather than
paging it from disk.

I don't really think that's likely to gain anything. You cannot
memory-map the bytes you read into CLR objects anyway.
Here's the question: How well would that approach translate to C#?

Write a parser for the structs. and a registry for them. Something along
the lines of:

public interface StructParser {
int Tag { get; }
object Parse(Stream s);
}
public class FooStructParser: StructParser {
const int Tag = 0xFFE4;
object Parse(Stream s) {
int x = Tools.ReadInt(s);
...
return new Foor(x);
}
}

public class FileReader {
IDictionary Parsers = new HashTable();
public void Register(StructParser p) {
Parsers[p.Tag] = p;
}
class Enumerator: IEnumrator {
Stream s;
FileReader r;
Object current;
public Enumerator(FileReader r, Stream s)
{ this.s = s; this.r = r; }
public object Current {
get {
if ( current == null )
throw new IndexOutOfBoundsException(); }
else
return current;
}
}
public bool MoveNext() {
try {
int x = Tools.ReadInt(s);
current = ((StructParser)Parsers[x]).Parse(s);
return true;
} catch ( EndOfStreamException ) {
current = null;
return false;
}
}
}
public Enumerator Objects(Stream s) { return new Enumerator(this,s); }
}

Note that the above can be made to work for composite and recusive
structs... :)
Someone claimed that there was no need for 'unsafe' code, given that
memory streams could be used to read the bytes into structs, but this
seems to require additional work (loading the structs as opposed to
simply pointing a *struct directly into the stream). Is there a
graceful way around this?

I don't think that's gonna be possible -- in C++ you could probably do
it if the data is POD without RTTI using placement-new for the objects,
but I don't think I would do it if I were you :)
The fallback plan is to use VS2005's CLI/C++. I find C++'s syntax
tedious compared to C#, but the new CLI pointer mechanisms might be
preferable; I don't want the GC moving streams at awkward moments
(I'm trying to maintain good realtime response).

You can lock data with "fixed", then the GC cannot move it.

Try coding a parser for a few of the structs and see if it's not really
the easy way out.
 
Back
Top