C# calling simple C++/CLI library deadlock

  • Thread starter Thread starter Dragan
  • Start date Start date
D

Dragan

Hi,
We're working in VS 2005, Team edition, if it makes any difference at
all (should be up-to-date and all that, but could not guarantee it is
100%).

We've implemented a simple generic wrapper parser under C++/CLI. It's
a basic project created under Visual C++/CLR - Class Library - and
it's pretty much what it is, just one /clr compiled class, fully CLR,
no C++ native types or processing, nothing, just managed all the way.
The reason whey we did it in the first place was the better handling
of generics, at least what we thought at the time - e.g. being able to
specify Enum as a constraint, using safe_cast to switch between the
types (as opposed to C# which requires boxing to e.g. cast generic T
to float) and so on. Basically, it's a generic class w/ static methods
which take value types or Enums and parse them (the strings that is -
parsing to specific values for each type). It has some more handling
specific to our case and error handling but it's merely one page and
nothing much in it (attaching an excerpt at the end).

At first it seemed to work just fine. But when we run it through some
heavy load and under multi-threading scenario it started to deadlock
(what seems to be a deadlock but could be nothing else - we're
catching all exceptions and handling any such possible issue which
could cause things to stop) regularly and repeatably.
Our app is feeding off of a stock-market data feed, thus it's quite
heavy and requires multiple threads processing things in the
background - but this part could not be simpler. On the threads side,
we have 2-3 main threads (up to 5 w/ worker threads), one receiving
the data (thru sockets, all is in C#) and others processing things in
the back + the GUI one. What's important is that this processing, done
in C++, is always from one thread, so there is no need to lock
anything there + it's just some 'static' one-off processing so no
'state' or any variables to protect really.
The problems start when we turn on the data-feed and a good deal of
data starts coming in - it works for a minute or so (which is
important to state, so it works and starts working) but then
regularly, like a clockwork, deadlocks somewhere inside C++ library -
and it's always different place, so 'smells' like a typical deadlock
scenario.

We tried tracing it, it's always somewhere inside C++ but nothing
particular to point to. Our best guess it's something to do w/
'global' calls back to CLR core libraries, maybe that String Alloc
which was mentioned or something. At one point it looked like a
safe_cast issue (where we used float% instead of float, we're passing
'by ref' to methods - but even w/ that 'fixed' it still locks
regularly, maybe a tad less often). We also investigated if it could
be the static initializer or something, read thru posts and
everything, but nothing helped.

Then, other thing is, when we copied the code back to C#, all works
fine and under pressure. We ran the thing for hours/days and no
problems if C# to C#. But if we switch over to C++/CLI it deadlocks
pretty soon, rarely ends up running for more then 5 minutes, that's
about max. And we've re-written everything so that it's almost exact
code here and there, no differences whatsoever. Also we removed any
'exotic' stuff, like calling our Logging libraries (C#) from inside C+
+/CLI or anything like it. Basically it's just dealing w/ core
libraries, like strings, Enum parsing, safe_cast here and there and
that's about all there is to it. That's why it is a bit strange.

Finally, we gave up on this approach, as we have little time to deal
w/ such things and w/ real development waiting ahead - so removed the C
++ part entirely and doing everything in C#, and all is ok, so we're
not overly keen on moving it back to C++.
But, being a long-time .NET programmer, and C++ before it for many
years, I'm just a bit curious about what's going on here and what
could be the issue. As, w/o having this resolved, I'd be very
reluctant on doing anything in C++/CLI, while on the other side I've
always been a fan of mixing things, do in C# what it does best, move
to C++ when you have to deal w/ performance intensive stuff and so on.
Also, expected much more from CLI version, after managed C++, thought
it'd be more mature and reliable.
So, in short, any ideas or anybody from Microsoft willing to answer
this and point me to possible/probable places or calls which could
cause this. Not really sure where to look for.

Thanks in advance,
Dragan

part of the code:


public ref class Parser
{
public:
generic <typename TField, typename TValue>
where TField : Enum
where TValue : value class
static FieldSetStatus Set(TValue% value, TField field, String^
strvalue, Context<TField>^ context)
{
try
{
TValue novalue = TValue();

FieldSetStatus status = Parse<TValue>(value, novalue, strvalue);
return status;
//return ProcessStatus<TField>(context, status, field, strvalue,
TValue::typeid->Name);
}
catch( Exception^ e )
{
return FieldSetStatus::ParseError;
}
}

private:
generic <typename TValue>
where TValue : value class
static FieldSetStatus Parse(TValue% value, TValue novalue, String^
strvalue, String^ format, IFormatProvider^ culture)
{
value = novalue;

if (strvalue == "Not Found")
return FieldSetStatus::NotFound;

try
{
if (!String::IsNullOrEmpty(strvalue))
{
Type^ type = TValue::typeid;

if( type->Equals(float::typeid) )
return ParseFloat(safe_cast<float>(value), strvalue, format,
culture);
if( type->Equals(int::typeid) )
return ParseInt32( safe_cast<int>(value), strvalue, format,
culture);
if( type->Equals(DateTime::typeid) )
return ParseDate(safe_cast<DateTime>(value), strvalue, format,
culture);
if( type->Equals(TimeSpan::typeid) )
return ParseTime(safe_cast<TimeSpan>(value), strvalue, format,
culture);

return FieldSetStatus::ParseError;
}
return FieldSetStatus::Empty;
}
catch( Exception^ e )
{
return FieldSetStatus::ParseError;
}
}

static FieldSetStatus ParseFloat(float% value, String^ strvalue,
String^ format, IFormatProvider^ culture)
{
if( Single::TryParse( strvalue, value ) )
return FieldSetStatus::Success;
else
return FieldSetStatus::ParseError;
}
static FieldSetStatus ParseInt32(int% value, String^ strvalue,
String^ format, IFormatProvider^ culture)
{
if( Int32::TryParse( strvalue, value ) )
return FieldSetStatus::Success;
else
return FieldSetStatus::ParseError;
}
static FieldSetStatus ParseDate(DateTime% value, String^ strvalue,
String^ format, IFormatProvider^ culture)
{
if( format==nullptr )
format = "MM/dd/yyyy";
if (DateTime::TryParseExact(strvalue, format, culture,
System::Globalization::DateTimeStyles::None, value))
return FieldSetStatus::Success;
else
return FieldSetStatus::ParseError;
}
static FieldSetStatus ParseTime(TimeSpan% value, String^ strvalue,
String^ format, IFormatProvider^ culture)
{
if( TimeSpan::TryParse( strvalue, value ) )
return FieldSetStatus::Success;
else
return FieldSetStatus::ParseError;
}

};
 
We're working in VS 2005, Team edition, if it makes any difference at
all (should be up-to-date and all that, but could not guarantee it is
100%).
SP1?

So, in short, any ideas or anybody from Microsoft willing to answer
this and point me to possible/probable places or calls which could
cause this. Not really sure where to look for.

Could you also post a small example of a stresstest that triggers the
deadlock behavior?
That way I could try to see what's happening when it locks up.
 
....Hi, sorry for the delay,
Yes, the latest build, all ok on that side, I've rechecked, even
though I didn't reinstall everything, but that's just desperate.
....well, I've been trying to find some time to make a small stress-
test, finally did, but I was unable to come up w/ something to repeat
the problem w/o using the data-feed itself (at least within the time
frame). Tried setting up a test feed and various things but
everything's working ok. When I switch things over to real-time data
it starts to break. And I know this from before, over years of
experience w/ stock-market live data, it's a beast, if there is
anything that could possibly go wrong it will - but it's very hard to
simulate the live-data.
My guess is it's something to do with the data itself which causes
some error of a kind that blocks the main thread (and the app that
deadlocks is using main thread to receive the data over sockets, so it
could be) - the only logical thing I could come up w/, even though I
don't have any reasonable explanation for it. And just to add (if I
didn't mention it), it's the socket in the main thread (built in
'managed sockets') that receives the data, while background thread
process it and calls the C++/CLI part which deadlocks.

So, to rephrase my question, within the realm of the code I've pasted
above (as that's effectively the test unit which deadlocks under real-
time feed data), is there anything within the 'system calls' (CLR that
is) which could cause such a behavior? What about that string buffer
issue I've read somewhere or something else?

Also, any recommendations on how to debug such an application, as
debugger deadlocks even faster so I'm unable to do anything but to
trace things, which I did and helps nothing. Is there any other way of
debugging such deadlocks (I know there were pdb-s of system libraries
for C++ so you could debug into it, anything similar for CLR?). As
much as I'd like you to debug it for me :) it's not what I had in
mind, rather to get some pointers on what could cause this and where
to look for, and how to debug it.

Thanks again
 
Back
Top