Strange memory problem in threaded application.

  • Thread starter Thread starter Andla Rand
  • Start date Start date
A

Andla Rand

Hi,

I almost feel like giving up on developing on dotnet. I have done
several threaded programs with the goal to parse webpages and I have
tried WebClient and then WinSock. Both programs has memory leaks not
handled by the Garbage Collector. I don't know why my program eats so
much memory. I start the program with about 22 MB and when running about
50 threads I'm up to about 100 Mb in 30 seconds then it flattens out and
when running for 30 minutes I got about 300 Mb of used memory.

Are there any windows controls that has this memory problems ?
I use textbox and listbox. The listbox contain all the links that should
be parsed. I add links to the end and parse links in the beginning.

I'm grateful for any help you can give me on this.

Yours sincerely
Andla
 
Hi,

It depends of your code I think, if you keep adding threads it will increase
the memory comsuption, also if you keep references to resources they will
not be release.

The best way to get those problems is using a profiler, take a look at the
memory profier found in http://www.scitech.se/memprofiler
I tested it and it gave me a very good idea of how the memory was being
consumed.

You could also post the code and others will check it and see what is wrong
with it.

Hope this help,
 
Hi,

As far as I know, the garbage collector will start freeing memory only when
the framework is unable to allocate a new chunk of memory for an object
being created. That's why your application, provided you have enough RAM,
can run with 300 MB used memory and it is not due to memory leaks. You can
check, however, that you Dispose() all classes dealing with unmanaged
resources to prevent resource leaks (that can in turn cause memory leaks).

P.S. I have also heard the garbage collector works differently on server and
workstation OSes...this might also have some effect on the memory
allocation/freeing policy.
 
How do I release memory when GC doesnt.

Here are some sample code.


void parse(int count)//Count is the position in a list where the link
is to be parsed-
{

this.Text= count.ToString();
this.Update();

try
{

string lnk="";

System.Threading.Thread.Sleep(200);

//if(listBox1.Items.Count==0)
if(arrli.Count==0)
{
if(count>-1)
{
lnk=links[count];
}
else
{
return;
}
}
else
{
lnk=(string)arrli[0];
arrli.RemoveAt(0);
//lnk=(string)listBox1.Items[0];
//listBox1.Items.RemoveAt(0);
//Label6.Text="Count = "+listBox1.Items.Count;
Label6.Text="Count = "+arrli.Count;
Label6.Update();
}

//lnk is the url I want to recieve.

int start=lnk.IndexOf("://");
if(start==-1)
return;

start+="://".Length;
int stop=lnk.IndexOf("/",start);
if(stop==-1)
stop=lnk.Length;
string domain=lnk.Substring(start,stop-start);
string path=lnk.Substring(stop,lnk.Length-stop).TrimEnd('/');




TcpClient cli = new TcpClient();
cli.Connect(domain,80);


NetworkStream stream = cli.GetStream();
Encoding encoder = Encoding.GetEncoding( "ascii" );

Byte[] request;
if(path=="")
{
request = encoder.GetBytes("GET / HTTP/1.0\r\nHost:
"+domain+"\r\n\r\n");
}
else
{



request = encoder.GetBytes("GET "+path+" HTTP/1.0\r\nHost:
"+domain+"\r\nUser-Agent: Mozilla/4.0 (compatible; MSIE 6.0;
Windows)\r\n\r\n");


}


stream.Flush();


stream.Write(request, 0, request.Length);


while(!stream.DataAvailable)
{
Console.WriteLine("WAIT");
Thread.Sleep(1000);
}



int buffcount = 0;
Byte[] buffer = new byte[1024];
String response = String.Empty;


encoder = Encoding.GetEncoding( "iso-8859-1" );
while((buffcount = stream.Read(buffer, 0, buffer.Length)) > 0)
response += encoder.GetString(buffer, 0, buffcount);



stream.Close();
cli.Close();
 
Andla Rand said:
How do I release memory when GC doesnt.

You don't. However, using a StringBuilder rather than string
concatenation would help things. Also note that a simpler way of
getting the ASCII encoding is to just use the static property
Encoding.ASCII. Further, a using(...) construct would make sure that
your streams get closed even if an exception is thrown. Finally, note
that this looks like it could be a long-running task - and so it
shouldn't be done in the UI thread, whereas the manipulation of the UI
elements in the method *should* be done in the UI thread (using
Control.Invoke).
 
Thanks for you answer.

I suppose that the using(...) is only for objects that inherit
IDisposible and I suppose these objects has a Dispose() method. The only
Dispose method I saw was in the listbox1 control but I rather leave that
alone :-). Thanks for your answer it makes me feal better and hopefully
leads to a programfix at the end. I read that Service Pack 2 should fix
the 'out of memory' error (extreamly bad error resulting in if program
is running in debug mode it just ends without any warnings leaving only
one line in the debugger output 'out of memory error')

Best regards
Andla
 
Andla Rand said:
Thanks for you answer.

I suppose that the using(...) is only for objects that inherit
IDisposible and I suppose these objects has a Dispose() method.
Yes.

The only
Dispose method I saw was in the listbox1 control but I rather leave that
alone :-)

Have a look at Stream - that implements IDisposable too.
Thanks for your answer it makes me feal better and hopefully
leads to a programfix at the end. I read that Service Pack 2 should fix
the 'out of memory' error (extreamly bad error resulting in if program
is running in debug mode it just ends without any warnings leaving only
one line in the debugger output 'out of memory error')

That's possible, yes. I wouldn't have thought your current code
*should* actually leak memory in the first place, but the suggestions I
made should help a bit anyway - certainly avoiding repeated string
concatenation will help to stop so much "memory churn" in the first
place.
 
I think this is a bit misleading. If you watch the GC performance counter on
an app, you can see GCs happening fairly frequently. It doesn't wait till
there's literally no more memory left in the system, which is the impression
I got from your post.

I read somewhere that it runs when the gen 0 heap is full, and hence I guess
it will expand the gen 0 heap size as required if the app is hungry. I don't
know if that's correct or not. Mostly you hear about the GC reacting to
"memory pressure", meaning that it will run more frequently when there is
more stress on the memory, but I haven't really seen a definitive
description of when the GC decides to come out of hibernation.

Niall

Dmitriy Lapshin said:
Hi,

As far as I know, the garbage collector will start freeing memory only when
the framework is unable to allocate a new chunk of memory for an object
being created. That's why your application, provided you have enough RAM,
can run with 300 MB used memory and it is not due to memory leaks. You can
check, however, that you Dispose() all classes dealing with unmanaged
resources to prevent resource leaks (that can in turn cause memory leaks).

P.S. I have also heard the garbage collector works differently on server and
workstation OSes...this might also have some effect on the memory
allocation/freeing policy.

--
Dmitriy Lapshin [C# / .NET MVP]
X-Unity Test Studio
http://x-unity.miik.com.ua/teststudio.aspx
Bring the power of unit testing to VS .NET IDE

Andla Rand said:
Hi,

I almost feel like giving up on developing on dotnet. I have done
several threaded programs with the goal to parse webpages and I have
tried WebClient and then WinSock. Both programs has memory leaks not
handled by the Garbage Collector. I don't know why my program eats so
much memory. I start the program with about 22 MB and when running about
50 threads I'm up to about 100 Mb in 30 seconds then it flattens out and
when running for 30 minutes I got about 300 Mb of used memory.

Are there any windows controls that has this memory problems ?
I use textbox and listbox. The listbox contain all the links that should
be parsed. I add links to the end and parse links in the beginning.

I'm grateful for any help you can give me on this.

Yours sincerely
Andla
 
I read somewhere that it runs when the gen 0 heap is full

This seems be closer to the truth. I will consult my Jeffrey Richter's book
when I have some free time - it, as far as I know, describes the GC
algorithm at length.

--
Dmitriy Lapshin [C# / .NET MVP]
X-Unity Test Studio
http://x-unity.miik.com.ua/teststudio.aspx
Bring the power of unit testing to VS .NET IDE

Niall said:
I think this is a bit misleading. If you watch the GC performance counter on
an app, you can see GCs happening fairly frequently. It doesn't wait till
there's literally no more memory left in the system, which is the impression
I got from your post.

I read somewhere that it runs when the gen 0 heap is full, and hence I guess
it will expand the gen 0 heap size as required if the app is hungry. I don't
know if that's correct or not. Mostly you hear about the GC reacting to
"memory pressure", meaning that it will run more frequently when there is
more stress on the memory, but I haven't really seen a definitive
description of when the GC decides to come out of hibernation.

Niall

Dmitriy Lapshin said:
Hi,

As far as I know, the garbage collector will start freeing memory only when
the framework is unable to allocate a new chunk of memory for an object
being created. That's why your application, provided you have enough RAM,
can run with 300 MB used memory and it is not due to memory leaks. You can
check, however, that you Dispose() all classes dealing with unmanaged
resources to prevent resource leaks (that can in turn cause memory leaks).

P.S. I have also heard the garbage collector works differently on server and
workstation OSes...this might also have some effect on the memory
allocation/freeing policy.

--
Dmitriy Lapshin [C# / .NET MVP]
X-Unity Test Studio
http://x-unity.miik.com.ua/teststudio.aspx
Bring the power of unit testing to VS .NET IDE

Andla Rand said:
Hi,

I almost feel like giving up on developing on dotnet. I have done
several threaded programs with the goal to parse webpages and I have
tried WebClient and then WinSock. Both programs has memory leaks not
handled by the Garbage Collector. I don't know why my program eats so
much memory. I start the program with about 22 MB and when running about
50 threads I'm up to about 100 Mb in 30 seconds then it flattens out and
when running for 30 minutes I got about 300 Mb of used memory.

Are there any windows controls that has this memory problems ?
I use textbox and listbox. The listbox contain all the links that should
be parsed. I add links to the end and parse links in the beginning.

I'm grateful for any help you can give me on this.

Yours sincerely
Andla
 
In addition, I doubt that 50 threads really helps you any and likely will
hurt you more then help. Why 50 threads? Can you refactor to using a few
worker threads? This should also help memory and performace.
 
I Can't help myself. I love threads :-)

Seriously if one thread is waiting for a server to respond then why not
connect to another server. Now actually the bottleneck is my code rather
than downloading pages. But if it wasn't then I wouldn't even know that
threads are waiting for slow servers. 50 threads maybe is much but I'm
still tweaking it for best performance.

Regards
Andla
 
What about the thread pool? If a thread pool thread is blocked, the thread
pool can automatically bring another into operation. Of course, it's not
going to be perfectly tailored to your situation, but it might help.

Niall
 
True, if each thread is waiting for IO. Then you can fire up more requests
and let some IO device (in this case the network) handle work for a while.
Problem is, at some point you have to pay the tax. As these queued up
requests come back, your threads will all compete for CPU time to do work.
This competition (sync locks, context switches) all factor to make many
threads slower then having a few worker threads (in general.) With a single
CPU the ideal scenario would be one thread working and one thread waiting
for IO and when thread1 is done, it waits for IO and the other starts
working on the IO it was waiting for. So at all times a thread is in the
ready queue or working and one is waiting for IO. Naturally, reality does
not work so cleanly, but the idea still works better for the most part.
Most of the time, you can get more work done with less threads if you factor
your consumer/producers correctly.
 
Back
Top