Reading data from a Socket

  • Thread starter Thread starter Ross Neilson
  • Start date Start date
R

Ross Neilson

I have coded a method which reads data from a socket in chunks of 2048
and concatenates those chunks into a string before returning the
string to the caller. Almost all the time this code works perfectly
well. Occasionally, for a reason I've yet to determine, the returned
string seems to be truncated at 2920 characters. This causes problems
in another part of my app because the string is Xml, which then fails
to be loaded into an XmlDocument object. The code for the method is:

Friend Shared Function ReadAllDataFromSocket(ByVal reciever As Socket)
As String

Const BUFFER_SIZE As Integer = 2048
Dim buffer(BUFFER_SIZE - 1) As Byte
Dim bytesRead As Integer = BUFFER_SIZE
Dim sTemp As StringBuilder = New StringBuilder(BUFFER_SIZE)
Dim DataRead As String = Nothing

' Loop until all data has been read.
While (bytesRead = BUFFER_SIZE)

' Read this chunk of data.
bytesRead = reciever.Receive(buffer)

' Check that some data was read.
If bytesRead > 0 Then
' Add this data to the string builder.
DataRead = Encoding.UTF8.GetString(buffer, 0,
bytesRead)

sTemp = sTemp.Append(DataRead)

End If


End While

' Return the string to the caller.
Return sTemp.ToString()

End Function

I tried recoding the method so that it reads the data all in one go,
using the Available propery of the socket to determine the buffer
size. However Available seems to be very unreliable and I've seen many
people recommend avoiding it.

Does anyone have any ideas as to why I get this problem, or can anyone
suggest an alternative way of getting the data out of the socket?

Thanks,

Ross
 
Ross Neilson said:
I have coded a method which reads data from a socket in chunks of 2048
and concatenates those chunks into a string before returning the
string to the caller. Almost all the time this code works perfectly
well. Occasionally, for a reason I've yet to determine, the returned
string seems to be truncated at 2920 characters.

You have two problems here:

1) You stop as soon as you read less than your buffer size at a time -
you should stop when you read 0 bytes.

2) A UTF-8 character can span multiple bytes, so you should use a
Decoder rather than just Encoding.GetString. (The Decoder saves the
state in terms of undecoded bytes.)
 
Thanks for the quick reply Jon. If I've understood your first point
correctly, I think I need to change the While condition from

While (bytesRead = BUFFER_SIZE)

to

While (bytesRead > 0)

is this right?

As for the second point I will look into that further.

Regards,

Ross
 
Ross Neilson said:
Thanks for the quick reply Jon. If I've understood your first point
correctly, I think I need to change the While condition from

While (bytesRead = BUFFER_SIZE)

to

While (bytesRead > 0)

is this right?
Yup.

As for the second point I will look into that further.

It's nice and easy - just get the decoder from Encoding.UTF8, and use
the same one for the whole while loop:

Decoder decoder = Encoding.UTF8.GetDecoder();
char[] charBuffer = new char[SomeBufferSize];

while (...)
{
...
int decoded = decoder.GetChars(byteBuffer, 0, bytesRead,
charBuffer, 0);
builder.Append (charBuffer, 0, decoded);
}
 
Jon,

I tried changing the loop as per your instructions in point 1. I'm now
finding that, on the iteration immediately after I've received a number
of bytes less than the buffer size (usually the last chunk of data), the
app is hanging on the line

bytesRead = reciever.Receive(buffer)

Looking at the documentation I think this is because I'm using a
blocking socket. If I was using a non-blocking socket it would just
throw an exception. I'm afraid I don't really understand how sockets
work, surely there must be a simple, robust way to get all the data read
from a socket?

Thanks,

Ross
 
Ross Neilson said:
I tried changing the loop as per your instructions in point 1. I'm now
finding that, on the iteration immediately after I've received a number
of bytes less than the buffer size (usually the last chunk of data), the
app is hanging on the line

bytesRead = reciever.Receive(buffer)

Looking at the documentation I think this is because I'm using a
blocking socket. If I was using a non-blocking socket it would just
throw an exception. I'm afraid I don't really understand how sockets
work, surely there must be a simple, robust way to get all the data read
from a socket?

Is the other end not closing the socket? If it isn't, then you will
indeed hang - the client has no way of knowing if the other end is
about to send more data. Typically you fix this by making sure in the
protocol that you know how much data to expect.
 
Jon,

You're right, the other end does not close the socket - I have no way to
alter this behaviour either.

Regarding your original response, you state that

"You stop as soon as you read less than your buffer size at a time - you
should stop when you read 0 bytes."

This seems to imply that, even if the number of bytes received is less
than the buffer size, it is still possible to get further data from the
socket next time you read from it. However that seems at odds with the
behaviour I've observed - I do not receive any data, the code just hangs
on the call to Socket.Receive instead.

To the uninitiated such as myself this seems crazy. Given that my socket
must remain open is there another way of reliably getting this data out?

Thanks,

Ross
 
Ross Neilson said:
You're right, the other end does not close the socket - I have no way to
alter this behaviour either.

Regarding your original response, you state that

"You stop as soon as you read less than your buffer size at a time - you
should stop when you read 0 bytes."

This seems to imply that, even if the number of bytes received is less
than the buffer size, it is still possible to get further data from the
socket next time you read from it.

Yup, quite possibly.
However that seems at odds with the
behaviour I've observed - I do not receive any data, the code just hangs
on the call to Socket.Receive instead.

Imagine if the other side waits for a few minutes and *then* sends some
more data. You could then receive that data. If the other side doesn't
send any more data, your call will block.
To the uninitiated such as myself this seems crazy. Given that my socket
must remain open is there another way of reliably getting this data out?

Not really - you can have a timeout on the read (using asynchronous
reads, for example) but that's about it. How are you expecting to tell
the difference between a server which has finished writing and a server
which is just pausing?
 
I'm not sure how I can tell the difference, but I'll keep working on it.
Thanks for your help.

Regards,

Ross
 
Back
Top