Saving Data to a Disconnected (Temporarily Unreachable) Database

  • Thread starter Thread starter Charles
  • Start date Start date
Yes, I agree. I think it will make sense for it to be a thread; we have
enough separate processes running already.

Back to the end of my original question, I suppose it is too much to hope
that there is a tried and tested mechanism to handle all this for me?

Charles


r norman said:
If the data must be loaded into the database in chronological order,
then you have no choice. ALL data must be stuffed into a queue as it
arrives. A separate thread (process?) then attempts to store it all
into the database as fast as the database will accept it. If the
database is unavailable for a time, the queue just backs up but all
new data still goes to the back of the line.


Even if you load the files every 30 seconds or every minute?

Hmm. Maybe.

The tricky bit, as I see it, is how to manage the data load when the
database comes back on-line. The incoming data stream is relentless, and
there is no way of stopping it. When the database goes off-line, we can
certainly divert the stream to a file. When the database comes back, the
file has to be emptied to the database whilst data are still coming in.
Managing that last bit so that the data are loaded in chronological order
is
where I see the complexity.

Charles


message
Even if you load the files every 30 seconds or every minute?

Well you could certainly have a flag that the app looks at that says
"write
to DB" or "write to file"... When the flag is set to the latter, it
writes
to the file instead. When the database comes back up, you change the
flag
back, and load any files that were written in the meantime.



On 5/23/09 2:14 PM, in article (e-mail address removed),

Hi Aaron

Thanks for the quick reply. The data need to be pushed to the database
because it is a [near] realtime system. When the database is up, the
data
need to be saved in the database immediately. The times when it is not
possible to reach the database are very rare, but we need to design for
that
eventuality so that we don't lose data on those rare occasions.

... including turning off the process when
you want to perform maintenance on the database

This would be a very attractive feature when we have the solution in
place,
but, as I say, when the database is up the data need to be saved as it
arrives, and not on batch.

Charles
 
Yes, I agree. I think it will make sense for it to be a thread; we have
enough separate processes running already.

Back to the end of my original question, I suppose it is too much to hope
that there is a tried and tested mechanism to handle all this for me?

If there is, I sure would love to see it!

When you do it yourself, make sure that your queue is made
thread-safe. The producer and consumer may try simultaneous access.

My own code is C++ and MFC and Access database and you post to .Net
and SQL server groups so I doubt my stuff would be useful. My
interest is because all my new development is in C# .NET and SQL
server and I would rather not port over outdated stuff.


r norman said:
If the data must be loaded into the database in chronological order,
then you have no choice. ALL data must be stuffed into a queue as it
arrives. A separate thread (process?) then attempts to store it all
into the database as fast as the database will accept it. If the
database is unavailable for a time, the queue just backs up but all
new data still goes to the back of the line.


Even if you load the files every 30 seconds or every minute?

Hmm. Maybe.

The tricky bit, as I see it, is how to manage the data load when the
database comes back on-line. The incoming data stream is relentless, and
there is no way of stopping it. When the database goes off-line, we can
certainly divert the stream to a file. When the database comes back, the
file has to be emptied to the database whilst data are still coming in.
Managing that last bit so that the data are loaded in chronological order
is
where I see the complexity.

Charles


message
Even if you load the files every 30 seconds or every minute?

Well you could certainly have a flag that the app looks at that says
"write
to DB" or "write to file"... When the flag is set to the latter, it
writes
to the file instead. When the database comes back up, you change the
flag
back, and load any files that were written in the meantime.



On 5/23/09 2:14 PM, in article (e-mail address removed),

Hi Aaron

Thanks for the quick reply. The data need to be pushed to the database
because it is a [near] realtime system. When the database is up, the
data
need to be saved in the database immediately. The times when it is not
possible to reach the database are very rare, but we need to design for
that
eventuality so that we don't lose data on those rare occasions.

... including turning off the process when
you want to perform maintenance on the database

This would be a very attractive feature when we have the solution in
place,
but, as I say, when the database is up the data need to be saved as it
arrives, and not on batch.

Charles
 
Yes, I agree. I think it will make sense for it to be a thread; we have
enough separate processes running already.

Back to the end of my original question, I suppose it is too much to hope
that there is a tried and tested mechanism to handle all this for me?

If there is, I sure would love to see it!

When you do it yourself, make sure that your queue is made
thread-safe. The producer and consumer may try simultaneous access.

My own code is C++ and MFC and Access database and you post to .Net
and SQL server groups so I doubt my stuff would be useful. My
interest is because all my new development is in C# .NET and SQL
server and I would rather not port over outdated stuff.


r norman said:
If the data must be loaded into the database in chronological order,
then you have no choice. ALL data must be stuffed into a queue as it
arrives. A separate thread (process?) then attempts to store it all
into the database as fast as the database will accept it. If the
database is unavailable for a time, the queue just backs up but all
new data still goes to the back of the line.


Even if you load the files every 30 seconds or every minute?

Hmm. Maybe.

The tricky bit, as I see it, is how to manage the data load when the
database comes back on-line. The incoming data stream is relentless, and
there is no way of stopping it. When the database goes off-line, we can
certainly divert the stream to a file. When the database comes back, the
file has to be emptied to the database whilst data are still coming in.
Managing that last bit so that the data are loaded in chronological order
is
where I see the complexity.

Charles


message
Even if you load the files every 30 seconds or every minute?

Well you could certainly have a flag that the app looks at that says
"write
to DB" or "write to file"... When the flag is set to the latter, it
writes
to the file instead. When the database comes back up, you change the
flag
back, and load any files that were written in the meantime.



On 5/23/09 2:14 PM, in article (e-mail address removed),

Hi Aaron

Thanks for the quick reply. The data need to be pushed to the database
because it is a [near] realtime system. When the database is up, the
data
need to be saved in the database immediately. The times when it is not
possible to reach the database are very rare, but we need to design for
that
eventuality so that we don't lose data on those rare occasions.

... including turning off the process when
you want to perform maintenance on the database

This would be a very attractive feature when we have the solution in
place,
but, as I say, when the database is up the data need to be saved as it
arrives, and not on batch.

Charles
 
Charles said:
Back to the end of my original question, I suppose it is too much to hope
that there is a tried and tested mechanism to handle all this for me?

The problem is whatever mechanism you choose, it could fail. You could write
to disk, if the database is down. But the disk could run out of space.
Or crack into pieces and so on.
When the database comes back, the file has to be emptied to the database
whilst data are still coming in. Managing that last bit so that the data
are loaded in chronological order is where I see the complexity.

You can probably win a lot if you can design away from that restriction.
If you can't, maybe your best option is to always write to a file, and
have another program to read from the file. Or looking into queueing
solutions like MQ.


--
Erland Sommarskog, SQL Server MVP, (e-mail address removed)

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
 
Charles said:
Back to the end of my original question, I suppose it is too much to hope
that there is a tried and tested mechanism to handle all this for me?

The problem is whatever mechanism you choose, it could fail. You could write
to disk, if the database is down. But the disk could run out of space.
Or crack into pieces and so on.
When the database comes back, the file has to be emptied to the database
whilst data are still coming in. Managing that last bit so that the data
are loaded in chronological order is where I see the complexity.

You can probably win a lot if you can design away from that restriction.
If you can't, maybe your best option is to always write to a file, and
have another program to read from the file. Or looking into queueing
solutions like MQ.


--
Erland Sommarskog, SQL Server MVP, (e-mail address removed)

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
 
implement a

Data Strategy
DB
MSMQ

Create a Composite of that strategy that attempt to write to the Db if it
succeeds exits else moves to the next persistence strategy which IMHO should
be MSMQ. Just persist your object to MSMQ and have a service polling the DB
for availability and MSMQ for messages. This will automatically do what you
want with no flags turned on or off and therfore no complexity to go wrong.
Patterns to your rescue. The rest is easy. MSMQ can also be used in a
transaction.

If you want a better example of a Strategy/composite pattern let me know.
 
implement a

Data Strategy
DB
MSMQ

Create a Composite of that strategy that attempt to write to the Db if it
succeeds exits else moves to the next persistence strategy which IMHO should
be MSMQ. Just persist your object to MSMQ and have a service polling the DB
for availability and MSMQ for messages. This will automatically do what you
want with no flags turned on or off and therfore no complexity to go wrong.
Patterns to your rescue. The rest is easy. MSMQ can also be used in a
transaction.

If you want a better example of a Strategy/composite pattern let me know.
 
Thank you. MSMQ seems like a valuable tool to know about.

My impression is that the original poster needs to use MSMQ as the
first strategy to ensure that events get posted to the database in the
order they are generated, whether the database is available or not.
 
Thank you. MSMQ seems like a valuable tool to know about.

My impression is that the original poster needs to use MSMQ as the
first strategy to ensure that events get posted to the database in the
order they are generated, whether the database is available or not.
 
If thats the case then yes they would need to always update the DB via MSMQ.

A strategy is still the way to go but this would require no composite.
 
If thats the case then yes they would need to always update the DB via MSMQ.

A strategy is still the way to go but this would require no composite.
 
Charles,

I would consider using SQL Server Broker for this.

On your source sytem, install SQL Server Express (free). Then create
message/contract/queue/service on both sides and allow communications
between instances, using windows or certificate based
authentication/encryption as desired desired.

Your SQL Server Express will serve as "cache" and will deliver data to your
master SQL Server in asycronous manner, near real time. Your application
will write into local queue and SQL Server Broker will take care of
"pushing" data over. On your receiving side, create activation procedure,
with 2 to 3 threads and make sure there is very little logic in it,
sufficient to write a message into a table. message can be any format that
can be cast into varbinary. XML sounds good, varchar(max) too.

SQL Server Broker is designed to work over not so reliable networks and has
retry mechanism to deliver when connection gets up. And it is guaranteed to
deliver in sequence, particularly if you fix sending conversation group to a
fixed value.

You can also create permanent conversations to not waste resources on
opening and closing conversations, remember, conversations can live forever.

good luck...

thanks
 
Hi Farmer

Thanks for that. Sounds great. I came across Service Broker as almost the
first thing, but thought that because it was SQL Server to SQL Server it
either didn't apply, or there would still be a problem if the network went
down. It didn't occur to me that I should install one copy locally. I will
certainly go back and give this another look.

A couple of questions spring to mind:

You say near real time. I have perhaps 10 records per second that need to be
stored. I wonder what sort of delay I could expect to see storing these. Do
you have any feel for that?

Secondly, the data come in to more than one client, behind NLB. If each
client has its own local database instance, what guarantee, if any, is there
that messages from those two local instances will arrive at the remote
server in order?

Thanks for any further thoughts.

Charles
 
Charles,

why I say near real time? because it's an asyc process therefore, not real
time. Besides, your network is not reliable, right?

You can get more information on how messages are delivered in
ms-help://MS.SQLCC.v10/MS.SQLSVR.v10.en/s10sb_0evalplan/html/274ad58c-3cb7-4ffe-94ef-be2fc819eacc.htm

"Dialogs provide exactly-once-in-order (EOIO) message delivery. Dialogs use
the conversation identifier and sequence numbers that are contained in each
message to identify related messages and deliver messages in the correct
order. A dialog is a reliable, persistent stream of messages between two
services."

If you use client to client, then the above guarantee stands.
I am not sure if this quarantee is across more than one client. If you send
a message from clinet A (very slow network) and client B (very fast) using
the same related conversation group, then B may win. Yet, you could use sql
2008 new datetime with nanosecond precision and send it in the message and
do your own post recieve sequencing.
How do you guarantee that two events A and B, somewhere in the world,
happened in A before B sequence? :) once you solve that, you can transpose
the solution into this solution.

from my own experience...

I have an application (state machine, workflow engine) that is build on SQL
Service Broker (SSB) and with all the plumbing it is capable of processing
and also executing 120 message actions a second. If it were to be just to
receive, I bet the rate would be far greater.

Try to build by using BEGIN CONVERSATION... END CONVERSATION model. if not
fast enough, then you can establish permanent conversations, which will
exclude open/close conversation overhead. Then it also would depend on
network latency.
On the receving side, define simple log table and RECEIVE all messages to
receive at once all messages waiting to be received.

BEGIN TRAN;

RECEIVE *
INTO INTO @table_variable
FROM dbo.MessageQueue

INSERT INTO log_table
SELECT *
FROM @table_variable

COMMIT


IMHO, this model implemented properly will blow socks off MSMQ way.

Good book I own on this is "SQL Server 2005 Service Broker" by Roger Walter.

thanks

Vladimir
 
Hi Vladimir

Thanks, this is great information. I will enjoy reading up on it and
implementing.

Cheers.

Charles
 
Back
Top