Saving Data to a Disconnected (Temporarily Unreachable) Database

Charles · May 23, 2009

We have a process (a Windows application) that receives data over the
network. After a couple of validity checks, the data must be saved to a
table in a SQL Server 2005 database. This is fine when the database is up,
and reachable, but if it is unreachable for a short time we still need to be
able to save the data in the table.

Obviously, we can't actually save the data to the table while the database
is down, so some mechanism needs to be employed to cache the data until the
database comes back, and then save it when it next can.

This seems like something that might be commonly required, so I wonder if
there is a tried and tested technique that I could employ?

TIA

Charles

Aaron Bertrand [SQL Server MVP] · May 23, 2009

Why not write the data to log files *always*, and have the database pick up
and bulk insert the log files on a schedule. This way the write load on the
database will not be at the mercy of volume patterns, and you can control
how often the log files are loaded (including turning off the process when
you want to perform maintenance on the database).

Aaron Bertrand [SQL Server MVP] · May 23, 2009

Why not write the data to log files *always*, and have the database pick up
and bulk insert the log files on a schedule. This way the write load on the
database will not be at the mercy of volume patterns, and you can control
how often the log files are loaded (including turning off the process when
you want to perform maintenance on the database).

r norman · May 23, 2009

We have a process (a Windows application) that receives data over the
network. After a couple of validity checks, the data must be saved to a
table in a SQL Server 2005 database. This is fine when the database is up,
and reachable, but if it is unreachable for a short time we still need to be
able to save the data in the table.

Obviously, we can't actually save the data to the table while the database
is down, so some mechanism needs to be employed to cache the data until the
database comes back, and then save it when it next can.

This seems like something that might be commonly required, so I wonder if
there is a tried and tested technique that I could employ?

I stash the data in a thread-safe queue and start a worker thread to
process it. It works well but making sure that the queue gets
properly saved somewhere if the main process shuts down and then gets
properly restored when it starts up again is a real pain. I would
love to see a better solution.

r norman · May 23, 2009

We have a process (a Windows application) that receives data over the
network. After a couple of validity checks, the data must be saved to a
table in a SQL Server 2005 database. This is fine when the database is up,
and reachable, but if it is unreachable for a short time we still need to be
able to save the data in the table.

Obviously, we can't actually save the data to the table while the database
is down, so some mechanism needs to be employed to cache the data until the
database comes back, and then save it when it next can.

This seems like something that might be commonly required, so I wonder if
there is a tried and tested technique that I could employ?

I stash the data in a thread-safe queue and start a worker thread to
process it. It works well but making sure that the queue gets
properly saved somewhere if the main process shuts down and then gets
properly restored when it starts up again is a real pain. I would
love to see a better solution.

Charles · May 23, 2009

Hi Aaron

Thanks for the quick reply. The data need to be pushed to the database
because it is a [near] realtime system. When the database is up, the data
need to be saved in the database immediately. The times when it is not
possible to reach the database are very rare, but we need to design for that
eventuality so that we don't lose data on those rare occasions.

... including turning off the process when
you want to perform maintenance on the database

This would be a very attractive feature when we have the solution in place,
but, as I say, when the database is up the data need to be saved as it
arrives, and not on batch.

Charles

Charles · May 23, 2009

Hi Aaron

Thanks for the quick reply. The data need to be pushed to the database
because it is a [near] realtime system. When the database is up, the data
need to be saved in the database immediately. The times when it is not
possible to reach the database are very rare, but we need to design for that
eventuality so that we don't lose data on those rare occasions.

... including turning off the process when
you want to perform maintenance on the database

This would be a very attractive feature when we have the solution in place,
but, as I say, when the database is up the data need to be saved as it
arrives, and not on batch.

Charles

Charles · May 23, 2009

Do you persist the queue in any way? I had thought of some kind of managed
FIFO, but if it would have to be saved locally to disk to make it safe.

Charles

Charles · May 23, 2009

Do you persist the queue in any way? I had thought of some kind of managed
FIFO, but if it would have to be saved locally to disk to make it safe.

Charles

Aaron Bertrand [SQL Server MVP] · May 23, 2009

Even if you load the files every 30 seconds or every minute?

Well you could certainly have a flag that the app looks at that says "write
to DB" or "write to file"... When the flag is set to the latter, it writes
to the file instead. When the database comes back up, you change the flag
back, and load any files that were written in the meantime.

Hi Aaron

Thanks for the quick reply. The data need to be pushed to the database
because it is a [near] realtime system. When the database is up, the data
need to be saved in the database immediately. The times when it is not
possible to reach the database are very rare, but we need to design for that
eventuality so that we don't lose data on those rare occasions.

... including turning off the process when
you want to perform maintenance on the database

Click to expand...

This would be a very attractive feature when we have the solution in place,
but, as I say, when the database is up the data need to be saved as it
arrives, and not on batch.

Charles

Aaron Bertrand [SQL Server MVP] · May 23, 2009

Even if you load the files every 30 seconds or every minute?

Well you could certainly have a flag that the app looks at that says "write
to DB" or "write to file"... When the flag is set to the latter, it writes
to the file instead. When the database comes back up, you change the flag
back, and load any files that were written in the meantime.

Hi Aaron

Thanks for the quick reply. The data need to be pushed to the database
because it is a [near] realtime system. When the database is up, the data
need to be saved in the database immediately. The times when it is not
possible to reach the database are very rare, but we need to design for that
eventuality so that we don't lose data on those rare occasions.

... including turning off the process when
you want to perform maintenance on the database

Click to expand...

This would be a very attractive feature when we have the solution in place,
but, as I say, when the database is up the data need to be saved as it
arrives, and not on batch.

Charles

r norman · May 23, 2009

Do you persist the queue in any way? I had thought of some kind of managed
FIFO, but if it would have to be saved locally to disk to make it safe.

Yes, that is what I meant by saying it gets saved somewhere if the
system shuts down. I write it out to a file and restore the
unprocessed data back to the queue on startup so that it can get
handled.

It really is a dedicated system that runs constantly without user
intervention acting as a server for a remote user interface. But if
it has to be stopped and restarted for local maintenance, all unsaved
data has to be accounted for and handled eventually.

r norman · May 23, 2009

Do you persist the queue in any way? I had thought of some kind of managed
FIFO, but if it would have to be saved locally to disk to make it safe.

Yes, that is what I meant by saying it gets saved somewhere if the
system shuts down. I write it out to a file and restore the
unprocessed data back to the queue on startup so that it can get
handled.

It really is a dedicated system that runs constantly without user
intervention acting as a server for a remote user interface. But if
it has to be stopped and restarted for local maintenance, all unsaved
data has to be accounted for and handled eventually.

Charles · May 24, 2009

Even if you load the files every 30 seconds or every minute?

Hmm. Maybe.

The tricky bit, as I see it, is how to manage the data load when the
database comes back on-line. The incoming data stream is relentless, and
there is no way of stopping it. When the database goes off-line, we can
certainly divert the stream to a file. When the database comes back, the
file has to be emptied to the database whilst data are still coming in.
Managing that last bit so that the data are loaded in chronological order is
where I see the complexity.

Charles

Aaron Bertrand said:
Even if you load the files every 30 seconds or every minute?

Well you could certainly have a flag that the app looks at that says
"write
to DB" or "write to file"... When the flag is set to the latter, it writes
to the file instead. When the database comes back up, you change the flag
back, and load any files that were written in the meantime.

Hi Aaron

Thanks for the quick reply. The data need to be pushed to the database
because it is a [near] realtime system. When the database is up, the data
need to be saved in the database immediately. The times when it is not
possible to reach the database are very rare, but we need to design for
that
eventuality so that we don't lose data on those rare occasions.

... including turning off the process when
you want to perform maintenance on the database

Click to expand...

This would be a very attractive feature when we have the solution in
place,
but, as I say, when the database is up the data need to be saved as it
arrives, and not on batch.

Charles

Click to expand...

Charles · May 24, 2009

Even if you load the files every 30 seconds or every minute?

Hmm. Maybe.

The tricky bit, as I see it, is how to manage the data load when the
database comes back on-line. The incoming data stream is relentless, and
there is no way of stopping it. When the database goes off-line, we can
certainly divert the stream to a file. When the database comes back, the
file has to be emptied to the database whilst data are still coming in.
Managing that last bit so that the data are loaded in chronological order is
where I see the complexity.

Charles

Aaron Bertrand said:
Even if you load the files every 30 seconds or every minute?

Well you could certainly have a flag that the app looks at that says
"write
to DB" or "write to file"... When the flag is set to the latter, it writes
to the file instead. When the database comes back up, you change the flag
back, and load any files that were written in the meantime.

Hi Aaron

Thanks for the quick reply. The data need to be pushed to the database
because it is a [near] realtime system. When the database is up, the data
need to be saved in the database immediately. The times when it is not
possible to reach the database are very rare, but we need to design for
that
eventuality so that we don't lose data on those rare occasions.

... including turning off the process when
you want to perform maintenance on the database

Click to expand...

This would be a very attractive feature when we have the solution in
place,
but, as I say, when the database is up the data need to be saved as it
arrives, and not on batch.

Charles

Click to expand...

Charles · May 24, 2009

... restore the unprocessed data back to the queue on startup

so that it can get handled.

A little light has just gone on. What you're saying is that the data saved
to file get restored to the head of the queue, so that the mechanism for
loading to the database just carries on as if nothing had happened. I think
that is what I should be looking at.

Great, thanks.

Charles

Charles · May 24, 2009

... restore the unprocessed data back to the queue on startup

so that it can get handled.

A little light has just gone on. What you're saying is that the data saved
to file get restored to the head of the queue, so that the mechanism for
loading to the database just carries on as if nothing had happened. I think
that is what I should be looking at.

Great, thanks.

Charles

r norman · May 24, 2009

If the data must be loaded into the database in chronological order,
then you have no choice. ALL data must be stuffed into a queue as it
arrives. A separate thread (process?) then attempts to store it all
into the database as fast as the database will accept it. If the
database is unavailable for a time, the queue just backs up but all
new data still goes to the back of the line.

Aaron Bertrand said:
Even if you load the files every 30 seconds or every minute?

Click to expand...

Hmm. Maybe.

The tricky bit, as I see it, is how to manage the data load when the
database comes back on-line. The incoming data stream is relentless, and
there is no way of stopping it. When the database goes off-line, we can
certainly divert the stream to a file. When the database comes back, the
file has to be emptied to the database whilst data are still coming in.
Managing that last bit so that the data are loaded in chronological order is
where I see the complexity.

Charles

Aaron Bertrand said:

Even if you load the files every 30 seconds or every minute?

Well you could certainly have a flag that the app looks at that says
"write
to DB" or "write to file"... When the flag is set to the latter, it writes
to the file instead. When the database comes back up, you change the flag
back, and load any files that were written in the meantime.

Hi Aaron

Thanks for the quick reply. The data need to be pushed to the database
because it is a [near] realtime system. When the database is up, the data
need to be saved in the database immediately. The times when it is not
possible to reach the database are very rare, but we need to design for
that
eventuality so that we don't lose data on those rare occasions.

... including turning off the process when
you want to perform maintenance on the database

This would be a very attractive feature when we have the solution in
place,
but, as I say, when the database is up the data need to be saved as it
arrives, and not on batch.

Charles

Click to expand...

Click to expand...

r norman · May 24, 2009

If the data must be loaded into the database in chronological order,
then you have no choice. ALL data must be stuffed into a queue as it
arrives. A separate thread (process?) then attempts to store it all
into the database as fast as the database will accept it. If the
database is unavailable for a time, the queue just backs up but all
new data still goes to the back of the line.

Aaron Bertrand said:
Even if you load the files every 30 seconds or every minute?

Click to expand...

Hmm. Maybe.

The tricky bit, as I see it, is how to manage the data load when the
database comes back on-line. The incoming data stream is relentless, and
there is no way of stopping it. When the database goes off-line, we can
certainly divert the stream to a file. When the database comes back, the
file has to be emptied to the database whilst data are still coming in.
Managing that last bit so that the data are loaded in chronological order is
where I see the complexity.

Charles

Aaron Bertrand said:

Even if you load the files every 30 seconds or every minute?

Well you could certainly have a flag that the app looks at that says
"write
to DB" or "write to file"... When the flag is set to the latter, it writes
to the file instead. When the database comes back up, you change the flag
back, and load any files that were written in the meantime.

Hi Aaron

Thanks for the quick reply. The data need to be pushed to the database
because it is a [near] realtime system. When the database is up, the data
need to be saved in the database immediately. The times when it is not
possible to reach the database are very rare, but we need to design for
that
eventuality so that we don't lose data on those rare occasions.

... including turning off the process when
you want to perform maintenance on the database

This would be a very attractive feature when we have the solution in
place,
but, as I say, when the database is up the data need to be saved as it
arrives, and not on batch.

Charles

Click to expand...

Click to expand...

Charles · May 24, 2009

Yes, I agree. I think it will make sense for it to be a thread; we have
enough separate processes running already.

Back to the end of my original question, I suppose it is too much to hope
that there is a tried and tested mechanism to handle all this for me?

Charles

r norman said:
If the data must be loaded into the database in chronological order,
then you have no choice. ALL data must be stuffed into a queue as it
arrives. A separate thread (process?) then attempts to store it all
into the database as fast as the database will accept it. If the
database is unavailable for a time, the queue just backs up but all
new data still goes to the back of the line.

Even if you load the files every 30 seconds or every minute?

Click to expand...

Hmm. Maybe.

The tricky bit, as I see it, is how to manage the data load when the
database comes back on-line. The incoming data stream is relentless, and
there is no way of stopping it. When the database goes off-line, we can
certainly divert the stream to a file. When the database comes back, the
file has to be emptied to the database whilst data are still coming in.
Managing that last bit so that the data are loaded in chronological order
is
where I see the complexity.

Charles

message

Even if you load the files every 30 seconds or every minute?

Well you could certainly have a flag that the app looks at that says
"write
to DB" or "write to file"... When the flag is set to the latter, it
writes
to the file instead. When the database comes back up, you change the
flag
back, and load any files that were written in the meantime.

On 5/23/09 2:14 PM, in article (e-mail address removed),

Hi Aaron

Thanks for the quick reply. The data need to be pushed to the database
because it is a [near] realtime system. When the database is up, the
data
need to be saved in the database immediately. The times when it is not
possible to reach the database are very rare, but we need to design for
that
eventuality so that we don't lose data on those rare occasions.

... including turning off the process when
you want to perform maintenance on the database

This would be a very attractive feature when we have the solution in
place,
but, as I say, when the database is up the data need to be saved as it
arrives, and not on batch.

Charles

Click to expand...

Click to expand...

Saving Data to a Disconnected (Temporarily Unreachable) Database

Charles

Aaron Bertrand [SQL Server MVP]

Aaron Bertrand [SQL Server MVP]

r norman

r norman

Charles

Charles

Charles

Charles

Aaron Bertrand [SQL Server MVP]

Aaron Bertrand [SQL Server MVP]

r norman

r norman

Charles

Charles

Charles

Charles

r norman

r norman

Charles