Shared memory coherency problem...

  • Thread starter Thread starter Dave Littell
  • Start date Start date
D

Dave Littell

Greetings,

(Sorry for the massive crossposts, but I need a real answer real
soon, so I'm shotgunning.)

I don't know if this is the right newsgroup for this question,
(there's only about 10,000 or so... :-0), but here goes:


OS: Windows 2000 Professional

Service Pack level: SP3

Problem Summary: Single writer/multiple reader shared memory fails
to maintain coherency, but (usually) recovers.


I have a situation where the writer updates a small area (less than
a physical page size) of shared memory at a relatively high rate
(greater than 50 Hz, less than 1 kHz (and no, I can't be more
specific)). There are multiple readers that are held at bay by a
named event (manual reset, controlled by the writer, initially
reset) during the writer's update and released via the writer's
SetEvent() at the end of the update. The named event remains
signaled until the next update. To protect against the case where a
reader can wake up slightly before the update (due to out-of-order
delivery of trigger events), proceed because the named event is
still signaled, and retrieve stale data I use a sequence number in
the shared memory that is only ever written by the shared memory
writer (during its update). Each reader keeps a local idea of the
expected sequence number at the next update and polls for that value
to show up in the shared memory sequence number before proceeding.
The readers' polling is gated by the named event so they can't jump
in during the writer's update.

So, the shared memory sequence number has the value 0 *only during
the time before the writer's first update*. During this
initialization period the named event is nonsignaled, so the readers
will all block on the named event and never see the 0.

Here's the problem: After they all run merrily for a while I
suddenly see cases where the readers are getting 0-valued sequence
numbers. This is just not possible after the writer cycles through
its first update. Sometimes a reader finally gets the correct
(non-zero) sequence number and continues along. Occasionally a
reader gets a 0-valued sequence number forever (or at least until I
kill it).

The shared-memory sequence number was initially a 64-bit value
(32-bit machine). Since 64-bit writes to shared memory aren't
guaranteed to be atomic, I changed it to a 32-bit value (for which
shared-memory writes are atomic). Same behavior. I'm running out
of guesses, so...

This sounds like some color of a shared-memory coherency problem to
me (not to mix metaphors or anything ;-). Yah?

I believe I'm seeing some correlation between these 0-valued
sequence numbers and bursts of disk activity while the kids are
running. I've been able to pretty consistently get the problem to
appear if I do something that touches the disk. Note that the
shared memory was set up using:

CreateFileMapping( INVALID_HANDLE_VALUE,
NULL,
PAGE_READWRITE | SEC_COMMIT, ... );

and

MapViewOfFile( ..., FILE_MAP_ALL_ACCESS, ... );

So, my understanding is that all this tells 200 there's just a page
of physical memory somewhere that the readers and writer can all
see. I believe that the physical page that is the shared memory
shouldn't ever be evicted to the paging file because of a call to
VirtualLock(), so it should never even come close to the the disk.

I'm stumped: any ideas?


Thanks very much,
Dave
 
Most people will not respond to pests who massively crosspost. It is not
considered polite and this is the wrong newsgroup anyway.
 
Mercury said:
Most people will not respond to pests who massively crosspost. It is not
considered polite and this is the wrong newsgroup anyway.

Sir,

My crossposting was never intended to be pesty. I have a serious
technical problem and there are far too many newsgroups with names
(and content) that suggest they are indeed not "the wrong newsgroup
anyway". Some may be not the "right" newsgroup, but I've exhausted
my knowledge on this issue, time is short, and I'm asking for help.

Thank you for your input.


Dave
 
So you are selfish and you demand everybody stop what they are doing no
matter what newsgroups they are in and deal with your problem for free. I
repeat you are a selfish, self-centered pest. If time is short and you want
help --pay for it and stop being an obnoxious twit.
..
 
Mercury said:
So you are selfish and you demand everybody stop what they are doing no
matter what newsgroups they are in and deal with your problem for free. I
repeat you are a selfish, self-centered pest. If time is short and you want
help --pay for it and stop being an obnoxious twit.
.

Top-poster. Please refer to USENET FAQs.

--TW
 
Mercury said:
So you are selfish and you demand everybody stop what they are doing no
matter what newsgroups they are in and deal with your problem for free. I
repeat you are a selfish, self-centered pest. If time is short and you want
help --pay for it and stop being an obnoxious twit.
.

yah, true... no one has the right to free speech here.
this is the internet, please don't post for any help.

you must use ebay for help.

i hate seeing ppl post in newsgroups, it's just too absurb! 8^P

-a|ex
 
Hey you two, I don't see you offering "poor Dave" any *useful help* or are
you just busybodies with nothing better to do but cluck.
 
Back
Top