Message based approach

  • Thread starter Thread starter Ginny
  • Start date Start date
G

Ginny

In a service oriented architecture, an application can call an external
servie. an alternate approach is to use message based technique. following
are the points that need calrifications??

1. how does message based approach help?
2. the .net application architecture guide says that, the message based
approcah can make the design of you business logic independent of the
underlying transport protocol used between services. HOW?
3. In a message based approach you need to take care of special
considerations as message correlation, optimistic data concurrency
management,business process compensation, and external service
unavailability. What r all these ??
 
Hello Ginny,
You appear to be a little confused. SOA is a high level architectural
approach to distributed systems design. What you appear to be referring to
as 'message based approach' is a subset of the SOA approach that uses long
running transactions. It is still SOA.
1. how does message based approach help?

Long running transactions and messaging solve a particularly difficult
design problem: how to simplify the code around the state machine and
seperate it from the rest of the system for easy modification. The state
machine is traditionally a problem. In the early days of Computer Science,
greats like Djikstra and Turing recognized the immense power of state
machines. The Turing Machine is, itself, a state machine that is able to
simulate the computational capabilities of any modern computer. The problem
is twofold: 1) our declarative programming languages are lousy for
describing and maintaining state machines. There are a great deal of
practical considerations to take into account that are not built in to the
error handling structures, branching structures, and even data type
declarations. 2) state machines change a lot. When done properly, a large
portion of the business rules end up embedded in a state machine. Therefore
they need to be encapsulated from the rest of the system.

In many cases, developers who know little about state machines avoid the
problem. You don't have to use a state machine. Most problems can be
solved without state machines, by adding a ten-fold increase in code and
complexity. But hey, if you only have a hammer, every problem is a nail.
Those of us who understood the power of the state machine have always sought
out ways to reduce code, and increase elegance, by implementing state
machines in our systems. Nice thing: in the age of the web and stateless
transaction models, this has required developers to place all of the
stateful aspects of the system in one place... which has done more to drive
the sophistication of state machine tools than any amount of salesmanship or
marketing could ever have. It's a golden time.

Message based approaches are the way you can understand how to drive a state
machine. A message is input to a state.

Message oriented middleware tools have increased in sophistication steadily
in the past four years. Look at the huge efforts being spent to improve ESB
tools and Biztalk. Look at the efforts to describe temporal relationships
in the WS_* standards.
2. the .net application architecture guide says that, the message based
approcah can make the design of you business logic independent of the
underlying transport protocol used between services. HOW?

If you take a message based approach, in your service oriented architecture,
it means that you start by isolating your state machine into a service
component. In a normal scenario, that state machine does not need to
actually implement the business logic. It implements that state transitions
and temporal error handling. By 'temporal error handling', I'm not
referring to an episode of Star Trek or Dr. Who. I'm talking about the fact
that messages drive state machines, so they have an inherent order to them.
In a banking system, a message to open an account needs to arrive before the
message that asks for the balance on the account. Being able to say: this
message makes sense now, but didn't make sense a minute ago, is a temporal
consideration because of the time relationship with another message.

None of this has anything at all to do with the transport protocol. If I
send a message using SOAP or REST or by dropping commands into a SQL Queue
or even in a file system, the state machine is the logic that understands
the message and responds, not the logic that finds and transmits the
message. That is what is meant by this independence. Sticking with
standards helps.
3. In a message based approach you need to take care of special
considerations as message correlation, optimistic data concurrency
management,business process compensation, and external service
unavailability. What r all these ??

* Message correlation - two messages arrive seperately, but they drive a
single state machine. Find the machine. Then apply the message. It is
like understanding the difference between a class and an instance of an
object. A class defines the machine, but the instance is the machine. When
a message comes in, you have to find the instance that it applies to.
Therefore, you need some kind of ID that you pass in the messages, and that
all asynchronous collaborators agree to pass back to you, allowing you to
correlate.

It's not as esoteric as it sounds. Your insurance card is a message
correlation id. You get sick. You go to the doctor. She pulls your chart
and examines you. She draws blood. She sends the blood to a lab for
analysis. You go home and await your results. The doctors office puts the
file back in the filing cabinet. The lab gets the blood. It has your ID on
it. They test the blood. You are going to be fine. They send back the
results with your ID. The doctor's office gets the result, and pulls your
chart back out of the filing cabinet. They see that you are waiting for a
call, so they ask the doctor to call you. [when they looked at the id and
decided to pull your chart, they were doing message correlation. The lab is
stateless].

* optimistic data concurrency management - In a long running transaction,
you cannot lock DB records. Two-phase commit mechanisms assume synchronous
communication mechanisms, so you cannot use them either... at least not in
the pure sense. So you have to take data transformation into account, often
by understanding the effect of messages on data. It is a design activity as
much as it is a runtime algorithm.

Let's say that an employee requests a week of vacation. The company has a
policy that reads like this: "The employee can request time off if they have
already accumulated it. Check accumulation first. If there is sufficient
accumulation to allow the request, then send it to the manager for
approval". Great. Now Joe has six days built up. He wants to take off
either the last week of November or the last week of December. So he sends
two requests, each for a week.

His first request goes through just fine. It is sent to his manager. What
about the second? Do we introduce a way of saying that the first five days
of vacation are "committed to an inflight transaction" with the ability to
roll it back if the manager doesn't approve it? In other words, do we deny
the second request before it gets to the manager? An alternate approach may
be to do a simple check and send the second request to the manager as well.
If the manager approves one, but not the other, no problems... because only
one will be marked as 'planned and approved time off' while the other
request disappears. What happens if the manager approves both? Does the
first one in the door win? How do you coordinate informing the manager that
he approved two but only one went through the system? This is optimistic
data concurrency management.

* Compensation. I mentioned an compensation scenario above. Let's say that
we choose the first alternative. We have a seperate counter for the number
of hours requested but not yet approved. This means the second request is
rejected by the system without going to the manager. Now, let's say that
the manager decides to reject the request. We cannot roll back the original
transaction. That would be silly. It may be days have passed. No, we
deduct the requested hours from the seperate counter. Now, if the employee
submits the second request, it will go through to the manager for
consideration. This whole notion of having a seperate counter, and then
removing hours from it, is a compensation design. It is there to handle
'rollback' of a request that didn't make it through the system.

* External service unavailability: As I mentioned above, the state machine
doesn't do more than simply coordinate things. So if a customer requests to
buy 1500 units of a really complex part from a manufacturing company,
there's a lot of coordinating to do. What is the bill of materials? Are
all the parts ready to ship or do some need to be built? How long will it
take to manufacture and deliver parts? Once delivered, what's the
manufacturing backlog on constructing the finished part? Does this timeline
meet the PO requirements? All this logic can be pulled into the state
machine. (We sometimes refer to this as the 'orchestration' or the
'workflow'). However, the logic to decide if all the parts are ready to
ship can be complicated and may require queries to multiple systems. Let's
say that your bill of materials is managed in one legacy system, while your
inventory is managed in another. Now, let's say that a PO arrives and the
inventory system is down for scheduled maintenance (you are swapping out the
network cards to upgrade to a 1GB backbone in that data center). The
orchestration asks for inventory data. No response comes back. What do you
do?

Your messaging system has to know how to hold a transaction request until
the receiving system is back up, and then to resume the flow. Biztalk
allows for retry and automatic dehydration. Other systems queue the message
and wait for a signal that the receiver is back online. It's a distinction
that becomes important if you don't want to lose that message. This is a
huge deal. If you don't have these mechanisms, then the failure rate of the
orchestration is the maximum overlap of the failures of the remote systems.
WIth this mechanism, the failure rate of the orchestration is the minimum
overlap between retry policy and the remote systems... often very near to
zero. You go from a very unreliable machine to a very reliable one.

I hope this helps. Final word: message based processing is not outside SOA.
It is part of it. In the SOA I am working in, we have message based
services as part of the infrastructure.
--
--- Nick Malik [Microsoft]
MCSD, CFPS, Certified Scrummaster
http://blogs.msdn.com/nickmalik

Disclaimer: Opinions expressed in this forum are my own, and not
representative of my employer.
I do not answer questions on behalf of my employer. I'm just a
programmer helping programmers.
--
 
Hello Nick Malik [Microsoft],

Well said, very well said Nick

---
WBR,
Michael Nemtsev :: blog: http://spaces.msn.com/laflour

"At times one remains faithful to a cause only because its opponents do not
cease to be insipid." (c) Friedrich Nietzsche
 
What a great answer Nick!

Seems like a great lead up to a pi calculus discussion :)

Cheers,

Greg Young
MVP - C#
http://codebetter.com/blogs/gregyoung

Nick Malik said:
Hello Ginny,
You appear to be a little confused. SOA is a high level architectural
approach to distributed systems design. What you appear to be referring
to as 'message based approach' is a subset of the SOA approach that uses
long running transactions. It is still SOA.
1. how does message based approach help?

Long running transactions and messaging solve a particularly difficult
design problem: how to simplify the code around the state machine and
seperate it from the rest of the system for easy modification. The state
machine is traditionally a problem. In the early days of Computer
Science, greats like Djikstra and Turing recognized the immense power of
state machines. The Turing Machine is, itself, a state machine that is
able to simulate the computational capabilities of any modern computer.
The problem is twofold: 1) our declarative programming languages are lousy
for describing and maintaining state machines. There are a great deal of
practical considerations to take into account that are not built in to the
error handling structures, branching structures, and even data type
declarations. 2) state machines change a lot. When done properly, a
large portion of the business rules end up embedded in a state machine.
Therefore they need to be encapsulated from the rest of the system.

In many cases, developers who know little about state machines avoid the
problem. You don't have to use a state machine. Most problems can be
solved without state machines, by adding a ten-fold increase in code and
complexity. But hey, if you only have a hammer, every problem is a nail.
Those of us who understood the power of the state machine have always
sought out ways to reduce code, and increase elegance, by implementing
state machines in our systems. Nice thing: in the age of the web and
stateless transaction models, this has required developers to place all of
the stateful aspects of the system in one place... which has done more to
drive the sophistication of state machine tools than any amount of
salesmanship or marketing could ever have. It's a golden time.

Message based approaches are the way you can understand how to drive a
state machine. A message is input to a state.

Message oriented middleware tools have increased in sophistication
steadily in the past four years. Look at the huge efforts being spent to
improve ESB tools and Biztalk. Look at the efforts to describe temporal
relationships in the WS_* standards.
2. the .net application architecture guide says that, the message based
approcah can make the design of you business logic independent of the
underlying transport protocol used between services. HOW?

If you take a message based approach, in your service oriented
architecture, it means that you start by isolating your state machine into
a service component. In a normal scenario, that state machine does not
need to actually implement the business logic. It implements that state
transitions and temporal error handling. By 'temporal error handling',
I'm not referring to an episode of Star Trek or Dr. Who. I'm talking
about the fact that messages drive state machines, so they have an
inherent order to them. In a banking system, a message to open an account
needs to arrive before the message that asks for the balance on the
account. Being able to say: this message makes sense now, but didn't make
sense a minute ago, is a temporal consideration because of the time
relationship with another message.

None of this has anything at all to do with the transport protocol. If I
send a message using SOAP or REST or by dropping commands into a SQL Queue
or even in a file system, the state machine is the logic that understands
the message and responds, not the logic that finds and transmits the
message. That is what is meant by this independence. Sticking with
standards helps.
3. In a message based approach you need to take care of special
considerations as message correlation, optimistic data concurrency
management,business process compensation, and external service
unavailability. What r all these ??

* Message correlation - two messages arrive seperately, but they drive a
single state machine. Find the machine. Then apply the message. It is
like understanding the difference between a class and an instance of an
object. A class defines the machine, but the instance is the machine.
When a message comes in, you have to find the instance that it applies to.
Therefore, you need some kind of ID that you pass in the messages, and
that all asynchronous collaborators agree to pass back to you, allowing
you to correlate.

It's not as esoteric as it sounds. Your insurance card is a message
correlation id. You get sick. You go to the doctor. She pulls your
chart and examines you. She draws blood. She sends the blood to a lab
for analysis. You go home and await your results. The doctors office
puts the file back in the filing cabinet. The lab gets the blood. It has
your ID on it. They test the blood. You are going to be fine. They send
back the results with your ID. The doctor's office gets the result, and
pulls your chart back out of the filing cabinet. They see that you are
waiting for a call, so they ask the doctor to call you. [when they looked
at the id and decided to pull your chart, they were doing message
correlation. The lab is stateless].

* optimistic data concurrency management - In a long running transaction,
you cannot lock DB records. Two-phase commit mechanisms assume
synchronous communication mechanisms, so you cannot use them either... at
least not in the pure sense. So you have to take data transformation into
account, often by understanding the effect of messages on data. It is a
design activity as much as it is a runtime algorithm.

Let's say that an employee requests a week of vacation. The company has a
policy that reads like this: "The employee can request time off if they
have already accumulated it. Check accumulation first. If there is
sufficient accumulation to allow the request, then send it to the manager
for approval". Great. Now Joe has six days built up. He wants to take
off either the last week of November or the last week of December. So he
sends two requests, each for a week.

His first request goes through just fine. It is sent to his manager.
What about the second? Do we introduce a way of saying that the first
five days of vacation are "committed to an inflight transaction" with the
ability to roll it back if the manager doesn't approve it? In other
words, do we deny the second request before it gets to the manager? An
alternate approach may be to do a simple check and send the second request
to the manager as well. If the manager approves one, but not the other, no
problems... because only one will be marked as 'planned and approved time
off' while the other request disappears. What happens if the manager
approves both? Does the first one in the door win? How do you coordinate
informing the manager that he approved two but only one went through the
system? This is optimistic data concurrency management.

* Compensation. I mentioned an compensation scenario above. Let's say
that we choose the first alternative. We have a seperate counter for the
number of hours requested but not yet approved. This means the second
request is rejected by the system without going to the manager. Now,
let's say that the manager decides to reject the request. We cannot roll
back the original transaction. That would be silly. It may be days have
passed. No, we deduct the requested hours from the seperate counter.
Now, if the employee submits the second request, it will go through to the
manager for consideration. This whole notion of having a seperate
counter, and then removing hours from it, is a compensation design. It is
there to handle 'rollback' of a request that didn't make it through the
system.

* External service unavailability: As I mentioned above, the state
machine doesn't do more than simply coordinate things. So if a customer
requests to buy 1500 units of a really complex part from a manufacturing
company, there's a lot of coordinating to do. What is the bill of
materials? Are all the parts ready to ship or do some need to be built?
How long will it take to manufacture and deliver parts? Once delivered,
what's the manufacturing backlog on constructing the finished part? Does
this timeline meet the PO requirements? All this logic can be pulled into
the state machine. (We sometimes refer to this as the 'orchestration' or
the 'workflow'). However, the logic to decide if all the parts are ready
to ship can be complicated and may require queries to multiple systems.
Let's say that your bill of materials is managed in one legacy system,
while your inventory is managed in another. Now, let's say that a PO
arrives and the inventory system is down for scheduled maintenance (you
are swapping out the network cards to upgrade to a 1GB backbone in that
data center). The orchestration asks for inventory data. No response
comes back. What do you do?

Your messaging system has to know how to hold a transaction request until
the receiving system is back up, and then to resume the flow. Biztalk
allows for retry and automatic dehydration. Other systems queue the
message and wait for a signal that the receiver is back online. It's a
distinction that becomes important if you don't want to lose that message.
This is a huge deal. If you don't have these mechanisms, then the failure
rate of the orchestration is the maximum overlap of the failures of the
remote systems. WIth this mechanism, the failure rate of the orchestration
is the minimum overlap between retry policy and the remote systems...
often very near to zero. You go from a very unreliable machine to a very
reliable one.

I hope this helps. Final word: message based processing is not outside
SOA. It is part of it. In the SOA I am working in, we have message based
services as part of the infrastructure.
--
--- Nick Malik [Microsoft]
MCSD, CFPS, Certified Scrummaster
http://blogs.msdn.com/nickmalik

Disclaimer: Opinions expressed in this forum are my own, and not
representative of my employer.
I do not answer questions on behalf of my employer. I'm just a
programmer helping programmers.
--
Ginny said:
In a service oriented architecture, an application can call an external
servie. an alternate approach is to use message based technique.
following
are the points that need calrifications??

1. how does message based approach help?
2. the .net application architecture guide says that, the message based
approcah can make the design of you business logic independent of the
underlying transport protocol used between services. HOW?
3. In a message based approach you need to take care of special
considerations as message correlation, optimistic data concurrency
management,business process compensation, and external service
unavailability. What r all these ??
 
Dear Nick,
Thanks a lot for your efforts. I am reading more on SOA to understand your
answer as well :) I have some small questions, if you can assist again:
1. What is coarse-grained interfaces compared to chatty interfaces for
remote communication
2. What is meant by loose coupling and high cohesion?
3. Can you send me a link of a good article on SOA?

Regards
Ginny
Nick Malik said:
Hello Ginny,
You appear to be a little confused. SOA is a high level architectural
approach to distributed systems design. What you appear to be referring to
as 'message based approach' is a subset of the SOA approach that uses long
running transactions. It is still SOA.
1. how does message based approach help?

Long running transactions and messaging solve a particularly difficult
design problem: how to simplify the code around the state machine and
seperate it from the rest of the system for easy modification. The state
machine is traditionally a problem. In the early days of Computer Science,
greats like Djikstra and Turing recognized the immense power of state
machines. The Turing Machine is, itself, a state machine that is able to
simulate the computational capabilities of any modern computer. The problem
is twofold: 1) our declarative programming languages are lousy for
describing and maintaining state machines. There are a great deal of
practical considerations to take into account that are not built in to the
error handling structures, branching structures, and even data type
declarations. 2) state machines change a lot. When done properly, a large
portion of the business rules end up embedded in a state machine. Therefore
they need to be encapsulated from the rest of the system.

In many cases, developers who know little about state machines avoid the
problem. You don't have to use a state machine. Most problems can be
solved without state machines, by adding a ten-fold increase in code and
complexity. But hey, if you only have a hammer, every problem is a nail.
Those of us who understood the power of the state machine have always sought
out ways to reduce code, and increase elegance, by implementing state
machines in our systems. Nice thing: in the age of the web and stateless
transaction models, this has required developers to place all of the
stateful aspects of the system in one place... which has done more to drive
the sophistication of state machine tools than any amount of salesmanship or
marketing could ever have. It's a golden time.

Message based approaches are the way you can understand how to drive a state
machine. A message is input to a state.

Message oriented middleware tools have increased in sophistication steadily
in the past four years. Look at the huge efforts being spent to improve ESB
tools and Biztalk. Look at the efforts to describe temporal relationships
in the WS_* standards.
2. the .net application architecture guide says that, the message based
approcah can make the design of you business logic independent of the
underlying transport protocol used between services. HOW?

If you take a message based approach, in your service oriented architecture,
it means that you start by isolating your state machine into a service
component. In a normal scenario, that state machine does not need to
actually implement the business logic. It implements that state transitions
and temporal error handling. By 'temporal error handling', I'm not
referring to an episode of Star Trek or Dr. Who. I'm talking about the fact
that messages drive state machines, so they have an inherent order to them.
In a banking system, a message to open an account needs to arrive before the
message that asks for the balance on the account. Being able to say: this
message makes sense now, but didn't make sense a minute ago, is a temporal
consideration because of the time relationship with another message.

None of this has anything at all to do with the transport protocol. If I
send a message using SOAP or REST or by dropping commands into a SQL Queue
or even in a file system, the state machine is the logic that understands
the message and responds, not the logic that finds and transmits the
message. That is what is meant by this independence. Sticking with
standards helps.
3. In a message based approach you need to take care of special
considerations as message correlation, optimistic data concurrency
management,business process compensation, and external service
unavailability. What r all these ??

* Message correlation - two messages arrive seperately, but they drive a
single state machine. Find the machine. Then apply the message. It is
like understanding the difference between a class and an instance of an
object. A class defines the machine, but the instance is the machine. When
a message comes in, you have to find the instance that it applies to.
Therefore, you need some kind of ID that you pass in the messages, and that
all asynchronous collaborators agree to pass back to you, allowing you to
correlate.

It's not as esoteric as it sounds. Your insurance card is a message
correlation id. You get sick. You go to the doctor. She pulls your chart
and examines you. She draws blood. She sends the blood to a lab for
analysis. You go home and await your results. The doctors office puts the
file back in the filing cabinet. The lab gets the blood. It has your ID on
it. They test the blood. You are going to be fine. They send back the
results with your ID. The doctor's office gets the result, and pulls your
chart back out of the filing cabinet. They see that you are waiting for a
call, so they ask the doctor to call you. [when they looked at the id and
decided to pull your chart, they were doing message correlation. The lab is
stateless].

* optimistic data concurrency management - In a long running transaction,
you cannot lock DB records. Two-phase commit mechanisms assume synchronous
communication mechanisms, so you cannot use them either... at least not in
the pure sense. So you have to take data transformation into account, often
by understanding the effect of messages on data. It is a design activity as
much as it is a runtime algorithm.

Let's say that an employee requests a week of vacation. The company has a
policy that reads like this: "The employee can request time off if they have
already accumulated it. Check accumulation first. If there is sufficient
accumulation to allow the request, then send it to the manager for
approval". Great. Now Joe has six days built up. He wants to take off
either the last week of November or the last week of December. So he sends
two requests, each for a week.

His first request goes through just fine. It is sent to his manager. What
about the second? Do we introduce a way of saying that the first five days
of vacation are "committed to an inflight transaction" with the ability to
roll it back if the manager doesn't approve it? In other words, do we deny
the second request before it gets to the manager? An alternate approach may
be to do a simple check and send the second request to the manager as well.
If the manager approves one, but not the other, no problems... because only
one will be marked as 'planned and approved time off' while the other
request disappears. What happens if the manager approves both? Does the
first one in the door win? How do you coordinate informing the manager that
he approved two but only one went through the system? This is optimistic
data concurrency management.

* Compensation. I mentioned an compensation scenario above. Let's say that
we choose the first alternative. We have a seperate counter for the number
of hours requested but not yet approved. This means the second request is
rejected by the system without going to the manager. Now, let's say that
the manager decides to reject the request. We cannot roll back the original
transaction. That would be silly. It may be days have passed. No, we
deduct the requested hours from the seperate counter. Now, if the employee
submits the second request, it will go through to the manager for
consideration. This whole notion of having a seperate counter, and then
removing hours from it, is a compensation design. It is there to handle
'rollback' of a request that didn't make it through the system.

* External service unavailability: As I mentioned above, the state machine
doesn't do more than simply coordinate things. So if a customer requests to
buy 1500 units of a really complex part from a manufacturing company,
there's a lot of coordinating to do. What is the bill of materials? Are
all the parts ready to ship or do some need to be built? How long will it
take to manufacture and deliver parts? Once delivered, what's the
manufacturing backlog on constructing the finished part? Does this timeline
meet the PO requirements? All this logic can be pulled into the state
machine. (We sometimes refer to this as the 'orchestration' or the
'workflow'). However, the logic to decide if all the parts are ready to
ship can be complicated and may require queries to multiple systems. Let's
say that your bill of materials is managed in one legacy system, while your
inventory is managed in another. Now, let's say that a PO arrives and the
inventory system is down for scheduled maintenance (you are swapping out the
network cards to upgrade to a 1GB backbone in that data center). The
orchestration asks for inventory data. No response comes back. What do you
do?

Your messaging system has to know how to hold a transaction request until
the receiving system is back up, and then to resume the flow. Biztalk
allows for retry and automatic dehydration. Other systems queue the message
and wait for a signal that the receiver is back online. It's a distinction
that becomes important if you don't want to lose that message. This is a
huge deal. If you don't have these mechanisms, then the failure rate of the
orchestration is the maximum overlap of the failures of the remote systems.
WIth this mechanism, the failure rate of the orchestration is the minimum
overlap between retry policy and the remote systems... often very near to
zero. You go from a very unreliable machine to a very reliable one.

I hope this helps. Final word: message based processing is not outside SOA.
It is part of it. In the SOA I am working in, we have message based
services as part of the infrastructure.
--
--- Nick Malik [Microsoft]
MCSD, CFPS, Certified Scrummaster
http://blogs.msdn.com/nickmalik

Disclaimer: Opinions expressed in this forum are my own, and not
representative of my employer.
I do not answer questions on behalf of my employer. I'm just a
programmer helping programmers.
--
Ginny said:
In a service oriented architecture, an application can call an external
servie. an alternate approach is to use message based technique. following
are the points that need calrifications??

1. how does message based approach help?
2. the .net application architecture guide says that, the message based
approcah can make the design of you business logic independent of the
underlying transport protocol used between services. HOW?
3. In a message based approach you need to take care of special
considerations as message correlation, optimistic data concurrency
management,business process compensation, and external service
unavailability. What r all these ??
 
Ginny,

#1 It is the difference between sending one big message verse many small
messages.

I believe #2 was alrady answerred for you in another post

with http://en.wikipedia.org/wiki/Loose_coupling and
http://en.wikipedia.org/wiki/Coupling_(computer_science)

for #3 there are alot of goob books/articles on SOA I would personally
recommend
http://www.amazon.com/gp/product/01...0472/ref=pd_bbs_1/104-4532018-6004701?ie=UTF8
Service-Oriented Architecture (SOA): Concepts, Technology, and Design,
Thomas Erl


Ginny said:
Dear Nick,
Thanks a lot for your efforts. I am reading more on SOA to understand your
answer as well :) I have some small questions, if you can assist again:
1. What is coarse-grained interfaces compared to chatty interfaces for
remote communication
2. What is meant by loose coupling and high cohesion?
3. Can you send me a link of a good article on SOA?

Regards
Ginny
Nick Malik said:
Hello Ginny,
You appear to be a little confused. SOA is a high level architectural
approach to distributed systems design. What you appear to be referring to
as 'message based approach' is a subset of the SOA approach that uses
long
running transactions. It is still SOA.
1. how does message based approach help?

Long running transactions and messaging solve a particularly difficult
design problem: how to simplify the code around the state machine and
seperate it from the rest of the system for easy modification. The state
machine is traditionally a problem. In the early days of Computer Science,
greats like Djikstra and Turing recognized the immense power of state
machines. The Turing Machine is, itself, a state machine that is able to
simulate the computational capabilities of any modern computer. The problem
is twofold: 1) our declarative programming languages are lousy for
describing and maintaining state machines. There are a great deal of
practical considerations to take into account that are not built in to
the
error handling structures, branching structures, and even data type
declarations. 2) state machines change a lot. When done properly, a large
portion of the business rules end up embedded in a state machine. Therefore
they need to be encapsulated from the rest of the system.

In many cases, developers who know little about state machines avoid the
problem. You don't have to use a state machine. Most problems can be
solved without state machines, by adding a ten-fold increase in code and
complexity. But hey, if you only have a hammer, every problem is a nail.
Those of us who understood the power of the state machine have always sought
out ways to reduce code, and increase elegance, by implementing state
machines in our systems. Nice thing: in the age of the web and stateless
transaction models, this has required developers to place all of the
stateful aspects of the system in one place... which has done more to drive
the sophistication of state machine tools than any amount of salesmanship or
marketing could ever have. It's a golden time.

Message based approaches are the way you can understand how to drive a state
machine. A message is input to a state.

Message oriented middleware tools have increased in sophistication steadily
in the past four years. Look at the huge efforts being spent to improve ESB
tools and Biztalk. Look at the efforts to describe temporal
relationships
in the WS_* standards.
2. the .net application architecture guide says that, the message based
approcah can make the design of you business logic independent of the
underlying transport protocol used between services. HOW?

If you take a message based approach, in your service oriented architecture,
it means that you start by isolating your state machine into a service
component. In a normal scenario, that state machine does not need to
actually implement the business logic. It implements that state transitions
and temporal error handling. By 'temporal error handling', I'm not
referring to an episode of Star Trek or Dr. Who. I'm talking about the fact
that messages drive state machines, so they have an inherent order to them.
In a banking system, a message to open an account needs to arrive before the
message that asks for the balance on the account. Being able to say:
this
message makes sense now, but didn't make sense a minute ago, is a
temporal
consideration because of the time relationship with another message.

None of this has anything at all to do with the transport protocol. If I
send a message using SOAP or REST or by dropping commands into a SQL
Queue
or even in a file system, the state machine is the logic that understands
the message and responds, not the logic that finds and transmits the
message. That is what is meant by this independence. Sticking with
standards helps.
3. In a message based approach you need to take care of special
considerations as message correlation, optimistic data concurrency
management,business process compensation, and external service
unavailability. What r all these ??

* Message correlation - two messages arrive seperately, but they drive a
single state machine. Find the machine. Then apply the message. It is
like understanding the difference between a class and an instance of an
object. A class defines the machine, but the instance is the machine. When
a message comes in, you have to find the instance that it applies to.
Therefore, you need some kind of ID that you pass in the messages, and that
all asynchronous collaborators agree to pass back to you, allowing you to
correlate.

It's not as esoteric as it sounds. Your insurance card is a message
correlation id. You get sick. You go to the doctor. She pulls your chart
and examines you. She draws blood. She sends the blood to a lab for
analysis. You go home and await your results. The doctors office puts the
file back in the filing cabinet. The lab gets the blood. It has your ID on
it. They test the blood. You are going to be fine. They send back the
results with your ID. The doctor's office gets the result, and pulls
your
chart back out of the filing cabinet. They see that you are waiting for
a
call, so they ask the doctor to call you. [when they looked at the id
and
decided to pull your chart, they were doing message correlation. The lab is
stateless].

* optimistic data concurrency management - In a long running transaction,
you cannot lock DB records. Two-phase commit mechanisms assume synchronous
communication mechanisms, so you cannot use them either... at least not
in
the pure sense. So you have to take data transformation into account, often
by understanding the effect of messages on data. It is a design activity as
much as it is a runtime algorithm.

Let's say that an employee requests a week of vacation. The company has
a
policy that reads like this: "The employee can request time off if they have
already accumulated it. Check accumulation first. If there is
sufficient
accumulation to allow the request, then send it to the manager for
approval". Great. Now Joe has six days built up. He wants to take off
either the last week of November or the last week of December. So he sends
two requests, each for a week.

His first request goes through just fine. It is sent to his manager. What
about the second? Do we introduce a way of saying that the first five days
of vacation are "committed to an inflight transaction" with the ability
to
roll it back if the manager doesn't approve it? In other words, do we deny
the second request before it gets to the manager? An alternate approach may
be to do a simple check and send the second request to the manager as well.
If the manager approves one, but not the other, no problems... because only
one will be marked as 'planned and approved time off' while the other
request disappears. What happens if the manager approves both? Does the
first one in the door win? How do you coordinate informing the manager that
he approved two but only one went through the system? This is optimistic
data concurrency management.

* Compensation. I mentioned an compensation scenario above. Let's say that
we choose the first alternative. We have a seperate counter for the number
of hours requested but not yet approved. This means the second request
is
rejected by the system without going to the manager. Now, let's say that
the manager decides to reject the request. We cannot roll back the original
transaction. That would be silly. It may be days have passed. No, we
deduct the requested hours from the seperate counter. Now, if the employee
submits the second request, it will go through to the manager for
consideration. This whole notion of having a seperate counter, and then
removing hours from it, is a compensation design. It is there to handle
'rollback' of a request that didn't make it through the system.

* External service unavailability: As I mentioned above, the state machine
doesn't do more than simply coordinate things. So if a customer requests to
buy 1500 units of a really complex part from a manufacturing company,
there's a lot of coordinating to do. What is the bill of materials? Are
all the parts ready to ship or do some need to be built? How long will
it
take to manufacture and deliver parts? Once delivered, what's the
manufacturing backlog on constructing the finished part? Does this timeline
meet the PO requirements? All this logic can be pulled into the state
machine. (We sometimes refer to this as the 'orchestration' or the
'workflow'). However, the logic to decide if all the parts are ready to
ship can be complicated and may require queries to multiple systems.
Let's
say that your bill of materials is managed in one legacy system, while your
inventory is managed in another. Now, let's say that a PO arrives and
the
inventory system is down for scheduled maintenance (you are swapping out the
network cards to upgrade to a 1GB backbone in that data center). The
orchestration asks for inventory data. No response comes back. What do you
do?

Your messaging system has to know how to hold a transaction request until
the receiving system is back up, and then to resume the flow. Biztalk
allows for retry and automatic dehydration. Other systems queue the message
and wait for a signal that the receiver is back online. It's a distinction
that becomes important if you don't want to lose that message. This is a
huge deal. If you don't have these mechanisms, then the failure rate of the
orchestration is the maximum overlap of the failures of the remote systems.
WIth this mechanism, the failure rate of the orchestration is the minimum
overlap between retry policy and the remote systems... often very near to
zero. You go from a very unreliable machine to a very reliable one.

I hope this helps. Final word: message based processing is not outside SOA.
It is part of it. In the SOA I am working in, we have message based
services as part of the infrastructure.
--
--- Nick Malik [Microsoft]
MCSD, CFPS, Certified Scrummaster
http://blogs.msdn.com/nickmalik

Disclaimer: Opinions expressed in this forum are my own, and not
representative of my employer.
I do not answer questions on behalf of my employer. I'm just a
programmer helping programmers.
--
Ginny said:
In a service oriented architecture, an application can call an external
servie. an alternate approach is to use message based technique. following
are the points that need calrifications??

1. how does message based approach help?
2. the .net application architecture guide says that, the message based
approcah can make the design of you business logic independent of the
underlying transport protocol used between services. HOW?
3. In a message based approach you need to take care of special
considerations as message correlation, optimistic data concurrency
management,business process compensation, and external service
unavailability. What r all these ??
 
Hi Greg,
There are no examples of loose coupling and high cohesion. The links only
talk about the definition. With the definition, its a little clear to me.
But i was looking for an example, probably on .net realted to loose coupling
and high cohesion, to get the concepts throughly clear. Like for exapmple,
you mentioned that coarse grained means a number of small messages and
chatty-interfaces means a big message, now how does it actually affect the
remote communication. What are the advantages of coarse grained over
chatty-interfaces. MS has used these jargons in their guides but with no
explanation at all. Furthermore, nothing much is available on net also with
examples. Definitions are everywhere. So I was looking for some practical
examples to get the concepts clear.

Regards

Greg Young said:
Ginny,

#1 It is the difference between sending one big message verse many small
messages.

I believe #2 was alrady answerred for you in another post

with http://en.wikipedia.org/wiki/Loose_coupling and
http://en.wikipedia.org/wiki/Coupling_(computer_science)

for #3 there are alot of goob books/articles on SOA I would personally
recommend
http://www.amazon.com/gp/product/01...0472/ref=pd_bbs_1/104-4532018-6004701?ie=UTF8
Service-Oriented Architecture (SOA): Concepts, Technology, and Design,
Thomas Erl


Ginny said:
Dear Nick,
Thanks a lot for your efforts. I am reading more on SOA to understand your
answer as well :) I have some small questions, if you can assist again:
1. What is coarse-grained interfaces compared to chatty interfaces for
remote communication
2. What is meant by loose coupling and high cohesion?
3. Can you send me a link of a good article on SOA?

Regards
Ginny
Nick Malik said:
Hello Ginny,
You appear to be a little confused. SOA is a high level architectural
approach to distributed systems design. What you appear to be
referring
to
as 'message based approach' is a subset of the SOA approach that uses
long
running transactions. It is still SOA.

1. how does message based approach help?

Long running transactions and messaging solve a particularly difficult
design problem: how to simplify the code around the state machine and
seperate it from the rest of the system for easy modification. The state
machine is traditionally a problem. In the early days of Computer Science,
greats like Djikstra and Turing recognized the immense power of state
machines. The Turing Machine is, itself, a state machine that is able to
simulate the computational capabilities of any modern computer. The problem
is twofold: 1) our declarative programming languages are lousy for
describing and maintaining state machines. There are a great deal of
practical considerations to take into account that are not built in to
the
error handling structures, branching structures, and even data type
declarations. 2) state machines change a lot. When done properly, a large
portion of the business rules end up embedded in a state machine. Therefore
they need to be encapsulated from the rest of the system.

In many cases, developers who know little about state machines avoid the
problem. You don't have to use a state machine. Most problems can be
solved without state machines, by adding a ten-fold increase in code and
complexity. But hey, if you only have a hammer, every problem is a nail.
Those of us who understood the power of the state machine have always sought
out ways to reduce code, and increase elegance, by implementing state
machines in our systems. Nice thing: in the age of the web and stateless
transaction models, this has required developers to place all of the
stateful aspects of the system in one place... which has done more to drive
the sophistication of state machine tools than any amount of
salesmanship
or
marketing could ever have. It's a golden time.

Message based approaches are the way you can understand how to drive a state
machine. A message is input to a state.

Message oriented middleware tools have increased in sophistication steadily
in the past four years. Look at the huge efforts being spent to
improve
ESB
tools and Biztalk. Look at the efforts to describe temporal
relationships
in the WS_* standards.

2. the .net application architecture guide says that, the message based
approcah can make the design of you business logic independent of the
underlying transport protocol used between services. HOW?

If you take a message based approach, in your service oriented architecture,
it means that you start by isolating your state machine into a service
component. In a normal scenario, that state machine does not need to
actually implement the business logic. It implements that state transitions
and temporal error handling. By 'temporal error handling', I'm not
referring to an episode of Star Trek or Dr. Who. I'm talking about the fact
that messages drive state machines, so they have an inherent order to them.
In a banking system, a message to open an account needs to arrive
before
the
message that asks for the balance on the account. Being able to say:
this
message makes sense now, but didn't make sense a minute ago, is a
temporal
consideration because of the time relationship with another message.

None of this has anything at all to do with the transport protocol. If I
send a message using SOAP or REST or by dropping commands into a SQL
Queue
or even in a file system, the state machine is the logic that understands
the message and responds, not the logic that finds and transmits the
message. That is what is meant by this independence. Sticking with
standards helps.

3. In a message based approach you need to take care of special
considerations as message correlation, optimistic data concurrency
management,business process compensation, and external service
unavailability. What r all these ??

* Message correlation - two messages arrive seperately, but they drive a
single state machine. Find the machine. Then apply the message. It is
like understanding the difference between a class and an instance of an
object. A class defines the machine, but the instance is the machine. When
a message comes in, you have to find the instance that it applies to.
Therefore, you need some kind of ID that you pass in the messages, and that
all asynchronous collaborators agree to pass back to you, allowing you to
correlate.

It's not as esoteric as it sounds. Your insurance card is a message
correlation id. You get sick. You go to the doctor. She pulls your chart
and examines you. She draws blood. She sends the blood to a lab for
analysis. You go home and await your results. The doctors office puts the
file back in the filing cabinet. The lab gets the blood. It has your
ID
on
it. They test the blood. You are going to be fine. They send back the
results with your ID. The doctor's office gets the result, and pulls
your
chart back out of the filing cabinet. They see that you are waiting for
a
call, so they ask the doctor to call you. [when they looked at the id
and
decided to pull your chart, they were doing message correlation. The
lab
is
stateless].

* optimistic data concurrency management - In a long running transaction,
you cannot lock DB records. Two-phase commit mechanisms assume synchronous
communication mechanisms, so you cannot use them either... at least not
in
the pure sense. So you have to take data transformation into account, often
by understanding the effect of messages on data. It is a design
activity
as
much as it is a runtime algorithm.

Let's say that an employee requests a week of vacation. The company has
a
policy that reads like this: "The employee can request time off if they have
already accumulated it. Check accumulation first. If there is
sufficient
accumulation to allow the request, then send it to the manager for
approval". Great. Now Joe has six days built up. He wants to take off
either the last week of November or the last week of December. So he sends
two requests, each for a week.

His first request goes through just fine. It is sent to his manager. What
about the second? Do we introduce a way of saying that the first five days
of vacation are "committed to an inflight transaction" with the ability
to
roll it back if the manager doesn't approve it? In other words, do we deny
the second request before it gets to the manager? An alternate
approach
may
be to do a simple check and send the second request to the manager as well.
If the manager approves one, but not the other, no problems... because only
one will be marked as 'planned and approved time off' while the other
request disappears. What happens if the manager approves both? Does the
first one in the door win? How do you coordinate informing the manager that
he approved two but only one went through the system? This is optimistic
data concurrency management.

* Compensation. I mentioned an compensation scenario above. Let's say that
we choose the first alternative. We have a seperate counter for the number
of hours requested but not yet approved. This means the second request
is
rejected by the system without going to the manager. Now, let's say that
the manager decides to reject the request. We cannot roll back the original
transaction. That would be silly. It may be days have passed. No, we
deduct the requested hours from the seperate counter. Now, if the employee
submits the second request, it will go through to the manager for
consideration. This whole notion of having a seperate counter, and then
removing hours from it, is a compensation design. It is there to handle
'rollback' of a request that didn't make it through the system.

* External service unavailability: As I mentioned above, the state machine
doesn't do more than simply coordinate things. So if a customer
requests
to
buy 1500 units of a really complex part from a manufacturing company,
there's a lot of coordinating to do. What is the bill of materials? Are
all the parts ready to ship or do some need to be built? How long will
it
take to manufacture and deliver parts? Once delivered, what's the
manufacturing backlog on constructing the finished part? Does this timeline
meet the PO requirements? All this logic can be pulled into the state
machine. (We sometimes refer to this as the 'orchestration' or the
'workflow'). However, the logic to decide if all the parts are ready to
ship can be complicated and may require queries to multiple systems.
Let's
say that your bill of materials is managed in one legacy system, while your
inventory is managed in another. Now, let's say that a PO arrives and
the
inventory system is down for scheduled maintenance (you are swapping
out
the
network cards to upgrade to a 1GB backbone in that data center). The
orchestration asks for inventory data. No response comes back. What
do
you
do?

Your messaging system has to know how to hold a transaction request until
the receiving system is back up, and then to resume the flow. Biztalk
allows for retry and automatic dehydration. Other systems queue the message
and wait for a signal that the receiver is back online. It's a distinction
that becomes important if you don't want to lose that message. This is a
huge deal. If you don't have these mechanisms, then the failure rate
of
the
orchestration is the maximum overlap of the failures of the remote systems.
WIth this mechanism, the failure rate of the orchestration is the minimum
overlap between retry policy and the remote systems... often very near to
zero. You go from a very unreliable machine to a very reliable one.

I hope this helps. Final word: message based processing is not outside SOA.
It is part of it. In the SOA I am working in, we have message based
services as part of the infrastructure.
--
--- Nick Malik [Microsoft]
MCSD, CFPS, Certified Scrummaster
http://blogs.msdn.com/nickmalik

Disclaimer: Opinions expressed in this forum are my own, and not
representative of my employer.
I do not answer questions on behalf of my employer. I'm just a
programmer helping programmers.
--
In a service oriented architecture, an application can call an external
servie. an alternate approach is to use message based technique. following
are the points that need calrifications??

1. how does message based approach help?
2. the .net application architecture guide says that, the message based
approcah can make the design of you business logic independent of the
underlying transport protocol used between services. HOW?
3. In a message based approach you need to take care of special
considerations as message correlation, optimistic data concurrency
management,business process compensation, and external service
unavailability. What r all these ??
 
Ginny said:
Hi Greg,
There are no examples of loose coupling and high cohesion. The links
only talk about the definition. With the definition, its a little clear to
me. But i was looking for an example, probably on .net realted to
loose coupling and high cohesion, to get the concepts throughly clear.
Like for exapmple, you mentioned that coarse grained means a number
of small messages and chatty-interfaces means a big message,

No, you got that backwards: Coarse grained means a few large messages while
chatty means a large number of small messages.
now how does it actually affect the
remote communication. What are the advantages of coarse grained over
chatty-interfaces. MS has used these jargons in their guides but with no
explanation at all. Furthermore, nothing much is available on net
also with examples. Definitions are everywhere. So I was looking for
some practical examples to get the concepts clear.

Here's an example, still conceptual.

Imagine that you need to enter a sales order into a remote system. A chatty
solution would be to expose a remote API (via web services or whatever) with
functions like:

CreateCustomer
CreateSalesOrder
CreateLineItem
SetSalesOrderAddress
SetSalesOrderBillingInfo
....

while a coarse-grained approach would expose just a single function:

CreateSalesOrder

which would take a number of aggregates as parameters (e.g. a struct with
customer info, an array of structs with address info, an array of structs
with line-item info, etc). Internally, this function would likely delegate
the work to a fine-grained API like the one above.

The impact for a remoting scenario is that fewer round-trips are made across
the network. This generally results in three desirable outcomes.

First, the total amount of network traffic is reduced, since you incur the
basic overhead of a single round-trip only once.

Second, the overall latency of the process is reduced, since when a client
is using the "chatty" interface, the steps are generally interlocked, so one
operation cannot begin before the previous one finishes - so the network
latency of a round-trip is effectively inserted between each pair of
consequtive operations.

Third, a coarse-grained API like this is more ammenable to asynchronous
invocation. Since the request contains all of the information to complete
the job in a single chunk, a client can wait asynchronously for the result
without having to build any complex asynchronous state machine - which would
be necessary to call the chatty version of the interface asynchronously.

HTH

-cd
 
Thanks Carl, thats real clear. Can you also help me to understand loose
coupling and high cohesion with a smilar example?

regards
 
Ginny said:
Thanks Carl, thats real clear. Can you also help me to understand
loose coupling and high cohesion with a smilar example?

Coupling between A and B (for any definition of A and B) is an indication of
the likelihood that a change in A will require a change in B. You always
want to strive for loose coupling because it leads to reduced maintenance
costs.

For example, if you define a struct:

struct Params
{
public int A;
public string B;
public float C;
}

and then define an API that uses this struct:

public void StronglyCoupled(Params p);

There's stronger coupling to the consumer of such an API than there would be
if the parameters were passed individually:

public void LooselyCoupled(int a, string b, float c);

The reason - both the implementation and the consumer (or server and client,
if you prefer) depend on the type 'Params', so they have "type coupling".
If the type is truly a fundamental part of the problem domain (e.g. a
complex number, a 3D point in space, the elements of a street address), then
it's appropriate to use it in an interface. But if it's just an artifical
packaging of unrelated things (like this example), then you're better off
keeping those things separated in your API design.

There are many other kinds of coupling than type coupling - see a good
software engineering book for a thorough treatise.

It's interesting to observe that the desire for loose coupling and the
desire to build a coarse-grained interface are somewhat at odds with one
another: exposing large complex functions that consume a lot of data as
parameters increases coupling (type coupling), but improves network
performance. There's a delicate balance to finding the right amount of
coupling. If you stick to types that are "naturual" to the problem domain,
the coupling you end up with is usually appropriate and won't cause undo
maintenance problems.

Cohesion is a qualitative measure of how much things "go together". As with
coupling, there are many ways to categorize items, hence many ways to reason
about the level of cohesion between the elements in an interface (or module,
class, program - you can apply the concept at nearly any scope). You always
want to strive for high cohesion because it makes your code easier to
understand and use.

Imagine two interfaces:

public interface NotCohesive
{
void StartTheMachine(int machineId);
int WordsInString(string sentence)
}

This interface has poor cohesion - the functions in the interface clearly
have nothing to do with one another - they've most likely been packaged
together for the convenience of the developer because some pair of client &
server (or consumer/implementor) needed access to both of these functions.
The lack of cohesion between these interface elements, however, increases
the chances that someone will want to either implement or consume one of the
elements and not the other, or that the elements will evolve divergently.

public interface HighlyCohesive
{
int CreateFooID();
void ReleaseFooID(int id);
}

This interface has much better cohesion - the elements of the interface
clearly "go together". It's likely that both a consumer and an implementor
of this interface would always need to use both elements.

Does that help? Again, both of these topics are well discussed in software
engineering texts (and generally covered badly, if at all, in "programming"
books).

-cd
 
I see CD has already answerred some of this but ...
What are the advantages of coarse grained over
chatty-interfaces. MS has used these jargons in their guides but with no
explanation at all.

That is because these are computer science concepts, they do not just belong
to microsoft.


Ginny said:
Hi Greg,
There are no examples of loose coupling and high cohesion. The links only
talk about the definition. With the definition, its a little clear to me.
But i was looking for an example, probably on .net realted to loose
coupling
and high cohesion, to get the concepts throughly clear. Like for exapmple,
you mentioned that coarse grained means a number of small messages and
chatty-interfaces means a big message, now how does it actually affect the
remote communication. What are the advantages of coarse grained over
chatty-interfaces. MS has used these jargons in their guides but with no
explanation at all. Furthermore, nothing much is available on net also
with
examples. Definitions are everywhere. So I was looking for some practical
examples to get the concepts clear.

Regards

Greg Young said:
Ginny,

#1 It is the difference between sending one big message verse many small
messages.

I believe #2 was alrady answerred for you in another post

with http://en.wikipedia.org/wiki/Loose_coupling and
http://en.wikipedia.org/wiki/Coupling_(computer_science)

for #3 there are alot of goob books/articles on SOA I would personally
recommend
http://www.amazon.com/gp/product/01...0472/ref=pd_bbs_1/104-4532018-6004701?ie=UTF8
Service-Oriented Architecture (SOA): Concepts, Technology, and Design,
Thomas Erl


Ginny said:
Dear Nick,
Thanks a lot for your efforts. I am reading more on SOA to understand your
answer as well :) I have some small questions, if you can assist again:
1. What is coarse-grained interfaces compared to chatty interfaces for
remote communication
2. What is meant by loose coupling and high cohesion?
3. Can you send me a link of a good article on SOA?

Regards
Ginny
message
Hello Ginny,
You appear to be a little confused. SOA is a high level architectural
approach to distributed systems design. What you appear to be referring
to
as 'message based approach' is a subset of the SOA approach that uses
long
running transactions. It is still SOA.

1. how does message based approach help?

Long running transactions and messaging solve a particularly difficult
design problem: how to simplify the code around the state machine and
seperate it from the rest of the system for easy modification. The state
machine is traditionally a problem. In the early days of Computer
Science,
greats like Djikstra and Turing recognized the immense power of state
machines. The Turing Machine is, itself, a state machine that is able to
simulate the computational capabilities of any modern computer. The
problem
is twofold: 1) our declarative programming languages are lousy for
describing and maintaining state machines. There are a great deal of
practical considerations to take into account that are not built in to
the
error handling structures, branching structures, and even data type
declarations. 2) state machines change a lot. When done properly, a
large
portion of the business rules end up embedded in a state machine.
Therefore
they need to be encapsulated from the rest of the system.

In many cases, developers who know little about state machines avoid the
problem. You don't have to use a state machine. Most problems can be
solved without state machines, by adding a ten-fold increase in code and
complexity. But hey, if you only have a hammer, every problem is a nail.
Those of us who understood the power of the state machine have always
sought
out ways to reduce code, and increase elegance, by implementing state
machines in our systems. Nice thing: in the age of the web and stateless
transaction models, this has required developers to place all of the
stateful aspects of the system in one place... which has done more to
drive
the sophistication of state machine tools than any amount of salesmanship
or
marketing could ever have. It's a golden time.

Message based approaches are the way you can understand how to drive a
state
machine. A message is input to a state.

Message oriented middleware tools have increased in sophistication
steadily
in the past four years. Look at the huge efforts being spent to improve
ESB
tools and Biztalk. Look at the efforts to describe temporal
relationships
in the WS_* standards.

2. the .net application architecture guide says that, the message based
approcah can make the design of you business logic independent of
the
underlying transport protocol used between services. HOW?

If you take a message based approach, in your service oriented
architecture,
it means that you start by isolating your state machine into a service
component. In a normal scenario, that state machine does not need to
actually implement the business logic. It implements that state
transitions
and temporal error handling. By 'temporal error handling', I'm not
referring to an episode of Star Trek or Dr. Who. I'm talking about
the
fact
that messages drive state machines, so they have an inherent order to
them.
In a banking system, a message to open an account needs to arrive before
the
message that asks for the balance on the account. Being able to say:
this
message makes sense now, but didn't make sense a minute ago, is a
temporal
consideration because of the time relationship with another message.

None of this has anything at all to do with the transport protocol.
If I
send a message using SOAP or REST or by dropping commands into a SQL
Queue
or even in a file system, the state machine is the logic that understands
the message and responds, not the logic that finds and transmits the
message. That is what is meant by this independence. Sticking with
standards helps.

3. In a message based approach you need to take care of special
considerations as message correlation, optimistic data concurrency
management,business process compensation, and external service
unavailability. What r all these ??

* Message correlation - two messages arrive seperately, but they drive a
single state machine. Find the machine. Then apply the message. It is
like understanding the difference between a class and an instance of
an
object. A class defines the machine, but the instance is the machine.
When
a message comes in, you have to find the instance that it applies to.
Therefore, you need some kind of ID that you pass in the messages, and
that
all asynchronous collaborators agree to pass back to you, allowing you to
correlate.

It's not as esoteric as it sounds. Your insurance card is a message
correlation id. You get sick. You go to the doctor. She pulls your
chart
and examines you. She draws blood. She sends the blood to a lab for
analysis. You go home and await your results. The doctors office
puts
the
file back in the filing cabinet. The lab gets the blood. It has your ID
on
it. They test the blood. You are going to be fine. They send back the
results with your ID. The doctor's office gets the result, and pulls
your
chart back out of the filing cabinet. They see that you are waiting for
a
call, so they ask the doctor to call you. [when they looked at the id
and
decided to pull your chart, they were doing message correlation. The lab
is
stateless].

* optimistic data concurrency management - In a long running transaction,
you cannot lock DB records. Two-phase commit mechanisms assume
synchronous
communication mechanisms, so you cannot use them either... at least
not
in
the pure sense. So you have to take data transformation into account,
often
by understanding the effect of messages on data. It is a design activity
as
much as it is a runtime algorithm.

Let's say that an employee requests a week of vacation. The company has
a
policy that reads like this: "The employee can request time off if
they
have
already accumulated it. Check accumulation first. If there is
sufficient
accumulation to allow the request, then send it to the manager for
approval". Great. Now Joe has six days built up. He wants to take off
either the last week of November or the last week of December. So he
sends
two requests, each for a week.

His first request goes through just fine. It is sent to his manager.
What
about the second? Do we introduce a way of saying that the first five
days
of vacation are "committed to an inflight transaction" with the
ability
to
roll it back if the manager doesn't approve it? In other words, do we
deny
the second request before it gets to the manager? An alternate approach
may
be to do a simple check and send the second request to the manager as
well.
If the manager approves one, but not the other, no problems... because
only
one will be marked as 'planned and approved time off' while the other
request disappears. What happens if the manager approves both? Does the
first one in the door win? How do you coordinate informing the
manager
that
he approved two but only one went through the system? This is optimistic
data concurrency management.

* Compensation. I mentioned an compensation scenario above. Let's
say
that
we choose the first alternative. We have a seperate counter for the
number
of hours requested but not yet approved. This means the second
request
is
rejected by the system without going to the manager. Now, let's say that
the manager decides to reject the request. We cannot roll back the
original
transaction. That would be silly. It may be days have passed. No,
we
deduct the requested hours from the seperate counter. Now, if the
employee
submits the second request, it will go through to the manager for
consideration. This whole notion of having a seperate counter, and then
removing hours from it, is a compensation design. It is there to handle
'rollback' of a request that didn't make it through the system.

* External service unavailability: As I mentioned above, the state
machine
doesn't do more than simply coordinate things. So if a customer requests
to
buy 1500 units of a really complex part from a manufacturing company,
there's a lot of coordinating to do. What is the bill of materials? Are
all the parts ready to ship or do some need to be built? How long
will
it
take to manufacture and deliver parts? Once delivered, what's the
manufacturing backlog on constructing the finished part? Does this
timeline
meet the PO requirements? All this logic can be pulled into the state
machine. (We sometimes refer to this as the 'orchestration' or the
'workflow'). However, the logic to decide if all the parts are ready to
ship can be complicated and may require queries to multiple systems.
Let's
say that your bill of materials is managed in one legacy system, while
your
inventory is managed in another. Now, let's say that a PO arrives and
the
inventory system is down for scheduled maintenance (you are swapping out
the
network cards to upgrade to a 1GB backbone in that data center). The
orchestration asks for inventory data. No response comes back. What do
you
do?

Your messaging system has to know how to hold a transaction request until
the receiving system is back up, and then to resume the flow. Biztalk
allows for retry and automatic dehydration. Other systems queue the
message
and wait for a signal that the receiver is back online. It's a
distinction
that becomes important if you don't want to lose that message. This
is a
huge deal. If you don't have these mechanisms, then the failure rate of
the
orchestration is the maximum overlap of the failures of the remote
systems.
WIth this mechanism, the failure rate of the orchestration is the minimum
overlap between retry policy and the remote systems... often very near to
zero. You go from a very unreliable machine to a very reliable one.

I hope this helps. Final word: message based processing is not
outside
SOA.
It is part of it. In the SOA I am working in, we have message based
services as part of the infrastructure.
--
--- Nick Malik [Microsoft]
MCSD, CFPS, Certified Scrummaster
http://blogs.msdn.com/nickmalik

Disclaimer: Opinions expressed in this forum are my own, and not
representative of my employer.
I do not answer questions on behalf of my employer. I'm just a
programmer helping programmers.
--
In a service oriented architecture, an application can call an external
servie. an alternate approach is to use message based technique.
following
are the points that need calrifications??

1. how does message based approach help?
2. the .net application architecture guide says that, the message based
approcah can make the design of you business logic independent of
the
underlying transport protocol used between services. HOW?
3. In a message based approach you need to take care of special
considerations as message correlation, optimistic data concurrency
management,business process compensation, and external service
unavailability. What r all these ??
 
Hi Greg,
Thanks for your reply. I am not a computer science graduate thus may be
missing out on some of these concepts. Can you provide me with some links so
that I can learn what I missed, not being a CS graduate :)

Regards
Amit
Greg Young said:
I see CD has already answerred some of this but ...
What are the advantages of coarse grained over
chatty-interfaces. MS has used these jargons in their guides but with no
explanation at all.

That is because these are computer science concepts, they do not just belong
to microsoft.


Ginny said:
Hi Greg,
There are no examples of loose coupling and high cohesion. The links only
talk about the definition. With the definition, its a little clear to me.
But i was looking for an example, probably on .net realted to loose
coupling
and high cohesion, to get the concepts throughly clear. Like for exapmple,
you mentioned that coarse grained means a number of small messages and
chatty-interfaces means a big message, now how does it actually affect the
remote communication. What are the advantages of coarse grained over
chatty-interfaces. MS has used these jargons in their guides but with no
explanation at all. Furthermore, nothing much is available on net also
with
examples. Definitions are everywhere. So I was looking for some practical
examples to get the concepts clear.

Regards
http://www.amazon.com/gp/product/0131858580/sr=8-1/qid=1154840472/ref=pd_bbs
_1/104-4532018-6004701?ie=UTF8
Service-Oriented Architecture (SOA): Concepts, Technology, and Design,
Thomas Erl


Dear Nick,
Thanks a lot for your efforts. I am reading more on SOA to understand your
answer as well :) I have some small questions, if you can assist again:
1. What is coarse-grained interfaces compared to chatty interfaces for
remote communication
2. What is meant by loose coupling and high cohesion?
3. Can you send me a link of a good article on SOA?

Regards
Ginny
message
Hello Ginny,
You appear to be a little confused. SOA is a high level architectural
approach to distributed systems design. What you appear to be referring
to
as 'message based approach' is a subset of the SOA approach that uses
long
running transactions. It is still SOA.

1. how does message based approach help?

Long running transactions and messaging solve a particularly difficult
design problem: how to simplify the code around the state machine and
seperate it from the rest of the system for easy modification. The state
machine is traditionally a problem. In the early days of Computer
Science,
greats like Djikstra and Turing recognized the immense power of state
machines. The Turing Machine is, itself, a state machine that is
able
to
simulate the computational capabilities of any modern computer. The
problem
is twofold: 1) our declarative programming languages are lousy for
describing and maintaining state machines. There are a great deal of
practical considerations to take into account that are not built in to
the
error handling structures, branching structures, and even data type
declarations. 2) state machines change a lot. When done properly, a
large
portion of the business rules end up embedded in a state machine.
Therefore
they need to be encapsulated from the rest of the system.

In many cases, developers who know little about state machines avoid the
problem. You don't have to use a state machine. Most problems can be
solved without state machines, by adding a ten-fold increase in code and
complexity. But hey, if you only have a hammer, every problem is a nail.
Those of us who understood the power of the state machine have always
sought
out ways to reduce code, and increase elegance, by implementing state
machines in our systems. Nice thing: in the age of the web and stateless
transaction models, this has required developers to place all of the
stateful aspects of the system in one place... which has done more to
drive
the sophistication of state machine tools than any amount of salesmanship
or
marketing could ever have. It's a golden time.

Message based approaches are the way you can understand how to drive a
state
machine. A message is input to a state.

Message oriented middleware tools have increased in sophistication
steadily
in the past four years. Look at the huge efforts being spent to improve
ESB
tools and Biztalk. Look at the efforts to describe temporal
relationships
in the WS_* standards.

2. the .net application architecture guide says that, the message based
approcah can make the design of you business logic independent of
the
underlying transport protocol used between services. HOW?

If you take a message based approach, in your service oriented
architecture,
it means that you start by isolating your state machine into a service
component. In a normal scenario, that state machine does not need to
actually implement the business logic. It implements that state
transitions
and temporal error handling. By 'temporal error handling', I'm not
referring to an episode of Star Trek or Dr. Who. I'm talking about
the
fact
that messages drive state machines, so they have an inherent order to
them.
In a banking system, a message to open an account needs to arrive before
the
message that asks for the balance on the account. Being able to say:
this
message makes sense now, but didn't make sense a minute ago, is a
temporal
consideration because of the time relationship with another message.

None of this has anything at all to do with the transport protocol.
If I
send a message using SOAP or REST or by dropping commands into a SQL
Queue
or even in a file system, the state machine is the logic that understands
the message and responds, not the logic that finds and transmits the
message. That is what is meant by this independence. Sticking with
standards helps.

3. In a message based approach you need to take care of special
considerations as message correlation, optimistic data concurrency
management,business process compensation, and external service
unavailability. What r all these ??

* Message correlation - two messages arrive seperately, but they
drive
a
single state machine. Find the machine. Then apply the message.
It
is
like understanding the difference between a class and an instance of
an
object. A class defines the machine, but the instance is the machine.
When
a message comes in, you have to find the instance that it applies to.
Therefore, you need some kind of ID that you pass in the messages, and
that
all asynchronous collaborators agree to pass back to you, allowing
you
to
correlate.

It's not as esoteric as it sounds. Your insurance card is a message
correlation id. You get sick. You go to the doctor. She pulls your
chart
and examines you. She draws blood. She sends the blood to a lab for
analysis. You go home and await your results. The doctors office
puts
the
file back in the filing cabinet. The lab gets the blood. It has
your
ID
on
it. They test the blood. You are going to be fine. They send back the
results with your ID. The doctor's office gets the result, and pulls
your
chart back out of the filing cabinet. They see that you are waiting for
a
call, so they ask the doctor to call you. [when they looked at the id
and
decided to pull your chart, they were doing message correlation.
The
lab
is
stateless].

* optimistic data concurrency management - In a long running transaction,
you cannot lock DB records. Two-phase commit mechanisms assume
synchronous
communication mechanisms, so you cannot use them either... at least
not
in
the pure sense. So you have to take data transformation into account,
often
by understanding the effect of messages on data. It is a design activity
as
much as it is a runtime algorithm.

Let's say that an employee requests a week of vacation. The company has
a
policy that reads like this: "The employee can request time off if
they
have
already accumulated it. Check accumulation first. If there is
sufficient
accumulation to allow the request, then send it to the manager for
approval". Great. Now Joe has six days built up. He wants to take off
either the last week of November or the last week of December. So he
sends
two requests, each for a week.

His first request goes through just fine. It is sent to his manager.
What
about the second? Do we introduce a way of saying that the first five
days
of vacation are "committed to an inflight transaction" with the
ability
to
roll it back if the manager doesn't approve it? In other words, do we
deny
the second request before it gets to the manager? An alternate approach
may
be to do a simple check and send the second request to the manager as
well.
If the manager approves one, but not the other, no problems... because
only
one will be marked as 'planned and approved time off' while the other
request disappears. What happens if the manager approves both?
Does
the
first one in the door win? How do you coordinate informing the
manager
that
he approved two but only one went through the system? This is optimistic
data concurrency management.

* Compensation. I mentioned an compensation scenario above. Let's
say
that
we choose the first alternative. We have a seperate counter for the
number
of hours requested but not yet approved. This means the second
request
is
rejected by the system without going to the manager. Now, let's say that
the manager decides to reject the request. We cannot roll back the
original
transaction. That would be silly. It may be days have passed. No,
we
deduct the requested hours from the seperate counter. Now, if the
employee
submits the second request, it will go through to the manager for
consideration. This whole notion of having a seperate counter, and then
removing hours from it, is a compensation design. It is there to handle
'rollback' of a request that didn't make it through the system.

* External service unavailability: As I mentioned above, the state
machine
doesn't do more than simply coordinate things. So if a customer requests
to
buy 1500 units of a really complex part from a manufacturing company,
there's a lot of coordinating to do. What is the bill of materials? Are
all the parts ready to ship or do some need to be built? How long
will
it
take to manufacture and deliver parts? Once delivered, what's the
manufacturing backlog on constructing the finished part? Does this
timeline
meet the PO requirements? All this logic can be pulled into the state
machine. (We sometimes refer to this as the 'orchestration' or the
'workflow'). However, the logic to decide if all the parts are
ready
to
ship can be complicated and may require queries to multiple systems.
Let's
say that your bill of materials is managed in one legacy system, while
your
inventory is managed in another. Now, let's say that a PO arrives and
the
inventory system is down for scheduled maintenance (you are swapping out
the
network cards to upgrade to a 1GB backbone in that data center). The
orchestration asks for inventory data. No response comes back.
What
do
you
do?

Your messaging system has to know how to hold a transaction request until
the receiving system is back up, and then to resume the flow. Biztalk
allows for retry and automatic dehydration. Other systems queue the
message
and wait for a signal that the receiver is back online. It's a
distinction
that becomes important if you don't want to lose that message. This
is a
huge deal. If you don't have these mechanisms, then the failure
rate
of
the
orchestration is the maximum overlap of the failures of the remote
systems.
WIth this mechanism, the failure rate of the orchestration is the minimum
overlap between retry policy and the remote systems... often very
near
to
zero. You go from a very unreliable machine to a very reliable one.

I hope this helps. Final word: message based processing is not
outside
SOA.
It is part of it. In the SOA I am working in, we have message based
services as part of the infrastructure.
--
--- Nick Malik [Microsoft]
MCSD, CFPS, Certified Scrummaster
http://blogs.msdn.com/nickmalik

Disclaimer: Opinions expressed in this forum are my own, and not
representative of my employer.
I do not answer questions on behalf of my employer. I'm just a
programmer helping programmers.
--
In a service oriented architecture, an application can call an external
servie. an alternate approach is to use message based technique.
following
are the points that need calrifications??

1. how does message based approach help?
2. the .net application architecture guide says that, the message based
approcah can make the design of you business logic independent of
the
underlying transport protocol used between services. HOW?
3. In a message based approach you need to take care of special
considerations as message correlation, optimistic data concurrency
management,business process compensation, and external service
unavailability. What r all these ??
 
Ginny said:
Hi Greg,
Thanks for your reply. I am not a computer science graduate thus may be
missing out on some of these concepts. Can you provide me with some links
so
that I can learn what I missed, not being a CS graduate :)

Here's what I'd recommend:

Go to Amazon (or your favorite book vendor) and get a copy of "Writing Solid
Code" by Steve Maguire. It's possibly the most concentrated collection of
software construction wisdom ever collected into a single book.

I'd also suggest getting a good OO design book - my favorite has always been
"Object Oriented Analysis and Design" by Grady Booch. It's a bit dated now,
but still a great read.

Finally, I'd recommend getting "Design Patterns: Elements of Reusable
Object-Oriented Design" by Erich Gamma, et al (know as the "Gang of Four"
book, or GoF because of the 4 authors).

Work your way through those 3 books and you'll be well on your way. Keep an
eye on the Microsoft Patterns & Practices web site for new papers - they'll
be a lot more meaningful to you with a solid background in the fundamentals.

-cd
 
I personally do not recommend GoF to someone without a CS background looking
to learn patterns. Design Patterns in C# or heads first design patterns are
probably better choices.

Booch I would highly recommend (and it is very inexpensive used). To add to
Booch, Object Oriented Software Engineering by Jacobson is also a good read.

I mentioned earlier a good SOA book that covers alot of the fundamentals for
you.

P of EAA [Fowler] is also worth a read for you (probably close to everything
here http://martinfowler.com/books.html). I would also recommend the POSA
(Pattern Oriented Software Architecture)series (especially volume 2 which
covers many networked patterns).

MS PnP also has many materials avialable for download/purchase which are
worth looking at.

Stal
 
Hello Ginny,

Probably the single best source for basic OO info is Robert Martin's
website, which can be found at 'www.ButUncleBob.com'
A good collection of his prior papers on OOAD are on this site at
http://www.butunclebob.com/ArticleS.UncleBob.PrinciplesOfOod

Note that Bob doesn't go into a lot of specifics about low coupling on that
site either. He does talk about cohesion, but primariy from the package
viewpoint, and not so much from the interface viewpoint, which is the level
that Carl replied from. The concept applies at many levels, as Carl so
aptly pointed out.

Primarily the concept of low coupling goes to support the Open-Closed
Principle. Bob does have a good paper on OCP on his blog, and there are
many other good papers available on the web if you search on that term.

Another excellent resource for short articles on OO design can be found in
the NetObjectives e-zine library at:
http://www.netobjectives.com/resources/rs_articles_ours.htm
To whit, one of Scott Bain's articles may offer an excellent set of insights
that you may be able to use:
http://www.netobjectives.com/ezines/ez0411NetObj_EvolutionInCodeStage1.pdf


Lastly, I'd like to expand on another responder's suggestion that you learn
Design Patterns, specifically from the Gang-of-Four (GoF) book. I agree
with the idea that you should learn design patterns. I do not suggest
learning from the Gang of Four book. It is basically Erich Gamma's Ph.d.
thesis. Instead, I suggest an excellent book by Alan Shalloway called
"Design Patterns Explained" which gives you an excellent narrative approach,
welcoming you to OO design rather than overwheming you with it. After that,
you will have no difficulty with the GoF book.

--
--- Nick Malik [Microsoft]
MCSD, CFPS, Certified Scrummaster
http://blogs.msdn.com/nickmalik

Disclaimer: Opinions expressed in this forum are my own, and not
representative of my employer.
I do not answer questions on behalf of my employer. I'm just a
programmer helping programmers.
 
Back
Top