What OpenMP brings to C++?

  • Thread starter Thread starter Vladimir_petter
  • Start date Start date
V

Vladimir_petter

Hello guys,

Looks like this technology is comming along with VS2005. Anybody has an
expirience of using it along with C++? What are the patterns?

Thanks,
Vladimir.
 
Hello,

Thanks for your response.
With parallel pipelines it will generate faster code (hopefully.)
ben

I am not sure what the "parallel pipelines" means. The OpenMP 2.0 standard
is mostly talking about "parallel regions".

There is how I see OpenMP in nutshells:
The traditional (Windows) approach to dial with multithreading is based on a
function that are executed in parallel with caller (CreateThread). In OpenMP
to make a piece of code to run in parallel I do not need to move it to a
function and think about marshaling data to this function. I just declare
this region as a parallel, declare thread specific and shared variables and
compiler takes care about the rest. As well now compiler can utilize the
knowledge it gets from my declaration and generate a better code. I see this
as a benefit.

One thing that concerns me is that the OpenMP syntax is kind of noisy and do
not look neutral inside a C++ program.

Another thing is that looks like C++ community does not talk about OpenMP a
lot (or at least I do not see those discussions in comp.lang.c++.moderated
and comp.std.c++), all though several major compiler vendors has been
supporting or are going to support OpenMP. There are talks about making a
library like boost::thread a part of the C++ standard library. So again this
means that OpenMP and traditional approach coexist in parallel and no one
look at OpenMP as a replacement for a traditional approach.

Looks like OpenMP is widely (going to be) supported by compiler vendors
like MS, Intel etc. This big investment on there side kind of contradicts to
the amount of attention this technology gets from C++ community.

So basically my questions boils down to following:
Is OpenMP going to replace traditional approach or this two solutions are
targeting different kind of problems?

Regards,
Vladimir.
 
Vladimir_petter said:
I am not sure what the "parallel pipelines" means. The OpenMP 2.0 standard
is mostly talking about "parallel regions".

There is how I see OpenMP in nutshells:
The traditional (Windows) approach to dial with multithreading is based on a
function that are executed in parallel with caller (CreateThread). In OpenMP
to make a piece of code to run in parallel I do not need to move it to a
function and think about marshaling data to this function. I just declare
this region as a parallel, declare thread specific and shared variables and
compiler takes care about the rest. As well now compiler can utilize the
knowledge it gets from my declaration and generate a better code. I see this
as a benefit.


Yes, OpenMP is an additional multithreading mechanism than the other
provided ones, like .NET multithreading etc.

One thing that concerns me is that the OpenMP syntax is kind of noisy and do
not look neutral inside a C++ program.


Yes, it works with #pragma directives, and I guess this involves less
run-time decisions/overhead, and also are less safe. For example:


#include <vector>

int main()
{
using std::vector;


vector<int>someVec(100);


#pragma omp for
for(vector<int>::size_type i=0; i<someVec.size(); ++i)
someVec=10*i;
}



C:\c>cl /clr temp.cpp
Microsoft (R) C/C++ Optimizing Compiler Version 14.00.41013
for Microsoft (R) .NET Framework version 2.00.41013.0
Copyright (C) Microsoft Corporation. All rights reserved.

temp.cpp
Microsoft (R) Incremental Linker Version 8.00.41013
Copyright (C) Microsoft Corporation. All rights reserved.

/out:temp.exe
temp.obj

C:\c>


With this #pragma directive of the OpenMP 2 standard, the programmer
provides the guarantee that each assignment is independent of the other
assignments, and thus the compiler creates separate threads for each
assignment, taking advantage of the possible presence of multiple
processors in a system.


Of course, according to the ISO C++ standard unknown #pragmas are
ignored, and thus the above code has no problem of portability to
compilers that do not support OpenMP:


C:\MinGW\bin\g++.exe -std=c++98 -pedantic-errors -Wall
-fexpensive-optimizations -O3 -ffloat-store -mcpu=pentiumpro temp.cpp -o
temp.exe

temp.cpp: In function `int main()':
temp.cpp:11: warning: ignoring #pragma omp for




However if we make a mistake, boom. :-)


But this really helps very much in hand-tuning, and given the guarantees
we provide to the compiler, there is much room for compile-time
optimisation.


I think that OpenMP should be viewed as separate and supplementary to
the platform multithreading support, that is when we design
multithreading applications, we should not take under consideration
OpenMP, but use it separately when we are writing the code.


But I may miss something, so someone may correct me.



Another thing is that looks like C++ community does not talk about OpenMP a
lot (or at least I do not see those discussions in comp.lang.c++.moderated
and comp.std.c++), all though several major compiler vendors has been
supporting or are going to support OpenMP. There are talks about making a
library like boost::thread a part of the C++ standard library. So again this
means that OpenMP and traditional approach coexist in parallel and no one
look at OpenMP as a replacement for a traditional approach.

Yes!



Looks like OpenMP is widely (going to be) supported by compiler vendors
like MS, Intel etc. This big investment on there side kind of contradicts to
the amount of attention this technology gets from C++ community.


Consider this under the view of the upcoming multicore processors in the
mainstream, due to the end of 2005 - beginning of 2006.


So basically my questions boils down to following:
Is OpenMP going to replace traditional approach or this two solutions are
targeting different kind of problems?


I think they are targeting different levels of multithreading optimisation.
 
Hello Ioannis,

Thanks for your response,

As I've read in one of the Intel papers one can distinguish to type of
parallelism:
- A "concurrent programming" is the multithreading programming we all got
used to.
- A "parallel programming" is something new for the PC world. So far only
Intel Itanium processor provides this with Explicit Instruction Level
Parallelism.

I can see advantages of using Open MP for "parallel programming" (cause C++
language does not give us any othe way to express this), but I kind of doubt
that it is good for concurrent programming.

In the example you've provided:
#include <vector>

int main()
{
using std::vector;


vector<int>someVec(100);


#pragma omp for
for(vector<int>::size_type i=0; i<someVec.size(); ++i)
someVec=10*i;
}


I can see that this loop can be great deal optimized using parallel
programming (on Itanium), but making this loops parallel using concurrent
programming just does not make sense to me. It simply too expensive to
delegate a single x86 (or x64) "add" instruction in a separate thread. Take
on account that to make it working compiler should generate a function with
a body that would be equal to the parallel region and sometimes generate a
structure to marshal data to execute in an OpenMP thread.
That is one of the things that bothers me a lot. In all OpenMP peppers I've
read so far there are examples like that do not make sense to parallelize in
concurrent program. It is just a waste of CPU.
I think that OpenMP should be viewed as separate and supplementary to the
platform multithreading support, that is when we design multithreading
applications, we should not take under consideration OpenMP, but use it
separately when we are writing the code.

Then where is a criteria what tasks should be solved using OpenMP and which
we solve in the old way?
But I may miss something, so someone may correct me.


Consider this under the view of the upcoming multicore processors in the
mainstream, due to the end of 2005 - beginning of 2006.

I do not understand how multicore CPU is related to this problem. As far as
I understand multicore CPU still lives us in the "concurrent programming"
world. You buy one chip, but actually this is 2 CPU in one box. So this
could be well handled in the old fashioned way.
I think they are targeting different levels of multithreading
optimization.

In that case the question is what are the criteria that would help us to
decide which way to choose for a particular problem.

In particularly I do not see Open MP overlap nicely with idea of (IOCP)
thread pool.

Vladimir.
 
Vladimir_petter said:
I can see advantages of using Open MP for "parallel programming" (cause C++
language does not give us any othe way to express this), but I kind of doubt
that it is good for concurrent programming.

It's providing an important feature that, as far as I know, has only been a
language construct in one language: Occam (for transputers). There, every
compound statement was either PAR or SEQ (IIRC).PAR meant the statements
inside it could be executed in parallel, SEQ that they must be execured in
series. One could nest PAR statements inside SEQ ones and vice versa
I can see that this loop can be great deal optimized using parallel
programming (on Itanium), but making this loops parallel using concurrent
programming just does not make sense to me. It simply too expensive to
delegate a single x86 (or x64) "add" instruction in a separate thread ..[snip]
That is one of the things that bothers me a lot. In all OpenMP peppers I've
read so far there are examples like that do not make sense to parallelize in
concurrent program. It is just a waste of CPU.

It's a (big) hint to the optimizing comiler. If the compiler decides that
parallelization isn't worth it, it won't do it. (In an ideal world...)
In that case the question is what are the criteria that would help us to
decide which way to choose for a particular problem.

In particularly I do not see Open MP overlap nicely with idea of (IOCP)
thread pool.

Vladimir.

You're right, threads are relatively large-scale things when compared to the
parallelizing of loops etc. And I think this is the essence of the answer.
Small-scale parallelization is done by coding in OpenMP or Occam's PAR/SEQ
construct; larger-scale parallelization using threading, multitasking, DCOM,
CORBA etc.

S.
 
Back
Top