Why OpenMP?

   When I attended the WWDC, i.e. Apple’s World Wide Developer Conference, a few years ago, and if I remember it correctly, some people raised their hands and asked when the OpenMP support would be included in the GCC provided by the Apple. At that time, I didn’t not understand why it is important. I thought the OpenMP and MPI are for speicial market, like high-performance science and data analysis market. I thought they are for their own league.
   Also, I didn’t understand why we needed another threading and mutiprocessing/multithreading API when we already have the pthread and other message passing APIs. I would confess this. “Why should programmers learn another threading API? I don’t want to do so!”

   However, about 1 month ago, I found out that the OpenMP, at least, can boost performance of any individual programmer’s codes very “easilty”. I woudl like to put emphasis on “easily”. If a new library is announced, it should be easy to be tried without sacrificing your precious time in my opinion.
The OpenMP was turned out to be in that category.

   Actually, it looks like a collection of macros which utilize the pthread functions. However, actually it is built into compilers like the Visual C++’s compiler and GCC v. 4.2.x. So, in other words, you need a compiler which supports the OpenMP.

The great features of the OpenMP are :

  1. Very easy to use; Not so many new keywords to memorize; very straight forward to use.
  2. Enables applying “fine” level of multithreading without hassle.
  3. You can use almost same source codes for single threaded version and multithreaded version no matter how many threads you want to create.

Let’s talk about them more to get better idea what I mean.

1. The OpenMP keywords are very easy to learn. They are quite clean, and doesn’t introduce new concept. It consists of only a couple of keywords, and you can try them very easilty without modifying your logic much. Actually, embrace your logic, which should be handled by threads, with brackets (braces?) and their keywords. That’s it!

2. When codes are written in multithreaded way, they can be usually for coarse-grained multithreaded. It is because that it is tedious to write multithreaded codes in fine-grained way. I just create a new thread which uses a function as a thread function. But with the OpenMP, you can easily slice your time-wasting for-loop and give them to their own threads.

3. If OpenMP allows you to write multithreaded code, but if it makes you to change your codes a lot, it is not useful. The OpenMP allows to convert a single threaded version of code into multithreaded version by adding their new statements. Usually there is no need to change the structure of existing codes. If you want to use 3 threads instead of 2, you can just specify the number of threads to utilize without change the exisiting code structure!

Also, it is quite handy for current multicore processors. The main target of the OpenMP is multiprocessor or multicore processors in one computer. On the other hand, the MPI is for distributed environment.

Here is my sample code which uses the OpenMP. It shows how fast it can be if the OpenMP is used. I also tried using SIMD instructions if it can achieve faster performance than using multithreading. I think SIMDs are more efficient than using multithreads, because there is no overhead to create multiple threads, and maintain them. However, my code sample shows that poorly designed SIMD codes are slower than simpler but multithreaded codes.

// OpenMP.cpp : Defines the entry point for the console application.
#include “stdafx.h”
using namespace std;

#include “performance_measure.h”

#define NUM_THREADS 4
#define NUM_START 1
#define NUM_END 10

void test(int val)
#pragma omp parallel if (val) num_threads(val)
if (omp_in_parallel())
#pragma omp single
printf_s(“val = %d, parallelized with %d threads\n”,
val, omp_get_num_threads());
printf_s(“val = %d, serialized\n”, val);

void AnotherTest( void )
int i, nRet = 0, nSum = 0, nStart = NUM_START, nEnd = NUM_END;
int nThreads = 0, nTmp = nStart + nEnd;
unsigned uTmp = (unsigned((abs(nStart – nEnd) + 1)) *
unsigned(abs(nTmp))) / 2;
int nSumCalc = uTmp;

if (nTmp < 0) nSumCalc = -nSumCalc; omp_set_num_threads(NUM_THREADS); #pragma omp parallel default(none) private(i) shared(nSum, nThreads, nStart, nEnd) { #pragma omp master nThreads = omp_get_num_threads(); #pragma omp for for (i=nStart; i<=nEnd; ++i) { #pragma omp atomic nSum += i; } } if (nThreads == NUM_THREADS) { printf_s("%d OpenMP threads were used.\n", NUM_THREADS); nRet = 0; } else { printf_s("Expected %d OpenMP threads, but %d were used.\n", NUM_THREADS, nThreads); nRet = 1; } if (nSum != nSumCalc) { printf_s("The sum of %d through %d should be %d, " "but %d was reported!\n", NUM_START, NUM_END, nSumCalc, nSum); nRet = 1; } else printf_s("The sum of %d through %d is %d\n", NUM_START, NUM_END, nSum); } void test2(int iter) { #pragma omp ordered printf_s("test2() iteration %d by thread ID %d\n", iter, omp_get_thread_num()); } void AnotherTest2( void ) { int i; #pragma omp parallel { #pragma omp for ordered for (i = 0 ; i < 5 ; i++) test2(i); } } /* * taylor.c * * This program calculates the value of e*pi by first calculating e * and pi by their taylor expansions and then multiplying them * together. */ #define num_steps 20000000 void sequential_taylor( void ) { double start, stop; /* times of beginning and end of procedure */ double e, pi, factorial, product; int i; printf("Sequential Taylor\n"); /* start the timer */ start = clock(); /* First we calculate e from its taylor expansion */ printf("e started\n"); e = 1; factorial = 1; /* rather than recalculating the factorial from scratch each iteration we keep it in this varialbe and multiply it by i each iteration. */ for (i = 1; i
For the GCC, 4.2.x versions or above are required. For the Mac OS X, if you log in the ADC web site, you can download 4.2.3(?) version or above. It is still kind of beta.

If you want to know more about the OpenMP, visit :

GOMP is for C/C++ and Fortran 95 in the GNU Compiler Collection, aka, GCC.


5 responses to this post.

  1. Posted by sad on December 23, 2009 at 1:45 AM

    thank u for this important article

    my problem is i have c prog and i tried to parallise it using openmp but it give no performance

    even the serial code run more fast than the parallel one

    Is there any could i send him parts of my code and gives me the code paralellised correctly
    that could run on dual and quad core?
    please send on my email and i will send him back my c prog


    • Posted by jongampark on December 23, 2009 at 8:33 PM

      Hello. If serial codes run faster than its parallel counter part, problems can be :
      1. It is parallelized too much
      To switch context, it also consumes processing power. If there are too many threads or processes spawned from a process, processors spend more of their time in switching among threads/processes rather than do actual work.

      2. There can be dependency among processes/threads
      if there is heavy dependency among them, they will be processed more sequentially than parallel way.
      While they are processed sequentially, CPUs should maintain processes and threads anyway. Then managing them becomes unnecessary burden.

      3. Or any problems exist in your codes.

      So, it is easy to manipulate number of threads using OpenMP. So, you can experiment by changing number of threads it makes.
      I’m sorry that I don’t help to investigate your code. It is up to you.


  2. Posted by saru on January 15, 2011 at 12:29 AM

    its very nice, but i came across with another similar blog


    so i just thought of sharing it here,.. it was very helpful for me,..


  3. Posted by anshu on May 1, 2011 at 1:29 AM

    ya i found this site very helpful.. really openmp is amazing..!

    i gone through the site mentioned by saru(http://openmp.blogspot.com/2011/01/home.html), it was also nice.
    it is very good site for beginners..


  4. Posted by anshu on May 1, 2011 at 1:30 AM

    ya i found this site very helpful.. really openmp is amazing..!

    i gone through the site mentioned by saru(http://openmp.blogspot.com/2011/01/home.html), it was also nice.
    it is very good site for beginners..


Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: