Parallel programming began about 30 years ago with the development of mul- tiprocessor computers. Generally, multiprocessor computers have different archi- tectures in terms of the assembly of the processors with their memory. A computer with a shared-memory architecture has multiple processors sharing the same mem- ory modules such that, the processors share the same workspace in a way that they can access and update the same variables simultaneously. On the other hand, the processors of a distributed-memory architecture have a private memory that cannot be accessed or modified by the other processors. In this type of architecture the processors communicate over a high-speed network that allows them to a send and recieve data. Typically, a task is distributed evenly among the processors and then completed by reassembling the information back. Due to the architectural differences of multiprocessor computers, there are two standards for progamming in parallel—Open Multi-Processing (OpenMP) and Mes- sage Passing Interface (MPI). OpenMP is the standard for programming on shared- memory architectures and comprises a set of compiler directives and libraries that control the parallel execution of the program. The advantage of implementing this standard is the ease of conversion of a serial program to a parallel program. This is due to the fact that the programmer needs only to include OpenMP directives be- tween certain code-blocks, as for or do loops. Another advantage is that a program with OpenMP directives can be executed by both serial and parallel computers; because, the directives appear as comments to a compiler that does not conform to the OpenMP standard. Thus, programs written with OpenMP directives are portable. The MPI standard includes a number of libraries for communication and data transfer between the processors over a high-speed network. Unlike OpenMP, where it is limited to computers of a shared-memory architecture, programs written with MPI can be deployed on shared and distributed-memory architectures. However, parallelizing serial programs is often a challenging task, as to distribute a job over a number of processors with a minimal communication among them since the speed of the network limits the overall execution of the program. Which one is better then? The answer is not trivial, one might think that OpenMP is often advantageous because it is easier to implement over MPI; but, not all pro- grams will benefit from parallelizing with OpenMP. For example, programs that handle databases, also known as “coarse-grained problems”, do not benefit from parallelizing with OpenMP. Whereas, “fine-grained problems” as weather simula- tion (where there is an intrisic dependence between the parts of the problem) are not suitable for MPI because of the load overhead from the vast amount of mes- sages that would require. Thus, choosing the parallelization scheme often depends on the problem at hand.