A06 - Parallel Reduction with Performance Analysis
Due Date: Thursday 02/22/2024 @ 11:59 PM
Late Policy
- You have until the assigned due date, after that you will receive 0 points.
Advanced C++ OpenMP Assignment: Parallel Reduction with Performance Analysis
Objective
This assignment aims to deepen your understanding of OpenMP’s parallel loop constructs, the concept of reduction, and the impact of thread count on performance. You will implement a parallel version of array summation and analyze how execution time varies with different numbers of threads.
Starter Code
Below is the starter code for your assignment. Your task is to complete the missing part of the code and extend it to measure performance.
#include <omp.h>
#include <iostream>
#include <vector>
int main() {
const int SIZE = 1e6;
std::vector<int> numbers(SIZE, 1); // Initialize all elements to 1
int total_sum = 0;
// Parallel region begins here
#pragma omp parallel
{
// TODO: Use 'omp for' and 'reduction' to calculate the sum of the array
}
std::cout << "Total sum is " << total_sum << std::endl;
return 0;
}
Instructions
- Complete the Parallel Region: Use the omp for directive and reduction clause to calculate the sum in parallel.
- Performance Measurement: Modify the program to measure and record the execution time for the parallel region with varying numbers of OpenMP threads (1, 2, 4, 8). Use omp_set_num_threads() to set the thread count and omp_get_wtime() to measure execution time.
-
Compilation and Execution: Compile the code using
g++ -fopenmp openmp01.cc -o openmp01
or create a Makefile. Run the compiled program multiple times, each with a different number of threads, and collect the execution time data. (openmp01) - Graphical Analysis: Plot the recorded execution times against the number of threads. This graph should help visualize the performance scalability with the increase in thread count. (openmp01.png)
- Discussion Questions: Reflect on the observed performance changes with varying thread counts. Consider the impact of thread management overhead, the potential for diminishing returns, and the effect of hardware on performance scalability. (openmp01.txt)
Expected Learning Outcomes
- An understanding of parallel loops, race conditions, and the application of the reduction clause in OpenMP is essential.
- The ability to measure and analyze the performance impact of altering the number of threads in a parallel program is crucial.
- Gaining insights into the scalability and efficiency of parallel computing with OpenMP is valuable for optimizing performance.
Advanced Challenge (Optional)
Explore the impact of different scheduling strategies (static, dynamic) on performance. Modify the parallel loop to specify the scheduling strategy and compare the execution times with those obtained using the default setting. (openmp01a.txt and openmp01a.png)