A05 - Getting Loopy with OpenMP
Due Date: Thursday 03/06/2025 @ 11:59 PM
Assignment: GitHub Classroom
Late Policy
- You have until the assigned due date, after that you will receive 0 points.
Getting Loopy with OpenMP (25 points)
Overview
In this assignment, you will extend the provided code in loopy.cc
by parallelizing four computational tasks using OpenMP. The original code implements each task in serial, and your goal is to modify the code to introduce parallelism where appropriate. The four tasks include:
- Matrix Initialization: Populate a 2D matrix.
- Row Sum Calculation: Compute the sum of each row.
- Stencil Computation with Loop Unrolling: Update matrix elements based on neighboring values.
- Unrolled Sum Reduction: Calculate a total sum of matrix elements with loop unrolling.
Each task is timed using OpenMP’s timing function omp_get_wtime()
, and the results (including sample data values, sums, and execution times) are written to a file named loopy.dat
.
Objective and Expected Learning Outcomes
Objectives
-
Introduce Parallelism with OpenMP: Apply OpenMP directives (e.g.,
#pragma omp parallel for
,collapse
,reduction
) to the provided serial code. - Measure Performance: Use OpenMP’s timing functions to compare the execution times of the serial versus parallel implementations.
- Analyze Speedup: Evaluate the benefits and potential overheads of parallelization by analyzing the output timings.
-
Build Automation: Develop a Makefile that compiles
loopy.cc
with the necessary OpenMP flag (-fopenmp
).
Expected Outcomes
-
Enhanced Parallel Programming Skills:
- Successfully convert serial loops to parallel using OpenMP.
- Understand common challenges such as data races and proper reduction.
-
Accurate Performance Evaluation:
- Utilize
omp_get_wtime()
to capture precise timing for each task. - Compare the cumulative execution times for the serial and parallel implementations.
- Utilize
-
Effective Build Management:
- Create a robust Makefile that compiles the code with OpenMP support.
- Ensure reproducible builds on designated systems.
-
Documentation and Reflection:
- Write clear, well-commented code.
- Reflect on your approach, challenges, and outcomes in a reflection file.
Instructions
Part 1: Code Modification and Parallelization
-
Review the Provided Code:
- Examine
loopy.cc
, which includes implementations of four tasks in serial. - Understand the purpose and flow of each task, as well as how performance is measured using
omp_get_wtime()
.
- Examine
-
Implement Parallel Versions:
-
Task 1 (Matrix Initialization): Parallelize the nested loops using
#pragma omp parallel for collapse(2)
. - Task 2 (Row Sum Calculation): Parallelize row sum computation using a local sum variable for each thread.
- Task 3 (Stencil Computation): Apply parallelization to the inner loop (or outer loop as appropriate) while keeping the loop unrolling intact.
-
Task 4 (Unrolled Sum Reduction): Use OpenMP’s
reduction
clause to parallelize the total sum computation. - Verify that both serial and parallel implementations produce consistent results, you can use
check.py
.
-
Task 1 (Matrix Initialization): Parallelize the nested loops using
-
Output Results:
- Ensure that your program writes the following to
loopy.dat
:- A sample matrix value (e.g.,
Matrix[500][500]
). - A sample row sum (e.g.,
RowSums[500]
). - The total sum.
- The execution time for each version (serial and parallel).
- A sample matrix value (e.g.,
- Use the provided
writeResults
function as a guide.
- Ensure that your program writes the following to
Part 2: Makefile Development
-
Create a Makefile:
- Write a Makefile that compiles
loopy.cc
into an executable namedloopy
. - Include the OpenMP flag
-fopenmp
in your compilation options. - Define at least two targets:
-
all
: Compiles the executable. -
clean
: Removes build artifacts and the executable.
-
- Write a Makefile that compiles
-
Testing:
- Confirm that your Makefile builds the project correctly on the designated systems.
Part 3: Experiment Analysis and Reflection
-
Performance Analysis:
- Document the execution times for both the serial and parallel versions.
- Optionally, generate a graph or chart (e.g.,
performance.png
) to illustrate the performance improvements.
-
Reflection:
- Write a brief reflection in a file named
loopy.txt
covering:- Your approach to parallelizing each task.
- Challenges encountered (e.g., handling data dependencies or synchronization).
- An analysis of whether the observed speedup met your expectations.
- Lessons learned and ideas for future optimizations.
- Write a brief reflection in a file named
Submission Guidelines and Evaluation Criteria
- Add and/or edit the following files in your GitHub repository:-
loopy.cc
– The source file with your serial and parallel implementations.-
Makefile
– A Makefile that compiles your project with OpenMP support. -
loopy.dat
– The output file generated by your program. -
loopy.txt
– A reflection on your approach, challenges, and insights. -
(Optional)
loopy.png
– A graph or chart illustrating performance differences.
-
- When your program is ready for grading, commit and push your local repository to the remote git classroom repository following the Assignment Submission Instructions.
- Your work will be evaluated on your adherence to instructions, code quality, file naming, completeness, and your reflection on the assignment.