Swamped with your writing assignments? Take the weight off your shoulder!
Submit your assignment instructions
In this lab, you are required to develop and compile a parallel program on the Hopper cluster, using the
C programming language and pthreads library.
Necessary instructions to develop code on Hopper are provided in the document:
Code Development on Hopper.
These instructions require you to install VSCode, which is a versatile IDE we will use for CPU and
GPU code development.
Since you did your first lab using MobaXterm, you can test this program on your personal computer by
compiling on your local computer (with MobaXterm), however, the submissions must include results
obtained on the Hopper cluster. So, only submit your results obtained with VSCode and Hopper cluster.
In this lab, you will be applying the lessons of Lecture 3 to a new application: adding two images.
The provided code (imadd.c) will take two bitmaps and add them as:
output = 0.7 * Image1 + 0.3 * Image2
So really, it’s a weighted average not a sum. Operations similar to this take place in several
applications, including motion detection and facial recognition.
I have provided a file (imadd.c), which includes two functions that perform weighted-add:
• AddImages(): A naïve single-threaded version
• AddImagesMT0(): Its multi-threaded version.
• To test both of these functions, I included example Image2’s, named Laser.bmp and
o You can weighted-add these to our familiar images: Astronaut.bmp, DogL.bmp.
o when you add dogL.bmp and Laser.bmp, save it as LaserDog.bmp.
o when you add Astronaut.bmp and Moon.bmp, save it as MoonAstronaut.bmp.
• Your goal is to improve the performance of the AddImagesMT0()function by writing three
versions of it named AddImagesMT1(), AddImagesMT2(), AddImagesMT3().
• It is expected that your versions are progressively faster, where MT3()is the fastest one.
• Your program must be capable of accepting two command line options:
o The number of threads
o The function version.
• Here are some example command lines:
$ imadd 1 this runs the serial version AddImages().Ignores 2nd argument
$ imadd 16 3 means 16 threads, AddImagesMT3()
• You must run the original and your three versions and report runtimes in your lab report.
Ideas for developing AddImages1() – 3():
• implement buffers like imflipPM and try different buffer sizes
• modify the access patterns to be more sequential/localized
• split up the work differently to be memory-friendly
• 0.7 and 0.3 are floating point numbers (float). Pixels are (unsigned char). Can you think of
something to speed it up??? Play with the variable types; a tiny bit of accuracy difference won’t
totally change your results…