CSE 512/CS 554 - Machine Problem 1
Due Apr 1, 2008
-
Implement the agglomerated version (multiple grid points per
coarse-grain task) of the 1-D Laplace example from Chapter 2 using
MPI. You may use some fixed number of iterations rather than iterating
until convergence.
-
Discuss the effects of the various communication modes of MPI with
respect to the ordering and placement of sends and receives in this
program.
- Is there any choice that risks deadlock?
- Is there any choice that will certainly deadlock?
- Is there any choice that results in effectively serial execution
(i.e., only one task actually executes at a time)?
- What communication mode and placement of sends and receives
enables the greatest overlap of communication and computation?
Verify your conclusions by making test runs to demonstrate the issues
discussed above (beware of explicit deadlock, however — do not let
your program hang indefinitely).
-
Run your program for a series of values of n and p and
plot and discuss the resulting speedup and efficiency you observe.
-
Formulate a performance model for this parallel program in terms of the
usual parameters n, p, tc,
ts, tw. Compare the predictions of
your model with performance actually observed. You will need values
for the computation and communication parameters. How can you obtain
these?
-
What is the isoefficiency function for this parallel program? Using
the parameters determined in the previous question, plot isoefficiency
curves given by your performance model and compare with your previous
observations of actual efficiency. How would the isoefficiency change
if a convergence test were required rather than taking a fixed number
of iterations?
-
Using a suitable choice of communication mode, implement the version of
the Laplace example with overlapped communication and computation and
compare its performance with that of the original version. How does
overlapping affect your performance model?
-
Generate trace files using
SLOG or MPICL so that
you can visualize the behavior and performance of your programs using
Jumpshot or
ParaGraph. Use such visualizations to illustrate some of the
points made in answering the previous questions.
You should submit your program listings, output, and any written
discussion either in hardcopy or via email (pdf format preferred).
For information on building an executable program and running it on
Turing, see the tutorials and FAQ on the Turing website. To obtain
MPICL tracing, you should use an appropriate compiler script (e.g.,
mpiclcc instead of mpicc) to build your executable.