wiki:bgas-user:bgas-mpi

BGAS MPI user's guide

General information

The MPI implementation installed on the BGAS system is MVAPICH2-1.8 and is on all BGAS installed here:

/usr/lib64/mvapich2

Compiling a MPI program

Use any of the following GNU compiler wrappers:

  • /usr/lib64/mvapich2/bin/mpicc
  • /usr/lib64/mvapich2/bin/mpic++ or /usr/lib64/mvapich2/bin/mpicxx
  • /usr/lib64/mvapich2/bin/mpif77

Executing a MPI program

In order to execute a MPI application, the mpiexec executable has to be used. This executable is responsible for launching the process manager (and therefore handling the process distribution), providing the user application with chosen environment variables, handling various MPI options and, finally, executing the user application on the given set of hosts. The mpiexec command can be found in the bin directory of the mvapich2 installation:

/usr/lib64/mvapich2/bin/mpiexec -env MV2_IBA_HCA=roq -env MV2_USE_RDMA_CM=1 -env MV2_USE_IWARP_MODE=1 ...

Note that setting the environment variable MV2_USE_RDMA is mandatory, the others are required when using the I/O torus links for communication (see below).

In order to run a sample MPI program hello.x on a local host using just one process, run:

/usr/lib64/mvapich2/bin/mpiexec -env MV2_USE_RDMA_CM=1 hello.x

To run the hello.x program using 8 processes on 2 hosts communicating via the interfaces 12.64.0.0 and 12.65.0.0, the following command has to be executed:

/usr/lib64/mvapich2/bin/mpiexec -env MV2_USE_RDMA_CM=1 -n 8 -hosts 12.64.0.0,12.65.0.0 hello.x

Please note that in the above case, the processes will be distributed between the hosts using in the round-robin fashion.

Alternatively, a hosts file containing the list of hosts can be specified as an command line option. To execute the command mentioned above, but using the hosts file istead of the -hosts command line option, the following command has to be used:

/usr/lib64/mvapich2/bin/mpiexec -env MV2_USE_RDMA_CM=1 -n 8 -f hosts.in hello.x

The structure of the hosts.in file to mimic the round-robin process distribution can be the following:

cat hosts.in
  12.64.0.0
  12.65.0.0

More complex hosts file can contain the limit of processes to be distributed to a given host, in the following case, 12.64.0.0 will execute ranks 0 and 1, and 12.65.0.0 will execute ranks from 2 to 7:

cat hosts.in
  12.64.0.0:2
  12.65.0.0:6

Comments can be added to the hosts file, starting with the # character:

cat hosts.in
  12.64.0.0:2  # Q70-I2-J00
  12.65.0.0:6  # Q70-I2-J01

For a list of all available options, execute:

/usr/lib64/mvapich2/bin/mpiexec -h

Network interfaces

The BGAS nodes can either communicate via 10GE Ethernet or via the I/O torus links. The latter is preferred and thus the addresses associated with the roq0 interface should be used.

Troubleshooting

In case of any issues or unexpected behavior, -verbose option can be added to the mpiexec command.

When processes fail to launch or an ssh error appears, ssh configuration and host-based authentication has to be examined.

In case of any issues with the functionality of MPI, contact m.foszczynski@… or m.stephan@….

Last modified 9 years ago Last modified on 06/04/15 22:08:20
Note: See TracWiki for help on using the wiki.