wiki:bgas-user:bgas-run

Compiling and running BGAS applications

General information

JBRT (Juelich BGAS RunTime) is available on the BGAS system since 08.09.2014.

User accounts and access

In order to use the JBRT and log in to the BGAS nodes, a Juqueen account is required with a special extension. Please contact one of the system administrators or project cordinators for details.

Location of the JBRT components

The JBRT with header (include) files, executables and library files is located under the following path, which can be accessed from any of the NFS-enabled (front-end, ION, or BGAS) nodes:

/bgsys/local/bgas/jbrt

Because of the multi-platform nature of JBRT (i.e. parts of it run on the FN, CNs and IONs), for ease of use, the main JBRT directory has been divided into subdirectories corresponding to the targeted part of the system:

ls -l /bgsys/local/bgas/jbrt
total 64
drwxrwxr-x 4 bgas bgas   512 Sep  4 15:10 jbcn
drwxrwxr-x 3 bgas bgas   512 Sep  4 15:12 jbfe
drwxrwxr-x 5 bgas bgas 32768 Sep  8 14:51 jbion

Compute node components (JBCN)

The jbcn directory contains all the CN components:

ls -l /bgsys/local/bgas/jbrt/jbcn
total 0
drwxrwxr-x 2 bgas bgas 512 Sep  8 16:11 include
drwxrwxr-x 2 bgas bgas 512 Sep  4 15:10 lib

In the include directory of JBCN, all header files required to be included by the CN application utilizing JBRT can be found:

ls -l /bgsys/local/bgas/jbrt/jbcn/include 
total 64
-rwxrwxr-x 1 bgas    bgas 531 Sep  4 15:10 jbcnl.h
-rw-r--r-- 1 vandenb zam  959 Sep  8 16:11 jbcnl_impl.h

Please note that currently only the jbcnl.h file needs to be included in the user application. The jbcnl_impl.h file is an implementation internal file and should not be used. In order to include JBCN header files into user application, the /bgsys/local/bgas/jbrt/jbcn/include path should be used.

The lib subdirectory of the JBCN contains the libjbcn.a, JBCN library:

ls -l /bgsys/local/bgas/jbrt/jbcn/lib
total 160
-rwxrwxr-x 1 bgas bgas 140574 Sep  4 15:10 libjbcn.a

This library needs to be linked to the user application running on the CNs, using the /bgsys/local/bgas/jbrt/jbcn/lib path for linking.

Front-end components (JBFE)

The jbfe directory contains all the FE components:

ls -l /bgsys/local/bgas/jbrt/jbfe
total 0
drwxrwxr-x 2 bgas bgas 512 Sep  8 14:36 bin

In the bin directory, the jbrunjob command is located. This command is used in the LoadLeveller script in substitute of the regular runjob command, to launch processes which should be executed on the CNs, and preparing the environment on the CNs and IONs for the CN-side application to launch the user code on the IONs using dedicated function calls from the library.

The other file, jbrunjob.x, is the implementation file and should not be used.

The jbion directory contains all the ION components:

ls -l /bgsys/local/bgas/jbrt/jbion
total 64
drwxrwxr-x 2 bgas bgas   512 Sep  8 15:07 bin
drwxrwxr-x 2 bgas bgas   512 Sep  8 14:52 etc
drwxrwxr-x 2 bgas bgas 32768 Sep  8 15:50 log

From the above, the bin directory contains the following implementation files, which should not be used by the user:

ls -l /bgsys/local/bgas/jbrt/jbion/bin 
total 1888
-rwxrwxr-x 1 bgas bgas  313393 Sep 16 11:36 jblauncher.x
-rwxrwxr-x 1 bgas bgas  302008 Sep 16 11:36 jbmd.x
-rwxrwxr-x 1 bgas bgas 1267910 Sep 16 11:36 jbsd.x

The etc directory is a place where the config files are located. Currently, it contains only the zlog config, which in principle should not be modified by the user without consultation with the development team:

ls -l /bgsys/local/bgas/jbrt/jbion/etc 
total 0
-rwxrwxr-x 1 bgas bgas 103 Sep  4 15:17 zlog.config

The log directory is where the system logs will be placed. Like above, it is most useful for the developers.

Writing a BGAS program

A sample CN-side program can be written as follows:

  1 #include <stdio.h>
  2 #include "mpi.h"
  3 #include "jbcnl.h"
  4 
  5 #define JB_BUFFER_SIZE  1048576
  6 #define JB_BUFFER_COUNT 1
  7 
  8 #define I_COUNT 128
  9 #define J_COUNT 8
 10 
 11 int main (int argc, char** argv) {
 12   int i  = 0;
 13   int j  = 0;
 14   int ic = I_COUNT;
 15   int jc = J_COUNT;
 16   int rc = 0;
 17   int ec = 0;
 18 
 19   int mpiRank = 0;
 20   int mpiSize = 0;
 21 
 22   int tag[ic];
 23 
 24   char* execvArgs[3];
 25 
 26   //printf("# Test starting\n");
 27 
 28   MPI_Init(&argc, &argv);
 29 
 30   rc = jb_init(JB_BUFFER_SIZE, JB_BUFFER_COUNT);
 31 
 32   //printf("# jb_init: %d\n", rc);
 33 
 34   MPI_Comm_rank(MPI_COMM_WORLD, &mpiRank);
 35   MPI_Comm_size(MPI_COMM_WORLD, &mpiSize);
 36 
 37   for (i = 0; i < ic; i += jc) {
 38     for (j = 0; j < jc; j++) {
 39       if (i + j < ic) {
 40         printf("# Sending request %d of %d\n", i + j + 1, ic);
 41 
 42         asprintf(&execvArgs[0], "%d", mpiRank);
 43         asprintf(&execvArgs[1], "%d", mpiSize);
 44         asprintf(&execvArgs[2], "%d", i);
 45 
 46         tag[j] = jb_execv("/path-to-the-ion-executable/ion-executable.x", execvArgs);
 47 
 48         //printf("# jb_execv[%d]: %d\n", i, tag[j]);
 49       }
 50     }
 51 
 52     for (j = 0; j < jc; j++) {
 53       if (i + j < ic) {
 54         rc = -1;
 55 
 56         printf("# MPI rank %d of %d waiting for request %d of %d\n", mpiRank, mpiSize, i + j + 1, ic);
 57 
 58         while (rc < 0) {
 59           rc = jb_execv_status(tag[j]);
 60 
 61           sleep(1);
 62         }
 63 
 64         if (rc != 0) ec++;
 65 
 66         //printf("# jb_execv_status[%d]: %d\n", i, rc);
 67       }
 68     }
 69   }
 70 
 71   jb_finalize();
 72 
 73   MPI_Finalize();
 74 
 75   printf("# Test done, errors: %d\n", ec);
 76 
 77   return 0;
 78 }

On the ION-side (BGAS node), a corresponding sample application can be written as in the following example:

  1 #include <stdio.h>
  2 #include <stdlib.h>
  3 #include <time.h>
  4 
  5 #define LOGS_DIRECTORY "/path-to-the-user-logs"
  6 
  7 int main(int argc, char** argv) {
  8   char hostName[256];
  9   char currentTime[128];
 10   char currentDate[128];
 11   char logFileName[512];
 12 
 13   FILE *LOGFILE;
 14 
 15   struct tm *tm;
 16 
 17   time_t timeT;
 18 
 19   timeT = time(NULL);
 20 
 21   tm = localtime(&timeT);
 22 
 23   strftime(currentDate, sizeof(currentDate), "%Y-%m-%d", tm);
 24   strftime(currentTime, sizeof(currentTime), "%H:%M:%S", tm);
 25 
 26   gethostname(hostName, 256);
 27 
 28   //snprintf(logFileName, sizeof(logFileName), "%s/%s-%s-%s.log", LOGS_DIRECTORY, hostName, currentDate, currentTime);
 29   snprintf(logFileName, sizeof(logFileName), "%s/%s.log", LOGS_DIRECTORY, hostName);
 30 
 31   LOGFILE = fopen(logFileName, "a+");
 32 
 33   if (LOGFILE== NULL) {
 34     printf("# Error opening the log file\n");
 35 
 36     return 1;
 37   }
 38 
 39   fprintf(LOGFILE, "%s %s Host name: %s, process rank %d of %d, loop %d\n", currentDate, currentTime, hostName, atoi(argv[1]), atoi(argv[2]), atoi(argv[3]));
 40 
 41   fclose(LOGFILE);
 42 
 43   return 0;
 44 }

Compiling a BGAS program

In order to compile a BGAS application, some understanding of the system architecture is required. The main point to note is the target architecture. Application running on the CN-side, and the application running on the ION-side should be compiled accordingly, using CN and ION header files and libraries.

Required jbcn header files can be found in /bgsys/local/bgas/jbrt/jbcn/include, while the CN library is located in /bgsys/local/bgas/jbrt/jbcn/lib. Those have to be used to make use of the remote process execution on IONs triggered by the CN application.

Currently, no specific including or linking on the ION-side is required.

A sample Makefile for the CN-side application can look as follows:

  1 MPICC = mpigcc
  2 
  3 CCFLAGS = -O3 -g
  4 LDFLAGS =
  5 
  6 JBRTDIR = /bgsys/local/bgas/jbrt
  7 
  8 INC = -I$(JBRTDIR)/jbcn/include
  9 LIB = -L$(JBRTDIR)/jbcn/lib -ljbcn
 10 
 11 TARGET = cn-executable.x
 12 
 13 SRC = $(wildcard *.c)
 14 OBJ = $(SRC:.c=.o)
 15 
 16 all = $(TARGET) $(SRC)
 17 
 18 .c.o:
 19   $(MPICC) $(CCFLAGS) $(INC) -c $< -o $@
 20 
 21 $(TARGET): $(OBJ)
 22   $(MPICC) $(LDFLAGS) $(OBJ) -o $@ $(LIB)
 23 
 24 clean:
 25   rm *.o
 26   rm *.x

An example Makefile for the ION-side application can look as follows:

 1 MPICC = mpigcc
  2 
  3 CCFLAGS = -O3 -g
  4 LDFLAGS =
  5 
  6 JBRTDIR = /bgsys/local/bgas/jbrt
  7 
  8 INC =
  9 LIB =
 10 
 11 TARGET = ion-executable.x
 12 
 13 SRC = $(wildcard *.c)
 14 OBJ = $(SRC:.c=.o)
 15 
 16 all = $(TARGET) $(SRC)
 17 
 18 .c.o:
 19   $(MPICC) $(CCFLAGS) $(INC) -c $< -o $@
 20 
 21 $(TARGET): $(OBJ)
 22   $(MPICC) $(LDFLAGS) $(OBJ) -o $@ $(LIB)
 23 
 24 clean:
 25   rm *.o
 26   rm *.x

Executing a BGAS program

To execute a BGAS program, the jbrunjob command has to be used in the LoadLeveller jobscript. This command is a special wrapper of the standard runjob command and supports the same set of arguments. The jbrunjob command can be found in the front-end part of the JBRT: /bgsys/local/bgas/jbrt/jbfe.

In the current setup, a special reservation is required, to ensure proper binding between CNs and BGAS nodes, and allow the system administrators to monitor the job's status. To request a reservation, contact one of the system administrators or project coordinators.

Below is an example of a LoadLeveller script:

  1 # @ job_name = BGAS_JOB_NAME
  2 # @ job_type = bluegene
  3 # @ comment = "BGL Job by Size"
  4 # @ error = $(job_name).$(jobid).out
  5 # @ output = $(job_name).$(jobid).out
  6 # @ environment = COPY_ALL;
  7 # @ wall_clock_limit = 00:10:00
  8 # @ notification = never
  9 # @ notify_user = your-email-address@your-domain
 10 # @ bg_size = 512
 11 # @ ll_res_id = your-reservation-id
 12 # @ class = system
 13 # @ queue
 14 
 15 /bgsys/local/bgas/jbrt/jbfe/bin/jbrunjob -n 8 -p 1 --envs BG_SHAREDMEMSIZE=32 : cn-executable.x

Addendum: It is now possible to use the JBRT on non-BGAS (i.e. standard production) IO nodes; however, no (HS4) flash memory is available and no ION-ION communication may be performed in this setup. To do so, comment out the "ll_res_id" and "class" lines (lines 11 and 12) in the example script above. No reservation is required in this case.

Troubleshooting

In case of any issues with the functionality of JBRT, contact n.vandenbergen@…, m.foszczynski@… or m.stephan@….

Last modified 9 years ago Last modified on 02/05/15 18:17:10
Note: See TracWiki for help on using the wiki.