wiki:benchmark

PEPC benchmark

old versions

2.0_internal (05/04/11) (not available to the public)

  • hybrid (MPI+PThreads) tree traversal --> up to 300k cores and up to 2 billion particles possible
  • included new and more efficient algortihm for a-priori branch node estimation and much more efficient branch node determination
  • added pepc-mw and pepc-mini frontend
  • put a lot of code into modules, some of them are shared between frontends
  • removed unused sorting functions and choose_sort parameter
  • removed a number of unused variables and some unused types
  • further cleanup in makefiles
  • added mpi-io checkpointing function (pepc-e & pepc-mw)
  • added direct vtk output (pepc-e & pepc-mw)
  • added some new setups, see special_start() for details (pepc-e & pepc-mw)
  • several bugfixes to improve runtime stability
  • deactivated check for valid particle labels (attention: currently, pelabel(p) may not be zero for any p after particle setup)
  • replaced locaddress()-calls by mpi_get_address()
  • added support for OSX (see note in makefiles/makefile.defs.osx for important details on MPI implementations on OSX)
  • updated example run.h file

1.5.2 (05/15/11) (not available to the public, part of the ScaFaCoS-project)

  • added pepc-s frontend for library integration
  • added computation of and output of virial tensor

1.5.1 (05/03/11)

  • support for OSX added (see note in makefiles/makefile.defs.osx for important details on MPI implementations on OSX)
  • minor bugfixes in sorting library
  • example run.h file updated

1.5 (03/08/11)

  • added framework for Doxygen support:
    • type 'make doc' to create source code documentation from supported inline comments - this creates a ./doc subdir with html-files
    • type 'make clean-doc' to get rid of all documentation (is done automatically on 'make clean')
    • on future changes, comments and inline documentation will be added where appropriate (see module_fmm_framework.f90 or module_math_tools.f90 for examples)
  • added 'make dist' rule for automatic creation of a redistributable tarball from the current svn working copy
  • removed choose_build parameter --> switched tree-buildup to tree_local, tree_global_, tree_exchange - approach after some fixes and optimizations therein
  • fixed typo in periodicity framework which caused a problem with non-unit-box setups
  • fixed workload computation with periodic setup

1.4 (01/10/11)

new features

  • included of periodic framework into pepc, new input parameters in run.h and their standard values:
    ! lattice basis vectors
        t_lattice_1 = 1.0 0.0 0.0
        t_lattice_2 = 0.0 1.0 0.0
        t_lattice_3 = 0.0 0.0 1.0
    
    ! periodicity in x-, y-, and z-direction
        periodicity = .false. .false. .false.
    
    ! extrinsic-to-intrinsic correction
        do_extrinsic_correction = .false.
    
  • added 3D Madelung test case (accessible via ispecial=7 and ne=ni=4,8,...).
  • added dump of field data (charge, potential, electric field) for selected diagnostic particles to fielddump.dat (in the same structure as trajectory.dat)

minor improvements and cleanup

  • bundled all routines concerning the velocity setup into one module (module_velocity_setup.f90)
  • removed (almost empty) utils.f90 module
  • removed some debug output, unnecessary variables and output files
  • adjusted several standard values of parameters
  • included f90_mod_deps.py build script with modifications to comply with Fortran2008 standard
  • added makefile.defs for gcc-4.5.x on SUSE workstation.
  • increased printed precision in energy.dat (at least for eps=0.0 and theta=0.0, this is necessary) and activated printing of header in this file

bugfixes

  • removed dispensable label check in trajectory output. this fixes an older issue and should not have any visible side effects
  • fixed serious errors that appeared in case of very inequal workloads (unbalanced chunks --> inequal number of chunks --> "Missed some..."-warning and companions)
  • fixed silent crash when domain_debug=.true. and number of particles per PE < 10.
  • fixed benchmarking routine to output correct data even for small (< NUM_DIAGNOSTIC_PARTICLES) particle numbers

1.3 (12/23/10)

  • removed (almost) all compiler warnings
  • fixed all(!) invalid data conversions
  • removed verbose debug output of previous version
  • fixed slow hashtable initialization to "hashtable = hash(0,0,..)"
  • prevent creation of fort.20 file in a usual run

1.2 (12/18/10)

  • proper integration of JuBE for internal revision verification, not available in the public version
  • interaction list length correction

1.1 (10/19/10)

  • minor fixes in sorting library integration

1.0 (01/08/10)

  • Added new sorting routine sl_pepc by M. Hofmann, use choose_sort and weighted parameter for switching
    • Better scaling behavior (even beyond 8k cores)
    • Weighted sorting is now optional (and recommended)
    • Choose between key or full particle sorting
  • Added alternative tree construction phase, use choose_build parameter for switching
  • Added much more timings and statistics, switched to MPI_WTIME
  • Cleaned up MPI_BARRIERs

0.9

0.9.2 (07/16/09)

  • default parameter in example run.h changed, removed unused parameter

0.9.1 (07/16/09)

  • switch for point-to-point vs. collective communication added
  • executable naming fix in makefiles and job-scripts

0.9.0 (07/13/09)

This is the initial benchmarking version. It's source differs from the trunk only in file naming and the exclusion of the configure script. Basic documentation.

how to run the benchmark

Instructions to compile and execute the benchmark are available in the tutorial section.

porting

The current version includes predefined environments for the following machines.

  • simple make files and submit scripts
    • IBM Power6 jump
    • IBM BlueGene/P jugene
    • Intel Nehalem Cluster JuRoPa
    • SUSE 10 Workstations with gcc 4.5.x
    • Ubuntu with gcc
    • OSX with gcc and MPICH2
  • JuBE integration (not available in the public version)
    • IBM BlueGene/P jugene
    • Intel Nehalem Cluster JuRoPa
    • GNU Linux

all versions

Last modified 12 years ago Last modified on 03/22/12 11:06:31
Note: See TracWiki for help on using the wiki.