Changes between Version 37 and Version 38 of WikiStart
- Timestamp:
- 01/13/12 15:31:53 (12 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
WikiStart
v37 v38 7 7 8 8 == Concept == 9 The logger is started with `mpiexec` and subsequently starts the application to monitor. It will create a directory (default: `.memlog` in the `PBS_O_WORKDIR`, => `wrkdir`, option `-w`) and by default each task will create its own logfile in that directory. The logger logs now each time step (=> `delay`, option `-d`) the values of the following keys in the following files:9 The logger is started with `mpiexec` and subsequently starts the application to monitor. The monitoring is done per task and/or per node. In time steps information are gathered from files provided by the operating system or from commands that are issued by the logger itself. Currently, the following resources are used: 10 10 11 * `/proc/<PID>/status:` 12 * `VmExe` 13 * `VmSt` 14 * `VmData` 15 * `VmSize` 16 * `VmLck` 17 * `VmLib` 18 * `VmRSS` 19 Each task writes the value for each key in the file `.memlog/task<MPI-rank>.log` and waits for the next time step. 11 * Monitoring by task: 12 * file `/proc/<PID>/status` with keys 13 * `VmExe` 14 * `VmSt` 15 * `VmData` 16 * `VmSize` 17 * `VmLck` 18 * `VmLib` 19 * `VmRSS` 20 21 * Monitoring per node 22 * command `vmstat` with keys 23 * MFree 24 * TWait 25 * Idle 26 * TDead 27 * UsedUs 28 * UsedKe 29 30 When monitoring per task each task writes the value for each key to the file `.memlog/task<MPI-rank>.log` and waits for the next time step. When monitoring by node the process running on core 0 of each node writes the value for each key to the file `node<node-name>_task<MPI-task>.log`. 20 31 21 32 After the run is finished the analyzer `juman` is run in the `PBS_O_WORKDIR`, either from within the same job script or afterwards on the login node to analyze the consumed resources. It will create graphs with the value of key (default: `VmSize`, => `key`, option `-k`) for each task at each time step (use `juman -k help` for a list of available keys), the process with the maximum value of the key at each time step and the total sum of all values of a key across all tasks at each time step (=> `statistics`, option `-s`). If `-i` is specified the graphs are displayed immediately.