MPI2007 license: | 0005 | Test date: | Oct-2008 |
---|---|---|---|
Test sponsor: | IBM Corporation | Hardware Availability: | Nov-2008 |
Tested by: | IBM Corporation | Software Availability: | Nov-2008 |
Benchmark | Base | Peak | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Ranks | Seconds | Ratio | Seconds | Ratio | Seconds | Ratio | Ranks | Seconds | Ratio | Seconds | Ratio | Seconds | Ratio | |
Results appear in the order in which they were run. Bold underlined text indicates a median measurement. | ||||||||||||||
104.milc | 16 | 1713 | 0.913 | 1711 | 0.915 | 1708 | 0.916 | 16 | 1713 | 0.913 | 1711 | 0.915 | 1708 | 0.916 |
107.leslie3d | 16 | 3524 | 1.48 | 3674 | 1.42 | 3538 | 1.48 | 16 | 3459 | 1.51 | 3727 | 1.40 | 3719 | 1.40 |
113.GemsFDTD | 16 | 2945 | 2.14 | 2949 | 2.14 | 2950 | 2.14 | 16 | 2945 | 2.14 | 2949 | 2.14 | 2950 | 2.14 |
115.fds4 | 16 | 1719 | 1.14 | 1689 | 1.16 | 1710 | 1.14 | 16 | 1770 | 1.10 | 1683 | 1.16 | 1675 | 1.17 |
121.pop2 | 16 | 2505 | 1.65 | 2513 | 1.64 | 2530 | 1.63 | 16 | 2505 | 1.65 | 2513 | 1.64 | 2530 | 1.63 |
122.tachyon | 16 | 4118 | 0.679 | 4115 | 0.680 | 4119 | 0.679 | 16 | 4030 | 0.694 | 4031 | 0.694 | 4031 | 0.694 |
126.lammps | 16 | 2425 | 1.20 | 2459 | 1.19 | 2413 | 1.21 | 16 | 2425 | 1.20 | 2459 | 1.19 | 2413 | 1.21 |
127.wrf2 | 16 | 5309 | 1.47 | 5299 | 1.47 | 5297 | 1.47 | 16 | 3578 | 2.18 | 3580 | 2.18 | 3559 | 2.19 |
128.GAPgeofem | 16 | 1361 | 1.52 | 1359 | 1.52 | 1362 | 1.52 | 16 | 1361 | 1.52 | 1359 | 1.52 | 1362 | 1.52 |
129.tera_tf | 16 | 3961 | 0.699 | 3961 | 0.699 | 3960 | 0.699 | 16 | 2857 | 0.969 | 2859 | 0.968 | 2861 | 0.968 |
130.socorro | 16 | 2199 | 1.74 | 2200 | 1.73 | 2206 | 1.73 | 16 | 792 | 4.82 | 792 | 4.82 | 789 | 4.84 |
132.zeusmp2 | 16 | 2558 | 1.21 | 2564 | 1.21 | 2559 | 1.21 | 16 | 2558 | 1.21 | 2564 | 1.21 | 2559 | 1.21 |
137.lu | 16 | 3312 | 1.11 | 3410 | 1.08 | 3326 | 1.11 | 16 | 3312 | 1.11 | 3410 | 1.08 | 3326 | 1.11 |
Hardware Summary | |
---|---|
Type of System: | Heterogeneous |
Compute Nodes: | IBM System JS22 IBM System JS22 |
Interconnects: | InfiniBand Ethernet |
File Server Node: | IBM System JS22 |
Head Node: | IBM System JS22 |
Total Compute Nodes: | 2 |
Total Chips: | 4 |
Total Cores: | 8 |
Total Threads: | 16 |
Total Memory: | 48 GB |
Base Ranks Run: | 16 |
Minimum Peak Ranks: | 16 |
Maximum Peak Ranks: | 16 |
Software Summary | |
---|---|
C Compiler: | IBM XL C/C++ Enterprise Edition V9 for AIX Updated with the September 2008 Fix level |
C++ Compiler: | IBM XL C/C++ Enterprise Edition V9 for AIX Updated with the September 2008 Fix level |
Fortran Compiler: | IBM XL Fortran Enterprise Edition V11.1 for AIX Updated with the September 2008 Fix level |
Base Pointers: | 32-bit |
Peak Pointers: | 32/64-bit |
MPI Library: | IBM Parallel Environment for AIX, Version 5 Release 1 |
Other MPI Info: | None |
Pre-processors: | None |
Other Software: | IBM Engineering and Scientific Subroutine Library (ESSL) for AIX Version 4 Release 3 Updated with PTF Set 3 |
Hardware | |
---|---|
Number of nodes: | 1 |
Uses of the node: | compute, head, fileserver |
Vendor: | IBM Corporation |
Model: | IBM System JS22 |
CPU Name: | POWER6 |
CPU(s) orderable: | 4 cores per blade |
Chips enabled: | 2 |
Cores enabled: | 4 |
Cores per chip: | 2 |
Threads per core: | 2 |
CPU Characteristics: | |
CPU MHz: | 4000 |
Primary Cache: | 64 KB I + 64 KB D on chip per core |
Secondary Cache: | 4 MB I+D on chip per core |
L3 Cache: | None |
Other Cache: | None |
Memory: | 32 GB (4x8 GB) DDR2 500 MHz |
Disk Subsystem: | 1x146 GB SAS 15K RPM |
Other Hardware: | BladeCenter-H chassis Voltaire 4X InfiniBand Pass-thru Module (P/N 43W4419) |
Adapter: | 4X InfiniBand DDR Expansion Card (CFFh) for IBM BladeCenter (P/N 43W4423) |
Number of Adapters: | 1 |
Slot Type: | PCIe x8 Gen2 |
Data Rate: | 4x DDR 20Gbps |
Ports Used: | 1 |
Interconnect Type: | InfiniBand |
Software | |
---|---|
Adapter: | 4X InfiniBand DDR Expansion Card (CFFh) for IBM BladeCenter (P/N 43W4423) |
Adapter Driver: | devices.pciex.b3157862.rte 6.1.2.0 |
Adapter Firmware: | 2.3.0 |
Operating System: | IBM AIX V6.1 with the 6100-02 Technology Level |
Local File System: | AIX/JFS2 |
Shared File System: | NFSv3 |
System State: | Multi-user |
Other Software: | None |
Blade[1] runs the following commands to compose the cluster: mkdev -c management -s infiniband -t icm /usr/sbin/mkiba -a 192.1.10.1 -m 255.255.255.0 -i ib0 -A iba0 -p 1 -P 0xFFFF -M 65532 -q 4000 -k off -Q 0x1E -S up startsrc -s ctcas preprpnode mpiblade1 mkrpdomain mpiblades mpiblade1 mpiblade2 startrpdomain mpiblades cd /usr/lpp/ppe.poe/samples/nrt make chmod 4755 nrt_api shutdown -rF su spec cd mpiblades.64ranks.load ../nrt_api -l
Hardware | |
---|---|
Number of nodes: | 1 |
Uses of the node: | compute |
Vendor: | IBM Corporation |
Model: | IBM System JS22 |
CPU Name: | POWER6 |
CPU(s) orderable: | 4 cores per blade |
Chips enabled: | 2 |
Cores enabled: | 4 |
Cores per chip: | 2 |
Threads per core: | 2 |
CPU Characteristics: | |
CPU MHz: | 4000 |
Primary Cache: | 64 KB I + 64 KB D on chip per core |
Secondary Cache: | 4 MB I+D on chip per core |
L3 Cache: | None |
Other Cache: | None |
Memory: | 16 GB (4x4 GB) DDR2 667 MHz |
Disk Subsystem: | 1x146 GB SAS 15K RPM |
Other Hardware: | BladeCenter-H chassis Voltaire 4X InfiniBand Pass-thru Module (P/N 43W4419) |
Adapter: | 4X InfiniBand DDR Expansion Card (CFFh) for IBM BladeCenter (P/N 43W4423) |
Number of Adapters: | 1 |
Slot Type: | PCIe x8 Gen2 |
Data Rate: | 4x DDR 20Gbps |
Ports Used: | 1 |
Interconnect Type: | InfiniBand |
Software | |
---|---|
Adapter: | 4X InfiniBand DDR Expansion Card (CFFh) for IBM BladeCenter (P/N 43W4423) |
Adapter Driver: | devices.pciex.b3157862.rte 6.1.2.0 |
Adapter Firmware: | 2.3.0 |
Operating System: | IBM AIX V6.1 with the 6100-02 Technology Level |
Local File System: | AIX/JFS2 |
Shared File System: | NFSv3 |
System State: | Multi-user |
Other Software: | None |
Blade[2] runs the following commands to compose the cluster: mkdev -c management -s infiniband -t icm /usr/sbin/mkiba -a 192.1.10.2 -m 255.255.255.0 -i ib0 -A iba0 -p 1 -P 0xFFFF -M 65532 -q 4000 -k off -Q 0x1E -S up startsrc -s ctcas preprpnode mpiblade1 cd /usr/lpp/ppe.poe/samples/nrt make chmod 4755 nrt_api shutdown -rF su spec cd mpiblades.64ranks.load ../nrt_api -l
Hardware | |
---|---|
Vendor: | IBM Corporation |
Model: | 4x DDR InfiniBand |
Switch Model: | QLogic SilverStorm 9024 |
Number of Switches: | 1 |
Number of Ports: | 24 |
Data Rate: | 4x DDR 20Gbps |
Firmware: | 4.2.1.1.1 |
Topology: | single switch |
Primary Use: | MPI Communication |
Hardware | |
---|---|
Vendor: | IBM Corporation |
Model: | 4-port Gigabit Ethernet |
Switch Model: | IBM BladeCenter 4-port Gigabit Ethernet switch module (P/N 26K6483) |
Number of Switches: | 1 |
Number of Ports: | 18 |
Data Rate: | 1Gbps |
Firmware: | 1.08 |
Topology: | single switch |
Primary Use: | File system |
Blade[1], with 32GB of memory and 32GB of paging space, was used to compile the benchmarks.
The config file option 'submit' was used. submit = poe task_stride.2level.32+64rank 4 2 8 $ranks $command -procs $ranks -hostfile /spec/MapFiles/ib0hosts.8x.1-8
Environment settings: All ulimits set to unlimited ranks = 16 CWD = /spec/mpi2007 MEMORY_AFFINITY = MCM XLFRTEOPTS = intrinthds=1 MP_PGMMODEL = spmd MP_MSG_API = mpi MP_DEVTYPE = ib MP_CLOCK_SOURCE = AIX MP_STDINMODE = none MP_SHARED_MEMORY = yes MP_SINGLE_THREAD = yes MP_EUILIB = us NRT_WINDOW_COUNT = 1 MP_RESD = no MP_PULSE = 0 ADAPTER_USE = shared EUIDEVICE = sn_single MP_CSS_INTERRUPT = no MP_BUFFER_MEM = 67108864 MP_USE_BULK_XFER = yes MP_BULK_MIN_MSG_SIZE = 8192 MP_EAGER_LIMIT = 65536 MP_WAIT_MODE = yield MP_INFOLEVEL = 0 MP_LABELIO = no MP_STDOUTMODE = unordered MP_PMDLOG = no NRT_JOB_KEY = 64
/usr/bin/mpcc_r |
126.lammps: | /usr/bin/mpCC_r |
/usr/bin/mpxlf95_r |
/usr/bin/mpcc_r /usr/bin/mpxlf95_r |
107.leslie3d: | -qfixed |
115.fds4: | -DSPEC_MPI_LC_NO_TRAILING_UNDERSCORE -qfixed |
121.pop2: | -DSPEC_MPI_AIX |
127.wrf2: | -DNOUNDERSCORE -DSPEC_MPI_AIX |
130.socorro: | -DSPEC_NO_UNDERSCORE -qcpluscmt |
132.zeusmp2: | -qfixed -DSPEC_SINGLE_UNDERSCORE |
137.lu: | -qfixed |
-bmaxdata:0x80000000 -O5 -D_ILS_MACROS -bdatapsize:64K -bstackpsize:64K -btextpsize:64K |
126.lammps: | -bmaxdata:0x80000000 -O5 |
-bmaxdata:0x80000000 -O4 -qstrict -qalias=nostd -qhot=level=0 -qsave -bdatapsize:64K -bstackpsize:64K -btextpsize:64K |
-bmaxdata:0x80000000 -O5 -D_ILS_MACROS -bdatapsize:64K -bstackpsize:64K -btextpsize:64K -O4 -qstrict -qalias=nostd -qhot=level=0 -qsave |
104.milc: | basepeak = yes |
122.tachyon: | -O5 -lessl -D_ILS_MACROS -bdatapsize:64K -bstackpsize:64K -btextpsize:64K -q64 |
126.lammps: | basepeak = yes |
107.leslie3d: | -O5 -bdatapsize:64K -bstackpsize:64K -btextpsize:64K -bmaxdata:0x70000000 |
113.GemsFDTD: | basepeak = yes |
129.tera_tf: | -O5 -qessl -lessl -bdatapsize:64K -bstackpsize:64K -btextpsize:64K |
137.lu: | basepeak = yes |
115.fds4: | -O5 -lessl -D_ILS_MACROS -bdatapsize:64K -bstackpsize:64K -btextpsize:64K -qstrict -qalias=nostd -qhot=level=0 -qsave -q64 |
121.pop2: | basepeak = yes |
127.wrf2: | -O5 -bmaxdata:0x80000000 |
128.GAPgeofem: | basepeak = yes |
130.socorro: | -O5 -lessl -D_ILS_MACROS -bdatapsize:64K -bstackpsize:64K -btextpsize:64K -qessl -bmaxdata:0x80000000 |
132.zeusmp2: | basepeak = yes |
-w -qsuppress=1500-036 -qipa=noobject -qipa=threads |
126.lammps: | -w -qsuppress=1500-036 -qipa=noobject -qipa=threads |
-w -qsuppress=1500-036 -qsuppress=cmpmsg -qspillsize=32648 |
-w -qsuppress=1500-036 -qipa=noobject -qipa=threads -qsuppress=cmpmsg -qspillsize=32648 |