iOpenShell » Q-Chem specific questions » Error in libvmm::io_excption

Error in libvmm::io_excption

Page: 1

Author Post
Member
Registered: Jun 2008
Posts: 30
I compiled Q-Chem using Intel Compiler, MKL, OpenMP and relwdeb on a SGI UV1000 shared memory system with suse operation system.

I try to calculate the CCSD(T) energy of a system contains H,C,O and V using aug-cc-pVTZ-DK basis set on the same computing system with 64 cores; 380G memory and up to 1.5TB Ram disk, but I got an error immediately after SCF converged , can anyone help to point out what was the cause of the problem?


Input
$fn"Wo!g\}Ymolecule
0 1
V -0.973142000 0.084610000 -0.012389000
...
H 2.495522000 1.340684000 -0.022383000
$fn"Wo!g\}Yend

$fn"Wo!g\}Yrem
JOBTYPE sp
exchange hf
correlation ccsd(t)
MAX_SCF_CYCLES 200
BASIS gen
PURECART 1111
mem_total 380000
$fn"Wo!g\}Yend

$fn"Wo!g\}Ybasis

(defn of aug-cc-pVTZ-DK)

$fn"Wo!g\}Yend


Here is the snapshot of the output

...
Nuclear Repulsion Energy = 509.5599530177 hartrees
There are 38 alpha and 38 beta electrons
Requested basis set is non-standard
There are 190 shells and 576 basis functions
Total memory of 1679678MB is distributed as follows:
QALLOC including MEM_STATIC uses 190062MB
MEM_STATIC is set to 62MB
CCMAN JOB total memory use is 1679616MB
Warning: actual memory use might exceed 1679678MB

Total QAlloc Memory Limit 190062 MB
Mega-Array Size 61 MB
MEM_STATIC part 62 MB
A cutoff of 1.0D-14 yielded 16964 shell pairs
There are 154686 function pairs ( 212074 Cartesian)

-------------------------------------------------------
OpenMP Integral Computing Module
Release: version 1.0, May 2013, Q-Chem Inc. Pittsburgh
-------------------------------------------------------
Integral Job Info:
Integral job number is 11
Integral operator is 1
short-range coefficients 0
long-range coefficients 100000000
Omega coefficients 0
if combine SR and LR in K 0
Integral screening is 0
Integral computing path is 2
max size of driver memory is 3200000
size of driver memory is 1704330
size of scratch memory is 24773880
max col of scratch BK array 50625
max len of scratch array in speh3 2114
max len of scratch index in speh4 50
max int batch size is 520
min int batch size is 52
fixed nKL is 2
max L of basis functions is 4
order of int derivative is 0
number of shells is 190
number of basis is 674
number of cartesian basis is 674
number of contracted shell pairs 16964
number of primitive shell pairs 91567
maxK2 (contraction) of shell pair 361
max number of K2 of shell pair 1
max number of CS2 of shell pair 2269
max number of PS2 of shell pair 8550
mem total for path MDJ 543872
-------------------------------------------------------
Smallest overlap matrix eigenvalue = 9.21E-06

Scale SEOQF with 1.000000e-01/1.000000e-01/1.000000e-01

Standard Electronic Orientation quadrupole field applied
Nucleus-field energy = 0.0000000094 hartrees
Guess MOs from core Hamiltonian diagonalization
A restricted Hartree-Fock SCF calculation will be
performed using Pulay DIIS extrapolation
SCF converges when DIIS error is below 1.0E-08
using 64 threads for integral computing
---------------------------------------
Cycle Energy DIIS Error
---------------------------------------
1 -769.1471175481 7.35E-02
...
26 -1321.3610706995 7.16E-09 Convergence criterion met
---------------------------------------
SCF time: CPU 89445.99 s wall 2244.35 s
SCF energy in the final basis set = -1321.3610706995
Total energy in the final basis set = -1321.3610706995


------------------------------------------------------------------------------

CCMAN2: suite of methods based on coupled cluster
and equation of motion theories.

Components:
* libvmm-1.3-trunk
by Evgeny Epifanovsky, Ilya Kaliman.
* libtensor-2.5-release
by Evgeny Epifanovsky, Michael Wormit, Dmitry Zuev, Sam Manzer,
Ilya Kaliman.
* libcc-2.5-trunk
by Evgeny Epifanovsky, Arik Landau, Tomasz Kus, Kirill Khistyaev,
Dmitry Zuev, Prashant Manohar, Xintian Feng, Anna Krylov,
Matthew Goldey, Alec White, Thomas Jagau, Kaushik Nanda,
Anastasia Gunina, Alexander Kunitsa, Joonho Lee.

CCMAN original authors:
Anna I. Krylov, C. David Sherrill, Steven R. Gwaltney,
Edward F. C. Byrd (2000)
Sergey V. Levchenko, Lyudmila V. Slipchenko, Tao Wang,
Ana-Maria C. Cristian (2003)
Piotr A. Pieniazek, C. Melania Oana, Evgeny Epifanovsky (2007)
Prashant Manohar (2009)

------------------------------------------------------------------------------


terminate called after throwing an instance of 'libvmm::io_exception'
what(): libvmm::page_file<T>::advance(size_t, pos_t&), /home01/acrc/chiensh/scratch/q-chem/qchem/libvmm/evmm/../page_file.h (233), io_exception
fseek (Invalid argument)
/home01/acrc/chiensh/scratch/q-chem/qchem/bin/mpi/mpirun_qchem.ch_p4: line 243: 409814 Aborted (core dumped)
/home01/acrc/chiensh/scratch/q-chem/qchem/exe/qcprog.exe ".input.409732.qcin.1" "/dev/shm/chiensh/qchem409732/"
"</dev/null" -p4pg /home01/acrc/chiensh/scratch/qtest/PI409762 -p4wd /home01/acrc/chiensh/scratch/qtest


Here is what I got from the core dump file, but it is not very informative.


GNU gdb (GDB) SUSE (7.3-0.6.1)
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-suse-linux".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /scratch_test/acrc/chiensh/q-chem/qchem/build/qcprog.exe...
warning: Range for type (null) has invalid bounds 0..-121

warning: Range for type (null) has invalid bounds 0..-121
done.
[New LWP 409814]
...
[New LWP 409827]
Missing separate debuginfo for /lib64/libdl.so.2
Try: zypper install -C "debuginfo(build-id)=3e4f6bfee9fdf77ca975b77b8c325347d9228bb8"
Missing separate debuginfo for /lib64/libpthread.so.0
Try: zypper install -C "debuginfo(build-id)=09dae90d04b1e2e43758ce58845f026b4085aec9"
Missing separate debuginfo for /lib64/libm.so.6
Try: zypper install -C "debuginfo(build-id)=e05f2e72f47391363a03eff3cde10ad4c007c045"
Missing separate debuginfo for /lib64/libc.so.6
Try: zypper install -C "debuginfo(build-id)=469835eb7eeb4aa3f653537c36a72810e01fb602"
Missing separate debuginfo for /lib64/ld-linux-x86-64.so.2
Try: zypper install -C "debuginfo(build-id)=9ae0815beac8378108daf1be3c6fd506bde67e03"
Missing separate debuginfo for /lib64/libnss_files.so.2
Try: zypper install -C "debuginfo(build-id)=22603a8ce8adec0e2b1855223fc9d4db3a0e5f3b"
Missing separate debuginfo for
Try: zypper install -C "debuginfo(build-id)=9ae0815beac8378108daf1be3c6fd506bde67e03"
Missing separate debuginfo for /lib64/libdl.so.2
Try: zypper install -C "debuginfo(build-id)=3e4f6bfee9fdf77ca975b77b8c325347d9228bb8"
Missing separate debuginfo for /lib64/libpthread.so.0
Try: zypper install -C "debuginfo(build-id)=09dae90d04b1e2e43758ce58845f026b4085aec9"
[Thread debugging using libthread_db enabled]
Missing separate debuginfo for /lib64/libm.so.6
Try: zypper install -C "debuginfo(build-id)=e05f2e72f47391363a03eff3cde10ad4c007c045"
Missing separate debuginfo for /lib64/libc.so.6
Try: zypper install -C "debuginfo(build-id)=469835eb7eeb4aa3f653537c36a72810e01fb602"
Missing separate debuginfo for /lib64/ld-linux-x86-64.so.2
Try: zypper install -C "debuginfo(build-id)=181860a35c8e9a0456dd6675f85c2eb0f062e956"
Missing separate debuginfo for /lib64/libnss_files.so.2
Try: zypper install -C "debuginfo(build-id)=22603a8ce8adec0e2b1855223fc9d4db3a0e5f3b"
Core was generated by `/home01/acrc/chiensh/scratch/q-chem/qchem/exe/qcprog.exe .input.409732.qcin.1 /'.
Program terminated with signal 6, Aborted.
#0 0x00002aaaabaf9b35 in raise () from /lib64/libc.so.6
Member
Registered: Jun 2008
Posts: 30
Following up to the previous question:
After I set CC_BACKEND = XM, the job goes a little bit further and manages to print the MO, but the job will hang right before CCSD iteration starting
with these error messages:

[uv1|08:54:20|#1041|~/scratch/qtest/Alex]: ~/scratch/q-chem/qchem/bin/qchem -nt 64 input output mo
This is a multi-thread run using 64 threads
/home01/acrc/chiensh/scratch/q-chem/qchem/bin/parallel.csh, , input, 1, 0, /dev/shm/chiensh/mo/
MPIRUN in parallel.csh is /home01/acrc/chiensh/scratch/q-chem/qchem/bin/mpi/mpirun_qchem
P4_RSHCOMMAND in parallel.csh is ssh
QCOUTFILE is output
Q-Chem machineFile is /home01/acrc/chiensh/scratch/q-chem/qchem/bin/mpi/machines
ERROR in qints omp thread 31/64: std::bad_alloc
ERROR in qints omp thread 11/64: std::bad_alloc
ERROR in qints omp thread 31/64: std::bad_alloc
ERROR in qints omp thread 47/64: std::bad_alloc
ERROR in qints omp thread 62/64: std::bad_alloc
ERROR in qints omp thread 22/64: std::bad_alloc
...



Here is the output

...
Point group: C1 (1 irreducible representation).

A All
-------------------------------------
All molecular orbitals:
- Alpha 576 576
- Beta 576 576
-------------------------------------
Alpha orbitals:
- Frozen occupied 0 0
- Active occupied 38 38
- Active virtual 538 538
- Frozen virtual 0 0
-------------------------------------
Beta orbitals:
- Frozen occupied 0 0
- Active occupied 38 38
- Active virtual 538 538
- Frozen virtual 0 0
-------------------------------------

Import integrals: CPU 0.05 s wall 0.00 s


Rem section in the intput:

$fn"Wo!g\}Yrem
JOBTYPE sp
exchange hf
correlation ccsd(t)
MAX_SCF_CYCLES 200
BASIS gen
scf_guess read
PURECART 1111
mem_total 300000
mem_static 2000
cc_memory 150000
CC_BACKEND XM
$fn"Wo!g\}Yend
« Last edit by chiensh on Tue Nov 15, 2016 1:48 am. »
Administrator
Registered: Sep 2007
Posts: 175
This seems to be perfectly fine size of a job to run with the memory you have. It seems that it crashes in libqints (new integral routine). I would suggest to try to run is with fewer number of cores -- about 16. I do not think this relatively small job would scale beyond 16.

In any case, the crash should not be happening. We should try to rerun this job after new release, with most up-todate libqints.

Page: 1

iOpenShell » Q-Chem specific questions » Error in libvmm::io_excption