i have python2.7 daemon process using module http://www.jejik.com/files/examples/daemon.py
the process heavy 1 40 gb ram usage , 9 child threads. server uses rhel 6.3 192 gb ram , enough cpu power.
after starting process, lasts around 3-7 hours, killed someone, might kernel. not find hints in dmesg nor kernel log (which had manually activated), nothings there. when not starting daemon, got message in terminal: "killed".
the following precautions have been done:
- resetting oom score in /proc//oom_score_adj oom killer not pick process when sort of resources
- increasing rlimits (that can increased) maximum
- set process nice/priority higher (prio -15)
this problem exists before applying these precautions, not responsible killing
i have mechanism catch exception, stderr, stdout , log everythings rotated log file. there nothing interesting before process died.
modules used within process among others: oracle_cx, ibm_db, suds, wsgi_utils. of them write logs when errors occured.
anyone know how trace killing? , why?
thank in advance
to see logged in @ time when process killed, use command last
.
if no 1 logged in @ time, process killed signal.
since python, easiest way find out killed process write signal handler signals , log them. see here how write signal handler. see this question how catch them all.
if core dump, attach external hard disk enough space. or limit size of core 1gb using ulimit
; might enough see crashed.
alternatively, start process using debugger gdb
; make sure prompt when "core dump" signals have been sent process.
Comments
Post a Comment