In the context of HarmonEPS over the METCOOP25B domain we've experienced two troublesome dates with crashes in CANARI, 2017-06-13 06Z and 2017-09-06 00Z. The crashes are reproducible with harmonie-40h1.1.bf1@cca on the first assimilation cycle. The traceback is shown below. Any suggestions from anyone?
Ulf
signal_harakiri(SIGALRM=14): New handler installed at 0x202d0140; old preserved at (nil)
***Received signal = 8 and ActivatED SIGALRM=14 and calling alarm(10), time =1510918066.32
[myproc#4,tid#1,pid#18048,signal#8(SIGFPE)]: Received signal :: 4425MB (heap), 2158MB (rss), 0MB (stack), 0 (paging), nsigs 1, time 1510918066.32
tid#1 starting drhook traceback, time =1510918066.32
[myproc#4,tid#1,pid#18048]: 4425 MB (maxheap), 2158 MB (maxrss), 0 MB (maxstack), walltime = 1510918066.32s
[myproc#4,tid#1,pid#18048]: MASTER
[myproc#4,tid#1,pid#18048]: CNT0<1>
[myproc#4,tid#1,pid#18048]: CAN1
[myproc#4,tid#1,pid#18048]: CANARI
[myproc#4,tid#1,pid#18048]: CADAVR
[myproc#4,tid#1,pid#18048]: STEPO
[myproc#4,tid#1,pid#18048]: OBSV
[myproc#4,tid#1,pid#18048]: TASKOB
[myproc#4,tid#1,pid#18048]: TASKOB>KSET_LOOP
[myproc#4,tid#1,pid#18048]: TASKOB>OBSGRP=01
[myproc#4,tid#1,pid#18048]: HOP
[myproc#4,tid#1,pid#18048]: PPOBSAC
[myproc#4,tid#1,pid#18048]: ACHMT
tid#1 starting sigdump traceback, time =1510918066.32