Welcome, Guest
Username: Password: Remember me
Forum header

TOPIC: MEMKILL

MEMKILL 8 years 1 month ago #497

  • Sigurdur Thorsteinsson
  • Sigurdur Thorsteinsson's Avatar
Hi,
I got yesterday a memory error in my job
EUW1 on /gpfs/home/ms/ic/ics/hl_home/EUW1.
With the following error message in
/c1a/ms_dir/ms/ic/ics/EUW1/Postprocessing/Hour/PPCycle/Verify.1:

All of c1a0103.10362438.0's processes on c1a1001 were sent SIGTERM or SIGKILL
by the operating system, as the step was attempting to use too much memory.

Please review the ConsumableMemory() specification in the job's submit script.
.
.
MEMKILL-WLMINFO-c1a1001: WLM_RES_TOTALRMEM.hardmax=781
.
.
<
CPU
>-<
MEMORY

>
t t | total average current | total current maximum
ratio |
a h | physical consumed consumed | alloc'd consumed consumed
maximum |
s r | CPU physCpu/ physCpu/ | memory memory memory
memory/ |
k d | seconds alloc'd alloc'd | on node on node on node
alloc'd |
Node ? s s | consumed SMT-thrd SMT-thrd | [MB] /alloc'd [MB]
memory | task list
|
|

|
* below denotes that this step allowed SMT: CPU efficiency may depend on coallo
cated foreign steps
|
|

|
c1a1001 . 1 1 | 21.92 *40% *0% | 781 *** MEMKILLED ***


I have tried to increase the memory in submission.db
} elsif ( $SMSNAME =~ m~/Boundaries~ ) {
$edit{ LL_MEMORY_LIMIT } = "5500mb";

How shall I increase the memory?
Hope to get a quick answear.
Best regards,
Sigurdur

Re:MEMKILL 8 years 1 month ago #501

  • Ulf Andrae
  • Ulf Andrae's Avatar
  • OFFLINE
  • Administrator
  • Posts: 284
  • Thank you received: 30
Sigurdur,

If you check the file that is submitted

ecgate:/gpfs/scratch/ms/ic/ics/hl_home/EUW1/EUW1/Postprocessing/Hour/PPCycle/Verify.job1-q

you will find that no memory request is given. For some reason LL_MEMORY_LIMIT is not set and hence no information about memory is given in:

hirlam.org/trac/browser/trunk/hirlam/scripts/submission.db#L400

Could you add some print statements in submission.db where you set your LL_MEMORY_LIMIT and just after the if (ECMWF) to check that we get?

Ulf

Re:MEMKILL 8 years 1 month ago #513

  • Ulf Andrae
  • Ulf Andrae's Avatar
  • OFFLINE
  • Administrator
  • Posts: 284
  • Thank you received: 30
Sigurdur,

I tried with your subimission.db file without any problems. Could you please try again?

Ulf
Time to create page: 0.080 seconds