Welcome, Guest
Username: Password: Remember me
Forum header

TOPIC: Hirlam 7.3 with Intel 12.1.0

Hirlam 7.3 with Intel 12.1.0 8 years 5 months ago #639

Hi.

While successfully replacing Intel compiler 11.059 with Intel 12.1.0 for Harmonie, I ran into some problems with Hirlam (7.3).

Using pretty much the same configuration as in Intel 11 (changing the corresponding MKL libraries of course) with:

FCFLAGS := -xSSE4.2 -O3 -cpp -ftz -align all -fno-alias -no-prec-div -no-prec-sqrt -ip #(openmp is not used just to be sure)

(or replacing the xSSE with xHOST)

Seems that for such configuration using optimisation level -O3 and xHOST (or corresponding xSSE) results in forecast abort (with various combinations of other flags ) so far, (Truncation in HALO ZONE etc.) for the same model set-up that works successfully with Intel 11.

File Attachment:

File Name: intel12_opHL7.txt
File Size: 1807




If -O2 is used in combination with other flags including that of xHOST, everything works.
(I don't think I should start changing the ENV_domain right now )

This is not a call for help to solve this situation, but if any of those that use recent Intel compiler editions have faced such problems, please share your experiences.


UPD.

At the moment looks like it's
src/grdy/hhsolv.f & friends
that doesn't' work properly with xSSE & O3 (using substitution for hhsolv.f to -O2 solves it for now)
Last Edit: 8 years 5 months ago by Martynas Kazlauskas.

Re:Hirlam 7.3 with Intel 12.1.0 8 years 4 months ago #662

Upd:

in grdy/hhsolv.f the solution for our local platform with intel 12.1.0 now is:
(assuming LHHITER=no in Env_expdesc)

160 !DEC$ novector
161 DO jrec = 1, nlat
162 j = jdatastart + jmin - 2 + jrec
163 jrec_fft = (k - 1) * nlat + jrec
164 IF (j .EQ. (kpbpts + 1) .OR.
165 . j .EQ. (klat_global - kpbpts - 1) .OR.
166 . j .EQ. (klat_global - kpbpts)) THEN
167 DO i = kpbpts + 1, klon_global - kpbpts - 2
168 div_fft(jrec_fft,i) = 0.0
169 END DO
170 END IF
171 IF (j .GE. (kpbpts + 2) .AND.
172 . j .LE. (klat_global - kpbpts - 2))
173 . div_fft(jrec_fft,kpbpts+1) = 0.0
174 END DO
175 END DO

It seems that intel 12 is just way too aggressive compared to intel 11 on the same combination of options(below, O3 as the key player) (according to -vec-report* from hhsolv.f)

FCFLAGS := -xSSE4.2 -O3 -g -traceback -cpp -ftz -align all -openmp -fno-alias -no-prec-div -no-prec-sqrt -ip

(No significant effect on speed due to this change)
Last Edit: 8 years 4 months ago by Martynas Kazlauskas.

Re:Hirlam 7.3 with Intel 12.1.0 8 years 2 months ago #675

Well, some nice conclusions after all

More recent Intel 12 distribution (12.1.4 v.s. 12.1.0 which needed xtra stuff) seems to be more stable on vectorization (some fixes perhaps), and needs no xtra directives in the default code to work without altering it.
No significant impact on performance,but still a bit faster than Intel v11 using the same configuration.

Keep in mind that these results are from out platform/configuration only.
Last Edit: 8 years 2 months ago by Martynas Kazlauskas.
Time to create page: 0.075 seconds