Intel® Fortran Compiler 17.0 Developer Guide and Reference

qopt-streaming-stores, Qopt-streaming-stores

Enables generation of streaming stores for optimization.

Syntax

Linux and macOS:

-qopt-streaming-stores=keyword

Windows:

/Qopt-streaming-stores:keyword

Arguments

keyword

Specifies whether streaming stores are generated. Possible values are:

always

Enables generation of streaming stores for optimization. The compiler optimizes under the assumption that the application is memory bound.

When this option setting is specified, it is your responsibility to also insert any memory barriers (fences) as required to ensure correct memory ordering within a thread or across threads. See the Examples section for one way to do this.

never

Disables generation of streaming stores for optimization. Normal stores are performed.

auto

Lets the compiler decide which instructions to use.

Default

-qopt-streaming-stores=auto
or/Qopt-streaming-stores:auto

The compiler decides whether to use streaming stores or normal stores.

Description

This option enables generation of streaming stores for optimization. This method stores data with instructions that use a non-temporal buffer, which minimizes memory hierarchy pollution.

This option may be useful for applications that can benefit from streaming stores.

IDE Equivalent

None

Alternate Options

None

Example

The following example shows one way to insert memory barriers (fences) when specifying -qopt-streaming-stores=always or /Qopt-streaming-stores:always. It allows you to access the C/C++ processor intrinsic _mm_sfence by adding an interface declaration and then calling that interface:

subroutine sub1(a, b, c, d, len, n1, n2)
real(8) a(len), b(len), c(len), d(len)

interface 
  subroutine ftn_sfence() bind (C, name = "_mm_sfence") 
     !DEC$ attributes known_intrinsic, default :: ftn_sfence
  end subroutine ftn_sfence
end interface

integer i, j, len

!OMP$ parallel for
  do j = 1,n1
     a(j) = 1.0
     b(j) = 2.0
     c(j) = 0.0
  enddo

call ftn_sfence()
      
!OMP$ parallel for
  do i = 1,n2
     a(i) = b(i)*c(i) 
  enddo

end

Another way to do this is to call a C/C++ function from the Fortran code after the loop.

See Also