Intel® Fortran Compiler 17.0 Developer Guide and Reference
This topic only applies when targeting Intel® Many Integrated Core Architecture (Intel® MIC Architecture).
By default, the OFFLOAD directive causes the CPU thread that encounters the directive to wait for completion of the offload before continuing to the next statement. You can execute an asynchronous offload computation, which enables the CPU to initiate the offload and immediately continue to the next statement.
To specify an asynchronous offloaded computation, specify a signal clause in the OFFLOAD directive to initiate the computation, and subsequently use the OFFLOAD_WAIT directive to wait for completion of the offloaded computation.
Alternatively, you can use the non-blocking API OFFLOAD_SIGNALED() to also determine if a section of offloaded code has completed running on a specific target device.
On Intel® MIC Architecture, the signal and wait clauses, the OFFLOAD_WAIT construct and the OFFLOAD_SIGNALED() API refer to a specific target device, so you must specify target-number in the target() clause.
Querying a signal before the signal has been initiated results in undefined behavior, and a runtime abort of the application. For example, consider a query of a signal (SIG1) on target device 0, where the signal was actually initiated for target device 1. The signal was initiated for target device 1, so there is no signal (SIG1) associated with target device 0, and therefore the application aborts.
The following example enables the CPU to issue offloaded computations and continue concurrent activity without using any additional CPU threads:
integer signal_var
integer counter
counter = 10000
!DIR$ ATTRIBUTES OFFLOAD:MIC :: long_running_mic_compute
do while (counter .gt. 0)
!DIR$ OFFLOAD TARGET(MIC:0) SIGNAL(signal_var)
call long_running_mic_compute()
call concurrent_cpu_activity()
!DIR$ OFFLOAD_WAIT TARGET(MIC:0) WAIT (signal_var)
counter = counter - 1
end do
end