Skip to content

Commit

Permalink
Merge branch 'master' into 2661_operate_on_halos
Browse files Browse the repository at this point in the history
  • Loading branch information
arporter committed Nov 13, 2024
2 parents 70fadc7 + d865146 commit 78ad828
Show file tree
Hide file tree
Showing 97 changed files with 2,857 additions and 1,575 deletions.
6 changes: 6 additions & 0 deletions changelog
Original file line number Diff line number Diff line change
Expand Up @@ -269,6 +269,12 @@
102) PR #2763 towards #2717. Upstreams PSyACC functionality: ensures
that clauses on an OpenACC directive are consistent.

103) PR #2734 towards #2704. Implements the forward_dependency part of
the new DefinitionUseChain PSyIR tool.

104) PR #2761 for #2739. Updates the signature of the PSyKAl
transformation script to accept a PSyIR node.

release 2.5.0 14th of February 2024

1) PR #2199 for #2189. Fix bugs with missing maps in enter data
Expand Down
26 changes: 26 additions & 0 deletions doc/developer_guide/dependency.rst
Original file line number Diff line number Diff line change
Expand Up @@ -662,3 +662,29 @@ can be parallelised:
:hide:

Error: The write access to 'a(i,i)' and the read access to 'a(i + 1,i + 1)' are dependent and cannot be parallelised. Variable: 'a'.

DefinitionUseChain
==================
PSyclone also provides a DefinitionUseChain class, which can search for forward
dependencies (backward NYI) for a given Reference inside a region of code. This
implementation differs from the DependencyTools as it is control-flow aware, so
can find many dependencies for a single Reference in a given Routine or scope.

This is primarily used to implement the `References.next_accesses` function, but can be
used directly as follows:

.. code::
chain = DefinitionUseChain(reference)
accesses = chain.find_forward_accesses()
# accesses contains Nodes that are dependent on reference
accesses[0].....
By default the dependencies will be searched for in the containing Routine.

Limitations
-----------
At the moment the DefinitionUseChain assumes that any control flow could not be taken, i.e.
any code inside a Loop or If statement is not guaranteed to occur. These dependencies
will be found, but will not limit further searching into the tree.
Additionally, GOTO statements are not supported and if found, will throw an Exception.
22 changes: 21 additions & 1 deletion doc/user_guide/getting_going.rst
Original file line number Diff line number Diff line change
Expand Up @@ -533,7 +533,7 @@ input source file is required:
However, we usually want to redirect the output to a file so that we can later
compile. We can do this using the `-o` flag:
compile it. We can do this using the `-o` flag:

.. code-block:: console
Expand Down Expand Up @@ -566,6 +566,26 @@ with a `trans` function defined. For example:
print(f"Loop not paralellised because: {err.value}")
.. warning::

Before PSyclone 3.0 the transformation scripts took a PSy object as argument:

.. code-block:: python
def trans(psy):
''' Add OpenMP Parallel Loop directives.
:param psy: the PSy object that PSyclone has constructed for the
'invoke'(s) found in the Algorithm file.
:type psy: :py:class:`psyclone.dynamo0p3.DynamoPSy`
'''
for invoke in psy.invokes.invoke_list:
invoke.schedule
This is deprecated and will stop working in PSyclone releases post version 3.0


And can be applied using the `-s` flag:

.. code-block:: console
Expand Down
26 changes: 11 additions & 15 deletions doc/user_guide/psy_data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -119,25 +119,21 @@ be printed at runtime, e.g.::

The transformation that adds read-only-verification to an application
can be applied for both the :ref:`LFRic <lfric-api>` and
:ref:`GOcean API <gocean-api>` - no API-specific
transformations are required. Below is an example that searches for each
loop in an invoke (which will always surround kernel calls) and applies the
transformation to each one. This code has been successfully used as a
global transformation with the LFRic Gravity Wave miniapp (the executable
is named ``gravity_wave``)::

def trans(psy):
:ref:`GOcean API <gocean-api>` - no API-specific transformations are required.
Below is an example that searches for each loop in a PSyKAl invoke code (which
will always surround kernel calls) and applies the transformation to each one.
This code has been successfully used as a global transformation with the LFRic
Gravity Wave application (the executable is named ``gravity_wave``)

.. code-block:: fortran
def trans(psyir):
from psyclone.psyir.transformations import ReadOnlyVerifyTrans
from psyclone.psyir.nodes import Loop
read_only_verify = ReadOnlyVerifyTrans()
for invoke in psy.invokes.invoke_list:
schedule = invoke.schedule
for node in schedule:
if isinstance(node, Loop):
read_only_verify.apply(node)

return psy
for loop in psyir.walk(Loop):
read_only_verify.apply(loop)
Besides the transformation, a library is required to do the actual
verification at runtime. There are two implementations of the
Expand Down
31 changes: 13 additions & 18 deletions doc/user_guide/transformations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -771,29 +771,25 @@ the Python search path **PYTHONPATH** as before. For example::
PSyclone also provides the same functionality via a function (which is
what the **psyclone** script calls internally).

###.. autofunction:: psyclone.generator.generate
### :noindex:
A valid script file must contain a **trans** function which accepts a
:ref:`PSyIR node<psyir-ug>` representing the root of the psy-layer
code (as a FileConatainer)::

A valid script file must contain a **trans** function which accepts a **PSy**
object as an argument and returns a **PSy** object, i.e.:
::

>>> def trans(psy):
>>> def trans(psyir):
... # ...
... return psy

It is up to the script what it does with the PSy object. The example
below does the same thing as the example in the
It is up to the script how to modify the PSyIR representation of the code.
The example below does the same thing as the example in the
:ref:`sec_transformations_interactive` section.
::

>>> def trans(psy):
>>> def trans(psyir):
... from psyclone.transformations import OMPParallelLoopTrans
... invoke = psy.invokes.get('invoke_0_v3_kernel_type')
... schedule = invoke.schedule
... ol = OMPParallelLoopTrans()
... ol.apply(schedule.children[0])
... return psy
... from psyclone.psyir.node import Routine
... for subroutine in psyir.walk(Routine):
... if subroutine.name == 'invoke_0_v3_kernel_type':
... ol = OMPParallelLoopTrans()
... ol.apply(subroutine.children[0])

In the gocean API (and in the future the lfric API) an
optional **trans_alg** function may also be supplied. This function
Expand All @@ -803,7 +799,6 @@ returns **PSyIR** i.e.:

>>> def trans_alg(psyir):
... # ...
... return psyir

As with the `trans()` function it is up to the script what it does with
the algorithm PSyIR. Note that the `trans_alg()` script is applied to
Expand Down Expand Up @@ -981,7 +976,7 @@ multiple InvokeSchedule and kernel-specific optimization options.
.. literalinclude:: ../../examples/gocean/eg3/ocl_trans.py
:language: python
:linenos:
:lines: 39-79
:pyobject: trans


OpenCL delays the decision of which and where kernels will execute until
Expand Down
20 changes: 7 additions & 13 deletions examples/gocean/eg1/opencl_transformation.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,38 +36,34 @@
''' Module providing a PSyclone transformation script that converts the
Schedule of each Invoke to use OpenCL. '''

from psyclone.psyGen import TransInfo
from psyclone.psyGen import TransInfo, InvokeSchedule
from psyclone.domain.gocean.transformations import GOOpenCLTrans, \
GOMoveIterationBoundariesInsideKernelTrans


def trans(psy):
def trans(psyir):
'''
Transformation routine for use with PSyclone. Converts any imported-
variable accesses into kernel arguments and then applies the OpenCL
transformation to the PSy layer.
:param psy: the PSy object which this script will transform.
:type psy: :py:class:`psyclone.psyGen.PSy`
:returns: the transformed PSy object.
:rtype: :py:class:`psyclone.psyGen.PSy`
:param psyir: the PSyIR of the PSy-layer.
:type psyir: :py:class:`psyclone.psyir.nodes.FileContainer`
'''

# Get the necessary transformations
tinfo = TransInfo()
import_trans = tinfo.get_trans_name('KernelImportsToArguments')
move_boundaries_trans = GOMoveIterationBoundariesInsideKernelTrans()
cltrans = GOOpenCLTrans()

for invoke in psy.invokes.invoke_list:
print("Converting to OpenCL invoke: " + invoke.name)
schedule = invoke.schedule
for schedule in psyir.walk(InvokeSchedule):
print("Converting to OpenCL invoke: " + schedule.name)

# Skip invoke_2 as its time_smooth_code kernel contains a
# module variable (alpha) which is not dealt with by the
# KernelImportsToArguments transformation, see issue #826.
if invoke.name == "invoke_2":
if schedule.name == "invoke_2":
continue

# Remove the imports from inside each kernel and move PSy-layer
Expand All @@ -79,5 +75,3 @@ def trans(psy):

# Transform invoke to OpenCL
cltrans.apply(schedule)

return psy
17 changes: 5 additions & 12 deletions examples/gocean/eg1/openmp_taskloop_trans.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,37 +39,30 @@
'''

from __future__ import print_function
from psyclone.psyir.nodes import Loop
from psyclone.transformations import OMPParallelTrans, OMPSingleTrans
from psyclone.transformations import OMPTaskloopTrans
from psyclone.psyir.transformations import OMPTaskwaitTrans


def trans(psy):
def trans(psyir):
'''
Transformation routine for use with PSyclone. Applies the OpenMP
taskloop and taskwait transformations to the PSy layer.
:param psy: the PSy object which this script will transform.
:type psy: :py:class:`psyclone.psyGen.PSy`
:returns: the transformed PSy object.
:rtype: :py:class:`psyclone.psyGen.PSy`
:param psyir: the PSyIR of the PSy-layer.
:type psyir: :py:class:`psyclone.psyir.nodes.FileContainer`
'''

singletrans = OMPSingleTrans()
paralleltrans = OMPParallelTrans()
tasklooptrans = OMPTaskloopTrans(nogroup=False)
taskwaittrans = OMPTaskwaitTrans()
for invoke in psy.invokes.invoke_list:
print("Adding OpenMP tasking to invoke: " + invoke.name)
schedule = invoke.schedule
for schedule in psyir.children[0].children:
print("Adding OpenMP tasking to invoke: " + schedule.name)
for child in schedule.children:
if isinstance(child, Loop):
tasklooptrans.apply(child)
singletrans.apply(schedule)
paralleltrans.apply(schedule)
taskwaittrans.apply(schedule[0])

return psy
2 changes: 1 addition & 1 deletion examples/gocean/eg2/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -119,5 +119,5 @@ psy_prof.f90 alg_prof.f90: alg.f90
# start-up. Therefore wildcards do not pick-up any files generated
# during execution.
kernels:
${MAKE} inc_field_mod.o inc_field_0_mod.o
${MAKE} inc_field_mod.o

18 changes: 6 additions & 12 deletions examples/gocean/eg2/acc_prof_transform.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,32 +38,26 @@
function via the -s option. Transforms the invoke with the addition of
OpenACC directives and then encloses the whole in a profiling region. '''

from __future__ import print_function
from acc_transform import trans as acc_trans
from psyclone.psyir.transformations import ProfileTrans


def trans(psy):
def trans(psyir):
'''
Take the supplied psy object, add OpenACC directives and then enclose
the whole schedule within a profiling region.
:param psy: the PSy layer to transform.
:type psy: :py:class:`psyclone.gocean1p0.GOPSy`
:returns: the transformed PSy object.
:rtype: :py:class:`psyclone.gocean1p0.GOPSy`
:param psyir: the PSyIR of the PSy-layer.
:type psyir: :py:class:`psyclone.psyir.nodes.FileContainer`
'''
proftrans = ProfileTrans()

# Use the trans() routine in acc_transform.py to add the OpenACC directives
psy = acc_trans(psy)
acc_trans(psyir)

invoke = psy.invokes.get('invoke_0_inc_field')
schedule = invoke.schedule
schedule = next(x for x in psyir.children[0].children
if x.name == 'invoke_0_inc_field')

# Enclose everything in a profiling region
proftrans.apply(schedule.children)
print(schedule.view())
return psy
51 changes: 25 additions & 26 deletions examples/gocean/eg2/acc_transform.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,42 +38,41 @@
function via the -s option. Transforms all kernels in the invoke
to have them compiled for an OpenACC accelerator. '''

from psyclone.domain.common.transformations import KernelModuleInlineTrans
from psyclone.psyir.nodes import Loop
from psyclone.transformations import (
ACCParallelTrans, ACCEnterDataTrans, ACCLoopTrans, ACCRoutineTrans)
from psyclone.psyir.nodes import Loop


def trans(psy):
''' Take the supplied psy object, apply OpenACC transformations
to the schedule of invoke_0 and return the new psy object '''
def trans(psyir):
''' Apply OpenACC transformations to the invoke_0 subroutine
:param psyir: the PSyIR of the PSy-layer.
:type psyir: :py:class:`psyclone.psyir.nodes.FileContainer`
'''
ptrans = ACCParallelTrans()
ltrans = ACCLoopTrans()
dtrans = ACCEnterDataTrans()
ktrans = ACCRoutineTrans()
itrans = KernelModuleInlineTrans()

invoke = psy.invokes.get('invoke_0_inc_field')
schedule = invoke.schedule
print(schedule.view())

# Apply the OpenACC Loop transformation to *every* loop
# nest in the schedule
for child in schedule.children:
if isinstance(child, Loop):
ltrans.apply(child, {"collapse": 2})
for schedule in psyir.children[0].children:
if schedule.name == 'invoke_0_inc_field':

# Put all of the loops in a single parallel region
ptrans.apply(schedule.children)
# Apply the OpenACC Loop transformation to *every* loop
# nest in the schedule
for child in schedule.children:
if isinstance(child, Loop):
ltrans.apply(child, {"collapse": 2})

# Add an enter-data directive
dtrans.apply(schedule)
# Put all of the loops in a single parallel region
ptrans.apply(schedule.children)

# Put an 'acc routine' directive inside each kernel
for kern in schedule.coded_kernels():
ktrans.apply(kern)
# Ideally we would module-inline the kernel here (to save having to
# rely on the compiler to do it) but this does not currently work
# for the fparser2 AST (issue #229).
# itrans.apply(kern)
# Add an enter-data directive
dtrans.apply(schedule)

print(schedule.view())
return psy
# Put an 'acc routine' directive inside each kernel
for kern in schedule.coded_kernels():
ktrans.apply(kern)
itrans.apply(kern)
Loading

0 comments on commit 78ad828

Please sign in to comment.