Skip to content

Commit

Permalink
Add compilation and usage of the device compiler with PoCL
Browse files Browse the repository at this point in the history
  • Loading branch information
keryell committed Jan 31, 2018
1 parent 90a5558 commit 3063a56
Show file tree
Hide file tree
Showing 2 changed files with 122 additions and 1 deletion.
6 changes: 6 additions & 0 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,12 @@ See `Possible futures <doc/possible-futures.rst>`_
News
----

- 2018/02/01 there is now some documentation about the architecture of
triSYCL on GPU and accelerators with its device compiler based on
Clang/LLVM in `<doc/architecture.rst>`_. While this is wildly
experimental, there is a growing interested around it and it is
always useful to get started as a contributor.

- 2018/01/05 there are some internship openings at Xilinx to work on
triSYCL for FPGA
https://xilinx.referrals.selectminds.com/jobs/compiler-engineer-intern-on-sycl-for-fpga-4685
Expand Down
117 changes: 116 additions & 1 deletion doc/architecture.rst
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,8 @@ and finish as explained by the ``make`` output.
triSYCL architecture for CPU
============================

The implementation for CPU is a pure C++17 templated header library.
The implementation for CPU is a pure C++17 templated header library
and does not require a specific C++ or SYCL compiler.

The dataflow SYCL infrastructure between kernels related by
buffer/accessors dependencies is implemented in
Expand Down Expand Up @@ -121,6 +122,120 @@ barrier-spaghetti code. But this relies on the triSYCL device
compiler...



triSYCL architecture for accelerator
====================================

When targeting an accelerator, even if SYCL is a pure C++ DSEL, a
specific compiler is required to extract the kernel code and compile
it to some target device and at the same time to compile on the host side
some glue code around the extraction boundary to transfer data to and
from the device and call the kernel itself.

The device compiler is based on Clang/LLVM 3.9 for now.

Since it is quite more experimental than the CPU path, it is not yet
merged into the main branches:

- device branch of triSYCL:
https://github.com/triSYCL/triSYCL/tree/device

- Clang supporting triSYCL:
https://github.com/triSYCL/clang/tree/sycl/release_39/master

- LLVM supporting triSYCL:
https://github.com/triSYCL/llvm/tree/sycl/release_39/master


Installation & compilation
--------------------------

First download or clone the device compiler repositories, for example
with::

git clone --branch sycl/release_39/master [email protected]:triSYCL/llvm.git
cd llvm/tools
git clone --branch sycl/release_39/master [email protected]:triSYCL/clang.git
cd ../..

Then compile for example with::

mkdir build
cd build
cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=1 \
-DLLVM_ENABLE_CXX1Y=On \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_TARGETS_TO_BUILD="X86" \
-DLLVM_BUILD_LLVM_DYLIB:BOOL=ON \
-DLLVM_LINK_LLVM_DYLIB:BOOL=ON \
../llvm
# Use -j8 to speed up compilation if you have 8 cores for example
make -j8

You might replace the ``Release`` by ``Debug`` above if you want to
debug the compiler itself. Look at https://llvm.org/docs/CMake.html
for more information.

Compilation and installation of the triSYCL runtime::

git clone --branch device [email protected]:triSYCL/triSYCL.git
# Compile the triSYCL_tool command
cd triSYCL/src
make


Usage
-----

Unfortunately there is no driver yet to generate directly the host and
device part and it is up to the end-user for now, since it is still
experimental and in development. So using the compiler
is... painful. :-(

It is expected to be used as for example with examples from
`<../tests/device_compiler>`_. Everything is done from
`<../tests/Makefile>`_ when making a target ending with the
``.kernel_caller`` extension such as
``tests/device_compiler/single_task_vector_add_drt.kernel_caller``.

triSYCL assumes some recent Clang/LLVM installed, independently from
the one used by device compiler which might not be new enough.

A recent version of Boost is required. It is available with package
``libboost-all-dev`` on Debian/Ubuntu or with some more modern
specific versions such as ``libboost1.63-all-dev``.

The device compiler generates the kernels as SPIR-df (*de-facto*),
which is SPIR 2.0 encoded with LLVM IR of a more recent version than
LLVM 3.4 expected by the SPIR specification. So a very modern SPIR
consumer is required, such as a recent PoCL. It is not the version
available in Ubuntu 17.10 for example, so you might compile and
install PoCL on your own...

Set up the environment::

export LLVM_BUILD_DIR=<directory_where_LLVM_is_built>
# Use PoCL OpenCL stack
export BOOST_COMPUTE_DEFAULT_PLATFORM='Portable Computing Language'
# Do not use another OpenCL stack if the one requested is not available
export BOOST_COMPUTE_DEFAULT_ENFORCE=1

Compile and execute a small example::

cd tests
make -j2 device_compiler/single_task_vector_add_drt.kernel_caller
device_compiler/single_task_vector_add_drt.kernel_caller
[...]
Queue waiting for kernel completion

**** no errors detected



Workflow
--------


Testing infrastructure
======================

Expand Down

0 comments on commit 3063a56

Please sign in to comment.