Skip to content

Releases: CNugteren/CLTune

Version 2.7.0

26 Jun 19:27
Compare
Choose a tag to compare

Version 2.7.0

  • CLTune now automatically ensures global size is a multiple of the local workgroup size
  • Added GetBestResult() to the tuner's API to retrieve the best parameters programmatically
  • Changed std::initalizer_list in the AddParameters API to std::vector
  • Fixed a bug in the simulated annealing search method

Version 2.6.0

23 Oct 13:51
Compare
Choose a tag to compare

Version 2.6.0

  • Changed timing measurements to now also include the (varying) kernel launch overhead
  • It is now possible to set OpenCL compiler options through the env variable CLTUNE_BUILD_OPTIONS
  • Added support for compilation under Visual Studio 2013 (MSVC++ 12.0)
  • Added an option to build a static version of the library

Version 2.5.0

27 Sep 19:05
Compare
Choose a tag to compare

Version 2.5.0

  • Updated to version 8.0 of the CLCudaAPI header
  • Made it possible to configure the number of times each kernel is run (to average results)
  • Minor bugfixes

Version 2.4.0

29 Jun 17:52
Compare
Choose a tag to compare

Version 2.4.0

  • Made it possible to run the unit-tests independently of the provided OpenCL kernel samples
  • Added an option to compile in verbose mode for additional diagnostic messages (-DVERBOSE=ON)
  • Now using version 6.0 of the CLCudaAPI header
  • Fixed the RPATH settings on OSX
  • Added Appveyor continuous integration and increased coverage of the Travis builds

Version 2.3.1

25 May 11:04
Compare
Choose a tag to compare

Version 2.3.1 (bug-fix release)

  • Fixed a bug where an output buffer could not be used as input at the same time
  • Fixed computing the validation error for half-precision fp16 data-types

Version 2.3.0

22 May 15:06
Compare
Choose a tag to compare

Version 2.3.0

  • Added support for 'short' and 'cl_half' data-types as kernel buffer and scalar arguments
  • Fixed a bug where failed results would still show up in the tuning results
  • Made MSVC link the run-time libraries statically

Version 2.2.0

27 Apr 09:08
Compare
Choose a tag to compare

Version 2.2.0

  • Added two new simpler samples of using the tuner (vector-add and convolution)
  • Updated the general documentation
  • Added API documentation
  • Now using version 5.0 of the CLCudaAPI header

Version 2.1.0

31 Mar 04:13
Compare
Choose a tag to compare

Version 2.1.0

  • Added exports to be able to create a DLL on Windows (thanks to Marco Hutter)
  • Added command-line OpenCL platform selection in the examples (thanks to William J Shipman)

Version 2.0.0

22 Nov 11:21
Compare
Choose a tag to compare

Version 2.0.0

  • Added support for machine learning models. These models can be trained on a small fraction of the
    tuning configurations and can be used to predict the remainder. Two models are supported:
    • Linear regression
    • A 3-layer neural network
  • Now using version 4.0 of the CLCudaAPI header (previously known as Claduc)
  • Added experimental support for CUDA kernels
  • Added support for MSVC (Visual Studio) 2015
  • Using Catch instead of GTest for unit-testing
  • Various minor fixes

Version 1.7.0

03 Aug 15:17
Compare
Choose a tag to compare

Version 1.7.0