lc0-cudnn : add support for fp16 network eval #685

ankan-ban · 2018-05-28T16:48:59Z

slightly more than 2x speedup (for large batch sizes) on supported hardware, without much loss of precision.

get latest

slightly more than 2X speedup compared to fp32 version (without noticiable loss of precision)

ankan-ban added 4 commits May 15, 2018 21:30

Merge pull request #6 from glinscott/next

d4efbf2

get latest

Merge pull request #7 from glinscott/next

66311ae

get latest

Merge pull request #8 from glinscott/next

4fc6080

get latest

lc0-cudnn: Add fp16 support

99b344e

slightly more than 2X speedup compared to fp32 version (without noticiable loss of precision)

mooskagh added the lc0 new tensorflow based implementation label May 28, 2018

gonzalezjo mentioned this pull request Jun 29, 2018

Implement some sort of compression for the networks LeelaChessZero/lc0#121

Closed

Provide feedback