Remove EnzymeInterpreter and instead reuse GPUInterpreter #1893
+35
−257
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
With the stack of changes in JuliaGPU/GPUCompiler.jl#582 JuliaGPU/GPUCompiler.jl#633 and JuliaGPU/GPUCompiler.jl#634 we can hopefully address several issues.
The primary goal is to make nested AD work better, primarily 2nd order on the GPU and higher than 3rd oder on the CPU.
GPUCompiler and Enzyme make more and more advanced use of AbstractInterpreter and we need to make sure that modifications from apply to the other. So here I drop EnzymeInterpreter and make GPUInterpreter extensible enough that inline blocking, deferred codgen and autodiff ->
autodiff_deferred
work...Eventually I would like to simplify
autodiff_deferred
such that it only works within a GPUCompiler context,and that it looks something like
And then we use the plugin interface in JuliaGPU/GPUCompiler.jl#633 to be more like the Clang plugin and transform
code within there instead of the current pipeline. This ought to simplify the current pipeline even more since it will
be focused on CPU codegen.