-add-exception-handling
¶
Adds exception handling for async qnodes.
This pass will set async tokens and async values to be an error when qnodes raise an exception.
Besides that, it will change the logic generated by the async dialect that aborts the execution and instead will call a function in the runtime that will generate an error.
Options¶
-stop-after-step : Useful for tests. This will stop the execution of the transformation after a N step has been executed. Defaults to 0 which is equivalent to running all steps to completion.
-apply-transform-sequence
¶
Apply the passes scheduled with the transform dialect.
-buffer-deallocation
¶
Adds all required dealloc operations for all allocations in the input program
This pass implements an algorithm to automatically introduce all required deallocation operations for all buffers in the input program. This ensures that the resulting program does not have any memory leaks.
Input
#map0 = affine_map<(d0) -> (d0)>
module {
func.func @condBranch(%arg0: i1, %arg1: memref<2xf32>, %arg2: memref<2xf32>) {
cf.cond_br %arg0, ^bb1, ^bb2
^bb1:
cf.br ^bb3(%arg1 : memref<2xf32>)
^bb2:
%0 = memref.alloc() : memref<2xf32>
linalg.generic {
indexing_maps = [#map0, #map0],
iterator_types = ["parallel"]} %arg1, %0 {
^bb0(%gen1_arg0: f32, %gen1_arg1: f32):
%tmp1 = exp %gen1_arg0 : f32
linalg.yield %tmp1 : f32
}: memref<2xf32>, memref<2xf32>
cf.br ^bb3(%0 : memref<2xf32>)
^bb3(%1: memref<2xf32>):
"memref.copy"(%1, %arg2) : (memref<2xf32>, memref<2xf32>) -> ()
return
}
}
Output
#map0 = affine_map<(d0) -> (d0)>
module {
func.func @condBranch(%arg0: i1, %arg1: memref<2xf32>, %arg2: memref<2xf32>) {
cf.cond_br %arg0, ^bb1, ^bb2
^bb1: // pred: ^bb0
%0 = memref.alloc() : memref<2xf32>
memref.copy(%arg1, %0) : memref<2xf32>, memref<2xf32>
cf.br ^bb3(%0 : memref<2xf32>)
^bb2: // pred: ^bb0
%1 = memref.alloc() : memref<2xf32>
linalg.generic {
indexing_maps = [#map0, #map0],
iterator_types = ["parallel"]} %arg1, %1 {
^bb0(%arg3: f32, %arg4: f32):
%4 = exp %arg3 : f32
linalg.yield %4 : f32
}: memref<2xf32>, memref<2xf32>
%2 = memref.alloc() : memref<2xf32>
memref.copy(%1, %2) : memref<2xf32>, memref<2xf32>
dealloc %1 : memref<2xf32>
cf.br ^bb3(%2 : memref<2xf32>)
^bb3(%3: memref<2xf32>): // 2 preds: ^bb1, ^bb2
memref.copy(%3, %arg2) : memref<2xf32>, memref<2xf32>
dealloc %3 : memref<2xf32>
return
}
}
-convert-arraylist-to-memref
¶
Lower array list operations to memref operations.
This pass implements dynamically resizable array lists via lowering them to mutable memrefs.
-convert-catalyst-to-llvm
¶
Lower catalyst utility operations to the LLVM dialect.
-detensorize-scf
¶
Detensorize for, if, while operations from the SCF dialect.
-disable-assertion
¶
_Disable all catalystassertions.
-gep-inbounds
¶
Mark GEPOp inbounds.
-hlo-custom-call-lowering
¶
Lower custom calls op from Stable HLO to CallOp.
-inline-nested-module
¶
Inline nested modules with qnode attribute.
Options¶
-stop-after-step : Useful for tests. This will stop the pass after the given step has been executed (steps run from 1 to 5). Defaults to 0 which is equivalent to running all steps to completion.
-memref-to-llvm-tbaa
¶
Lower the memref load and store operation to LLVM and add the TBAA tags.
-memrefcpy-to-linalgcpy
¶
Switch memref.copy to linalg.copy when the layout is not the identity.
-qnode-to-async-lowering
¶
Lower Qnode func and call operations to async func and call operations.
-register-inactive-callback
¶
_Register __catalyst_inactive_callback
as inactive with Enzyme_
-scatter-lowering
¶
Lower scatter op from Stable HLO to loops.