-add-exception-handling

Adds exception handling for async qnodes.

This pass will set async tokens and async values to be an error when qnodes raise an exception.

Besides that, it will change the logic generated by the async dialect that aborts the execution and instead will call a function in the runtime that will generate an error.

Options

-stop-after-step : Useful for tests. This will stop the execution of the transformation after a N step has been executed. Defaults to 0 which is equivalent to running all steps to completion.

-apply-transform-sequence

Apply the passes scheduled with the transform dialect.

-buffer-deallocation

Adds all required dealloc operations for all allocations in the input program

This pass implements an algorithm to automatically introduce all required deallocation operations for all buffers in the input program. This ensures that the resulting program does not have any memory leaks.

Input

#map0 = affine_map<(d0) -> (d0)>
module {
  func.func @condBranch(%arg0: i1, %arg1: memref<2xf32>, %arg2: memref<2xf32>) {
    cf.cond_br %arg0, ^bb1, ^bb2
  ^bb1:
    cf.br ^bb3(%arg1 : memref<2xf32>)
  ^bb2:
    %0 = memref.alloc() : memref<2xf32>
    linalg.generic {
      indexing_maps = [#map0, #map0],
      iterator_types = ["parallel"]} %arg1, %0 {
    ^bb0(%gen1_arg0: f32, %gen1_arg1: f32):
      %tmp1 = exp %gen1_arg0 : f32
      linalg.yield %tmp1 : f32
    }: memref<2xf32>, memref<2xf32>
    cf.br ^bb3(%0 : memref<2xf32>)
  ^bb3(%1: memref<2xf32>):
    "memref.copy"(%1, %arg2) : (memref<2xf32>, memref<2xf32>) -> ()
    return
  }
}

Output

#map0 = affine_map<(d0) -> (d0)>
module {
  func.func @condBranch(%arg0: i1, %arg1: memref<2xf32>, %arg2: memref<2xf32>) {
    cf.cond_br %arg0, ^bb1, ^bb2
  ^bb1:  // pred: ^bb0
    %0 = memref.alloc() : memref<2xf32>
    memref.copy(%arg1, %0) : memref<2xf32>, memref<2xf32>
    cf.br ^bb3(%0 : memref<2xf32>)
  ^bb2:  // pred: ^bb0
    %1 = memref.alloc() : memref<2xf32>
    linalg.generic {
      indexing_maps = [#map0, #map0],
      iterator_types = ["parallel"]} %arg1, %1 {
    ^bb0(%arg3: f32, %arg4: f32):
      %4 = exp %arg3 : f32
      linalg.yield %4 : f32
    }: memref<2xf32>, memref<2xf32>
    %2 = memref.alloc() : memref<2xf32>
    memref.copy(%1, %2) : memref<2xf32>, memref<2xf32>
    dealloc %1 : memref<2xf32>
    cf.br ^bb3(%2 : memref<2xf32>)
  ^bb3(%3: memref<2xf32>):  // 2 preds: ^bb1, ^bb2
    memref.copy(%3, %arg2) : memref<2xf32>, memref<2xf32>
    dealloc %3 : memref<2xf32>
    return
  }

}

-convert-arraylist-to-memref

Lower array list operations to memref operations.

This pass implements dynamically resizable array lists via lowering them to mutable memrefs.

-convert-catalyst-to-llvm

Lower catalyst utility operations to the LLVM dialect.

-detensorize-scf

Detensorize for, if, while operations from the SCF dialect.

-disable-assertion

_Disable all catalystassertions.

-gep-inbounds

Mark GEPOp inbounds.

-hlo-custom-call-lowering

Lower custom calls op from Stable HLO to CallOp.

-inline-nested-module

Inline nested modules with qnode attribute.

Options

-stop-after-step : Useful for tests. This will stop the pass after the given step has been executed (steps run from 1 to 5). Defaults to 0 which is equivalent to running all steps to completion.

-memref-to-llvm-tbaa

Lower the memref load and store operation to LLVM and add the TBAA tags.

-memrefcpy-to-linalgcpy

Switch memref.copy to linalg.copy when the layout is not the identity.

-qnode-to-async-lowering

Lower Qnode func and call operations to async func and call operations.

-register-inactive-callback

_Register __catalyst_inactive_callback as inactive with Enzyme_

-scatter-lowering

Lower scatter op from Stable HLO to loops.