Manual

The scope of problems this package intends to solve are computational problems whose structure is large and complex, but statically defined. For example, the Strassen matrix multiplication algorithm, implemented as a unit test in this package, has a static structure of slicing, matrix addition, and matrix multiplication, when the initial matrix size is known.

The rest of this page is a manual describing how to use this package. For a very simple example of a usage, you can refer to the Fibonacci implementation.

1. Building a DAG

The most central concept of this package is the Computable DAG. A DAG is a directed acyclic graph. For more information on the basics check out the wikipedia article. In the context of this package, every node in a CDAG represents a computation or a data transfer, and every edge represents a dependency between these nodes. For a valid CDAG, we can generate a function that computes all the compute nodes and produces the CDAGs result, hence, "computable".

To create a DAG, some steps are necessary:

A problem defining struct needs to be defined. This does not have to inherit from any type and is completely free in its definition. It can contain information about the parameters of the problem, for example the matrix size in the Strassen matrix multiplication example.
The types of compute steps need to be defined. They are represented by AbstractComputeTasks and can be defined easily using the @compute_task macro.
The input nodes of the CDAG must have a way of reading their value from the input to the actually callable function later. For this, these input nodes are required to have a name and a ComputableDAGs.input_expr must be defined to map from this input to the value of the node. To remove overhead from handling the strings at execution time, this is done by returning an expression that is later evaluated in the function.
ComputableDAGs.input_type needs to be implemented for the problem instance defined earlier, returning the expected type that the generated function will later be called with.
Implement the ComputableDAGs.graph interface function, creating and returning the CDAG. The macros @assemble_dag, @add_entry, and @add_call can be used for this.

2. Working with the CDAG

Once a CDAG has been created, it can be analyzed and optimized. For this, the concepts of Operations, AbstractEstimator, and AbstractOptimizers exist in this package. Operations can be pushed onto a CDAG, which applies it to the structure. Applied operations change the CDAG's structure, but not its computation. An estimator can estimate the required time (or some other metric, depending on its implementation) of a CDAG. These metrics may change by applying operations. Finally, these estimations and operations can be used together to optimize the CDAG.

If the intention is to use this package mostly as an overhead-free Dagger.jl alternative, this step may be skipped entirely.

3. Executing the CDAG

When the CDAG is ready, it can be compiled for the machine you're running on.

Note

Currently, as of version 0.2.4, only cpu_st (single threaded CPU) is working properly as machine definition.

Now, compute_function can be used to create a function that can be called on inputs. compute_function supports and in some cases requires keyword arguments, please refer to its documentation for more information.

Alternatively, GPU kernels can be generated by using kernel instead of compute_function. This is implemented for several GPU backends and produces a regular function for the given backend. Since RuntimeGeneratedFunctions.jl does not support GPU kernels at this time, this function will only be callable if the world age has been increased since its generation. Furthermore, the compute functions in the graph need to comply with all the normal requirements for GPU kernels, such as not calling dynamic functions.

Application repositories

The following repositories use this package productively:

QEDFeynmanDiagrams.jl: Compute differential cross-sections of scattering processes in perturbative QED. Successor of QEDFeynman.jl below.
deprecated QEDFeynman: Compute differential cross-sections of scattering processes in ABC and QED models.