IO Complexity lower and upper bounds

How does it work?

The IOUB tool takes as an input a yaml file describing the accesses and the loops of a program, and the sizes of the arrays and the caches. It produces:

  • A permutation of tiling loops that minimizes the volume of communication required, by maximizing the reuse
  • A symbolic expression of an upper bound on the volume of I/O required to perform a computation, parametrized by the tile sizes
  • Optimal tile sizes values that minimize this expression for the provided parameter values and cache sizes

Description of the input

The input is a yaml file following this convention:

  • dims : list of the dimensions of the program. Ex: "dims: [i, j, k]".
  • arrays : map associating each array with their access function and their number of occurrences (read and write). Ex: "arrays:\n   A: [ [i, j], 1]".
  • reuse: map associating each array with their reuse dimension. A reuse dimension of an array is a dimension along which the accessed memory location does not change. Ex: "reuse:\n   A: [k]".
  • values: a map containing various information about the sizes of the problem:
    • n_levels: number of cache levels
    • cache: list of sizes of the caches. It must have at least as many elements than "n_levels".
    • dims: list of dimensions of the problem. Their order corresponds to the one from "dims".

Description of the outputs

The output format is a list of information

  • Cache size (in words): list of the sizes of the caches. Information already in the input.
  • Permutation: the optimal tiled loop permutation, partitioned across levels of memory.
  • Footprint: symbolic expression of the footprint of the tile, below the level of cache saturation in the loop nest.
  • Analytical cost expression: symbolic expression of the volume of data to be transmitted at each level of cache. Cost function that combines these expressions.
  • Optimal sizes: tile sizes that minimizes the previous cost function.
  • Parameters: Reminder of the cache sizes and the problem sizes. Information already in the input.
  • Total Cost: Value of the cost function for the optimal tile sizes.
  • Data volume: Detailed repartitions of the volume data to be transmitted across each cache level.
  • Data footprint: Data kept by a tile, below the level of cache saturation in the loop nest.