fort()
creates an object (that inherits from class FastTransform) which represents a fast random
\(\mathbb{R}^{dim\_in} \to \mathbb{R}^{dim\_out}\) linear transform. This transform will be
orthonormal when \(dim\_in = dim\_out\) and they are a power of 2, and approximately
orthogonal or semi-orthogonal (in the sense that either \(W^T W \approx I_{dim\_in}\) or
\(W W^T \approx I_{dim\_out}\), if \(W\) represents the transform and \(I_n\) represents an
N-dimensional identity matrix) otherwise.
Arguments
- dim_in
Either a scalar indicating the input dimensionality, or a vector of length 2 indicating the input and output dimensionality of the transform (if dim_out is not specified).
- dim_out
A scalar indicating the output dimensionality of the transform (not required if the first parameter is a vector of length 2).
- type
A string indicating the type of transform to use (optional); current valid options are:
fft2
(i.e.default
).- cache_matrix
Logical that controls whether matrices are cached when as.matrix() is called; should be set to FALSE if saving memory is important (optional, default = TRUE).
- seed
If set, defines the seed used to generate the random transform (optional, default = NULL).
Value
An object of a class that inherits from class FastTransform and which represents a fast linear transform.
Details
The goal of fort()
is to provide an easy and efficient way of calculating fast orthogonal random
transforms (when dim_in
is the same as dim_out
) or semi-orthogonal transforms (when dim_in
is
different from dim_out
) within R, by using fast structured transforms (like the Fast Fourier
Transform or the Fast Walsh-Hadamard Transform) to avoid matrix multiplications, in the same spirit
as the Fastfood (Rahimi et al. (2007)), ACDC (Moczulski et al. (2015)), HD (Yu et al. (2016)) and
SD (Choromanski et al. (2017)) families of random structured transforms.
Internally, all fort
transforms assume a blocksize which must be a power of 2 and no smaller
than max(dim_in, dim_out)
. The resulting transform will be practically orthonormal when
\(dim\_in = dim\_out\) and they match the blocksize of the transform, and practically semi-orthogonal
when \(dim\_in \neq dim\_out\) and max(dim_in, dim_out)
matches the blocksize. Otherwise, these
properties will only approximately hold, since the output will result from a decimated
transform (i.e., the rows and columns of the transform should be decorrelated, but not necessarily
orthogonal).
fort
transform types
The specific type of transform returned depends on the value passed in the type
parameter,
but all methods rely on alternating between applying permutations (complexity \(O(N)\)),
diagonal scaling matrices (complexity \(O(N)\)) and structured fast linear transforms
(such as the Fast Fourier Transform or the Fast Walsh-Hadamard Transform, which can be
implemented with complexity \(O(N \mathrm{log} N)\)). Thus, it becomes possible to reduce the
complexity of transforming an \(\mathbb{R}^N\) vector from \(O(N^{2})\) (using matrix
multiplication) to \(O(N \mathrm{log} N)\).
Currently, the available options for the type
parameter are:
default
: this is the default option, if notype
is specified; currently, it assumes thefft2
type, but this is subject to change (so avoid this option in non-interactive usage);fft1
: this type offort
transform uses the Fast Fourier Transform as base transform (which is used once); for more technical details, see FastTransformFFT1.fft2
: this type offort
transform uses the Fast Fourier Transform as base transform (which is used twice); for more technical details, see FastTransformFFT2.
Using fort
transforms
In practice, to apply the fast transform to the columns of a matrix, you should use the %*%
operator
as if the output of fort()
was a matrix (e.g., fort(4,6) %*% matrix(1:12,4,3)
will output a 6 by 3
matrix that results from applying the transform on the left to the matrix on the right of the %*%
operator).
Objects generated by fort()
are also compatible with other methods applicable to matrix
objects,
such as dim()
, ncol()
, nrow()
, solve()
, t()
and det()
. Furthermore, these object can also
be easily converted to matrices (using as.matrix()
), if required.
References
Krzysztof M. Choromanski, Mark Rowland, and Adrian Weller. (2017). The unreasonable effectiveness of structured random orthogonal embeddings. Conference and Workshop on Neural Information Processing Systems. http://papers.neurips.cc/paper/6626-the-unreasonable-effectiveness-of-structured-random-orthogonal-embeddings
Felix Xinnan X. Yu, Ananda Theertha Suresh, Krzysztof M. Choromanski, Daniel N. Holtmann-Rice, and Sanjiv Kumar. (2016). Orthogonal random features. Conference and Workshop on Neural Information Processing Systems. http://papers.neurips.cc/paper/6246-orthogonal-random-features
Marcin Moczulski, Misha Denil, Jeremy Appleyard, and Nando de Freitas. (2015). ACDC: A structured efficient linear layer. https://arxiv.org/abs/1511.05946
Quoc Le, Tamás Sarlós and Alex Smola. (2013). Fastfood - approximating kernel expansions in loglinear time. International Conference on Machine Learning. https://proceedings.mlr.press/v28/le13-supp.pdf
See also
How to apply
fort
transforms:%*%.FastTransform
How to obtain a
fort
transform in matrix form:as.matrix.FastTransform()
How to invert
fort
transforms:solve.FastTransform()
How to access low-level functionality of
fort
transforms:FastTransform
Examples
fort(16) # a random orthogonal transform from R^16 to R^16
#> fort linear operation: R^16 -> [fft2] -> R^16
fort(5, 33) # a random transform from R^5 to R^33
#> fort linear operation: R^5 -> [R^64 -> fft2 -> R^64] -> R^33
fort(c(5, 33)) # same as previous line
#> fort linear operation: R^5 -> [R^64 -> fft2 -> R^64] -> R^33
# apply a random orthogonal transformation to the canonical R^4 basis
fort(4) %*% diag(4)
#> [,1] [,2] [,3] [,4]
#> [1,] 0.10670396 0.97471439 0.04746131 0.19050865
#> [2,] 0.91454452 -0.09463491 -0.38725963 0.06842912
#> [3,] 0.04746131 0.19050865 -0.10670396 -0.97471439
#> [4,] 0.38725963 -0.06842912 0.91454452 -0.09463491