Fast Convolution

A simple and fast algorithm to compute the convolution of an arbitrary sequence x with a sequence of a specific type, a. The number of steps for computing the convolution depends on a certain complexity of a and not on its length. This makes it feasible to convolve a sequence with very large kernels fast.

Uploaded by

gulanitoyo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

89 views2 pages

Fast Convolution

Uploaded by

gulanitoyo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

Fast Convolution

Michael Werman
School of Computer Science and Engineering
The Hebrew University of Jerusalem
Jerusalem 91904, Israel
werman@cs.huji.ac.il
Abstract
We present a very simple and fast algorithm to compute the convolution of an arbitrary sequence
x with a sequence of a specific type, a. The sequence a is any linear combination of polynomials,
exponentials and trigonometric terms. The number of steps for computing the convolution depends
on a certain complexity of a and not on its length, thus making it feasible to convolve a sequence
with very large kernels fast.

Computing the convolution (correlation, filter- depend on r), or equivalently, ar = di=1 αi ar+i .
ing) of a sequence x together with a fixed sequence For d smaller than log |m| this is faster and much
a is one of the ubiquitous operations in graphics, simpler than using FFT.
image and signal processing. Often the sequence Examples of such sequences are; d−1
a is a polynomial, exponential or trigonometric • polynomials of degree d − 1, ai = j=0 λj ij ,
function sampled at discrete points or a piecewise the
d LHE isj d
sum of such terms, such as, splines, or else the se- j=0 (−1) j ai+j = 0, this is of complexity d.
quence can can be well approximated with a few • ai = βλi , the LHE is λai − ai+1 = 0, this is
such terms. The computation of these convolutions of complexity2.
d−1
is usually computed straight from the definition • ai = λi j=0 αj ij , the LHE is
taking O(|x||a|) time or using a more complicated d
j d
d−j
j=0 (−1) j ai+j λ = 0, this is of complexity
FFT based O(|x| log |a|) time algorithm.
d.
Here we present a simple and fast algorithm
• ai = α sin(iθ) + β cos(iθ), the LHE is ai −
to compute the convolution of x1 , x2 , . . . , xn with
2 cos(θ)ai+1 + ai+2 = 0, this is of complexity 3.
am , am−1
m, . . . , a1 , namely y1 , y2 , . . . , yn−m where • ai = λi (α sin(iθ) + β cos(iθ)), the LHE is
yi = k=1 ak xi+k−1 . The number of steps of the 2
λ ai − 2λ cos(θ)ai+1 + ai+2 = 0, this is of com-
algorithm depends on a measure of complexity of
plexity 3.
a and not on m, its length. The number of steps to
Sums of above like terms also satisfy a linear
compute the convolution is O(dn) (m < n) where
homogeneous equation with a complexity that is
the sequence asatisfies a linear homogeneous equa-
d additive, such as ai = 3 sin(21πi/4)+(−2)i +i3 −4,
tion, (LHE), i=0 βi ar+i = 0 (where the β do not
this is of complexity 3 + 2 + 4 = 9.
The complete algorithm, consists of two steps;
Permission to make digital or hard copies of all or initialization and the running computation:
part of this work for personal or classroom use is
granted without fee provided that copies are not • for i=1 . . . d
made or distributed for profit or commercial ad- m
– yi = k=1 ak xi+k−1
vantage and that copies bear this notice and the
full citation on the first page. To copy otherwise, – Fd+1
d+1−i
= yi − d−i+1 ak xi+k−1
m+d−i+1 k=1
or republish, to post on servers or to redistribute + k=m ak xi+k−1 .
to lists, requires prior specific permission and/or
a fee. • for i = d + 1 . . . n − m
Journal of WSCG, Vol.11, No.1., ISSN 1213- d
– yi = Fi0 = j=1 αj Fij
6972 WSCG(92)2003, February 3-7, 2003, Plzen,
Czech Republic. Copyright UNION Agency (96) – for j = 1 . . . d
j
Science Press ∗ Fi+1 = Fij−1 −aj+1 xi +am+j+1 xm+i
m
The invariant is Fij = k=1 aj+k xi+k−1 . signals, images, whenever the function is separable
dThe correctness d of the malgorithm follows from, in x and y and if the 1-d functions are simple, for
2πj
α
j=1 j i F j
= j=1 jα k=1 aj+k xi+k−1 example, a(i, j) = 3 sin( 2πi 3 2
10 )(2j − j + cos( 20 )).
m d Then the amount of work for each output is just
= k=1 ( j=1 αj aj+k )xi+k−1
m the sum of the two complexities.
= k=1 ak xi+k−1 = yi .
There is no reason to save all the previous F ’s The family of 2-d functions that satisfy a linear
so that the extra memory requirement over the in- homogeneous equation is much larger than what
put/output is just O(d). can be computed with the method of separable
Notes: functions, as it is possible to add arbitrary 1-d
• This is a special case of a recursive filter[Smi97], functions of linear combinations of x and y for free.
whereit is easy to define The reason that we do not propose using a straight
d d the sequence a.
d+1−k forward generalization of the the 1-d algorithm is
yi = j=1 αj yi−j − k=1 ( l=1 al )xi−k
d d+1−k that the updating of the F ’s (in the algorithm) now
+ k=1 ( l=1 al )xm+i−k takes time proportional to the the perimeter of the
• Special cases of LHE’s with a fast algorithm √
signal which is now very large, (O( m) in 2-d).
such as the constant function (order 0 polynomial),
box filtering, [SBHC88], and exponential functions Experiments
have appeared before [Smi97, SBW02]. In order to test the algorithm we compared dif-
• The case of polynomials where the filter can ferent convolutions with different algorithms. The
be space variant was treated in [SBHC88, Hec86] algorithms were convolution using a straight for-
using repeated integration and differentiation. They ward implementation of the definition, convolution
gave slightly different formula as they used a con- based on FFT and our proposed fast convolution.
tinuous set up. The code was written in JAVA and only the FFT
Formulas equivalent to ours can be derived us- based algorithm was optimized, time is in ms.
ing finite differences m[MT33] and the summation N M d Conv FFT Fast
by parts formula: k=1 x(r + k)a(k)
d 16384 16 3 160 611 110
= l=0 (−1)l x−l (r + m + l)∆l a(m). 16384 128 3 1212 751 110
x (sum) is defined by, x0 (i) = x(i) and xp (i) =
i −l l 0
16384 1024 3 9273 731 81
j=0 xp−1 (j), and ∆ a(m) by ∆ a(m) = a(m) 16384 2048 3 18066 751 101
and ∆ a(m) = ∆ a(m + 1) − ∆ a(m).
p+1 p p 8186 16 5 140 120 70
∆d+1 a(m) = 0 whenever a(m) is a sequence of 8186 128 5 991 101 70
equally spaced samples of a polynomial of degree d. 8186 1024 5 8242 100 60
So that as in [SBHC88, Hec86] once the sequences 8186 2048 5 15332 90 50
xl and ∆l a(m) for l = 0..d are precomputed m
O(d) per element the computation of k=1 x(r +
in
References
k)a(k) takes only O(d) time. [Hec86] P. Heckbert. Filtering by repeated inte-
l gration. In Proceedings of SIGGRAPH,
∆ a(m) isx especially
l x
easy to compute using that
∆ n = n−l and that any polynomial can be pages 317–321, 1986.
x
written as βj j .
[MT33] L. M. Milne-Thomson. The Calculus of
• The complexity of convolving with a sequence
Finite Differences. Macmillan, 1933.
that is a piecewise sum of simple functions such as
a spline can be computed by adding up the contri- [SBHC88] P. Y. Simard, L. Bottou, P. Haffner,
butions of each of the pieces. and Y. Le Cun. Boxlets: a fast convolu-
• Almost exactly the same same formulas can tion algorithm for signal processing and
be used to compute the convolution for sequences neural networks. Neural Information
defined by linear equations that are not homoge- Processing Systems, NIPS 11, 1988.
neous. This saves a few arithmetic operations
d but
does not add any new functions, for if i=0 βi ar+i = [SBW02] H. Schweitzer, W. Bell, and F. Wu.
Very fast template matching. In ECCV,
d−1 two consecutive sums gives β0 ar −
c then subtracting
pages 358–372, 2002.
βd ar+d+1 + i=1 (βi+1 − βi)ar+i = 0 a LHS of or-
der d + 1. [Smi97] S. W. Smith. The Scientist and Engi-
• Of course the algorithm can be used succes- neer’s Guide to Digital Signal Process-
sively on rows then on columns for convolving 2-d1 , ing. California Technical Publishing,
1 or any dimension http://www.dspguide.com/, 1997.

Convolutions and The Discrete Fourier Transform: 3.1 The Cooley-Tukey Algorithm
No ratings yet
Convolutions and The Discrete Fourier Transform: 3.1 The Cooley-Tukey Algorithm
17 pages
Slides06 FFT
No ratings yet
Slides06 FFT
60 pages
FFT and Algebraic Computation
No ratings yet
FFT and Algebraic Computation
29 pages
Lab4 Report
No ratings yet
Lab4 Report
19 pages
5 FFT NP Randomized
No ratings yet
5 FFT NP Randomized
72 pages
Chapter 2 Supplymentary Notes
No ratings yet
Chapter 2 Supplymentary Notes
8 pages
Assignment 10 Linear and Circular Convolution: April 3, 2019
No ratings yet
Assignment 10 Linear and Circular Convolution: April 3, 2019
5 pages
Polynomial Multiplication via FFT
No ratings yet
Polynomial Multiplication via FFT
28 pages
Polymultiply
No ratings yet
Polymultiply
5 pages
Fast Convolution Cook Toom Algorithm
No ratings yet
Fast Convolution Cook Toom Algorithm
24 pages
Divide-and-Conquer: Polynomial Multiplication: FFT
No ratings yet
Divide-and-Conquer: Polynomial Multiplication: FFT
22 pages
Cook Toom Algorithm
No ratings yet
Cook Toom Algorithm
27 pages
DFT and FFT
No ratings yet
DFT and FFT
8 pages
3 Goertzel's Algorithm
100% (1)
3 Goertzel's Algorithm
41 pages
Lab4 - Report 2 2 2 2 2 1 9
No ratings yet
Lab4 - Report 2 2 2 2 2 1 9
9 pages
Advanced Fourier Analysis
No ratings yet
Advanced Fourier Analysis
6 pages
Polynomial Multiplication Via Fast Fourier Transforms
No ratings yet
Polynomial Multiplication Via Fast Fourier Transforms
7 pages
Lec07 FFT
No ratings yet
Lec07 FFT
14 pages
Algorithms For Efficient Computation of Convolution
No ratings yet
Algorithms For Efficient Computation of Convolution
30 pages
Tutorial3 Adv Writeup
No ratings yet
Tutorial3 Adv Writeup
7 pages
Fast Fourier Transforms
No ratings yet
Fast Fourier Transforms
19 pages
Topic22 DFT and FFT
No ratings yet
Topic22 DFT and FFT
7 pages
Fast Fourier Transform
100% (1)
Fast Fourier Transform
16 pages
Phys760 FinalFormulaSheet
No ratings yet
Phys760 FinalFormulaSheet
4 pages
Fast Fourier Transform (FFT) - Digital Signal Processing Lecture 6
No ratings yet
Fast Fourier Transform (FFT) - Digital Signal Processing Lecture 6
34 pages
Fast Digital Convolution Methods
No ratings yet
Fast Digital Convolution Methods
11 pages
DSP Simulation Assignment Using Octave
No ratings yet
DSP Simulation Assignment Using Octave
33 pages
Fast Fourier Transform: Key Papers in Computer Science Seminar 2005 Dima Batenkov
No ratings yet
Fast Fourier Transform: Key Papers in Computer Science Seminar 2005 Dima Batenkov
90 pages
CS 179: GPU Programming: Lecture 9 / Homework 3
No ratings yet
CS 179: GPU Programming: Lecture 9 / Homework 3
33 pages
Convolution Fourier Series in Detail
No ratings yet
Convolution Fourier Series in Detail
29 pages
Lab4 - Report 2 2 2 2
No ratings yet
Lab4 - Report 2 2 2 2
19 pages
Convolution, Continuous Harmonic Signals: Bounds of Sum
No ratings yet
Convolution, Continuous Harmonic Signals: Bounds of Sum
4 pages
Week 5
No ratings yet
Week 5
28 pages
Convolution, Fourier Series, and The Fourier Transform: CS414 - Spring 2007
0% (1)
Convolution, Fourier Series, and The Fourier Transform: CS414 - Spring 2007
29 pages
Convolution Theorem
No ratings yet
Convolution Theorem
4 pages
Sol 5
No ratings yet
Sol 5
6 pages
Chuong - 6 - Efficient Computation of The DFT Fast Fourier Transform Algorithms
No ratings yet
Chuong - 6 - Efficient Computation of The DFT Fast Fourier Transform Algorithms
67 pages
Polynomials FFT and DFT
No ratings yet
Polynomials FFT and DFT
17 pages
Chapter3 Solutions
No ratings yet
Chapter3 Solutions
5 pages
(J.W.cooley, J.W.tukey) An Algorithm For The Machine Calculation of Complex Fourier Series
No ratings yet
(J.W.cooley, J.W.tukey) An Algorithm For The Machine Calculation of Complex Fourier Series
6 pages
Circular Convolution & DFT Guide
No ratings yet
Circular Convolution & DFT Guide
2 pages
Fourier Transform for Theorists
No ratings yet
Fourier Transform for Theorists
9 pages
Isaac Williams 3164407 Assignsubmission File PDET1 3
No ratings yet
Isaac Williams 3164407 Assignsubmission File PDET1 3
12 pages
ELL319 Experiment 3
No ratings yet
ELL319 Experiment 3
4 pages
FFT and It's App
No ratings yet
FFT and It's App
8 pages
The FFT Artigo 2000
No ratings yet
The FFT Artigo 2000
5 pages
Fourier Series and Transforms Guide
No ratings yet
Fourier Series and Transforms Guide
7 pages
Fast Fourier Transform: XK Xne K N
No ratings yet
Fast Fourier Transform: XK Xne K N
44 pages
DSAP Lab 2
No ratings yet
DSAP Lab 2
11 pages
Assignment 3
No ratings yet
Assignment 3
29 pages
1981 Book FastFourierTransformAndConvolu
No ratings yet
1981 Book FastFourierTransformAndConvolu
260 pages
HO6 HW2Solutions
No ratings yet
HO6 HW2Solutions
11 pages
Daa CS F364 L04 05 Eval Interpolate Polynomials DFT FFT
No ratings yet
Daa CS F364 L04 05 Eval Interpolate Polynomials DFT FFT
29 pages
Daa - CS - F364 - L06 - Fractional Knapsack Problem
No ratings yet
Daa - CS - F364 - L06 - Fractional Knapsack Problem
29 pages
Benjamin Hope 3164568 Assignsubmission File PDET1 Benhope
No ratings yet
Benjamin Hope 3164568 Assignsubmission File PDET1 Benhope
9 pages
Free Ebooks Download
No ratings yet
Free Ebooks Download
33 pages
IB Physics IA Guide for Students
No ratings yet
IB Physics IA Guide for Students
27 pages
Collaborative Feedback Form
No ratings yet
Collaborative Feedback Form
2 pages
Practical Research Final Complete 1
No ratings yet
Practical Research Final Complete 1
15 pages
Development Communication Outline
No ratings yet
Development Communication Outline
9 pages
Gupta and Gupta Vol 1 - PDF
No ratings yet
Gupta and Gupta Vol 1 - PDF
604 pages
Is4s767 CW1M
No ratings yet
Is4s767 CW1M
7 pages
Reaction Paper in Forgery and Forged Document
No ratings yet
Reaction Paper in Forgery and Forged Document
3 pages
Acoustic-Induced-Vibration Case Study For WDDM Onshore Facilities
No ratings yet
Acoustic-Induced-Vibration Case Study For WDDM Onshore Facilities
5 pages
Human Capacity Development PDF
40% (5)
Human Capacity Development PDF
33 pages
Transmisionespt
100% (1)
Transmisionespt
66 pages
Chapter 1 Elementary Lesson Plans Social Studies Grade 5 Lesson 3
100% (1)
Chapter 1 Elementary Lesson Plans Social Studies Grade 5 Lesson 3
5 pages
Interpretable Machine Learning
100% (4)
Interpretable Machine Learning
251 pages
Binary and Shuttle Vector
100% (1)
Binary and Shuttle Vector
9 pages
PDU For TMA Catalog (Add 3DT) - Add CEQ V3.0 - 001 PDF
No ratings yet
PDU For TMA Catalog (Add 3DT) - Add CEQ V3.0 - 001 PDF
3 pages
2020 Powersports Catalog Hkqxey
No ratings yet
2020 Powersports Catalog Hkqxey
292 pages
List of Documents EU GDPR ISO 27001 2022 Integrated Documentation Toolkit EN
No ratings yet
List of Documents EU GDPR ISO 27001 2022 Integrated Documentation Toolkit EN
10 pages
Bc12 Manual Black and Decker
No ratings yet
Bc12 Manual Black and Decker
38 pages
Language & Brain-Neurolinguistics
No ratings yet
Language & Brain-Neurolinguistics
21 pages
Forests 16 00164
No ratings yet
Forests 16 00164
32 pages
Political Philosophy of Thomas Hobbes & State of Nature
100% (2)
Political Philosophy of Thomas Hobbes & State of Nature
22 pages
WBM G - II Tests
No ratings yet
WBM G - II Tests
15 pages
Cidam in Applied Economics
No ratings yet
Cidam in Applied Economics
10 pages
February 2023
No ratings yet
February 2023
2 pages
Jefferson - Catalogo de Solenoides
No ratings yet
Jefferson - Catalogo de Solenoides
8 pages
Instruments
No ratings yet
Instruments
31 pages
DipTrace Tutorial
No ratings yet
DipTrace Tutorial
134 pages
Day 4
No ratings yet
Day 4
6 pages
Header Linked List Basics
No ratings yet
Header Linked List Basics
13 pages
Ge Ethics Lesson 1 Ethics As A Science of Morality
No ratings yet
Ge Ethics Lesson 1 Ethics As A Science of Morality
42 pages
English7 - Lesson Exemplar - forDEMO
No ratings yet
English7 - Lesson Exemplar - forDEMO
5 pages

Fast Convolution

Uploaded by

Fast Convolution

Uploaded by

Fast Convolution

You might also like