Informer

(The formulation of this paper is not so clear.)

Problem Description

The input at time , and output is to predict corresponding sequence .

Solution abstraction

Long sequence time-series forecasting (LSTF) demands a high prediction capacity of the model. Transformer can do this job but have following drawbacks:

The quadratic computation of self-attention.
The memory bottleneck in stacking layers for long inputs.
The speed plunge in predicting long outputs.

The paper proposes Informer to address these issues.

Methodology

Efficient Self-attention Mechanism

where is a sparse matrix of the same size of .

Encoder

where contains the Multi-head probSparse self-attention and essential operations in attention block.

Decoder

References

[1] @article{zhou2020informer, title={Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting}, author={Zhou, Haoyi and Zhang, Shanghang and Peng, Jieqi and Zhang, Shuai and Li, Jianxin and Xiong, Hui and Zhang, Wancai}, journal={arXiv preprint arXiv:2012.07436}, year={2020} }

[2]

Ye Yuan

Note on "Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting"