Tensors for Data Processing Theory, Methods, and Applications
Tensors for Data Processing: Theory, Methods and Applications presents both classical and state-of-the-art methods on tensor computation for data processing, covering computation theories, processing methods, computing and engineering applications, with an emphasis on techniques for data processing.
Uložené v:
| Hlavný autor: | |
|---|---|
| Médium: | E-kniha |
| Jazyk: | English |
| Vydavateľské údaje: |
Chantilly
Elsevier Science & Technology
2021
Academic Press |
| Vydanie: | 1 |
| Predmet: | |
| ISBN: | 9780128244470, 012824447X |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
Obsah:
- 6.3 Tensor PCA for Gaussian-noisy data -- 6.3.1 Tensor rank and tensor nuclear norm -- 6.3.2 Analysis of tensor PCA on Gaussian-noisy data -- 6.3.3 Summary -- 6.4 Tensor PCA for sparsely corrupted data -- 6.4.1 Robust tensor PCA -- 6.4.1.1 Tensor incoherence conditions -- 6.4.1.2 Exact recovery guarantee of R-TPCA -- 6.4.1.3 Optimization algorithm -- 6.4.2 Tensor low-rank representation -- 6.4.2.1 Tensor linear representation -- 6.4.2.2 TLRR for data clustering -- 6.4.2.3 TLRR for exact data recovery -- 6.4.2.4 Optimization technique -- 6.4.2.5 Dictionary construction -- 6.4.3 Applications -- 6.4.3.1 Application to data recovery -- 6.4.3.2 Application to data clustering -- 6.4.4 Summary -- 6.5 Tensor PCA for outlier-corrupted data -- 6.5.1 Outlier robust tensor PCA -- 6.5.1.1 Formulation of OR-TPCA -- 6.5.1.2 Exact subspace recovery guarantees -- 6.5.1.3 Optimization -- 6.5.2 The fast OR-TPCA algorithm -- 6.5.2.1 Sketch of fast OR-TPCA -- 6.5.2.2 Guarantees for fast OR-TPCA -- 6.5.3 Applications -- 6.5.3.1 Evaluation on synthetic data -- 6.5.3.2 Evaluation on real applications -- 6.5.3.3 Outlier detection -- 6.5.3.4 Unsupervised and semi-supervised learning -- 6.5.3.5 Experiments on fast OR-TPCA -- 6.5.4 Summary -- 6.6 Other tensor PCA methods -- 6.7 Future work -- 6.8 Summary -- References -- 7 Tensors for deep learning theory -- 7.1 Introduction -- 7.2 Bounding a function's expressivity via tensorization -- 7.2.1 A measure of capacity for modeling input dependencies -- 7.2.2 Bounding correlations with tensor matricization ranks -- 7.3 A case study: self-attention networks -- 7.3.1 The self-attention mechanism -- 7.3.1.1 The operation of a self-attention layer -- 7.3.1.2 Partition invariance of the self-attention separation rank -- 7.3.2 Self-attention architecture expressivity questions -- 7.3.2.1 The depth-to-width interplay in self-attention
- Front Cover -- Tensors for Data Processing -- Copyright -- Contents -- List of contributors -- Preface -- 1 Tensor decompositions: computations, applications, and challenges -- 1.1 Introduction -- 1.1.1 What is a tensor? -- 1.1.2 Why do we need tensors? -- 1.2 Tensor operations -- 1.2.1 Tensor notations -- 1.2.2 Matrix operators -- 1.2.3 Tensor transformations -- 1.2.4 Tensor products -- 1.2.5 Structural tensors -- 1.2.6 Summary -- 1.3 Tensor decompositions -- 1.3.1 Tucker decomposition -- 1.3.2 Canonical polyadic decomposition -- 1.3.3 Block term decomposition -- 1.3.4 Tensor singular value decomposition -- 1.3.5 Tensor network -- 1.3.5.1 Hierarchical Tucker decomposition -- 1.3.5.2 Tensor train decomposition -- 1.3.5.3 Tensor ring decomposition -- 1.3.5.4 Other variants -- 1.4 Tensor processing techniques -- 1.5 Challenges -- References -- 2 Transform-based tensor singular value decomposition in multidimensional image recovery -- 2.1 Introduction -- 2.2 Recent advances of the tensor singular value decomposition -- 2.2.1 Preliminaries and basic tensor notations -- 2.2.2 The t-SVD framework -- 2.2.3 Tensor nuclear norm and tensor recovery -- 2.2.4 Extensions -- 2.2.4.1 Nonconvex surrogates -- 2.2.4.2 Additional prior knowledge -- 2.2.4.3 Multiple directions and higher-order tensors -- 2.2.5 Summary -- 2.3 Transform-based t-SVD -- 2.3.1 Linear invertible transform-based t-SVD -- 2.3.2 Beyond invertibility and data adaptivity -- 2.4 Numerical experiments -- 2.4.1 Examples within the t-SVD framework -- 2.4.2 Examples of the transform-based t-SVD -- 2.5 Conclusions and new guidelines -- References -- 3 Partensor -- 3.1 Introduction -- 3.1.1 Related work -- 3.1.2 Notation -- 3.2 Tensor decomposition -- 3.2.1 Matrix least-squares problems -- 3.2.1.1 The unconstrained case -- 3.2.1.2 The nonnegative case -- 3.2.1.3 The orthogonal case
- 4.3.1.3 A novel Riemannian metric -- 4.3.2 Characterization of the induced spaces -- 4.3.2.1 Characterization of the normal space -- 4.3.2.2 Decomposition of tangent space into vertical and horizontal spaces -- 4.3.3 Linear projectors -- 4.3.3.1 The tangent space projector -- 4.3.3.2 The horizontal space projector -- 4.3.4 Retraction -- 4.3.5 Vector transport -- 4.3.6 Computational cost -- 4.4 Algorithms for tensor learning problems -- 4.4.1 Tensor completion -- 4.4.2 General tensor learning -- 4.5 Experiments -- 4.5.1 Choice of metric -- 4.5.2 Low-rank tensor completion -- 4.5.2.1 Small-scale instances -- 4.5.2.2 Large-scale instances -- 4.5.2.3 Low sampling instances -- 4.5.2.4 Ill-conditioned and low sampling instances -- 4.5.2.5 Noisy instances -- 4.5.2.6 Skewed dimensional instances -- 4.5.2.7 Ribeira dataset -- 4.5.2.8 MovieLens 10M dataset -- 4.5.3 Low-rank tensor regression -- 4.5.4 Multilinear multitask learning -- 4.6 Conclusion -- References -- 5 Generalized thresholding for low-rank tensor recovery: approaches based on model and learning -- 5.1 Introduction -- 5.2 Tensor singular value thresholding -- 5.2.1 Proximity operator and generalized thresholding -- 5.2.2 Tensor singular value decomposition -- 5.2.3 Generalized matrix singular value thresholding -- 5.2.4 Generalized tensor singular value thresholding -- 5.3 Thresholding based low-rank tensor recovery -- 5.3.1 Thresholding algorithms for low-rank tensor recovery -- 5.3.2 Generalized thresholding algorithms for low-rank tensor recovery -- 5.4 Generalized thresholding algorithms with learning -- 5.4.1 Deep unrolling -- 5.4.2 Deep plug-and-play -- 5.5 Numerical examples -- 5.6 Conclusion -- References -- 6 Tensor principal component analysis -- 6.1 Introduction -- 6.2 Notations and preliminaries -- 6.2.1 Notations -- 6.2.2 Discrete Fourier transform -- 6.2.3 T-product -- 6.2.4 Summary
- 8.3.6.2 Kernel validity of K-STTM
- 7.3.2.2 The input embedding rank bottleneck in self-attention -- 7.3.2.3 Mid-architecture rank bottlenecks in self-attention -- 7.3.3 Results on the operation of self-attention -- 7.3.3.1 The effect of depth in self-attention networks -- 7.3.3.2 The effect of bottlenecks in self-attention networks -- 7.3.4 Bounding the separation rank of self-attention -- 7.3.4.1 An upper bound on the separation rank -- 7.3.4.2 A lower bound on the separation rank -- 7.4 Convolutional and recurrent networks -- 7.4.1 The operation of convolutional and recurrent networks -- 7.4.2 Addressed architecture expressivity questions -- 7.4.2.1 Depth efficiency in convolutional and recurrent networks -- 7.4.2.2 Further results on convolutional networks -- 7.5 Conclusion -- References -- 8 Tensor network algorithms for image classification -- 8.1 Introduction -- 8.2 Background -- 8.2.1 Tensor basics -- 8.2.2 Tensor decompositions -- 8.2.2.1 Rank-1 tensor decomposition -- 8.2.2.2 Canonical polyadic decomposition -- 8.2.2.3 Tucker decomposition -- 8.2.2.4 Tensor train decomposition -- 8.2.3 Support vector machines -- 8.2.4 Logistic regression -- 8.3 Tensorial extensions of support vector machine -- 8.3.1 Supervised tensor learning -- 8.3.2 Support tensor machines -- 8.3.2.1 Methodology -- 8.3.2.2 Examples -- 8.3.2.3 Conclusion -- 8.3.3 Higher-rank support tensor machines -- 8.3.3.1 Methodology -- 8.3.3.2 Complexity analysis -- 8.3.3.3 Examples -- 8.3.3.4 Conclusion -- 8.3.4 Support Tucker machines -- 8.3.4.1 Methodology -- 8.3.4.2 Examples -- 8.3.5 Support tensor train machines -- 8.3.5.1 Methodology -- 8.3.5.2 Complexity analysis -- 8.3.5.3 Effect of TT ranks on STTM classification -- 8.3.5.4 Updating in site-k-mixed-canonical form -- 8.3.5.5 Examples -- 8.3.5.6 Conclusion -- 8.3.6 Kernelized support tensor train machines -- 8.3.6.1 Methodology
- 3.2.2 Alternating optimization for tensor decomposition -- 3.3 Tensor decomposition with missing elements -- 3.3.1 Matrix least-squares with missing elements -- 3.3.1.1 The unconstrained case -- 3.3.1.2 The nonnegative case -- 3.3.2 Tensor decomposition with missing elements: the unconstrained case -- 3.3.3 Tensor decomposition with missing elements: the nonnegative case -- 3.3.4 Alternating optimization for tensor decomposition with missing elements -- 3.4 Distributed memory implementations -- 3.4.1 Some MPI preliminaries -- 3.4.1.1 Communication domains and topologies -- 3.4.1.2 Synchronization among processes -- 3.4.1.3 Point-to-point communication operations -- 3.4.1.4 Collective communication operations -- 3.4.1.5 Derived data types -- 3.4.2 Variable partitioning and data allocation -- 3.4.2.1 Communication domains -- 3.4.3 Tensor decomposition -- 3.4.3.1 The unconstrained and the nonnegative case -- 3.4.3.2 The orthogonal case -- 3.4.3.3 Factor normalization and acceleration -- 3.4.4 Tensor decomposition with missing elements -- 3.4.4.1 The unconstrained case -- 3.4.4.2 The nonnegative case -- 3.4.5 Some implementation details -- 3.5 Numerical experiments -- 3.5.1 Tensor decomposition -- 3.5.2 Tensor decomposition with missing elements -- 3.6 Conclusion -- Acknowledgment -- References -- 4 A Riemannian approach to low-rank tensor learning -- 4.1 Introduction -- 4.2 A brief introduction to Riemannian optimization -- 4.2.1 Riemannian manifolds -- 4.2.1.1 Riemannian gradient -- 4.2.1.2 Retraction -- 4.2.2 Riemannian quotient manifolds -- 4.2.2.1 Riemannian gradient on quotient manifold -- 4.2.2.2 Retraction on quotient manifold -- 4.3 Riemannian Tucker manifold geometry -- 4.3.1 Riemannian metric and quotient manifold structure -- 4.3.1.1 The symmetry structure in Tucker decomposition -- 4.3.1.2 A metric motivated by a particular cost function

