main 6e212334b5b4 cached
1 files
21.3 KB
6.8k tokens
1 requests
Download .txt
Repository: valeman/Transformers_And_LLM_Are_What_You_Dont_Need
Branch: main
Commit: 6e212334b5b4
Files: 1
Total size: 21.3 KB

Directory structure:
gitextract_af8jwzis/

└── README.md

================================================
FILE CONTENTS
================================================

================================================
FILE: README.md
================================================
# Transformers_And_LLM_Are_What_You_Dont_Need
By far the best and only repository showing why transformers don’t work in time series forecasting 

## Star History
![GitHub Repo stars](https://img.shields.io/github/stars/valeman/Transformers_And_LLM_Are_What_You_Dont_Need?style=social)


[![Star History Chart](https://api.star-history.com/svg?repos=valeman/Transformers_And_LLM_Are_What_You_Dont_Need&type=Date)](https://www.star-history.com/#valeman/Transformers_And_LLM_Are_What_You_Dont_Need&Date)

<a href="https://www.buymeacoffee.com/valeman" target="_blank"><img src="https://cdn.buymeacoffee.com/buttons/default-orange.png" alt="Buy Me A Coffee" height="41" width="174"></a>



## [Table of Contents]() 

* [PhD and MSc Theses](#theses)

* [Videos](#videos) 
 
* [Papers](#papers)

* [Articles](#articles)

## Videos
1. [Problems in the current research on forecasting with transformers, foundational models, etc.](https://www.youtube.com/watch?v=vNul_AjRPFw&t=1084s) by Christof Bergmeir

## Theses
1. [Cotton Price Long-Term Time Series Forecasting: A look at Transformers Suitability](https://repository.eafit.edu.co/server/api/core/bitstreams/27739bb8-7237-498f-8fba-ddb4ed6ca4fe/content)

## Papers
1. [Are Transformers Effective for Time Series Forecasting?](https://arxiv.org/abs/2205.13504) by Ailing Zeng, Muxi Chen, Lei Zhang, Qiang Xu (The Chinese University of Hong Kong, International Digital Economy Academy (IDEA), 2022) [code](https://github.com/cure-lab/LTSF-Linear) 🔥🔥🔥🔥🔥
2.  [LLMs and foundational models for time series forecasting: They are not (yet) as good as you may hope](https://www.linkedin.com/pulse/llms-foundational-models-time-series-forecasting-yet-good-bergmeir-bprwf) by Christoph Bergmeir (2023) 🔥🔥🔥🔥🔥
3.   [Transformers Are What You Do Not Need](https://medium.com/@valeman/transformers-are-what-you-do-not-need-cf16a4c13ab7) by Valeriy Manokhin (2023) 🔥🔥🔥🔥🔥
4.   [Time Series Foundational Models: Their Role in Anomaly Detection and Prediction](https://arxiv.org/abs/2412.19286v1) (2024) [code](https://github.com/smtmnfg/TSFM)
5.   [Deep Learning is What You Do Not Need](https://medium.com/@valeman/-86655805a676) by Valeriy Manokhin (2022) 🔥🔥🔥🔥🔥
6. [Why do Transformers suck at Time Series Forecasting](https://machine-learning-made-simple.medium.com/why-do-transformers-suck-at-time-series-forecasting-46ae3a4d6b11) by Devansh (2023)
7. [Frequency-domain MLPs are More Effective Learners in Time Series Forecasting](https://arxiv.org/abs/2311.06184) by Kun Yi, Qi Zhang, Wei Fan, Shoujin Wang, Pengyang Wang, Hui He, Defu Lian, Ning An, Longbing Cao, Zhendong Niu (Bejing Institute of Technology, Tongji University, University of Oxford, Universuty of Technology Sydney, University of Macau, HeFei University of Technology, Macquarie University) (2023) 🔥🔥🔥🔥🔥
8. [Forecasting CPI inflation under economic policy and geo-political uncertainties](https://arxiv.org/abs/2401.00249) by Shovon Sengupta, Tanujit Chakraborty, Sunny Kumar Singh (Fidelity Investments, Sorbonne University, BITS Pilani Hyderabad). (2024) 🔥🔥🔥🔥🔥
9. [Revisiting Long-term Time Series Forecasting: An Investigation on Linear Mapping](https://arxiv.org/abs/2305.10721) by Zhe Li, Shiyi Qi, Yiduo Li, Zenglin Xu (Harbin Institute of Technology, Shenzhen, 2023) [code](https://github.com/plumprc/RTSF)
10. [SCINet: Time Series Modeling and Forecasting with Sample Convolution and Interaction](https://arxiv.org/abs/2106.09305) by Minhao Liu, Ailing Zeng, Muxi Chen, Zhijian Xu, Qiuxia Lai, Lingna Ma, Qiang Xu (The Chinese University of Hong Kong,2022) [code](https://github.com/cure-lab/SCINet)
11. [WINNET:TIME SERIES FORECASTING WITH A WINDOW-ENHANCED PERIOD EXTRACTING AND INTERACTING](https://arxiv.org/pdf/2311.00214.pdf) by Wenjie Ou, Dongyue Guo, Zheng Zhang, Zhishuo Zhao, Yi Lin (Sichuan University, China, 2023)
12. [A Multi-Scale Decomposition MLP-Mixer for Time Series Analysis](https://arxiv.org/abs/2310.11959) by Shuhan Zhong, Sizhe Song, Guanyao Li, Weipeng Zhuo, Yang Liu, S.-H. Gary Chan, The Hong Kong University of Science and Technology
Hong Kong, 2023) [code](https://github.com/zshhans/MSD-Mixer) 🔥🔥🔥🔥🔥
15. [TimesNet: Temporal 2D-Variation Modeling for General Time Series Analysis](https://arxiv.org/abs/2210.02186) by (Haixu Wu, Tengge Hu, Yong Liu, Hang Zhou, Jianmin Wang, Mingsheng Longj, , Tsinghua University, 2023) [code](https://github.com/thuml/TimesNet) 🔥🔥🔥🔥🔥
16. [MTS-Mixers: Multivariate Time Series Forecasting via Factorized Temporal and Channel Mixing](https://arxiv.org/abs/2302.04501) [code](https://github.com/plumprc/MTS-Mixers) 🔥🔥🔥🔥🔥
17. [Reversible Instance Normalization for Accurate Time-Series Forecasting against Distribution Shift](https://openreview.net/forum?id=cGDAkQo1C0p) by Taesung Kim, Jinhee Kim, Yunwon Tae, Cheonbok Park, Jang-Ho Choi, Jaegul Choo (Kaist AI, Vuno, Naver Corp, ETRI, ICLR 2022) [code](https://github.com/ts-kim/RevIN) [project page](https://seharanul17.github.io/RevIN/) 🔥🔥🔥🔥🔥
18. [WINNet: Wavelet-inspired Invertible Network for Image Denoising](https://arxiv.org/abs/2311.00214) by 
Wenjie Ou, Dongyue Guo, Zheng Zhang, Zhishuo Zhao, Yi Lin (College of Computer Science, Sichuan University, China) [code](https://github.com/jjhuangcs/WINNet) 🔥🔥🔥🔥🔥
19. [Mlinear: Rethink the Linear Model for Time-series Forecasting](https://arxiv.org/abs/2305.04800) Wei Li, Xiangxu Meng, Chuhao Chen and Jianing Chen (Harbin Engineering University, 2023) 🔥🔥🔥🔥🔥
20. [Minimalist Traffic Prediction: Linear Layer Is All You Need](https://arxiv.org/abs/2308.10276) by Wenying Duan, Hong Rao, Wei Huang, Xiaoxi He (Nanchang, University, Universify of Macau, 2023)
21. [Frequency-domain MLPs are More Effective Learners in Time Series Forecasting](https://arxiv.org/abs/2311.06184) by Kun Yi, Qi Zhang, Wei Fan, Shoujin Wang, Pengyang Wang, Hui He, Defu Lian, Ning An, Longbing Cao, Zhendong Niu (Beijing Institute of Technology, Tongji University, University of Oxford University of Technology Sydney, University of Macau, USTC, HeFei University of Technology, Macquarie University, 2023) [code](https://github.com/aikunyi/FreTS) 🔥🔥🔥🔥🔥
22. [AN END-TO-END TIME SERIES MODEL FOR SIMULTANEOUS IMPUTATION AND FORECAST](https://arxiv.org/abs/2306.00778) by Trang H. Tran, Lam M. Nguyen, Kyongmin Yeo, Nam Nguyen, Dzung Phan, Roman Vaculin Jayant Kalagnanam (School of Operations Research and Information Engineering, Cornell University; IBM Research, Thomas J. Watson Research Center, Yorktown Heights, NY, USA, 2023) 🔥🔥🔥🔥🔥
23. [Long-term Forecasting with TiDE: Time-series Dense Encoder](https://arxiv.org/abs/2304.08424) by Abhimanyu Das, Weihao Kong, Andrew Leach, Shaan Mathur, Rajat Sen, Rose Yu (Google Cloud, University of California, San Diego, 2023)
24. [TSMixer: Lightweight MLP-Mixer Model for Multivariate Time Series Forecasting](https://arxiv.org/abs/2306.09364) by Vijay Ekambaram, Arindam Jati, Nam Nguyen, Phanwadee Sinthong, Jayant Kalagnanam (IBM Research, 2023) [code](https://huggingface.co/docs/transformers/main/en/model_doc/patchtsmixer) [code](https://github.com/IBM/tsfm/blob/main/wiki.md)
25. [Koopa: Learning Non-stationary Time Series Dynamics with Koopman Predictors](https://arxiv.org/abs/2305.18803) by Yong Liu, Chenyu Li, Jianmin Wang, Mingsheng Long (Tsinghua University, 2023) [code](https://github.com/thuml/Time-Series-Library/blob/main/models/Koopa.py) 🔥🔥🔥🔥🔥
26. [Attractor Memory for Long-Term Time Series Forecasting: A Chaos Perspective](https://arxiv.org/abs/2402.11463) Jiaxi Hu, Yuehong Hu, Wei Chen, Ming Jin, Shirui Pan, Qingsong Wen, Yuxuan Liang (2024) 🔥🔥🔥🔥🔥
27. [When and How: Learning Identifiable Latent States for Nonstationary Time Series Forecasting](https://arxiv.org/abs/2402.12767) (2024) 🔥🔥🔥🔥🔥
28. [Deep Coupling Network For Multivariate Time Series Forecasting](https://arxiv.org/abs/2402.15134) (2024)
29. [Linear Dynamics-embedded Neural Network for Long-Sequence Modeling](https://arxiv.org/abs/2402.15290) by Tongyi Liang and Han-Xiong Li (City University of Hong Kong, 2024).
30. [PDETime: Rethinking Long-Term Multivariate Time Series Forecasting from the perspective of partial differential equations](https://arxiv.org/abs/2402.16913) (2024)
31. [CATS: Enhancing Multivariate Time Series Forecasting by Constructing Auxiliary Time Series as Exogenous Variables](https://arxiv.org/abs/2403.01673) (2024) 🔥🔥🔥🔥🔥
32. [Is Mamba Effective for Time Series Forecasting?](https://arxiv.org/abs/2403.11144) [code](https://github.com/wzhwzhwzh0921/S-D-Mamba) (2024) 🔥🔥🔥🔥🔥
33. [STG-Mamba: Spatial-Temporal Graph Learning via Selective State Space Model](https://arxiv.org/abs/2403.12418) (2024)
34. [TimeMachine: A Time Series is Worth 4 Mambas for Long-term Forecasting](https://arxiv.org/abs/2403.09898) [code](https://github.com/Atik-Ahamed/TimeMachine) (2024)🔥🔥🔥🔥🔥
35. [FITS: Modeling Time Series with 10k Parameters](https://arxiv.org/abs/2307.03756) [code](https://github.com/VEWOXIC/FITS) (2023)
36. [TSLANet: Rethinking Transformers for Time Series Representation Learning](https://arxiv.org/abs/2404.08472) [code](https://github.com/emadeldeen24/TSLANet) (2024) 🔥🔥🔥🔥🔥
37. [WFTNet: Exploiting Global and Local Periodicity in Long-term Time Series Forecasting](https://arxiv.org/abs/2309.11319) [code](https://github.com/Hank0626/WFTNet) (2024) 🔥🔥🔥🔥🔥
38. [SiMBA: Simplified Mamba-based Architecture for Vision and Multivariate Time series](https://arxiv.org/abs/2403.15360) [code](https://github.com/badripatro/Simba) (2024) 🔥🔥🔥🔥🔥
39. [SOFTS: Efficient Multivariate Time Series Forecasting with Series-Core Fusion](https://arxiv.org/abs/2404.14197) [code](https://github.com/Secilia-Cxy/SOFTS) (2024) 🔥🔥🔥🔥🔥
40. [Integrating Mamba and Transformer for Long-Short Range Time Series Forecasting](https://arxiv.org/abs/2404.14757) [code](https://github.com/XiongxiaoXu/Mambaformer-in-Time-Series) (2024) 🔥🔥🔥🔥🔥
41. [SparseTSF: Modeling Long-term Time Series Forecasting with 1k Parameters](https://arxiv.org/abs/2405.00946) (2024) 🔥🔥🔥🔥🔥
42. [Boosting MLPs with a Coarsening Strategy for Long-Term Time Series Forecasting](https://arxiv.org/abs/2405.03199) (2024) 🔥🔥🔥🔥🔥
43. [Multi-Scale Dilated Convolution Network for Long-Term Time Series Forecasting](https://arxiv.org/abs/2405.05499) (2024)
44. [ModernTCN: A Modern Pure Convolution Structure for General Time Series Analysis](https://openreview.net/forum?id=vpJMJerXHU#) [code](https://github.com/luodhhh/ModernTCN) (ICLR 2024 Spotlight)
45. [Adaptive Extraction Network for Multivariate Long Sequence Time-Series Forecasting](https://arxiv.org/abs/2405.12038) (2024) 🔥🔥🔥🔥🔥
46. [Interpretable Multivariate Time Series Forecasting Using Neural Fourier Transform](https://arxiv.org/abs/2405.13812) (2024) 🔥🔥🔥🔥🔥
47. [PERIODICITY DECOUPLING FRAMEWORK FOR LONG- TERM SERIES FORECASTING](https://openreview.net/pdf?id=dp27P5HBBt) [code](https://github.com/Hank0626/PDF)  (2024) 🔥🔥🔥🔥🔥
48. [Chimera: Effectively Modeling Multivariate Time Series with 2-Dimensional State Space Models](https://arxiv.org/abs/2406.04320) 🔥🔥🔥🔥🔥 (2024)
49. [Time Evidence Fusion Network: Multi-source View in Long-Term Time Series Forecasting](https://arxiv.org/abs/2405.06419) [code](https://github.com/ztxtech/Time-Evidence-Fusion-Network) (2024)
50. [ATFNet: Adaptive Time-Frequency Ensembled Network for Long-term Time Series Forecasting](https://arxiv.org/abs/2404.05192) [code](https://github.com/yhyhyhyhyhy/atfnet) (2024) 🔥🔥🔥🔥
51. [C-Mamba: Channel Correlation Enhanced State Space Models for Multivariate Time Series Forecasting](https://arxiv.org/abs/2406.05316) (2024) 🔥🔥🔥🔥
52. [The Power of Minimalism in Long Sequence Time-series Forecasting](https://openreview.net/pdf?id=hF8jnnexSB)
53. [WindowMixer: Intra-Window and Inter-Window Modeling for Time Series Forecasting](https://arxiv.org/abs/2406.12921)
54. [xLSTMTime : Long-term Time Series Forecasting With xLSTM](https://arxiv.org/abs/2407.10240) [code](https://github.com/muslehal/xLSTMTime) (2024)
55. [Not All Frequencies Are Created Equal:Towards a Dynamic Fusion of Frequencies in Time-Series Forecasting](https://arxiv.org/abs/2407.12415) (2024) 🔥🔥🔥🔥
56. [FMamba: Mamba based on Fast-attention for Multivariate Time-series Forecasting](https://arxiv.org/abs/2407.14814) (2024)
57. [Long Input Sequence Network for Long Time Series Forecasting](https://arxiv.org/abs/2407.15869) (2024)
58. [Time-series Forecasting with Tri-Decomposition Linear-based Modelling and Series-wise Metrics](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4913290) (2024) 🔥🔥🔥🔥
59. [An Evaluation of Standard Statistical Models and LLMs on Time Series Forecasting](https://arxiv.org/abs/2408.04867) (2024) LLM 🔥🔥🔥🔥
60. [Macroeconomic Forecasting with Large Language Models](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4881094) (2024) LLM 🔥🔥🔥🔥
61. [Language Models Still Struggle to Zero-shot Reason about Time Series](https://arxiv.org/abs/2404.11757) (2024) LLM 🔥🔥🔥🔥
62. [KAN4TSF: Are KAN and KAN-based models Effective for Time Series Forecasting?](https://arxiv.org/abs/2408.11306) (2024) [code](https://github.com/2448845600/EasyTSF)
63. [Simplified Mamba with Disentangled Dependency Encoding for Long-Term Time Series Forecasting](https://arxiv.org/abs/2408.12068) (2024)
64. [Transformers are Expressive, But Are They Expressive Enough for Regression?](https://arxiv.org/abs/2402.15478) (2024) paper showing transformers cant approximate smooth functions
65. [MixLinear: Extreme Low Resource Multivariate Time Series Forecasting with 0.1K Parameters](https://arxiv.org/abs/2410.02081)
66. [MMFNet: Multi-Scale Frequency Masking Neural Network for Multivariate Time Series Forecasting](https://arxiv.org/abs/2410.02070)
67. [Neural Fourier Modelling: A Highly Compact Approach to Time-Series Analysis](https://arxiv.org/abs/2410.04703) [code](https://github.com/minkiml/NFM)
68. [CMMamba: channel mixing Mamba for time series forecasting](https://journalofbigdata.springeropen.com/articles/10.1186/s40537-024-01001-9)
69. [EffiCANet: Efficient Time Series Forecasting with Convolutional Attention](https://arxiv.org/abs/2411.04669)
70. [Curse of Attention: A Kernel-Based Perspective for Why Transformers Fail to Generalize on Time Series Forecasting and Beyond](https://arxiv.org/abs/2412.06061)
71. [CycleNet: Enhancing Time Series Forecasting through Modeling Periodic Patterns](https://github.com/ACAT-SCUT/CycleNet) [code](https://github.com/ACAT-SCUT/CycleNet)
72. [Are Language Models Actually Useful for Time Series Forecasting?](https://arxiv.org/abs/2406.16964)
73. [SOFTS: Efficient Multivariate Time Series Forecasting with Series-Core Fusion](https://arxiv.org/abs/2404.14197) [code](https://github.com/Secilia-Cxy/SOFTS)
74. [FTLinear: MLP based on Fourier Transform for Multivariate Time-series Forecasting](https://www.researchsquare.com/article/rs-5654336/v1)
75. [WPMixer: Efficient Multi-Resolution Mixing for Long-Term Time Series Forecasting](https://arxiv.org/abs/2412.17176) [code](https://github.com/Secure-and-Intelligent-Systems-Lab/WPMixer)
76. [Zero Shot Time Series Forecasting Using Kolmogorov Arnold Networks](https://arxiv.org/abs/2412.17853)
77. [BEAT: Balanced Frequency Adaptive Tuning for Long-Term Time-Series Forecasting](https://arxiv.org/abs/2501.19065) (2025) 🔥🔥🔥🔥🔥
78. [A Multi-Task Learning Approach to Linear Multivariate Forecasting](https://arxiv.org/abs/2502.03571) (2025)
79. [Benchmarking Time Series Forecasting Models: From Statistical Techniques to Foundation Models in Real-World Applications](https://arxiv.org/abs/2502.03395) (2025)
80. [Day-ahead demand response potential prediction in residential buildings with HITSKAN: A fusion of Kolmogorov-Arnold networks and N-HiTS](https://www.sciencedirect.com/science/article/pii/S0378778825001859) (2025)
81. [Do We Really Need Deep Learning Models for Time Series Forecasting?](https://arxiv.org/abs/2101.02118) (2021)
82. [Two Steps Forward and One Behind: Rethinking Time Series Forecasting with Deep Learning](https://arxiv.org/abs/2304.04553) (2023)
83. [Are Self-Attentions Effective for Time Series Forecasting?](https://arxiv.org/abs/2405.16877?utm_source=chatgpt.com) (2024)
84. [What Matters in Transformers? Not All Attention is Needed](https://arxiv.org/abs/2406.15786) (2024)
85. [Time Series Foundational Models: Their Role in Anomaly Detection and Prediction](https://arxiv.org/abs/2412.19286v1) (2024)
86. [Performance of Zero-Shot Time Series Foundation Models on Cloud Data](https://arxiv.org/abs/2502.12944) (2025) 🔥🔥🔥🔥🔥
87. [Position: There are no Champions in Long-Term Time Series Forecasting](https://arxiv.org/abs/2502.12161) (2025)
88. [FinTSB: A Comprehensive and Practical Benchmark for Financial Time Series Forecasting](https://arxiv.org/abs/2502.18834)
89. [Cherry-Picking in Time Series Forecasting: How to Select Datasets to Make Your Model Shine](https://arxiv.org/abs/2412.14435)
90. [TFB: Towards Comprehensive and Fair Benchmarking of Time Series Forecasting Methods](https://arxiv.org/abs/2403.20150) 🔥🔥🔥🔥🔥
91. [DUET: Dual Clustering Enhanced Multivariate Time Series Forecasting](https://arxiv.org/abs/2412.10859)
92. [CycleNet: Enhancing Time Series Forecasting through Modeling Periodic Patterns](https://arxiv.org/abs/2409.18479) [code](https://github.com/ACAT-SCUT/CycleNet)
93. [Can LLMs Understand Time Series Anomalies?](https://arxiv.org/abs/2410.05440) [code](https://github.com/rose-stl-lab/anomllm)
94. [EMForecaster: A Deep Learning Framework for Time Series Forecasting in Wireless Networks with Distribution-Free Uncertainty Quantification](https://arxiv.org/abs/2504.00120)
95. [Times2D: Multi-Period Decomposition and Derivative Mapping for General Time Series Forecasting](https://arxiv.org/abs/2504.00118) [code](https://github.com/Tims2D/Times2D)
96. [FilterTS: Comprehensive Frequency Filtering for Multivariate Time Series Forecasting](https://arxiv.org/abs/2505.04158)
97. [OLinear: A Linear Model for Time Series Forecasting in Orthogonally Transformed Domain](https://arxiv.org/abs/2505.08550)
98. [Does Scaling Law Apply in Time Series Forecasting?](https://arxiv.org/abs/2505.10172)
99. [TiRex: Zero-Shot Forecasting Across Long and Short Horizons with Enhanced In-Context Learning](https://arxiv.org/abs/2505.23719) [code](https://github.com/NX-AI/tirex) [article](https://huggingface.co/NX-AI/TiRex) [podcast](https://open.spotify.com/episode/5PmGnhjPf5JMI1SdVqwOu4) 🔥🔥🔥🔥🔥
100. [FreDF: Learning to Forecast in the Frequency Domain](https://arxiv.org/abs/2402.02399) [code](https://github.com/Master-PLC/FreDF)
101. [KARMA: A Multilevel Decomposition Hybrid Mamba Framework for Multivariate Long-Term Time Series Forecasting](https://arxiv.org/abs/2506.08939) [code](https://github.com/yedadasd/KARMA)
102. [MS-TVNet:A Long-Term Time Series Prediction Method Based on Multi-Scale Dynamic Convolution](https://arxiv.org/abs/2506.17253)
103. [Variational Mode Decomposition and Linear Embeddings are What You Need For Time-Series Forecasting](https://arxiv.org/abs/2408.16122) [code](https://github.com/Espalemit/VMD-With-LTSF-Linear) 🔥🔥🔥🔥🔥
104. [TimeSieve: Extracting Temporal Dynamics through Information Bottlenecks](https://arxiv.org/abs/2406.05036) [code](https://github.com/xll0328/TimeSieve)
105. [VisionTS++: Cross-Modal Time Series Foundation Model with Continual Pre-trained Visual Backbones](https://arxiv.org/abs/2508.04379) [code](https://github.com/HALF111/VisionTSpp) VisionTS 🔥🔥🔥🔥🔥
106. [VISIONTS: Visual Masked Autoencoders Are Free-Lunch Zero-Shot Time Series Forecasters](https://arxiv.org/abs/2408.17253) [code](https://github.com/Keytoyze/VisionTS) VisionTS 🔥🔥🔥🔥🔥
107. [Wavelet Mixture of Experts for Time Series Forecasting](https://arxiv.org/abs/2508.08825)
108. [TFKAN: Time-Frequency KAN for Long-Term Time Series Forecasting](https://arxiv.org/abs/2506.12696)
109. [Why Do Transformers Fail to Forecast Time Series In-Context?](https://arxiv.org/abs/2510.09776) 🔥🔥🔥🔥🔥 [code](https://github.com/MasterZhou1/ICL-Time-Series)
110. [In-Context Learning of Linear Dynamical Systems with Transformers: Error Bounds and Depth-Separation](https://arxiv.org/abs/2502.08136)
111. [Time Series Foundation Models: Benchmarking Challenges and Requirements](https://arxiv.org/abs/2510.13654) 🔥🔥🔥🔥🔥
112. [DoFlow: Causal Generative Flows for Interventional and Counterfactual Time-Series Prediction](https://arxiv.org/abs/2511.02137) 🔥🔥🔥🔥🔥
113. [Hydra: Dual Exponentiated Memory for Multivariate Time Series Analysis](https://arxiv.org/abs/2511.00989) 🔥🔥🔥🔥🔥
114. [Mixture-of-KAN for Multivariate Time Series Forecasting](https://dl.acm.org/doi/abs/10.1145/3746252.3760836) 🔥🔥🔥🔥🔥
115. [A Realistic Evaluation of Cross-Frequency Transfer Learning and Foundation Forecasting Models](https://arxiv.org/abs/2509.19465) paper showing foundational models systematically underpeform ARIMA and simple stats ensemble 🔥🔥🔥🔥🔥
116. [XLinear: A Lightweight and Accurate MLP-Based Model for Long-Term Time Series Forecasting with Exogenous Inputs](https://arxiv.org/abs/2601.09237)
117. [Position: The Inevitable End of One-Architecture-Fits-All-Domains in Time Series Forecasting](https://arxiv.org/abs/2602.01736)
118. [Assessing Electricity Demand Forecasting with Exogenous Data in Time Series Foundation Models](https://arxiv.org/abs/2602.05390)








    
## Articles
1. [TimeGPT vs TiDE: Is Zero-Shot Inference the Future of Forecasting or Just Hype?](https://arxiv.org/abs/2205.13504) (https://towardsdatascience.com/timegpt-vs-tide-is-zero-shot-inference-the-future-of-forecasting-or-just-hype-9063bdbe0b76) by Luís Roque
and Rafael Guedes. (2024)🔥🔥🔥🔥🔥
2. [TimeGPT-1, discussion on Hacker News](https://news.ycombinator.com/item?id=37874891) (2023) 
3. [TimeGPT : The first Generative Pretrained Transformer for Time-Series Forecasting](https://www.reddit.com/r/MachineLearning/comments/176wsne/r_timegpt_the_first_generative_pretrained/)


Download .txt
gitextract_af8jwzis/

└── README.md
Condensed preview — 1 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (23K chars).
[
  {
    "path": "README.md",
    "chars": 21830,
    "preview": "# Transformers_And_LLM_Are_What_You_Dont_Need\nBy far the best and only repository showing why transformers don’t work in"
  }
]

About this extraction

This page contains the full source code of the valeman/Transformers_And_LLM_Are_What_You_Dont_Need GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 1 files (21.3 KB), approximately 6.8k tokens. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Copied to clipboard!