While Transformers [4] have been the driving force between the rise of LLMs, before them, LSTM or Long Short-Term Memory [5] has been one of the state-of-the-art machine learning model architectures.
However, it had short comings compared to Transfomers, which the inventor of LSTM, Sepp Hochreiter, sought to rectify with the introduction of xLSTM.
While Transformers are a powerful machine learning architecture, they are extremely resource hungry as their time and memory complexity is O(N²).
xLSTM on the other hand has time complexity O(N) and memory complexity O(1) [6].
I have given this talk in our machine learning journal club after being invited to do so.
My research work is usually removed from the concrete implementation details of machine learning architectures, however, I was still interested and by accepting the invitation, I forced myself to dedicate time to investing this topic.
While the depth and perspective of the talk very much reflects my own view of the topic, they are still very useful to gain an overview of xLSTM and its surrounding technologies.
The full slides of the talk can be found here: Slides.
References
- ERROR: Missing or incorrect citation ID ("ksi")
- ERROR: Missing or incorrect citation ID ("saia")
- ERROR: Missing or incorrect citation ID ("method")
- ERROR: Missing or incorrect citation ID ("transformers")
- ERROR: Missing or incorrect citation ID ("lstm")
- ERROR: Missing or incorrect citation ID ("xlstm")