xLSTM, The One To Overcome Transformers

2024 Jul 23

xLSTM, The One To Overcome Transformers

talk

While Transformers [4] have been the driving force between the rise of LLMs, before them, LSTM or Long Short-Term Memory [5] has been one of the state-of-the-art machine learning model architectures.
However, it had short comings compared to Transfomers, which the inventor of LSTM, Sepp Hochreiter, sought to rectify with the introduction of xLSTM.
While Transformers are a powerful machine learning architecture, they are extremely resource hungry as their time and memory complexity is O(N²).
xLSTM on the other hand has time complexity O(N) and memory complexity O(1) [6].

I have given this talk in our machine learning journal club after being invited to do so.
My research work is usually removed from the concrete implementation details of machine learning architectures, however, I was still interested and by accepting the invitation, I forced myself to dedicate time to investing this topic.
While the depth and perspective of the talk very much reflects my own view of the topic, they are still very useful to gain an overview of xLSTM and its surrounding technologies.
The full slides of the talk can be found here: Slides.

References

ERROR: Missing or incorrect citation ID ("ksi")
ERROR: Missing or incorrect citation ID ("saia")
ERROR: Missing or incorrect citation ID ("method")
ERROR: Missing or incorrect citation ID ("transformers")
ERROR: Missing or incorrect citation ID ("lstm")
ERROR: Missing or incorrect citation ID ("xlstm")

Next Post Previous Post

Search

Random Article

I'm Feeling Lucky!

Popular Tags

paper talk project thesis

About Me

Dr. Jonathan Decker

Jonathan is a scientific employee of the Georg-August-University of Göttingen and a postdoc researcher. He takes the role of a system architect and is focused on designing systems that enable new and novel ways of utilizing Cloud and HPC resources, while also being efficient, secure and scalable. Most notably, he strives to combine HPC with Kubernetes. In addition to conducting research in these topics, he handles university teaching activities.

inbox@dr-decker.example.science

Feed

Atom 1.0 RSS

Dr. Decker