Profile picture of MatthewBerman

MatthewBerman

@MatthewBerman

Published: January 15, 2025
41
426
2.8k

1/ Google Research unveils new paper: "Titans: Learning to Memorize at Test Time" It introduces human-like memory structures to overcome the limits of Transformers, with one "SURPRISING" feature. Here's why this is huge for AI. 🧵👇

Image in tweet by MatthewBerman

2/ The Problem: Transformers, the backbone of most AI today, struggle with long-term memory due to quadratic memory complexity. Basically, there's a big penalty for long context windows! Titans aims to solve this with massive scalability.

Image in tweet by MatthewBerman

3/ What Makes Titans Different? Inspired by human memory, Titans integrate: • Short-term memory (real-time processing) • Long-term memory (retaining key past information) • Persistent memory (task-specific baked-in knowledge) This modular approach mimics how the brain works.

Image in tweet by MatthewBerman

4/ Game-Changer: Memory at Test Time Titans can learn and adapt during inference (test time), unlike Transformers, which rely on pre-training. This means: • Dynamic updating of memory during real-time use. • Better generalization and contextual understanding.

Image in tweet by MatthewBerman

5/ The "Surprise" Mechanism: Humans remember surprising events better. Titans use a "surprise" metric to prioritize what to memorize and forget. • Adaptive Forgetting ensures efficiency. • Surprising inputs create stronger memory retention. This leads to smarter, leaner models.

Image in tweet by MatthewBerman
Image in tweet by MatthewBerman

6/ Three Architectural Variants: Titans offer flexible implementations based on use cases: • Memory as Context (MAC): Best for tasks needing detailed historical context. • Memory as Gated (MAG): Balances short- and long-term memory. • Memory as Layer (MAL): Most efficient, slightly less powerful. Trade-offs for every need!

Image in tweet by MatthewBerman
Image in tweet by MatthewBerman
Image in tweet by MatthewBerman

7/ Performance: Titans outperform Transformers and other models in: • Language modeling. • Common-sense reasoning. • Needle-in-a-haystack tasks (retrieving data in vast contexts). • DNA modeling & time-series forecasting. They maintain high accuracy even with millions of tokens.

Image in tweet by MatthewBerman

8/ Why This Matters: • Massive Context: No more limits on how much info models can process. • Real-Time Adaptation: Models learn dynamically, like humans. • Scalability: Opens the door for AI in genomics, long video understanding, and reasoning across massive datasets.

Image in tweet by MatthewBerman

9/ Key Innovations: • Surprise-based memory prioritization. • Efficient, scalable architectures with adaptive forgetting. • Parallelizable training algorithms for better hardware utilization. Titans bridges the gap between AI and human-like reasoning.

Image in tweet by MatthewBerman

10/ What’s Next? With Titans, we could see breakthroughs in AI applications that demand massive context, from personalized healthcare to real-time video analytics. Read the paper here: https://arxiv.org/abs/2501.006... Check out my video breakdown here: https://www.youtube.com/watch?... What do you think of Titans? Let’s discuss. 💬

Share this thread

Read on Twitter

View original thread

Navigate thread

1/10