Brain Titan
May 11, 2024
Expand the context window of Gemma 2B to 10 million

Expand the context window of Gemma 2B to 10 million

Using a technique called Infini-Attention, the context window of Gemma 2B is expanded to 10M.

The expanded Gemma-10M model is able to maintain low memory and computational costs.

- Supports sequences up to 10 million in length. — Use no more than 32GB of memory. — O (1) memory and O (n) time computational complexity allows it to still run on consumer hardware.

Its main approach is to preserve long-distance dependencies through cyclic local attention and compressed memory.

Model Download: https://t.co/3Auv2AkQCK