AICurious Logo

What is: Grouped-query attention?

SourceGQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

Grouped-query attention an interpolation of multi-query and multi-head attention that achieves quality close to multi-head at comparable speed to multi-query attention.