AICurious Logo

What is: Routing Transformer?

SourceEfficient Content-Based Sparse Attention with Routing Transformers
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

The Routing Transformer is a Transformer that endows self-attention with a sparse routing module based on online k-means. Each attention module considers a clustering of the space: the current timestep only attends to context belonging to the same cluster. In other word, the current time-step query is routed to a limited number of context through its cluster assignment.