AICurious Logo

What is: Extended Transformer Construction?

SourceETC: Encoding Long and Structured Inputs in Transformers
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

Extended Transformer Construction, or ETC, is an extension of the Transformer architecture with a new attention mechanism that extends the original in two main ways: (1) it allows scaling up the input length from 512 to several thousands; and (2) it can ingesting structured inputs instead of just linear sequences. The key ideas that enable ETC to achieve these are a new global-local attention mechanism, coupled with relative position encodings. ETC also allows lifting weights from existing BERT models, saving computational resources while training.