AICurious Logo

What is: CharacterBERT?

SourceCharacterBERT: Reconciling ELMo and BERT for Word-Level Open-Vocabulary Representations From Characters
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

CharacterBERT is a variant of BERT that drops the wordpiece system and replaces it with a CharacterCNN module just like the one ELMo uses to produce its first layer representation. This allows CharacterBERT to represent any input token without splitting it into wordpieces. Moreover, this frees BERT from the burden of a domain-specific wordpiece vocabulary which may not be suited to your domain of interest (e.g. medical domain). Finally, it allows the model to be more robust to noisy inputs.