AICurious Logo

What is: CRISS?

SourceCross-lingual Retrieval for Iterative Self-Supervised Training
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

CRISS, or Cross-lingual Retrievial for Iterative Self-Supervised Training (CRISS), is a self-supervised learning method for multilingual sequence generation. CRISS is developed based on the finding that the encoder outputs of multilingual denoising autoencoder can be used as language agnostic representation to retrieve parallel sentence pairs, and training the model on these retrieved sentence pairs can further improve its sentence retrieval and translation capabilities in an iterative manner. Using only unlabeled data from many different languages, CRISS iteratively mines for parallel sentences across languages, trains a new better multilingual model using these mined sentence pairs, mines again for better parallel sentences, and repeats.