AICurious Logo

What is: End-to-End Neural Diarization?

SourceEnd-to-End Neural Diarization: Reformulating Speaker Diarization as Simple Multi-label Classification
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

End-to-End Neural Diarization is a neural network for speaker diarization in which a neural network directly outputs speaker diarization results given a multi-speaker recording. To realize such an end-to-end model, the speaker diarization problem is formulated as a multi-label classification problem and a permutation-free objective function is introduced to directly minimize diarization errors. The EEND method can explicitly handle speaker overlaps during training and inference. Just by feeding multi-speaker recordings with corresponding speaker segment labels, the model can be adapted to real conversations.