AICurious Logo

What is: Audiovisual SlowFast Network?

SourceAudiovisual SlowFast Networks for Video Recognition
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

Audiovisual SlowFast Network, or AVSlowFast, is an architecture for integrated audiovisual perception. AVSlowFast has Slow and Fast visual pathways that are integrated with a Faster Audio pathway to model vision and sound in a unified representation. Audio and visual features are fused at multiple layers, enabling audio to contribute to the formation of hierarchical audiovisual concepts. To overcome training difficulties that arise from different learning dynamics for audio and visual modalities, DropPathway is used, which randomly drops the Audio pathway during training as an effective regularization technique. Inspired by prior studies in neuroscience, hierarchical audiovisual synchronization is performed to learn joint audiovisual features.